Sample records for vector machine lssvm

  1. A Two-Layer Least Squares Support Vector Machine Approach to Credit Risk Assessment

    NASA Astrophysics Data System (ADS)

    Liu, Jingli; Li, Jianping; Xu, Weixuan; Shi, Yong

    Least squares support vector machine (LS-SVM) is a revised version of support vector machine (SVM) and has been proved to be a useful tool for pattern recognition. LS-SVM had excellent generalization performance and low computational cost. In this paper, we propose a new method called two-layer least squares support vector machine which combines kernel principle component analysis (KPCA) and linear programming form of least square support vector machine. With this method sparseness and robustness is obtained while solving large dimensional and large scale database. A U.S. commercial credit card database is used to test the efficiency of our method and the result proved to be a satisfactory one.

  2. T-wave end detection using neural networks and Support Vector Machines.

    PubMed

    Suárez-León, Alexander Alexeis; Varon, Carolina; Willems, Rik; Van Huffel, Sabine; Vázquez-Seisdedos, Carlos Román

    2018-05-01

    In this paper we propose a new approach for detecting the end of the T-wave in the electrocardiogram (ECG) using Neural Networks and Support Vector Machines. Both, Multilayer Perceptron (MLP) neural networks and Fixed-Size Least-Squares Support Vector Machines (FS-LSSVM) were used as regression algorithms to determine the end of the T-wave. Different strategies for selecting the training set such as random selection, k-means, robust clustering and maximum quadratic (Rényi) entropy were evaluated. Individual parameters were tuned for each method during training and the results are given for the evaluation set. A comparison between MLP and FS-LSSVM approaches was performed. Finally, a fair comparison of the FS-LSSVM method with other state-of-the-art algorithms for detecting the end of the T-wave was included. The experimental results show that FS-LSSVM approaches are more suitable as regression algorithms than MLP neural networks. Despite the small training sets used, the FS-LSSVM methods outperformed the state-of-the-art techniques. FS-LSSVM can be successfully used as a T-wave end detection algorithm in ECG even with small training set sizes. Copyright © 2018 Elsevier Ltd. All rights reserved.

  3. Identification of Shearer Cutting Patterns Using Vibration Signals Based on a Least Squares Support Vector Machine with an Improved Fruit Fly Optimization Algorithm

    PubMed Central

    Si, Lei; Wang, Zhongbin; Liu, Xinhua; Tan, Chao; Liu, Ze; Xu, Jing

    2016-01-01

    Shearers play an important role in fully mechanized coal mining face and accurately identifying their cutting pattern is very helpful for improving the automation level of shearers and ensuring the safety of coal mining. The least squares support vector machine (LSSVM) has been proven to offer strong potential in prediction and classification issues, particularly by employing an appropriate meta-heuristic algorithm to determine the values of its two parameters. However, these meta-heuristic algorithms have the drawbacks of being hard to understand and reaching the global optimal solution slowly. In this paper, an improved fly optimization algorithm (IFOA) to optimize the parameters of LSSVM was presented and the LSSVM coupled with IFOA (IFOA-LSSVM) was used to identify the shearer cutting pattern. The vibration acceleration signals of five cutting patterns were collected and the special state features were extracted based on the ensemble empirical mode decomposition (EEMD) and the kernel function. Some examples on the IFOA-LSSVM model were further presented and the results were compared with LSSVM, PSO-LSSVM, GA-LSSVM and FOA-LSSVM models in detail. The comparison results indicate that the proposed approach was feasible, efficient and outperformed the others. Finally, an industrial application example at the coal mining face was demonstrated to specify the effect of the proposed system. PMID:26771615

  4. Semisupervised learning using Bayesian interpretation: application to LS-SVM.

    PubMed

    Adankon, Mathias M; Cheriet, Mohamed; Biem, Alain

    2011-04-01

    Bayesian reasoning provides an ideal basis for representing and manipulating uncertain knowledge, with the result that many interesting algorithms in machine learning are based on Bayesian inference. In this paper, we use the Bayesian approach with one and two levels of inference to model the semisupervised learning problem and give its application to the successful kernel classifier support vector machine (SVM) and its variant least-squares SVM (LS-SVM). Taking advantage of Bayesian interpretation of LS-SVM, we develop a semisupervised learning algorithm for Bayesian LS-SVM using our approach based on two levels of inference. Experimental results on both artificial and real pattern recognition problems show the utility of our method.

  5. A hybrid least squares support vector machines and GMDH approach for river flow forecasting

    NASA Astrophysics Data System (ADS)

    Samsudin, R.; Saad, P.; Shabri, A.

    2010-06-01

    This paper proposes a novel hybrid forecasting model, which combines the group method of data handling (GMDH) and the least squares support vector machine (LSSVM), known as GLSSVM. The GMDH is used to determine the useful input variables for LSSVM model and the LSSVM model which works as time series forecasting. In this study the application of GLSSVM for monthly river flow forecasting of Selangor and Bernam River are investigated. The results of the proposed GLSSVM approach are compared with the conventional artificial neural network (ANN) models, Autoregressive Integrated Moving Average (ARIMA) model, GMDH and LSSVM models using the long term observations of monthly river flow discharge. The standard statistical, the root mean square error (RMSE) and coefficient of correlation (R) are employed to evaluate the performance of various models developed. Experiment result indicates that the hybrid model was powerful tools to model discharge time series and can be applied successfully in complex hydrological modeling.

  6. Nonlinear temperature compensation of fluxgate magnetometers with a least-squares support vector machine

    NASA Astrophysics Data System (ADS)

    Pang, Hongfeng; Chen, Dixiang; Pan, Mengchun; Luo, Shitu; Zhang, Qi; Luo, Feilu

    2012-02-01

    Fluxgate magnetometers are widely used for magnetic field measurement. However, their accuracy is influenced by temperature. In this paper, a new method was proposed to compensate the temperature drift of fluxgate magnetometers, in which a least-squares support vector machine (LSSVM) is utilized. The compensation performance was analyzed by simulation, which shows that the LSSVM has better performance and less training time than backpropagation and radical basis function neural networks. The temperature characteristics of a DM fluxgate magnetometer were measured with a temperature experiment box. Forty-five measured data under different magnetic fields and temperatures were obtained and divided into 36 training data and nine test data. The training data were used to obtain the parameters of the LSSVM model, and the compensation performance of the LSSVM model was verified by the test data. Experimental results show that the temperature drift of magnetometer is reduced from 109.3 to 3.3 nT after compensation, which suggests that this compensation method is effective for the accuracy improvement of fluxgate magnetometers.

  7. Inline Measurement of Particle Concentrations in Multicomponent Suspensions using Ultrasonic Sensor and Least Squares Support Vector Machines.

    PubMed

    Zhan, Xiaobin; Jiang, Shulan; Yang, Yili; Liang, Jian; Shi, Tielin; Li, Xiwen

    2015-09-18

    This paper proposes an ultrasonic measurement system based on least squares support vector machines (LS-SVM) for inline measurement of particle concentrations in multicomponent suspensions. Firstly, the ultrasonic signals are analyzed and processed, and the optimal feature subset that contributes to the best model performance is selected based on the importance of features. Secondly, the LS-SVM model is tuned, trained and tested with different feature subsets to obtain the optimal model. In addition, a comparison is made between the partial least square (PLS) model and the LS-SVM model. Finally, the optimal LS-SVM model with the optimal feature subset is applied to inline measurement of particle concentrations in the mixing process. The results show that the proposed method is reliable and accurate for inline measuring the particle concentrations in multicomponent suspensions and the measurement accuracy is sufficiently high for industrial application. Furthermore, the proposed method is applicable to the modeling of the nonlinear system dynamically and provides a feasible way to monitor industrial processes.

  8. Prediction of p38 map kinase inhibitory activity of 3, 4-dihydropyrido [3, 2-d] pyrimidone derivatives using an expert system based on principal component analysis and least square support vector machine

    PubMed Central

    Shahlaei, M.; Saghaie, L.

    2014-01-01

    A quantitative structure–activity relationship (QSAR) study is suggested for the prediction of biological activity (pIC50) of 3, 4-dihydropyrido [3,2-d] pyrimidone derivatives as p38 inhibitors. Modeling of the biological activities of compounds of interest as a function of molecular structures was established by means of principal component analysis (PCA) and least square support vector machine (LS-SVM) methods. The results showed that the pIC50 values calculated by LS-SVM are in good agreement with the experimental data, and the performance of the LS-SVM regression model is superior to the PCA-based model. The developed LS-SVM model was applied for the prediction of the biological activities of pyrimidone derivatives, which were not in the modeling procedure. The resulted model showed high prediction ability with root mean square error of prediction of 0.460 for LS-SVM. The study provided a novel and effective approach for predicting biological activities of 3, 4-dihydropyrido [3,2-d] pyrimidone derivatives as p38 inhibitors and disclosed that LS-SVM can be used as a powerful chemometrics tool for QSAR studies. PMID:26339262

  9. Aircraft Engine Thrust Estimator Design Based on GSA-LSSVM

    NASA Astrophysics Data System (ADS)

    Sheng, Hanlin; Zhang, Tianhong

    2017-08-01

    In view of the necessity of highly precise and reliable thrust estimator to achieve direct thrust control of aircraft engine, based on support vector regression (SVR), as well as least square support vector machine (LSSVM) and a new optimization algorithm - gravitational search algorithm (GSA), by performing integrated modelling and parameter optimization, a GSA-LSSVM-based thrust estimator design solution is proposed. The results show that compared to particle swarm optimization (PSO) algorithm, GSA can find unknown optimization parameter better and enables the model developed with better prediction and generalization ability. The model can better predict aircraft engine thrust and thus fulfills the need of direct thrust control of aircraft engine.

  10. Statistical learning algorithms for identifying contrasting tillage practices with landsat thematic mapper data

    USDA-ARS?s Scientific Manuscript database

    Tillage management practices have direct impact on water holding capacity, evaporation, carbon sequestration, and water quality. This study examines the feasibility of two statistical learning algorithms, such as Least Square Support Vector Machine (LSSVM) and Relevance Vector Machine (RVM), for cla...

  11. Discrimination of raw and processed Dipsacus asperoides by near infrared spectroscopy combined with least squares-support vector machine and random forests

    NASA Astrophysics Data System (ADS)

    Xin, Ni; Gu, Xiao-Feng; Wu, Hao; Hu, Yu-Zhu; Yang, Zhong-Lin

    2012-04-01

    Most herbal medicines could be processed to fulfill the different requirements of therapy. The purpose of this study was to discriminate between raw and processed Dipsacus asperoides, a common traditional Chinese medicine, based on their near infrared (NIR) spectra. Least squares-support vector machine (LS-SVM) and random forests (RF) were employed for full-spectrum classification. Three types of kernels, including linear kernel, polynomial kernel and radial basis function kernel (RBF), were checked for optimization of LS-SVM model. For comparison, a linear discriminant analysis (LDA) model was performed for classification, and the successive projections algorithm (SPA) was executed prior to building an LDA model to choose an appropriate subset of wavelengths. The three methods were applied to a dataset containing 40 raw herbs and 40 corresponding processed herbs. We ran 50 runs of 10-fold cross validation to evaluate the model's efficiency. The performance of the LS-SVM with RBF kernel (RBF LS-SVM) was better than the other two kernels. The RF, RBF LS-SVM and SPA-LDA successfully classified all test samples. The mean error rates for the 50 runs of 10-fold cross validation were 1.35% for RBF LS-SVM, 2.87% for RF, and 2.50% for SPA-LDA. The best classification results were obtained by using LS-SVM with RBF kernel, while RF was fast in the training and making predictions.

  12. Prediction of biochar yield from cattle manure pyrolysis via least squares support vector machine intelligent approach.

    PubMed

    Cao, Hongliang; Xin, Ya; Yuan, Qiaoxia

    2016-02-01

    To predict conveniently the biochar yield from cattle manure pyrolysis, intelligent modeling approach was introduced in this research. A traditional artificial neural networks (ANN) model and a novel least squares support vector machine (LS-SVM) model were developed. For the identification and prediction evaluation of the models, a data set with 33 experimental data was used, which were obtained using a laboratory-scale fixed bed reaction system. The results demonstrated that the intelligent modeling approach is greatly convenient and effective for the prediction of the biochar yield. In particular, the novel LS-SVM model has a more satisfying predicting performance and its robustness is better than the traditional ANN model. The introduction and application of the LS-SVM modeling method gives a successful example, which is a good reference for the modeling study of cattle manure pyrolysis process, even other similar processes. Copyright © 2015 Elsevier Ltd. All rights reserved.

  13. Customer demand prediction of service-oriented manufacturing using the least square support vector machine optimized by particle swarm optimization algorithm

    NASA Astrophysics Data System (ADS)

    Cao, Jin; Jiang, Zhibin; Wang, Kangzhou

    2017-07-01

    Many nonlinear customer satisfaction-related factors significantly influence the future customer demand for service-oriented manufacturing (SOM). To address this issue and enhance the prediction accuracy, this article develops a novel customer demand prediction approach for SOM. The approach combines the phase space reconstruction (PSR) technique with the optimized least square support vector machine (LSSVM). First, the prediction sample space is reconstructed by the PSR to enrich the time-series dynamics of the limited data sample. Then, the generalization and learning ability of the LSSVM are improved by the hybrid polynomial and radial basis function kernel. Finally, the key parameters of the LSSVM are optimized by the particle swarm optimization algorithm. In a real case study, the customer demand prediction of an air conditioner compressor is implemented. Furthermore, the effectiveness and validity of the proposed approach are demonstrated by comparison with other classical predication approaches.

  14. Modelling and Prediction of Spark-ignition Engine Power Performance Using Incremental Least Squares Support Vector Machines

    NASA Astrophysics Data System (ADS)

    Wong, Pak-kin; Vong, Chi-man; Wong, Hang-cheong; Li, Ke

    2010-05-01

    Modern automotive spark-ignition (SI) power performance usually refers to output power and torque, and they are significantly affected by the setup of control parameters in the engine management system (EMS). EMS calibration is done empirically through tests on the dynamometer (dyno) because no exact mathematical engine model is yet available. With an emerging nonlinear function estimation technique of Least squares support vector machines (LS-SVM), the approximate power performance model of a SI engine can be determined by training the sample data acquired from the dyno. A novel incremental algorithm based on typical LS-SVM is also proposed in this paper, so the power performance models built from the incremental LS-SVM can be updated whenever new training data arrives. With updating the models, the model accuracies can be continuously increased. The predicted results using the estimated models from the incremental LS-SVM are good agreement with the actual test results and with the almost same average accuracy of retraining the models from scratch, but the incremental algorithm can significantly shorten the model construction time when new training data arrives.

  15. [Based on the LS-SVM modeling method determination of soil available N and available K by using near-infrared spectroscopy].

    PubMed

    Liu, Xue-Mei; Liu, Jian-She

    2012-11-01

    Visible infrared spectroscopy (Vis/SW-NIRS) was investigated in the present study for measurement accuracy of soil properties,namely, available nitrogen(N) and available potassium(K). Three types of pretreatments including standard normal variate (SNV), multiplicative scattering correction (MSC) and Savitzky-Golay smoothing+first derivative were adopted to eliminate the system noises and external disturbances. Then partial least squares (PLS) and least squares-support vector machine (LS-SVM) models analysis were implemented for calibration models. Simultaneously, the performance of least squares-support vector machine (LS-SVM) models was compared with three kinds of inputs, including PCA(PCs), latent variables (LVs), and effective wavelengths (EWs). The results indicated that all LS-SVM models outperformed PLS models. The performance of the model was evaluated by the correlation coefficient (r2) and RMSEP. The optimal EWs-LS-SVM models were achieved, and the correlation coefficient (r2) and RMSEP were 0.82 and 17.2 for N and 0.72 and 15.0 for K, respectively. The results indicated that visible and short wave-near infrared spectroscopy (Vis/SW-NIRS)(325-1 075 nm) combined with LS-SVM could be utilized as a precision method for the determination of soil properties.

  16. Robust Least-Squares Support Vector Machine With Minimization of Mean and Variance of Modeling Error.

    PubMed

    Lu, Xinjiang; Liu, Wenbo; Zhou, Chuang; Huang, Minghui

    2017-06-13

    The least-squares support vector machine (LS-SVM) is a popular data-driven modeling method and has been successfully applied to a wide range of applications. However, it has some disadvantages, including being ineffective at handling non-Gaussian noise as well as being sensitive to outliers. In this paper, a robust LS-SVM method is proposed and is shown to have more reliable performance when modeling a nonlinear system under conditions where Gaussian or non-Gaussian noise is present. The construction of a new objective function allows for a reduction of the mean of the modeling error as well as the minimization of its variance, and it does not constrain the mean of the modeling error to zero. This differs from the traditional LS-SVM, which uses a worst-case scenario approach in order to minimize the modeling error and constrains the mean of the modeling error to zero. In doing so, the proposed method takes the modeling error distribution information into consideration and is thus less conservative and more robust in regards to random noise. A solving method is then developed in order to determine the optimal parameters for the proposed robust LS-SVM. An additional analysis indicates that the proposed LS-SVM gives a smaller weight to a large-error training sample and a larger weight to a small-error training sample, and is thus more robust than the traditional LS-SVM. The effectiveness of the proposed robust LS-SVM is demonstrated using both artificial and real life cases.

  17. Application of least square support vector machine and multivariate adaptive regression spline models in long term prediction of river water pollution

    NASA Astrophysics Data System (ADS)

    Kisi, Ozgur; Parmar, Kulwinder Singh

    2016-03-01

    This study investigates the accuracy of least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5Tree) in modeling river water pollution. Various combinations of water quality parameters, Free Ammonia (AMM), Total Kjeldahl Nitrogen (TKN), Water Temperature (WT), Total Coliform (TC), Fecal Coliform (FC) and Potential of Hydrogen (pH) monitored at Nizamuddin, Delhi Yamuna River in India were used as inputs to the applied models. Results indicated that the LSSVM and MARS models had almost same accuracy and they performed better than the M5Tree model in modeling monthly chemical oxygen demand (COD). The average root mean square error (RMSE) of the LSSVM and M5Tree models was decreased by 1.47% and 19.1% using MARS model, respectively. Adding TC input to the models did not increase their accuracy in modeling COD while adding FC and pH inputs to the models generally decreased the accuracy. The overall results indicated that the MARS and LSSVM models could be successfully used in estimating monthly river water pollution level by using AMM, TKN and WT parameters as inputs.

  18. Data on Support Vector Machines (SVM) model to forecast photovoltaic power.

    PubMed

    Malvoni, M; De Giorgi, M G; Congedo, P M

    2016-12-01

    The data concern the photovoltaic (PV) power, forecasted by a hybrid model that considers weather variations and applies a technique to reduce the input data size, as presented in the paper entitled "Photovoltaic forecast based on hybrid pca-lssvm using dimensionality reducted data" (M. Malvoni, M.G. De Giorgi, P.M. Congedo, 2015) [1]. The quadratic Renyi entropy criteria together with the principal component analysis (PCA) are applied to the Least Squares Support Vector Machines (LS-SVM) to predict the PV power in the day-ahead time frame. The data here shared represent the proposed approach results. Hourly PV power predictions for 1,3,6,12, 24 ahead hours and for different data reduction sizes are provided in Supplementary material.

  19. Kennard-Stone combined with least square support vector machine method for noncontact discriminating human blood species

    NASA Astrophysics Data System (ADS)

    Zhang, Linna; Li, Gang; Sun, Meixiu; Li, Hongxiao; Wang, Zhennan; Li, Yingxin; Lin, Ling

    2017-11-01

    Identifying whole bloods to be either human or nonhuman is an important responsibility for import-export ports and inspection and quarantine departments. Analytical methods and DNA testing methods are usually destructive. Previous studies demonstrated that visible diffuse reflectance spectroscopy method can realize noncontact human and nonhuman blood discrimination. An appropriate method for calibration set selection was very important for a robust quantitative model. In this paper, Random Selection (RS) method and Kennard-Stone (KS) method was applied in selecting samples for calibration set. Moreover, proper stoichiometry method can be greatly beneficial for improving the performance of classification model or quantification model. Partial Least Square Discrimination Analysis (PLSDA) method was commonly used in identification of blood species with spectroscopy methods. Least Square Support Vector Machine (LSSVM) was proved to be perfect for discrimination analysis. In this research, PLSDA method and LSSVM method was used for human blood discrimination. Compared with the results of PLSDA method, this method could enhance the performance of identified models. The overall results convinced that LSSVM method was more feasible for identifying human and animal blood species, and sufficiently demonstrated LSSVM method was a reliable and robust method for human blood identification, and can be more effective and accurate.

  20. An improved conjugate gradient scheme to the solution of least squares SVM.

    PubMed

    Chu, Wei; Ong, Chong Jin; Keerthi, S Sathiya

    2005-03-01

    The least square support vector machines (LS-SVM) formulation corresponds to the solution of a linear system of equations. Several approaches to its numerical solutions have been proposed in the literature. In this letter, we propose an improved method to the numerical solution of LS-SVM and show that the problem can be solved using one reduced system of linear equations. Compared with the existing algorithm for LS-SVM, the approach used in this letter is about twice as efficient. Numerical results using the proposed method are provided for comparisons with other existing algorithms.

  1. The Application of Auto-Disturbance Rejection Control Optimized by Least Squares Support Vector Machines Method and Time-Frequency Representation in Voltage Source Converter-High Voltage Direct Current System.

    PubMed

    Liu, Ying-Pei; Liang, Hai-Ping; Gao, Zhong-Ke

    2015-01-01

    In order to improve the performance of voltage source converter-high voltage direct current (VSC-HVDC) system, we propose an improved auto-disturbance rejection control (ADRC) method based on least squares support vector machines (LSSVM) in the rectifier side. Firstly, we deduce the high frequency transient mathematical model of VSC-HVDC system. Then we investigate the ADRC and LSSVM principles. We ignore the tracking differentiator in the ADRC controller aiming to improve the system dynamic response speed. On this basis, we derive the mathematical model of ADRC controller optimized by LSSVM for direct current voltage loop. Finally we carry out simulations to verify the feasibility and effectiveness of our proposed control method. In addition, we employ the time-frequency representation methods, i.e., Wigner-Ville distribution (WVD) and adaptive optimal kernel (AOK) time-frequency representation, to demonstrate our proposed method performs better than the traditional method from the perspective of energy distribution in time and frequency plane.

  2. The Application of Auto-Disturbance Rejection Control Optimized by Least Squares Support Vector Machines Method and Time-Frequency Representation in Voltage Source Converter-High Voltage Direct Current System

    PubMed Central

    Gao, Zhong-Ke

    2015-01-01

    In order to improve the performance of voltage source converter-high voltage direct current (VSC-HVDC) system, we propose an improved auto-disturbance rejection control (ADRC) method based on least squares support vector machines (LSSVM) in the rectifier side. Firstly, we deduce the high frequency transient mathematical model of VSC-HVDC system. Then we investigate the ADRC and LSSVM principles. We ignore the tracking differentiator in the ADRC controller aiming to improve the system dynamic response speed. On this basis, we derive the mathematical model of ADRC controller optimized by LSSVM for direct current voltage loop. Finally we carry out simulations to verify the feasibility and effectiveness of our proposed control method. In addition, we employ the time-frequency representation methods, i.e., Wigner-Ville distribution (WVD) and adaptive optimal kernel (AOK) time-frequency representation, to demonstrate our proposed method performs better than the traditional method from the perspective of energy distribution in time and frequency plane. PMID:26098556

  3. Budget Online Learning Algorithm for Least Squares SVM.

    PubMed

    Jian, Ling; Shen, Shuqian; Li, Jundong; Liang, Xijun; Li, Lei

    2017-09-01

    Batch-mode least squares support vector machine (LSSVM) is often associated with unbounded number of support vectors (SVs'), making it unsuitable for applications involving large-scale streaming data. Limited-scale LSSVM, which allows efficient updating, seems to be a good solution to tackle this issue. In this paper, to train the limited-scale LSSVM dynamically, we present a budget online LSSVM (BOLSSVM) algorithm. Methodologically, by setting a fixed budget for SVs', we are able to update the LSSVM model according to the updated SVs' set dynamically without retraining from scratch. In particular, when a new small chunk of SVs' substitute for the old ones, the proposed algorithm employs a low rank correction technology and the Sherman-Morrison-Woodbury formula to compute the inverse of saddle point matrix derived from the LSSVM's Karush-Kuhn-Tucker (KKT) system, which, in turn, updates the LSSVM model efficiently. In this way, the proposed BOLSSVM algorithm is especially useful for online prediction tasks. Another merit of the proposed BOLSSVM is that it can be used for k -fold cross validation. Specifically, compared with batch-mode learning methods, the computational complexity of the proposed BOLSSVM method is significantly reduced from O(n 4 ) to O(n 3 ) for leave-one-out cross validation with n training samples. The experimental results of classification and regression on benchmark data sets and real-world applications show the validity and effectiveness of the proposed BOLSSVM algorithm.

  4. Degradation trend estimation of slewing bearing based on LSSVM model

    NASA Astrophysics Data System (ADS)

    Lu, Chao; Chen, Jie; Hong, Rongjing; Feng, Yang; Li, Yuanyuan

    2016-08-01

    A novel prediction method is proposed based on least squares support vector machine (LSSVM) to estimate the slewing bearing's degradation trend with small sample data. This method chooses the vibration signal which contains rich state information as the object of the study. Principal component analysis (PCA) was applied to fuse multi-feature vectors which could reflect the health state of slewing bearing, such as root mean square, kurtosis, wavelet energy entropy, and intrinsic mode function (IMF) energy. The degradation indicator fused by PCA can reflect the degradation more comprehensively and effectively. Then the degradation trend of slewing bearing was predicted by using the LSSVM model optimized by particle swarm optimization (PSO). The proposed method was demonstrated to be more accurate and effective by the whole life experiment of slewing bearing. Therefore, it can be applied in engineering practice.

  5. Density-Dependent Quantized Least Squares Support Vector Machine for Large Data Sets.

    PubMed

    Nan, Shengyu; Sun, Lei; Chen, Badong; Lin, Zhiping; Toh, Kar-Ann

    2017-01-01

    Based on the knowledge that input data distribution is important for learning, a data density-dependent quantization scheme (DQS) is proposed for sparse input data representation. The usefulness of the representation scheme is demonstrated by using it as a data preprocessing unit attached to the well-known least squares support vector machine (LS-SVM) for application on big data sets. Essentially, the proposed DQS adopts a single shrinkage threshold to obtain a simple quantization scheme, which adapts its outputs to input data density. With this quantization scheme, a large data set is quantized to a small subset where considerable sample size reduction is generally obtained. In particular, the sample size reduction can save significant computational cost when using the quantized subset for feature approximation via the Nyström method. Based on the quantized subset, the approximated features are incorporated into LS-SVM to develop a data density-dependent quantized LS-SVM (DQLS-SVM), where an analytic solution is obtained in the primal solution space. The developed DQLS-SVM is evaluated on synthetic and benchmark data with particular emphasis on large data sets. Extensive experimental results show that the learning machine incorporating DQS attains not only high computational efficiency but also good generalization performance.

  6. Support vector machine regression (LS-SVM)--an alternative to artificial neural networks (ANNs) for the analysis of quantum chemistry data?

    PubMed

    Balabin, Roman M; Lomakina, Ekaterina I

    2011-06-28

    A multilayer feed-forward artificial neural network (MLP-ANN) with a single, hidden layer that contains a finite number of neurons can be regarded as a universal non-linear approximator. Today, the ANN method and linear regression (MLR) model are widely used for quantum chemistry (QC) data analysis (e.g., thermochemistry) to improve their accuracy (e.g., Gaussian G2-G4, B3LYP/B3-LYP, X1, or W1 theoretical methods). In this study, an alternative approach based on support vector machines (SVMs) is used, the least squares support vector machine (LS-SVM) regression. It has been applied to ab initio (first principle) and density functional theory (DFT) quantum chemistry data. So, QC + SVM methodology is an alternative to QC + ANN one. The task of the study was to estimate the Møller-Plesset (MPn) or DFT (B3LYP, BLYP, BMK) energies calculated with large basis sets (e.g., 6-311G(3df,3pd)) using smaller ones (6-311G, 6-311G*, 6-311G**) plus molecular descriptors. A molecular set (BRM-208) containing a total of 208 organic molecules was constructed and used for the LS-SVM training, cross-validation, and testing. MP2, MP3, MP4(DQ), MP4(SDQ), and MP4/MP4(SDTQ) ab initio methods were tested. Hartree-Fock (HF/SCF) results were also reported for comparison. Furthermore, constitutional (CD: total number of atoms and mole fractions of different atoms) and quantum-chemical (QD: HOMO-LUMO gap, dipole moment, average polarizability, and quadrupole moment) molecular descriptors were used for the building of the LS-SVM calibration model. Prediction accuracies (MADs) of 1.62 ± 0.51 and 0.85 ± 0.24 kcal mol(-1) (1 kcal mol(-1) = 4.184 kJ mol(-1)) were reached for SVM-based approximations of ab initio and DFT energies, respectively. The LS-SVM model was more accurate than the MLR model. A comparison with the artificial neural network approach shows that the accuracy of the LS-SVM method is similar to the accuracy of ANN. The extrapolation and interpolation results show that LS-SVM is superior by almost an order of magnitude over the ANN method in terms of the stability, generality, and robustness of the final model. The LS-SVM model needs a much smaller numbers of samples (a much smaller sample set) to make accurate prediction results. Potential energy surface (PES) approximations for molecular dynamics (MD) studies are discussed as a promising application for the LS-SVM calibration approach. This journal is © the Owner Societies 2011

  7. Working set selection using functional gain for LS-SVM.

    PubMed

    Bo, Liefeng; Jiao, Licheng; Wang, Ling

    2007-09-01

    The efficiency of sequential minimal optimization (SMO) depends strongly on the working set selection. This letter shows how the improvement of SMO in each iteration, named the functional gain (FG), is used to select the working set for least squares support vector machine (LS-SVM). We prove the convergence of the proposed method and give some theoretical support for its performance. Empirical comparisons demonstrate that our method is superior to the maximum violating pair (MVP) working set selection.

  8. Least-Squares Support Vector Machine Approach to Viral Replication Origin Prediction

    PubMed Central

    Cruz-Cano, Raul; Chew, David S.H.; Kwok-Pui, Choi; Ming-Ying, Leung

    2010-01-01

    Replication of their DNA genomes is a central step in the reproduction of many viruses. Procedures to find replication origins, which are initiation sites of the DNA replication process, are therefore of great importance for controlling the growth and spread of such viruses. Existing computational methods for viral replication origin prediction have mostly been tested within the family of herpesviruses. This paper proposes a new approach by least-squares support vector machines (LS-SVMs) and tests its performance not only on the herpes family but also on a collection of caudoviruses coming from three viral families under the order of caudovirales. The LS-SVM approach provides sensitivities and positive predictive values superior or comparable to those given by the previous methods. When suitably combined with previous methods, the LS-SVM approach further improves the prediction accuracy for the herpesvirus replication origins. Furthermore, by recursive feature elimination, the LS-SVM has also helped find the most significant features of the data sets. The results suggest that the LS-SVMs will be a highly useful addition to the set of computational tools for viral replication origin prediction and illustrate the value of optimization-based computing techniques in biomedical applications. PMID:20729987

  9. Least-Squares Support Vector Machine Approach to Viral Replication Origin Prediction.

    PubMed

    Cruz-Cano, Raul; Chew, David S H; Kwok-Pui, Choi; Ming-Ying, Leung

    2010-06-01

    Replication of their DNA genomes is a central step in the reproduction of many viruses. Procedures to find replication origins, which are initiation sites of the DNA replication process, are therefore of great importance for controlling the growth and spread of such viruses. Existing computational methods for viral replication origin prediction have mostly been tested within the family of herpesviruses. This paper proposes a new approach by least-squares support vector machines (LS-SVMs) and tests its performance not only on the herpes family but also on a collection of caudoviruses coming from three viral families under the order of caudovirales. The LS-SVM approach provides sensitivities and positive predictive values superior or comparable to those given by the previous methods. When suitably combined with previous methods, the LS-SVM approach further improves the prediction accuracy for the herpesvirus replication origins. Furthermore, by recursive feature elimination, the LS-SVM has also helped find the most significant features of the data sets. The results suggest that the LS-SVMs will be a highly useful addition to the set of computational tools for viral replication origin prediction and illustrate the value of optimization-based computing techniques in biomedical applications.

  10. A Temperature Compensation Method for Piezo-Resistive Pressure Sensor Utilizing Chaotic Ions Motion Algorithm Optimized Hybrid Kernel LSSVM.

    PubMed

    Li, Ji; Hu, Guoqing; Zhou, Yonghong; Zou, Chong; Peng, Wei; Alam Sm, Jahangir

    2016-10-14

    A piezo-resistive pressure sensor is made of silicon, the nature of which is considerably influenced by ambient temperature. The effect of temperature should be eliminated during the working period in expectation of linear output. To deal with this issue, an approach consists of a hybrid kernel Least Squares Support Vector Machine (LSSVM) optimized by a chaotic ions motion algorithm presented. To achieve the learning and generalization for excellent performance, a hybrid kernel function, constructed by a local kernel as Radial Basis Function (RBF) kernel, and a global kernel as polynomial kernel is incorporated into the Least Squares Support Vector Machine. The chaotic ions motion algorithm is introduced to find the best hyper-parameters of the Least Squares Support Vector Machine. The temperature data from a calibration experiment is conducted to validate the proposed method. With attention on algorithm robustness and engineering applications, the compensation result shows the proposed scheme outperforms other compared methods on several performance measures as maximum absolute relative error, minimum absolute relative error mean and variance of the averaged value on fifty runs. Furthermore, the proposed temperature compensation approach lays a foundation for more extensive research.

  11. Eddy current characterization of small cracks using least square support vector machine

    NASA Astrophysics Data System (ADS)

    Chelabi, M.; Hacib, T.; Le Bihan, Y.; Ikhlef, N.; Boughedda, H.; Mekideche, M. R.

    2016-04-01

    Eddy current (EC) sensors are used for non-destructive testing since they are able to probe conductive materials. Despite being a conventional technique for defect detection and localization, the main weakness of this technique is that defect characterization, of the exact determination of the shape and dimension, is still a question to be answered. In this work, we demonstrate the capability of small crack sizing using signals acquired from an EC sensor. We report our effort to develop a systematic approach to estimate the size of rectangular and thin defects (length and depth) in a conductive plate. The achieved approach by the novel combination of a finite element method (FEM) with a statistical learning method is called least square support vector machines (LS-SVM). First, we use the FEM to design the forward problem. Next, an algorithm is used to find an adaptive database. Finally, the LS-SVM is used to solve the inverse problems, creating polynomial functions able to approximate the correlation between the crack dimension and the signal picked up from the EC sensor. Several methods are used to find the parameters of the LS-SVM. In this study, the particle swarm optimization (PSO) and genetic algorithm (GA) are proposed for tuning the LS-SVM. The results of the design and the inversions were compared to both simulated and experimental data, with accuracy experimentally verified. These suggested results prove the applicability of the presented approach.

  12. An improved CS-LSSVM algorithm-based fault pattern recognition of ship power equipments.

    PubMed

    Yang, Yifei; Tan, Minjia; Dai, Yuewei

    2017-01-01

    A ship power equipments' fault monitoring signal usually provides few samples and the data's feature is non-linear in practical situation. This paper adopts the method of the least squares support vector machine (LSSVM) to deal with the problem of fault pattern identification in the case of small sample data. Meanwhile, in order to avoid involving a local extremum and poor convergence precision which are induced by optimizing the kernel function parameter and penalty factor of LSSVM, an improved Cuckoo Search (CS) algorithm is proposed for the purpose of parameter optimization. Based on the dynamic adaptive strategy, the newly proposed algorithm improves the recognition probability and the searching step length, which can effectively solve the problems of slow searching speed and low calculation accuracy of the CS algorithm. A benchmark example demonstrates that the CS-LSSVM algorithm can accurately and effectively identify the fault pattern types of ship power equipments.

  13. Prediction of pH of cola beverage using Vis/NIR spectroscopy and least squares-support vector machine

    NASA Astrophysics Data System (ADS)

    Liu, Fei; He, Yong

    2008-02-01

    Visible and near infrared (Vis/NIR) transmission spectroscopy and chemometric methods were utilized to predict the pH values of cola beverages. Five varieties of cola were prepared and 225 samples (45 samples for each variety) were selected for the calibration set, while 75 samples (15 samples for each variety) for the validation set. The smoothing way of Savitzky-Golay and standard normal variate (SNV) followed by first-derivative were used as the pre-processing methods. Partial least squares (PLS) analysis was employed to extract the principal components (PCs) which were used as the inputs of least squares-support vector machine (LS-SVM) model according to their accumulative reliabilities. Then LS-SVM with radial basis function (RBF) kernel function and a two-step grid search technique were applied to build the regression model with a comparison of PLS regression. The correlation coefficient (r), root mean square error of prediction (RMSEP) and bias were 0.961, 0.040 and 0.012 for PLS, while 0.975, 0.031 and 4.697x10 -3 for LS-SVM, respectively. Both methods obtained a satisfying precision. The results indicated that Vis/NIR spectroscopy combined with chemometric methods could be applied as an alternative way for the prediction of pH of cola beverages.

  14. Modelling daily dissolved oxygen concentration using least square support vector machine, multivariate adaptive regression splines and M5 model tree

    NASA Astrophysics Data System (ADS)

    Heddam, Salim; Kisi, Ozgur

    2018-04-01

    In the present study, three types of artificial intelligence techniques, least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5T) are applied for modeling daily dissolved oxygen (DO) concentration using several water quality variables as inputs. The DO concentration and water quality variables data from three stations operated by the United States Geological Survey (USGS) were used for developing the three models. The water quality data selected consisted of daily measured of water temperature (TE, °C), pH (std. unit), specific conductance (SC, μS/cm) and discharge (DI cfs), are used as inputs to the LSSVM, MARS and M5T models. The three models were applied for each station separately and compared to each other. According to the results obtained, it was found that: (i) the DO concentration could be successfully estimated using the three models and (ii) the best model among all others differs from one station to another.

  15. Spectrophotometric determination of ternary mixtures of thiamin, riboflavin and pyridoxal in pharmaceutical and human plasma by least-squares support vector machines.

    PubMed

    Niazi, Ali; Zolgharnein, Javad; Afiuni-Zadeh, Somaie

    2007-11-01

    Ternary mixtures of thiamin, riboflavin and pyridoxal have been simultaneously determined in synthetic and real samples by applications of spectrophotometric and least-squares support vector machines. The calibration graphs were linear in the ranges of 1.0 - 20.0, 1.0 - 10.0 and 1.0 - 20.0 microg ml(-1) with detection limits of 0.6, 0.5 and 0.7 microg ml(-1) for thiamin, riboflavin and pyridoxal, respectively. The experimental calibration matrix was designed with 21 mixtures of these chemicals. The concentrations were varied between calibration graph concentrations of vitamins. The simultaneous determination of these vitamin mixtures by using spectrophotometric methods is a difficult problem, due to spectral interferences. The partial least squares (PLS) modeling and least-squares support vector machines were used for the multivariate calibration of the spectrophotometric data. An excellent model was built using LS-SVM, with low prediction errors and superior performance in relation to PLS. The root mean square errors of prediction (RMSEP) for thiamin, riboflavin and pyridoxal with PLS and LS-SVM were 0.6926, 0.3755, 0.4322 and 0.0421, 0.0318, 0.0457, respectively. The proposed method was satisfactorily applied to the rapid simultaneous determination of thiamin, riboflavin and pyridoxal in commercial pharmaceutical preparations and human plasma samples.

  16. Design of a multiple kernel learning algorithm for LS-SVM by convex programming.

    PubMed

    Jian, Ling; Xia, Zhonghang; Liang, Xijun; Gao, Chuanhou

    2011-06-01

    As a kernel based method, the performance of least squares support vector machine (LS-SVM) depends on the selection of the kernel as well as the regularization parameter (Duan, Keerthi, & Poo, 2003). Cross-validation is efficient in selecting a single kernel and the regularization parameter; however, it suffers from heavy computational cost and is not flexible to deal with multiple kernels. In this paper, we address the issue of multiple kernel learning for LS-SVM by formulating it as semidefinite programming (SDP). Furthermore, we show that the regularization parameter can be optimized in a unified framework with the kernel, which leads to an automatic process for model selection. Extensive experimental validations are performed and analyzed. Copyright © 2011 Elsevier Ltd. All rights reserved.

  17. A Bayesian least squares support vector machines based framework for fault diagnosis and failure prognosis

    NASA Astrophysics Data System (ADS)

    Khawaja, Taimoor Saleem

    A high-belief low-overhead Prognostics and Health Management (PHM) system is desired for online real-time monitoring of complex non-linear systems operating in a complex (possibly non-Gaussian) noise environment. This thesis presents a Bayesian Least Squares Support Vector Machine (LS-SVM) based framework for fault diagnosis and failure prognosis in nonlinear non-Gaussian systems. The methodology assumes the availability of real-time process measurements, definition of a set of fault indicators and the existence of empirical knowledge (or historical data) to characterize both nominal and abnormal operating conditions. An efficient yet powerful Least Squares Support Vector Machine (LS-SVM) algorithm, set within a Bayesian Inference framework, not only allows for the development of real-time algorithms for diagnosis and prognosis but also provides a solid theoretical framework to address key concepts related to classification for diagnosis and regression modeling for prognosis. SVM machines are founded on the principle of Structural Risk Minimization (SRM) which tends to find a good trade-off between low empirical risk and small capacity. The key features in SVM are the use of non-linear kernels, the absence of local minima, the sparseness of the solution and the capacity control obtained by optimizing the margin. The Bayesian Inference framework linked with LS-SVMs allows a probabilistic interpretation of the results for diagnosis and prognosis. Additional levels of inference provide the much coveted features of adaptability and tunability of the modeling parameters. The two main modules considered in this research are fault diagnosis and failure prognosis. With the goal of designing an efficient and reliable fault diagnosis scheme, a novel Anomaly Detector is suggested based on the LS-SVM machines. The proposed scheme uses only baseline data to construct a 1-class LS-SVM machine which, when presented with online data is able to distinguish between normal behavior and any abnormal or novel data during real-time operation. The results of the scheme are interpreted as a posterior probability of health (1 - probability of fault). As shown through two case studies in Chapter 3, the scheme is well suited for diagnosing imminent faults in dynamical non-linear systems. Finally, the failure prognosis scheme is based on an incremental weighted Bayesian LS-SVR machine. It is particularly suited for online deployment given the incremental nature of the algorithm and the quick optimization problem solved in the LS-SVR algorithm. By way of kernelization and a Gaussian Mixture Modeling (GMM) scheme, the algorithm can estimate "possibly" non-Gaussian posterior distributions for complex non-linear systems. An efficient regression scheme associated with the more rigorous core algorithm allows for long-term predictions, fault growth estimation with confidence bounds and remaining useful life (RUL) estimation after a fault is detected. The leading contributions of this thesis are (a) the development of a novel Bayesian Anomaly Detector for efficient and reliable Fault Detection and Identification (FDI) based on Least Squares Support Vector Machines, (b) the development of a data-driven real-time architecture for long-term Failure Prognosis using Least Squares Support Vector Machines, (c) Uncertainty representation and management using Bayesian Inference for posterior distribution estimation and hyper-parameter tuning, and finally (d) the statistical characterization of the performance of diagnosis and prognosis algorithms in order to relate the efficiency and reliability of the proposed schemes.

  18. Identification of spilled oils by NIR spectroscopy technology based on KPCA and LSSVM

    NASA Astrophysics Data System (ADS)

    Tan, Ailing; Bi, Weihong

    2011-08-01

    Oil spills on the sea surface are seen relatively often with the development of the petroleum exploitation and transportation of the sea. Oil spills are great threat to the marine environment and the ecosystem, thus the oil pollution in the ocean becomes an urgent topic in the environmental protection. To develop the oil spill accident treatment program and track the source of the spilled oils, a novel qualitative identification method combined Kernel Principal Component Analysis (KPCA) and Least Square Support Vector Machine (LSSVM) was proposed. The proposed method adapt Fourier transform NIR spectrophotometer to collect the NIR spectral data of simulated gasoline, diesel fuel and kerosene oil spills samples and do some pretreatments to the original spectrum. We use the KPCA algorithm which is an extension of Principal Component Analysis (PCA) using techniques of kernel methods to extract nonlinear features of the preprocessed spectrum. Support Vector Machines (SVM) is a powerful methodology for solving spectral classification tasks in chemometrics. LSSVM are reformulations to the standard SVMs which lead to solving a system of linear equations. So a LSSVM multiclass classification model was designed which using Error Correcting Output Code (ECOC) method borrowing the idea of error correcting codes used for correcting bit errors in transmission channels. The most common and reliable approach to parameter selection is to decide on parameter ranges, and to then do a grid search over the parameter space to find the optimal model parameters. To test the proposed method, 375 spilled oil samples of unknown type were selected to study. The optimal model has the best identification capabilities with the accuracy of 97.8%. Experimental results show that the proposed KPCA plus LSSVM qualitative analysis method of near infrared spectroscopy has good recognition result, which could work as a new method for rapid identification of spilled oils.

  19. Novel Hybrid of LS-SVM and Kalman Filter for GPS/INS Integration

    NASA Astrophysics Data System (ADS)

    Xu, Zhenkai; Li, Yong; Rizos, Chris; Xu, Xiaosu

    Integration of Global Positioning System (GPS) and Inertial Navigation System (INS) technologies can overcome the drawbacks of the individual systems. One of the advantages is that the integrated solution can provide continuous navigation capability even during GPS outages. However, bridging the GPS outages is still a challenge when Micro-Electro-Mechanical System (MEMS) inertial sensors are used. Methods being currently explored by the research community include applying vehicle motion constraints, optimal smoother, and artificial intelligence (AI) techniques. In the research area of AI, the neural network (NN) approach has been extensively utilised up to the present. In an NN-based integrated system, a Kalman filter (KF) estimates position, velocity and attitude errors, as well as the inertial sensor errors, to output navigation solutions while GPS signals are available. At the same time, an NN is trained to map the vehicle dynamics with corresponding KF states, and to correct INS measurements when GPS measurements are unavailable. To achieve good performance it is critical to select suitable quality and an optimal number of samples for the NN. This is sometimes too rigorous a requirement which limits real world application of NN-based methods.The support vector machine (SVM) approach is based on the structural risk minimisation principle, instead of the minimised empirical error principle that is commonly implemented in an NN. The SVM can avoid local minimisation and over-fitting problems in an NN, and therefore potentially can achieve a higher level of global performance. This paper focuses on the least squares support vector machine (LS-SVM), which can solve highly nonlinear and noisy black-box modelling problems. This paper explores the application of the LS-SVM to aid the GPS/INS integrated system, especially during GPS outages. The paper describes the principles of the LS-SVM and of the KF hybrid method, and introduces the LS-SVM regression algorithm. Field test data is processed to evaluate the performance of the proposed approach.

  20. Comparative Analysis of River Flow Modelling by Using Supervised Learning Technique

    NASA Astrophysics Data System (ADS)

    Ismail, Shuhaida; Mohamad Pandiahi, Siraj; Shabri, Ani; Mustapha, Aida

    2018-04-01

    The goal of this research is to investigate the efficiency of three supervised learning algorithms for forecasting monthly river flow of the Indus River in Pakistan, spread over 550 square miles or 1800 square kilometres. The algorithms include the Least Square Support Vector Machine (LSSVM), Artificial Neural Network (ANN) and Wavelet Regression (WR). The forecasting models predict the monthly river flow obtained from the three models individually for river flow data and the accuracy of the all models were then compared against each other. The monthly river flow of the said river has been forecasted using these three models. The obtained results were compared and statistically analysed. Then, the results of this analytical comparison showed that LSSVM model is more precise in the monthly river flow forecasting. It was found that LSSVM has he higher r with the value of 0.934 compared to other models. This indicate that LSSVM is more accurate and efficient as compared to the ANN and WR model.

  1. The application of continuous wavelet transform and least squares support vector machine for the simultaneous quantitative spectrophotometric determination of Myricetin, Kaempferol and Quercetin as flavonoids in pharmaceutical plants

    NASA Astrophysics Data System (ADS)

    Sohrabi, Mahmoud Reza; Darabi, Golnaz

    2016-01-01

    Flavonoids are γ-benzopyrone derivatives, which are highly regarded in these researchers for their antioxidant property. In this study, two new signals processing methods been coupled with UV spectroscopy for spectral resolution and simultaneous quantitative determination of Myricetin, Kaempferol and Quercetin as flavonoids in Laurel, St. John's Wort and Green Tea without the need for any previous separation procedure. The developed methods are continuous wavelet transform (CWT) and least squares support vector machine (LS-SVM) methods integrated with UV spectroscopy individually. Different wavelet families were tested by CWT method and finally the Daubechies wavelet family (Db4) for Myricetin and the Gaussian wavelet families for Kaempferol (Gaus3) and Quercetin (Gaus7) were selected and applied for simultaneous analysis under the optimal conditions. The LS-SVM was applied to build the flavonoids prediction model based on absorption spectra. The root mean square errors for prediction (RMSEP) of Myricetin, Kaempferol and Quercetin were 0.0552, 0.0275 and 0.0374, respectively. The developed methods were validated by the analysis of the various synthetic mixtures associated with a well- known flavonoid contents. Mean recovery values of Myricetin, Kaempferol and Quercetin, in CWT method were 100.123, 100.253, 100.439 and in LS-SVM method were 99.94, 99.81 and 99.682, respectively. The results achieved by analyzing the real samples from the CWT and LS-SVM methods were compared to the HPLC reference method and the results were very close to the reference method. Meanwhile, the obtained results of the one-way ANOVA (analysis of variance) test revealed that there was no significant difference between the suggested methods.

  2. The application of continuous wavelet transform and least squares support vector machine for the simultaneous quantitative spectrophotometric determination of Myricetin, Kaempferol and Quercetin as flavonoids in pharmaceutical plants.

    PubMed

    Sohrabi, Mahmoud Reza; Darabi, Golnaz

    2016-01-05

    Flavonoids are γ-benzopyrone derivatives, which are highly regarded in these researchers for their antioxidant property. In this study, two new signals processing methods been coupled with UV spectroscopy for spectral resolution and simultaneous quantitative determination of Myricetin, Kaempferol and Quercetin as flavonoids in Laurel, St. John's Wort and Green Tea without the need for any previous separation procedure. The developed methods are continuous wavelet transform (CWT) and least squares support vector machine (LS-SVM) methods integrated with UV spectroscopy individually. Different wavelet families were tested by CWT method and finally the Daubechies wavelet family (Db4) for Myricetin and the Gaussian wavelet families for Kaempferol (Gaus3) and Quercetin (Gaus7) were selected and applied for simultaneous analysis under the optimal conditions. The LS-SVM was applied to build the flavonoids prediction model based on absorption spectra. The root mean square errors for prediction (RMSEP) of Myricetin, Kaempferol and Quercetin were 0.0552, 0.0275 and 0.0374, respectively. The developed methods were validated by the analysis of the various synthetic mixtures associated with a well- known flavonoid contents. Mean recovery values of Myricetin, Kaempferol and Quercetin, in CWT method were 100.123, 100.253, 100.439 and in LS-SVM method were 99.94, 99.81 and 99.682, respectively. The results achieved by analyzing the real samples from the CWT and LS-SVM methods were compared to the HPLC reference method and the results were very close to the reference method. Meanwhile, the obtained results of the one-way ANOVA (analysis of variance) test revealed that there was no significant difference between the suggested methods. Copyright © 2015 Elsevier B.V. All rights reserved.

  3. Computer-aided classification of optical images for diagnosis of osteoarthritis in the finger joints.

    PubMed

    Zhang, Jiang; Wang, James Z; Yuan, Zhen; Sobel, Eric S; Jiang, Huabei

    2011-01-01

    This study presents a computer-aided classification method to distinguish osteoarthritis finger joints from healthy ones based on the functional images captured by x-ray guided diffuse optical tomography. Three imaging features, joint space width, optical absorption, and scattering coefficients, are employed to train a Least Squares Support Vector Machine (LS-SVM) classifier for osteoarthritis classification. The 10-fold validation results show that all osteoarthritis joints are clearly identified and all healthy joints are ruled out by the LS-SVM classifier. The best sensitivity, specificity, and overall accuracy of the classification by experienced technicians based on manual calculation of optical properties and visual examination of optical images are only 85%, 93%, and 90%, respectively. Therefore, our LS-SVM based computer-aided classification is a considerably improved method for osteoarthritis diagnosis.

  4. Study on for soluble solids contents measurement of grape juice beverage based on Vis/NIRS and chemomtrics

    NASA Astrophysics Data System (ADS)

    Wu, Di; He, Yong

    2007-11-01

    The aim of this study is to investigate the potential of the visible and near infrared spectroscopy (Vis/NIRS) technique for non-destructive measurement of soluble solids contents (SSC) in grape juice beverage. 380 samples were studied in this paper. Smoothing way of Savitzky-Golay and standard normal variate were applied for the pre-processing of spectral data. Least-squares support vector machines (LS-SVM) with RBF kernel function was applied to developing the SSC prediction model based on the Vis/NIRS absorbance data. The determination coefficient for prediction (Rp2) of the results predicted by LS-SVM model was 0. 962 and root mean square error (RMSEP) was 0. 434137. It is concluded that Vis/NIRS technique can quantify the SSC of grape juice beverage fast and non-destructively.. At the same time, LS-SVM model was compared with PLS and back propagation neural network (BP-NN) methods. The results showed that LS-SVM was superior to the conventional linear and non-linear methods in predicting SSC of grape juice beverage. In this study, the generation ability of LS-SVM, PLS and BP-NN models were also investigated. It is concluded that LS-SVM regression method is a promising technique for chemometrics in quantitative prediction.

  5. Dissolved oxygen content prediction in crab culture using a hybrid intelligent method

    PubMed Central

    Yu, Huihui; Chen, Yingyi; Hassan, ShahbazGul; Li, Daoliang

    2016-01-01

    A precise predictive model is needed to obtain a clear understanding of the changing dissolved oxygen content in outdoor crab ponds, to assess how to reduce risk and to optimize water quality management. The uncertainties in the data from multiple sensors are a significant factor when building a dissolved oxygen content prediction model. To increase prediction accuracy, a new hybrid dissolved oxygen content forecasting model based on the radial basis function neural networks (RBFNN) data fusion method and a least squares support vector machine (LSSVM) with an optimal improved particle swarm optimization(IPSO) is developed. In the modelling process, the RBFNN data fusion method is used to improve information accuracy and provide more trustworthy training samples for the IPSO-LSSVM prediction model. The LSSVM is a powerful tool for achieving nonlinear dissolved oxygen content forecasting. In addition, an improved particle swarm optimization algorithm is developed to determine the optimal parameters for the LSSVM with high accuracy and generalizability. In this study, the comparison of the prediction results of different traditional models validates the effectiveness and accuracy of the proposed hybrid RBFNN-IPSO-LSSVM model for dissolved oxygen content prediction in outdoor crab ponds. PMID:27270206

  6. Dissolved oxygen content prediction in crab culture using a hybrid intelligent method.

    PubMed

    Yu, Huihui; Chen, Yingyi; Hassan, ShahbazGul; Li, Daoliang

    2016-06-08

    A precise predictive model is needed to obtain a clear understanding of the changing dissolved oxygen content in outdoor crab ponds, to assess how to reduce risk and to optimize water quality management. The uncertainties in the data from multiple sensors are a significant factor when building a dissolved oxygen content prediction model. To increase prediction accuracy, a new hybrid dissolved oxygen content forecasting model based on the radial basis function neural networks (RBFNN) data fusion method and a least squares support vector machine (LSSVM) with an optimal improved particle swarm optimization(IPSO) is developed. In the modelling process, the RBFNN data fusion method is used to improve information accuracy and provide more trustworthy training samples for the IPSO-LSSVM prediction model. The LSSVM is a powerful tool for achieving nonlinear dissolved oxygen content forecasting. In addition, an improved particle swarm optimization algorithm is developed to determine the optimal parameters for the LSSVM with high accuracy and generalizability. In this study, the comparison of the prediction results of different traditional models validates the effectiveness and accuracy of the proposed hybrid RBFNN-IPSO-LSSVM model for dissolved oxygen content prediction in outdoor crab ponds.

  7. [Measurement of soil organic matter and available K based on SPA-LS-SVM].

    PubMed

    Zhang, Hai-Liang; Liu, Xue-Mei; He, Yong

    2014-05-01

    Visible and short wave infrared spectroscopy (Vis/SW-NIRS) was investigated in the present study for measurement of soil organic matter (OM) and available potassium (K). Four types of pretreatments including smoothing, SNV, MSC and SG smoothing+first derivative were adopted to eliminate the system noises and external disturbances. Then partial least squares regression (PLSR) and least squares-support vector machine (LS-SVM) models were implemented for calibration models. The LS-SVM model was built by using characteristic wavelength based on successive projections algorithm (SPA). Simultaneously, the performance of LSSVM models was compared with PLSR models. The results indicated that LS-SVM models using characteristic wavelength as inputs based on SPA outperformed PLSR models. The optimal SPA-LS-SVM models were achieved, and the correlation coefficient (r), and RMSEP were 0. 860 2 and 2. 98 for OM and 0. 730 5 and 15. 78 for K, respectively. The results indicated that visible and short wave near infrared spectroscopy (Vis/SW-NIRS) (325 approximately 1 075 nm) combined with LS-SVM based on SPA could be utilized as a precision method for the determination of soil properties.

  8. An effective parameter optimization technique for vibration flow field characterization of PP melts via LS-SVM combined with SALS in an electromagnetism dynamic extruder

    NASA Astrophysics Data System (ADS)

    Xian, Guangming

    2018-03-01

    A method for predicting the optimal vibration field parameters by least square support vector machine (LS-SVM) is presented in this paper. One convenient and commonly used technique for characterizing the the vibration flow field of polymer melts films is small angle light scattering (SALS) in a visualized slit die of the electromagnetism dynamic extruder. The optimal value of vibration vibration frequency, vibration amplitude, and the maximum light intensity projection area can be obtained by using LS-SVM for prediction. For illustrating this method and show its validity, the flowing material is used with polypropylene (PP) and fifteen samples are tested at the rotation speed of screw at 36rpm. This paper first describes the apparatus of SALS to perform the experiments, then gives the theoretical basis of this new method, and detail the experimental results for parameter prediction of vibration flow field. It is demonstrated that it is possible to use the method of SALS and obtain detailed information on optimal parameter of vibration flow field of PP melts by LS-SVM.

  9. Lamb Wave Damage Quantification Using GA-Based LS-SVM.

    PubMed

    Sun, Fuqiang; Wang, Ning; He, Jingjing; Guan, Xuefei; Yang, Jinsong

    2017-06-12

    Lamb waves have been reported to be an efficient tool for non-destructive evaluations (NDE) for various application scenarios. However, accurate and reliable damage quantification using the Lamb wave method is still a practical challenge, due to the complex underlying mechanism of Lamb wave propagation and damage detection. This paper presents a Lamb wave damage quantification method using a least square support vector machine (LS-SVM) and a genetic algorithm (GA). Three damage sensitive features, namely, normalized amplitude, phase change, and correlation coefficient, were proposed to describe changes of Lamb wave characteristics caused by damage. In view of commonly used data-driven methods, the GA-based LS-SVM model using the proposed three damage sensitive features was implemented to evaluate the crack size. The GA method was adopted to optimize the model parameters. The results of GA-based LS-SVM were validated using coupon test data and lap joint component test data with naturally developed fatigue cracks. Cases of different loading and manufacturer were also included to further verify the robustness of the proposed method for crack quantification.

  10. Lamb Wave Damage Quantification Using GA-Based LS-SVM

    PubMed Central

    Sun, Fuqiang; Wang, Ning; He, Jingjing; Guan, Xuefei; Yang, Jinsong

    2017-01-01

    Lamb waves have been reported to be an efficient tool for non-destructive evaluations (NDE) for various application scenarios. However, accurate and reliable damage quantification using the Lamb wave method is still a practical challenge, due to the complex underlying mechanism of Lamb wave propagation and damage detection. This paper presents a Lamb wave damage quantification method using a least square support vector machine (LS-SVM) and a genetic algorithm (GA). Three damage sensitive features, namely, normalized amplitude, phase change, and correlation coefficient, were proposed to describe changes of Lamb wave characteristics caused by damage. In view of commonly used data-driven methods, the GA-based LS-SVM model using the proposed three damage sensitive features was implemented to evaluate the crack size. The GA method was adopted to optimize the model parameters. The results of GA-based LS-SVM were validated using coupon test data and lap joint component test data with naturally developed fatigue cracks. Cases of different loading and manufacturer were also included to further verify the robustness of the proposed method for crack quantification. PMID:28773003

  11. [Determination of calcium and magnesium in tobacco by near-infrared spectroscopy and least squares-support vector machine].

    PubMed

    Tian, Kuang-da; Qiu, Kai-xian; Li, Zu-hong; Lü, Ya-qiong; Zhang, Qiu-ju; Xiong, Yan-mei; Min, Shun-geng

    2014-12-01

    The purpose of the present paper is to determine calcium and magnesium in tobacco using NIR combined with least squares-support vector machine (LS-SVM). Five hundred ground and dried tobacco samples from Qujing city, Yunnan province, China, were surveyed by a MATRIX-I spectrometer (Bruker Optics, Bremen, Germany). At the beginning of data processing, outliers of samples were eliminated for stability of the model. The rest 487 samples were divided into several calibration sets and validation sets according to a hybrid modeling strategy. Monte-Carlo cross validation was used to choose the best spectral preprocess method from multiplicative scatter correction (MSC), standard normal variate transformation (SNV), S-G smoothing, 1st derivative, etc., and their combinations. To optimize parameters of LS-SVM model, the multilayer grid search and 10-fold cross validation were applied. The final LS-SVM models with the optimizing parameters were trained by the calibration set and accessed by 287 validation samples picked by Kennard-Stone method. For the quantitative model of calcium in tobacco, Savitzky-Golay FIR smoothing with frame size 21 showed the best performance. The regularization parameter λ of LS-SVM was e16.11, while the bandwidth of the RBF kernel σ2 was e8.42. The determination coefficient for prediction (Rc(2)) was 0.9755 and the determination coefficient for prediction (Rp(2)) was 0.9422, better than the performance of PLS model (Rc(2)=0.9593, Rp(2)=0.9344). For the quantitative analysis of magnesium, SNV made the regression model more precise than other preprocess. The optimized λ was e15.25 and σ2 was e6.32. Rc(2) and Rp(2) were 0.9961 and 0.9301, respectively, better than PLS model (Rc(2)=0.9716, Rp(2)=0.8924). After modeling, the whole progress of NIR scan and data analysis for one sample was within tens of seconds. The overall results show that NIR spectroscopy combined with LS-SVM can be efficiently utilized for rapid and accurate analysis of calcium and magnesium in tobacco.

  12. A Hybrid Hierarchical Approach for Brain Tissue Segmentation by Combining Brain Atlas and Least Square Support Vector Machine

    PubMed Central

    Kasiri, Keyvan; Kazemi, Kamran; Dehghani, Mohammad Javad; Helfroush, Mohammad Sadegh

    2013-01-01

    In this paper, we present a new semi-automatic brain tissue segmentation method based on a hybrid hierarchical approach that combines a brain atlas as a priori information and a least-square support vector machine (LS-SVM). The method consists of three steps. In the first two steps, the skull is removed and the cerebrospinal fluid (CSF) is extracted. These two steps are performed using the toolbox FMRIB's automated segmentation tool integrated in the FSL software (FSL-FAST) developed in Oxford Centre for functional MRI of the brain (FMRIB). Then, in the third step, the LS-SVM is used to segment grey matter (GM) and white matter (WM). The training samples for LS-SVM are selected from the registered brain atlas. The voxel intensities and spatial positions are selected as the two feature groups for training and test. SVM as a powerful discriminator is able to handle nonlinear classification problems; however, it cannot provide posterior probability. Thus, we use a sigmoid function to map the SVM output into probabilities. The proposed method is used to segment CSF, GM and WM from the simulated magnetic resonance imaging (MRI) using Brainweb MRI simulator and real data provided by Internet Brain Segmentation Repository. The semi-automatically segmented brain tissues were evaluated by comparing to the corresponding ground truth. The Dice and Jaccard similarity coefficients, sensitivity and specificity were calculated for the quantitative validation of the results. The quantitative results show that the proposed method segments brain tissues accurately with respect to corresponding ground truth. PMID:24696800

  13. L2-norm multiple kernel learning and its application to biomedical data fusion

    PubMed Central

    2010-01-01

    Background This paper introduces the notion of optimizing different norms in the dual problem of support vector machines with multiple kernels. The selection of norms yields different extensions of multiple kernel learning (MKL) such as L∞, L1, and L2 MKL. In particular, L2 MKL is a novel method that leads to non-sparse optimal kernel coefficients, which is different from the sparse kernel coefficients optimized by the existing L∞ MKL method. In real biomedical applications, L2 MKL may have more advantages over sparse integration method for thoroughly combining complementary information in heterogeneous data sources. Results We provide a theoretical analysis of the relationship between the L2 optimization of kernels in the dual problem with the L2 coefficient regularization in the primal problem. Understanding the dual L2 problem grants a unified view on MKL and enables us to extend the L2 method to a wide range of machine learning problems. We implement L2 MKL for ranking and classification problems and compare its performance with the sparse L∞ and the averaging L1 MKL methods. The experiments are carried out on six real biomedical data sets and two large scale UCI data sets. L2 MKL yields better performance on most of the benchmark data sets. In particular, we propose a novel L2 MKL least squares support vector machine (LSSVM) algorithm, which is shown to be an efficient and promising classifier for large scale data sets processing. Conclusions This paper extends the statistical framework of genomic data fusion based on MKL. Allowing non-sparse weights on the data sources is an attractive option in settings where we believe most data sources to be relevant to the problem at hand and want to avoid a "winner-takes-all" effect seen in L∞ MKL, which can be detrimental to the performance in prospective studies. The notion of optimizing L2 kernels can be straightforwardly extended to ranking, classification, regression, and clustering algorithms. To tackle the computational burden of MKL, this paper proposes several novel LSSVM based MKL algorithms. Systematic comparison on real data sets shows that LSSVM MKL has comparable performance as the conventional SVM MKL algorithms. Moreover, large scale numerical experiments indicate that when cast as semi-infinite programming, LSSVM MKL can be solved more efficiently than SVM MKL. Availability The MATLAB code of algorithms implemented in this paper is downloadable from http://homes.esat.kuleuven.be/~sistawww/bioi/syu/l2lssvm.html. PMID:20529363

  14. Design of an Adaptive Human-Machine System Based on Dynamical Pattern Recognition of Cognitive Task-Load.

    PubMed

    Zhang, Jianhua; Yin, Zhong; Wang, Rubin

    2017-01-01

    This paper developed a cognitive task-load (CTL) classification algorithm and allocation strategy to sustain the optimal operator CTL levels over time in safety-critical human-machine integrated systems. An adaptive human-machine system is designed based on a non-linear dynamic CTL classifier, which maps a set of electroencephalogram (EEG) and electrocardiogram (ECG) related features to a few CTL classes. The least-squares support vector machine (LSSVM) is used as dynamic pattern classifier. A series of electrophysiological and performance data acquisition experiments were performed on seven volunteer participants under a simulated process control task environment. The participant-specific dynamic LSSVM model is constructed to classify the instantaneous CTL into five classes at each time instant. The initial feature set, comprising 56 EEG and ECG related features, is reduced to a set of 12 salient features (including 11 EEG-related features) by using the locality preserving projection (LPP) technique. An overall correct classification rate of about 80% is achieved for the 5-class CTL classification problem. Then the predicted CTL is used to adaptively allocate the number of process control tasks between operator and computer-based controller. Simulation results showed that the overall performance of the human-machine system can be improved by using the adaptive automation strategy proposed.

  15. An Automated and Intelligent Medical Decision Support System for Brain MRI Scans Classification.

    PubMed

    Siddiqui, Muhammad Faisal; Reza, Ahmed Wasif; Kanesan, Jeevan

    2015-01-01

    A wide interest has been observed in the medical health care applications that interpret neuroimaging scans by machine learning systems. This research proposes an intelligent, automatic, accurate, and robust classification technique to classify the human brain magnetic resonance image (MRI) as normal or abnormal, to cater down the human error during identifying the diseases in brain MRIs. In this study, fast discrete wavelet transform (DWT), principal component analysis (PCA), and least squares support vector machine (LS-SVM) are used as basic components. Firstly, fast DWT is employed to extract the salient features of brain MRI, followed by PCA, which reduces the dimensions of the features. These reduced feature vectors also shrink the memory storage consumption by 99.5%. At last, an advanced classification technique based on LS-SVM is applied to brain MR image classification using reduced features. For improving the efficiency, LS-SVM is used with non-linear radial basis function (RBF) kernel. The proposed algorithm intelligently determines the optimized values of the hyper-parameters of the RBF kernel and also applied k-fold stratified cross validation to enhance the generalization of the system. The method was tested by 340 patients' benchmark datasets of T1-weighted and T2-weighted scans. From the analysis of experimental results and performance comparisons, it is observed that the proposed medical decision support system outperformed all other modern classifiers and achieves 100% accuracy rate (specificity/sensitivity 100%/100%). Furthermore, in terms of computation time, the proposed technique is significantly faster than the recent well-known methods, and it improves the efficiency by 71%, 3%, and 4% on feature extraction stage, feature reduction stage, and classification stage, respectively. These results indicate that the proposed well-trained machine learning system has the potential to make accurate predictions about brain abnormalities from the individual subjects, therefore, it can be used as a significant tool in clinical practice.

  16. A Fast Reduced Kernel Extreme Learning Machine.

    PubMed

    Deng, Wan-Yu; Ong, Yew-Soon; Zheng, Qing-Hua

    2016-04-01

    In this paper, we present a fast and accurate kernel-based supervised algorithm referred to as the Reduced Kernel Extreme Learning Machine (RKELM). In contrast to the work on Support Vector Machine (SVM) or Least Square SVM (LS-SVM), which identifies the support vectors or weight vectors iteratively, the proposed RKELM randomly selects a subset of the available data samples as support vectors (or mapping samples). By avoiding the iterative steps of SVM, significant cost savings in the training process can be readily attained, especially on Big datasets. RKELM is established based on the rigorous proof of universal learning involving reduced kernel-based SLFN. In particular, we prove that RKELM can approximate any nonlinear functions accurately under the condition of support vectors sufficiency. Experimental results on a wide variety of real world small instance size and large instance size applications in the context of binary classification, multi-class problem and regression are then reported to show that RKELM can perform at competitive level of generalized performance as the SVM/LS-SVM at only a fraction of the computational effort incurred. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. Drought forecasting in eastern Australia using multivariate adaptive regression spline, least square support vector machine and M5Tree model

    NASA Astrophysics Data System (ADS)

    Deo, Ravinesh C.; Kisi, Ozgur; Singh, Vijay P.

    2017-02-01

    Drought forecasting using standardized metrics of rainfall is a core task in hydrology and water resources management. Standardized Precipitation Index (SPI) is a rainfall-based metric that caters for different time-scales at which the drought occurs, and due to its standardization, is well-suited for forecasting drought at different periods in climatically diverse regions. This study advances drought modelling using multivariate adaptive regression splines (MARS), least square support vector machine (LSSVM), and M5Tree models by forecasting SPI in eastern Australia. MARS model incorporated rainfall as mandatory predictor with month (periodicity), Southern Oscillation Index, Pacific Decadal Oscillation Index and Indian Ocean Dipole, ENSO Modoki and Nino 3.0, 3.4 and 4.0 data added gradually. The performance was evaluated with root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (r2). Best MARS model required different input combinations, where rainfall, sea surface temperature and periodicity were used for all stations, but ENSO Modoki and Pacific Decadal Oscillation indices were not required for Bathurst, Collarenebri and Yamba, and the Southern Oscillation Index was not required for Collarenebri. Inclusion of periodicity increased the r2 value by 0.5-8.1% and reduced RMSE by 3.0-178.5%. Comparisons showed that MARS superseded the performance of the other counterparts for three out of five stations with lower MAE by 15.0-73.9% and 7.3-42.2%, respectively. For the other stations, M5Tree was better than MARS/LSSVM with lower MAE by 13.8-13.4% and 25.7-52.2%, respectively, and for Bathurst, LSSVM yielded more accurate result. For droughts identified by SPI ≤ - 0.5, accurate forecasts were attained by MARS/M5Tree for Bathurst, Yamba and Peak Hill, whereas for Collarenebri and Barraba, M5Tree was better than LSSVM/MARS. Seasonal analysis revealed disparate results where MARS/M5Tree was better than LSSVM. The results highlight the importance of periodicity in drought forecasting and also ascertains that model accuracy scales with geographic/seasonal factors due to complexity of drought and its relationship with inputs and data attributes that can affect the evolution of drought events.

  18. [Rapid determination of COD in aquaculture water based on LS-SVM with ultraviolet/visible spectroscopy].

    PubMed

    Liu, Xue-Mei; Zhang, Hai-Liang

    2014-10-01

    Ultraviolet/visible (UV/Vis) spectroscopy was studied for the rapid determination of chemical oxygen demand (COD), which was an indicator to measure the concentration of organic matter in aquaculture water. In order to reduce the influence of the absolute noises of the spectra, the extracted 135 absorbance spectra were preprocessed by Savitzky-Golay smoothing (SG), EMD, and wavelet transform (WT) methods. The preprocessed spectra were then used to select latent variables (LVs) by partial least squares (PLS) methods. Partial least squares (PLS) was used to build models with the full spectra, and back- propagation neural network (BPNN) and least square support vector machine (LS-SVM) were applied to build models with the selected LVs. The overall results showed that BPNN and LS-SVM models performed better than PLS models, and the LS-SVM models with LVs based on WT preprocessed spectra obtained the best results with the determination coefficient (r2) and RMSE being 0. 83 and 14. 78 mg · L(-1) for calibration set, and 0.82 and 14.82 mg · L(-1) for the prediction set respectively. The method showed the best performance in LS-SVM model. The results indicated that it was feasible to use UV/Vis with LVs which were obtained by PLS method, combined with LS-SVM calibration could be applied to the rapid and accurate determination of COD in aquaculture water. Moreover, this study laid the foundation for further implementation of online analysis of aquaculture water and rapid determination of other water quality parameters.

  19. Detection of Glutamic Acid in Oilseed Rape Leaves Using Near Infrared Spectroscopy and the Least Squares-Support Vector Machine

    PubMed Central

    Bao, Yidan; Kong, Wenwen; Liu, Fei; Qiu, Zhengjun; He, Yong

    2012-01-01

    Amino acids are quite important indices to indicate the growth status of oilseed rape under herbicide stress. Near infrared (NIR) spectroscopy combined with chemometrics was applied for fast determination of glutamic acid in oilseed rape leaves. The optimal spectral preprocessing method was obtained after comparing Savitzky-Golay smoothing, standard normal variate, multiplicative scatter correction, first and second derivatives, detrending and direct orthogonal signal correction. Linear and nonlinear calibration methods were developed, including partial least squares (PLS) and least squares-support vector machine (LS-SVM). The most effective wavelengths (EWs) were determined by the successive projections algorithm (SPA), and these wavelengths were used as the inputs of PLS and LS-SVM model. The best prediction results were achieved by SPA-LS-SVM (Raw) model with correlation coefficient r = 0.9943 and root mean squares error of prediction (RMSEP) = 0.0569 for prediction set. These results indicated that NIR spectroscopy combined with SPA-LS-SVM was feasible for the fast and effective detection of glutamic acid in oilseed rape leaves. The selected EWs could be used to develop spectral sensors, and the important and basic amino acid data were helpful to study the function mechanism of herbicide. PMID:23203052

  20. A RLS-SVM Aided Fusion Methodology for INS during GPS Outages

    PubMed Central

    Yao, Yiqing; Xu, Xiaosu

    2017-01-01

    In order to maintain a relatively high accuracy of navigation performance during global positioning system (GPS) outages, a novel robust least squares support vector machine (LS-SVM)-aided fusion methodology is explored to provide the pseudo-GPS position information for the inertial navigation system (INS). The relationship between the yaw, specific force, velocity, and the position increment is modeled. Rather than share the same weight in the traditional LS-SVM, the proposed algorithm allocates various weights for different data, which makes the system immune to the outliers. Field test data was collected to evaluate the proposed algorithm. The comparison results indicate that the proposed algorithm can effectively provide position corrections for standalone INS during the 300 s GPS outage, which outperforms the traditional LS-SVM method. Historical information is also involved to better represent the vehicle dynamics. PMID:28245549

  1. A RLS-SVM Aided Fusion Methodology for INS during GPS Outages.

    PubMed

    Yao, Yiqing; Xu, Xiaosu

    2017-02-24

    In order to maintain a relatively high accuracy of navigation performance during global positioning system (GPS) outages, a novel robust least squares support vector machine (LS-SVM)-aided fusion methodology is explored to provide the pseudo-GPS position information for the inertial navigation system (INS). The relationship between the yaw, specific force, velocity, and the position increment is modeled. Rather than share the same weight in the traditional LS-SVM, the proposed algorithm allocates various weights for different data, which makes the system immune to the outliers. Field test data was collected to evaluate the proposed algorithm. The comparison results indicate that the proposed algorithm can effectively provide position corrections for standalone INS during the 300 s GPS outage, which outperforms the traditional LS-SVM method. Historical information is also involved to better represent the vehicle dynamics.

  2. Modeling when and where a secondary accident occurs.

    PubMed

    Wang, Junhua; Liu, Boya; Fu, Ting; Liu, Shuo; Stipancic, Joshua

    2018-01-31

    The occurrence of secondary accidents leads to traffic congestion and road safety issues. Secondary accident prevention has become a major consideration in traffic incident management. This paper investigates the location and time of a potential secondary accident after the occurrence of an initial traffic accident. With accident data and traffic loop data collected over three years from California interstate freeways, a shock wave-based method was introduced to identify secondary accidents. A linear regression model and two machine learning algorithms, including a back-propagation neural network (BPNN) and a least squares support vector machine (LSSVM), were implemented to explore the distance and time gap between the initial and secondary accidents using inputs of crash severity, violation category, weather condition, tow away, road surface condition, lighting, parties involved, traffic volume, duration, and shock wave speed generated by the primary accident. From the results, the linear regression model was inadequate in describing the effect of most variables and its goodness-of-fit and accuracy in prediction was relatively poor. In the training programs, the BPNN and LSSVM demonstrated adequate goodness-of-fit, though the BPNN was superior with a higher CORR and lower MSE. The BPNN model also outperformed the LSSVM in time prediction, while both failed to provide adequate distance prediction. Therefore, the BPNN model could be used to forecast the time gap between initial and secondary accidents, which could be used by decision makers and incident management agencies to prevent or reduce secondary collisions. Copyright © 2018 Elsevier Ltd. All rights reserved.

  3. Prediction of BP reactivity to talking using hybrid soft computing approaches.

    PubMed

    Kaur, Gurmanik; Arora, Ajat Shatru; Jain, Vijender Kumar

    2014-01-01

    High blood pressure (BP) is associated with an increased risk of cardiovascular diseases. Therefore, optimal precision in measurement of BP is appropriate in clinical and research studies. In this work, anthropometric characteristics including age, height, weight, body mass index (BMI), and arm circumference (AC) were used as independent predictor variables for the prediction of BP reactivity to talking. Principal component analysis (PCA) was fused with artificial neural network (ANN), adaptive neurofuzzy inference system (ANFIS), and least square-support vector machine (LS-SVM) model to remove the multicollinearity effect among anthropometric predictor variables. The statistical tests in terms of coefficient of determination (R (2)), root mean square error (RMSE), and mean absolute percentage error (MAPE) revealed that PCA based LS-SVM (PCA-LS-SVM) model produced a more efficient prediction of BP reactivity as compared to other models. This assessment presents the importance and advantages posed by PCA fused prediction models for prediction of biological variables.

  4. Discrimination of transgenic soybean seeds by terahertz spectroscopy

    NASA Astrophysics Data System (ADS)

    Liu, Wei; Liu, Changhong; Chen, Feng; Yang, Jianbo; Zheng, Lei

    2016-10-01

    Discrimination of genetically modified organisms is increasingly demanded by legislation and consumers worldwide. The feasibility of a non-destructive discrimination of glyphosate-resistant and conventional soybean seeds and their hybrid descendants was examined by terahertz time-domain spectroscopy system combined with chemometrics. Principal component analysis (PCA), least squares-support vector machines (LS-SVM) and PCA-back propagation neural network (PCA-BPNN) models with the first and second derivative and standard normal variate (SNV) transformation pre-treatments were applied to classify soybean seeds based on genotype. Results demonstrated clear differences among glyphosate-resistant, hybrid descendants and conventional non-transformed soybean seeds could easily be visualized with an excellent classification (accuracy was 88.33% in validation set) using the LS-SVM and the spectra with SNV pre-treatment. The results indicated that THz spectroscopy techniques together with chemometrics would be a promising technique to distinguish transgenic soybean seeds from non-transformed seeds with high efficiency and without any major sample preparation.

  5. Detection of Fungus Infection on Petals of Rapeseed (Brassica napus L.) Using NIR Hyperspectral Imaging

    NASA Astrophysics Data System (ADS)

    Zhao, Yan-Ru; Yu, Ke-Qiang; Li, Xiaoli; He, Yong

    2016-12-01

    Infected petals are often regarded as the source for the spread of fungi Sclerotinia sclerotiorum in all growing process of rapeseed (Brassica napus L.) plants. This research aimed to detect fungal infection of rapeseed petals by applying hyperspectral imaging in the spectral region of 874-1734 nm coupled with chemometrics. Reflectance was extracted from regions of interest (ROIs) in the hyperspectral image of each sample. Firstly, principal component analysis (PCA) was applied to conduct a cluster analysis with the first several principal components (PCs). Then, two methods including X-loadings of PCA and random frog (RF) algorithm were used and compared for optimizing wavebands selection. Least squares-support vector machine (LS-SVM) methodology was employed to establish discriminative models based on the optimal and full wavebands. Finally, area under the receiver operating characteristics curve (AUC) was utilized to evaluate classification performance of these LS-SVM models. It was found that LS-SVM based on the combination of all optimal wavebands had the best performance with AUC of 0.929. These results were promising and demonstrated the potential of applying hyperspectral imaging in fungus infection detection on rapeseed petals.

  6. A comparative study of clonal selection algorithm for effluent removal forecasting in septic sludge treatment plant.

    PubMed

    Chun, Ting Sie; Malek, M A; Ismail, Amelia Ritahani

    2015-01-01

    The development of effluent removal prediction is crucial in providing a planning tool necessary for the future development and the construction of a septic sludge treatment plant (SSTP), especially in the developing countries. In order to investigate the expected functionality of the required standard, the prediction of the effluent quality, namely biological oxygen demand, chemical oxygen demand and total suspended solid of an SSTP was modelled using an artificial intelligence approach. In this paper, we adopt the clonal selection algorithm (CSA) to set up a prediction model, with a well-established method - namely the least-square support vector machine (LS-SVM) as a baseline model. The test results of the case study showed that the prediction of the CSA-based SSTP model worked well and provided model performance as satisfactory as the LS-SVM model. The CSA approach shows that fewer control and training parameters are required for model simulation as compared with the LS-SVM approach. The ability of a CSA approach in resolving limited data samples, non-linear sample function and multidimensional pattern recognition makes it a powerful tool in modelling the prediction of effluent removals in an SSTP.

  7. The prediction of the building precision in the Laser Engineered Net Shaping process using advanced networks

    NASA Astrophysics Data System (ADS)

    Lu, Z. L.; Li, D. C.; Lu, B. H.; Zhang, A. F.; Zhu, G. X.; Pi, G.

    2010-05-01

    Laser Engineered Net Shaping (LENS) is an advanced manufacturing technology, but it is difficult to control the depositing height (DH) of the prototype because there are many technology parameters influencing the forming process. The effect of main parameters (laser power, scanning speed and powder feeding rate) on the DH of single track is firstly analyzed, and then it shows that there is the complex nonlinear intrinsic relationship between them. In order to predict the DH, the back propagation (BP) based network improved with Adaptive learning rate and Momentum coefficient (AM) algorithm, and the least square support vector machine (LS-SVM) network are both adopted. The mapping relationship between above parameters and the DH is constructed according to training samples collected by LENS experiments, and then their generalization ability, function-approximating ability and real-time are contrastively investigated. The results show that although the predicted result by the BP-AM approximates the experimental result, above performance index of the LS-SVM are better than those of the BP-AM. Finally, high-definition thin-walled parts of AISI316L are successfully fabricated. Hence, the LS-SVM network is more suitable for the prediction of the DH.

  8. Objective estimation of tropical cyclone innercore surface wind structure using infrared satellite images

    NASA Astrophysics Data System (ADS)

    Zhang, Changjiang; Dai, Lijie; Ma, Leiming; Qian, Jinfang; Yang, Bo

    2017-10-01

    An objective technique is presented for estimating tropical cyclone (TC) innercore two-dimensional (2-D) surface wind field structure using infrared satellite imagery and machine learning. For a TC with eye, the eye contour is first segmented by a geodesic active contour model, based on which the eye circumference is obtained as the TC eye size. A mathematical model is then established between the eye size and the radius of maximum wind obtained from the past official TC report to derive the 2-D surface wind field within the TC eye. Meanwhile, the composite information about the latitude of TC center, surface maximum wind speed, TC age, and critical wind radii of 34- and 50-kt winds can be combined to build another mathematical model for deriving the innercore wind structure. After that, least squares support vector machine (LSSVM), radial basis function neural network (RBFNN), and linear regression are introduced, respectively, in the two mathematical models, which are then tested with sensitivity experiments on real TC cases. Verification shows that the innercore 2-D surface wind field structure estimated by LSSVM is better than that of RBFNN and linear regression.

  9. Laser-Induced Breakdown Spectroscopy Coupled with Multivariate Chemometrics for Variety Discrimination of Soil

    PubMed Central

    Yu, Ke-Qiang; Zhao, Yan-Ru; Liu, Fei; He, Yong

    2016-01-01

    The aim of this work was to analyze the variety of soil by laser-induced breakdown spectroscopy (LIBS) coupled with chemometrics methods. 6 certified reference materials (CRMs) of soil samples were selected and their LIBS spectra were captured. Characteristic emission lines of main elements were identified based on the LIBS curves and corresponding contents. From the identified emission lines, LIBS spectra in 7 lines with high signal-to-noise ratio (SNR) were chosen for further analysis. Principal component analysis (PCA) was carried out using the LIBS spectra at 7 selected lines and an obvious cluster of 6 soils was observed. Soft independent modeling of class analogy (SIMCA) and least-squares support vector machine (LS-SVM) were introduced to establish discriminant models for classifying the 6 types of soils, and they offered the correct discrimination rates of 90% and 100%, respectively. Receiver operating characteristic (ROC) curve was used to evaluate the performance of models and the results demonstrated that the LS-SVM model was promising. Lastly, 8 types of soils from different places were gathered to conduct the same experiments for verifying the selected 7 emission lines and LS-SVM model. The research revealed that LIBS technology coupled with chemometrics could conduct the variety discrimination of soil. PMID:27279284

  10. Firmness prediction in Prunus persica 'Calrico' peaches by visible/short-wave near infrared spectroscopy and acoustic measurements using optimised linear and non-linear chemometric models.

    PubMed

    Lafuente, Victoria; Herrera, Luis J; Pérez, María del Mar; Val, Jesús; Negueruela, Ignacio

    2015-08-15

    In this work, near infrared spectroscopy (NIR) and an acoustic measure (AWETA) (two non-destructive methods) were applied in Prunus persica fruit 'Calrico' (n = 260) to predict Magness-Taylor (MT) firmness. Separate and combined use of these measures was evaluated and compared using partial least squares (PLS) and least squares support vector machine (LS-SVM) regression methods. Also, a mutual-information-based variable selection method, seeking to find the most significant variables to produce optimal accuracy of the regression models, was applied to a joint set of variables (NIR wavelengths and AWETA measure). The newly proposed combined NIR-AWETA model gave good values of the determination coefficient (R(2)) for PLS and LS-SVM methods (0.77 and 0.78, respectively), improving the reliability of MT firmness prediction in comparison with separate NIR and AWETA predictions. The three variables selected by the variable selection method (AWETA measure plus NIR wavelengths 675 and 697 nm) achieved R(2) values 0.76 and 0.77, PLS and LS-SVM. These results indicated that the proposed mutual-information-based variable selection algorithm was a powerful tool for the selection of the most relevant variables. © 2014 Society of Chemical Industry.

  11. [Characteristic wavelengths selection of soluble solids content of pear based on NIR spectral and LS-SVM].

    PubMed

    Fan, Shu-xiang; Huang, Wen-qian; Li, Jiang-bo; Zhao, Chun-jiang; Zhang, Bao-hua

    2014-08-01

    To improve the precision and robustness of the NIR model of the soluble solid content (SSC) on pear. The total number of 160 pears was for the calibration (n=120) and prediction (n=40). Different spectral pretreatment methods, including standard normal variate (SNV) and multiplicative scatter correction (MSC) were used before further analysis. A combination of genetic algorithm (GA) and successive projections algorithm (SPA) was proposed to select most effective wavelengths after uninformative variable elimination (UVE) from original spectra, SNV pretreated spectra and MSC pretreated spectra respectively. The selected variables were used as the inputs of least squares-support vector machine (LS-SVM) model to build models for de- termining the SSC of pear. The results indicated that LS-SVM model built using SNVE-UVE-GA-SPA on 30 characteristic wavelengths selected from full-spectrum which had 3112 wavelengths achieved the optimal performance. The correlation coefficient (Rp) and root mean square error of prediction (RMSEP) for prediction sets were 0.956, 0.271 for SSC. The model is reliable and the predicted result is effective. The method can meet the requirement of quick measuring SSC of pear and might be important for the development of portable instruments and online monitoring.

  12. Determination of Hemicellulose, Cellulose and Lignin in Moso Bamboo by Near Infrared Spectroscopy

    PubMed Central

    Li, Xiaoli; Sun, Chanjun; Zhou, Binxiong; He, Yong

    2015-01-01

    The contents of hemicellulose, cellulose and lignin are important for moso bamboo processing in biomass energy industry. The feasibility of using near infrared (NIR) spectroscopy for rapid determination of hemicellulose, cellulose and lignin was investigated in this study. Initially, the linear relationship between bamboo components and their NIR spectroscopy was established. Subsequently, successive projections algorithm (SPA) was used to detect characteristic wavelengths for establishing the convenient models. For hemicellulose, cellulose and lignin, 22, 22 and 20 characteristic wavelengths were obtained, respectively. Nonlinear determination models were subsequently built by an artificial neural network (ANN) and a least-squares support vector machine (LS-SVM) based on characteristic wavelengths. The LS-SVM models for predicting hemicellulose, cellulose and lignin all obtained excellent results with high determination coefficients of 0.921, 0.909 and 0.892 respectively. These results demonstrated that NIR spectroscopy combined with SPA-LS-SVM is a useful, nondestructive tool for the determinations of hemicellulose, cellulose and lignin in moso bamboo. PMID:26601657

  13. Classification of epileptic EEG signals based on simple random sampling and sequential feature selection.

    PubMed

    Ghayab, Hadi Ratham Al; Li, Yan; Abdulla, Shahab; Diykh, Mohammed; Wan, Xiangkui

    2016-06-01

    Electroencephalogram (EEG) signals are used broadly in the medical fields. The main applications of EEG signals are the diagnosis and treatment of diseases such as epilepsy, Alzheimer, sleep problems and so on. This paper presents a new method which extracts and selects features from multi-channel EEG signals. This research focuses on three main points. Firstly, simple random sampling (SRS) technique is used to extract features from the time domain of EEG signals. Secondly, the sequential feature selection (SFS) algorithm is applied to select the key features and to reduce the dimensionality of the data. Finally, the selected features are forwarded to a least square support vector machine (LS_SVM) classifier to classify the EEG signals. The LS_SVM classifier classified the features which are extracted and selected from the SRS and the SFS. The experimental results show that the method achieves 99.90, 99.80 and 100 % for classification accuracy, sensitivity and specificity, respectively.

  14. Nondestructive determination of transgenic Bacillus thuringiensis rice seeds (Oryza sativa L.) using multispectral imaging and chemometric methods.

    PubMed

    Liu, Changhong; Liu, Wei; Lu, Xuzhong; Chen, Wei; Yang, Jianbo; Zheng, Lei

    2014-06-15

    Crop-to-crop transgene flow may affect the seed purity of non-transgenic rice varieties, resulting in unwanted biosafety consequences. The feasibility of a rapid and nondestructive determination of transgenic rice seeds from its non-transgenic counterparts was examined by using multispectral imaging system combined with chemometric data analysis. Principal component analysis (PCA), partial least squares discriminant analysis (PLSDA), least squares-support vector machines (LS-SVM), and PCA-back propagation neural network (PCA-BPNN) methods were applied to classify rice seeds according to their genetic origins. The results demonstrated that clear differences between non-transgenic and transgenic rice seeds could be easily visualized with the nondestructive determination method developed through this study and an excellent classification (up to 100% with LS-SVM model) can be achieved. It is concluded that multispectral imaging together with chemometric data analysis is a promising technique to identify transgenic rice seeds with high efficiency, providing bright prospects for future applications. Copyright © 2013 Elsevier Ltd. All rights reserved.

  15. Feasibility in multispectral imaging for predicting the content of bioactive compounds in intact tomato fruit.

    PubMed

    Liu, Changhong; Liu, Wei; Chen, Wei; Yang, Jianbo; Zheng, Lei

    2015-04-15

    Tomato is an important health-stimulating fruit because of the antioxidant properties of its main bioactive compounds, dominantly lycopene and phenolic compounds. Nowadays, product differentiation in the fruit market requires an accurate evaluation of these value-added compounds. An experiment was conducted to simultaneously and non-destructively measure lycopene and phenolic compounds content in intact tomatoes using multispectral imaging combined with chemometric methods. Partial least squares (PLS), least squares-support vector machines (LS-SVM) and back propagation neural network (BPNN) were applied to develop quantitative models. Compared with PLS and LS-SVM, BPNN model considerably improved the performance with coefficient of determination in prediction (RP(2))=0.938 and 0.965, residual predictive deviation (RPD)=4.590 and 9.335 for lycopene and total phenolics content prediction, respectively. It is concluded that multispectral imaging is an attractive alternative to the standard methods for determination of bioactive compounds content in intact tomatoes, providing a useful platform for infield fruit sorting/grading. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. Multi-sensor information fusion method for vibration fault diagnosis of rolling bearing

    NASA Astrophysics Data System (ADS)

    Jiao, Jing; Yue, Jianhai; Pei, Di

    2017-10-01

    Bearing is a key element in high-speed electric multiple unit (EMU) and any defect of it can cause huge malfunctioning of EMU under high operation speed. This paper presents a new method for bearing fault diagnosis based on least square support vector machine (LS-SVM) in feature-level fusion and Dempster-Shafer (D-S) evidence theory in decision-level fusion which were used to solve the problems about low detection accuracy, difficulty in extracting sensitive characteristics and unstable diagnosis system of single-sensor in rolling bearing fault diagnosis. Wavelet de-nosing technique was used for removing the signal noises. LS-SVM was used to make pattern recognition of the bearing vibration signal, and then fusion process was made according to the D-S evidence theory, so as to realize recognition of bearing fault. The results indicated that the data fusion method improved the performance of the intelligent approach in rolling bearing fault detection significantly. Moreover, the results showed that this method can efficiently improve the accuracy of fault diagnosis.

  17. In Situ Measurement of Some Soil Properties in Paddy Soil Using Visible and Near-Infrared Spectroscopy

    PubMed Central

    Wenjun, Ji; Zhou, Shi; Jingyi, Huang; Shuo, Li

    2014-01-01

    In situ measurements with visible and near-infrared spectroscopy (vis-NIR) provide an efficient way for acquiring soil information of paddy soils in the short time gap between the harvest and following rotation. The aim of this study was to evaluate its feasibility to predict a series of soil properties including organic matter (OM), organic carbon (OC), total nitrogen (TN), available nitrogen (AN), available phosphorus (AP), available potassium (AK) and pH of paddy soils in Zhejiang province, China. Firstly, the linear partial least squares regression (PLSR) was performed on the in situ spectra and the predictions were compared to those with laboratory-based recorded spectra. Then, the non-linear least-square support vector machine (LS-SVM) algorithm was carried out aiming to extract more useful information from the in situ spectra and improve predictions. Results show that in terms of OC, OM, TN, AN and pH, (i) the predictions were worse using in situ spectra compared to laboratory-based spectra with PLSR algorithm (ii) the prediction accuracy using LS-SVM (R2>0.75, RPD>1.90) was obviously improved with in situ vis-NIR spectra compared to PLSR algorithm, and comparable or even better than results generated using laboratory-based spectra with PLSR; (iii) in terms of AP and AK, poor predictions were obtained with in situ spectra (R2<0.5, RPD<1.50) either using PLSR or LS-SVM. The results highlight the use of LS-SVM for in situ vis-NIR spectroscopic estimation of soil properties of paddy soils. PMID:25153132

  18. Trajectory Correction and Locomotion Analysis of a Hexapod Walking Robot with Semi-Round Rigid Feet

    PubMed Central

    Zhu, Yaguang; Jin, Bo; Wu, Yongsheng; Guo, Tong; Zhao, Xiangmo

    2016-01-01

    Aimed at solving the misplaced body trajectory problem caused by the rolling of semi-round rigid feet when a robot is walking, a legged kinematic trajectory correction methodology based on the Least Squares Support Vector Machine (LS-SVM) is proposed. The concept of ideal foothold is put forward for the three-dimensional kinematic model modification of a robot leg, and the deviation value between the ideal foothold and real foothold is analyzed. The forward/inverse kinematic solutions between the ideal foothold and joint angular vectors are formulated and the problem of direct/inverse kinematic nonlinear mapping is solved by using the LS-SVM. Compared with the previous approximation method, this correction methodology has better accuracy and faster calculation speed with regards to inverse kinematics solutions. Experiments on a leg platform and a hexapod walking robot are conducted with multi-sensors for the analysis of foot tip trajectory, base joint vibration, contact force impact, direction deviation, and power consumption, respectively. The comparative analysis shows that the trajectory correction methodology can effectively correct the joint trajectory, thus eliminating the contact force influence of semi-round rigid feet, significantly improving the locomotion of the walking robot and reducing the total power consumption of the system. PMID:27589766

  19. Comparative Analysis of Hybrid Models for Prediction of BP Reactivity to Crossed Legs.

    PubMed

    Kaur, Gurmanik; Arora, Ajat Shatru; Jain, Vijender Kumar

    2017-01-01

    Crossing the legs at the knees, during BP measurement, is one of the several physiological stimuli that considerably influence the accuracy of BP measurements. Therefore, it is paramount to develop an appropriate prediction model for interpreting influence of crossed legs on BP. This research work described the use of principal component analysis- (PCA-) fused forward stepwise regression (FSWR), artificial neural network (ANN), adaptive neuro fuzzy inference system (ANFIS), and least squares support vector machine (LS-SVM) models for prediction of BP reactivity to crossed legs among the normotensive and hypertensive participants. The evaluation of the performance of the proposed prediction models using appropriate statistical indices showed that the PCA-based LS-SVM (PCA-LS-SVM) model has the highest prediction accuracy with coefficient of determination ( R 2 ) = 93.16%, root mean square error (RMSE) = 0.27, and mean absolute percentage error (MAPE) = 5.71 for SBP prediction in normotensive subjects. Furthermore, R 2  = 96.46%, RMSE = 0.19, and MAPE = 1.76 for SBP prediction and R 2  = 95.44%, RMSE = 0.21, and MAPE = 2.78 for DBP prediction in hypertensive subjects using the PCA-LSSVM model. This assessment presents the importance and advantages posed by hybrid computing models for the prediction of variables in biomedical research studies.

  20. Sonomyography Analysis on Thickness of Skeletal Muscle During Dynamic Contraction Induced by Neuromuscular Electrical Stimulation: A Pilot Study.

    PubMed

    Qiu, Shuang; Feng, Jing; Xu, Jiapeng; Xu, Rui; Zhao, Xin; Zhou, Peng; Qi, Hongzhi; Zhang, Lixin; Ming, Dong

    2017-01-01

    Neuromuscular electrical stimulation (NMES) that stimulates skeletal muscles to induce contractions has been widely applied to restore functions of paralyzed muscles. However, the architectural changes of stimulated muscles induced by NMES are still not well understood. The present study applies sonomyography (SMG) to evaluate muscle architecture under NMES-induced and voluntary movements. The quadriceps muscles of seven healthy subjects were tested for eight cycles during an extension exercise of the knee joint with/without NMES, and SMG and the knee joint angle were recorded during the process of knee extension. A least squares support vector machine (LS-SVM) LS-SVM model was developed and trained using the data sets of six cycles collected under NMES, while the remaining data was used to test. Muscle thickness changes were extracted from ultrasound images and compared between NMES-induced and voluntary contractions, and LS-SVM was used to model a relationship between dynamical knee joint angles and SMG signals. Muscle thickness showed to be significantly correlated with joint angle in NMES-induced contractions, and a significant negative correlation was observed between Vastus intermedius (VI) thickness and rectus femoris (RF) thickness. In addition, there was a significant difference between voluntary and NMES-induced contractions . The LS-SVM model based on RF thickness and knee joint angle provided superior performance compared with the model based on VI thickness and knee joint angle or total thickness and knee joint angle. This suggests that a strong relation exists between the RF thickness and knee joint angle. These results provided direct evidence for the potential application of RF thickness in optimizing NMES system as well as measuring muscle state under NMES.

  1. Automated Diagnosis of Glaucoma Using Empirical Wavelet Transform and Correntropy Features Extracted From Fundus Images.

    PubMed

    Maheshwari, Shishir; Pachori, Ram Bilas; Acharya, U Rajendra

    2017-05-01

    Glaucoma is an ocular disorder caused due to increased fluid pressure in the optic nerve. It damages the optic nerve and subsequently causes loss of vision. The available scanning methods are Heidelberg retinal tomography, scanning laser polarimetry, and optical coherence tomography. These methods are expensive and require experienced clinicians to use them. So, there is a need to diagnose glaucoma accurately with low cost. Hence, in this paper, we have presented a new methodology for an automated diagnosis of glaucoma using digital fundus images based on empirical wavelet transform (EWT). The EWT is used to decompose the image, and correntropy features are obtained from decomposed EWT components. These extracted features are ranked based on t value feature selection algorithm. Then, these features are used for the classification of normal and glaucoma images using least-squares support vector machine (LS-SVM) classifier. The LS-SVM is employed for classification with radial basis function, Morlet wavelet, and Mexican-hat wavelet kernels. The classification accuracy of the proposed method is 98.33% and 96.67% using threefold and tenfold cross validation, respectively.

  2. Rapid discrimination of pork in Halal and non-Halal Chinese ham sausages by Fourier transform infrared (FTIR) spectroscopy and chemometrics.

    PubMed

    Xu, L; Cai, C B; Cui, H F; Ye, Z H; Yu, X P

    2012-12-01

    Rapid discrimination of pork in Halal and non-Halal Chinese ham sausages was developed by Fourier transform infrared (FTIR) spectrometry combined with chemometrics. Transmittance spectra ranging from 400 to 4000 cm⁻¹ of 73 Halal and 78 non-Halal Chinese ham sausages were measured. Sample preparation involved finely grinding of samples and formation of KBr disks (under 10 MPa for 5 min). The influence of data preprocessing methods including smoothing, taking derivatives and standard normal variate (SNV) on partial least squares discriminant analysis (PLSDA) and least squares support vector machine (LS-SVM) was investigated. The results indicate removal of spectral background and baseline plays an important role in discrimination. Taking derivatives, SNV can improve classification accuracy and reduce the complexity of PLSDA. Possibly due to the loss of detailed high-frequency spectral information, smoothing degrades the model performance. For the best models, the sensitivity and specificity was 0.913 and 0.929 for PLSDA with SNV spectra, 0.957 and 0.929 for LS-SVM with second derivative spectra, respectively. Copyright © 2012 Elsevier Ltd. All rights reserved.

  3. Variety identification of brown sugar using short-wave near infrared spectroscopy and multivariate calibration

    NASA Astrophysics Data System (ADS)

    Yang, Haiqing; Wu, Di; He, Yong

    2007-11-01

    Near-infrared spectroscopy (NIRS) with the characteristics of high speed, non-destructiveness, high precision and reliable detection data, etc. is a pollution-free, rapid, quantitative and qualitative analysis method. A new approach for variety discrimination of brown sugars using short-wave NIR spectroscopy (800-1050nm) was developed in this work. The relationship between the absorbance spectra and brown sugar varieties was established. The spectral data were compressed by the principal component analysis (PCA). The resulting features can be visualized in principal component (PC) space, which can lead to discovery of structures correlative with the different class of spectral samples. It appears to provide a reasonable variety clustering of brown sugars. The 2-D PCs plot obtained using the first two PCs can be used for the pattern recognition. Least-squares support vector machines (LS-SVM) was applied to solve the multivariate calibration problems in a relatively fast way. The work has shown that short-wave NIR spectroscopy technique is available for the brand identification of brown sugar, and LS-SVM has the better identification ability than PLS when the calibration set is small.

  4. A measurement fusion method for nonlinear system identification using a cooperative learning algorithm.

    PubMed

    Xia, Youshen; Kamel, Mohamed S

    2007-06-01

    Identification of a general nonlinear noisy system viewed as an estimation of a predictor function is studied in this article. A measurement fusion method for the predictor function estimate is proposed. In the proposed scheme, observed data are first fused by using an optimal fusion technique, and then the optimal fused data are incorporated in a nonlinear function estimator based on a robust least squares support vector machine (LS-SVM). A cooperative learning algorithm is proposed to implement the proposed measurement fusion method. Compared with related identification methods, the proposed method can minimize both the approximation error and the noise error. The performance analysis shows that the proposed optimal measurement fusion function estimate has a smaller mean square error than the LS-SVM function estimate. Moreover, the proposed cooperative learning algorithm can converge globally to the optimal measurement fusion function estimate. Finally, the proposed measurement fusion method is applied to ARMA signal and spatial temporal signal modeling. Experimental results show that the proposed measurement fusion method can provide a more accurate model.

  5. Estimating Inflows to Lake Okeechobee Using Climate Indices: A Machine Learning Modeling Approach

    NASA Astrophysics Data System (ADS)

    Kalra, A.; Ahmad, S.

    2008-12-01

    The operation of regional water management systems that include lakes and storage reservoirs for flood control and water supply can be significantly improved by using climate indices. This research is focused on forecasting Lag 1 annual inflow to Lake Okeechobee, located in South Florida, using annual oceanic- atmospheric indices of Pacific Decadal Oscillation (PDO), North Atlantic Oscillation (NAO), Atlantic Multidecadal Oscillation (AMO), and El Nino-Southern Oscillations (ENSO). Support Vector Machine (SVM) and Least Square Support Vector Machine (LSSVM), belonging to the class of data driven models, are developed to forecast annual lake inflow using annual oceanic-atmospheric indices data from 1914 to 2003. The models were trained with 80 years of data and tested for 10 years of data. Based on Correlation Coefficient, Root Means Square Error, and Mean Absolute Error model predictions were in good agreement with measured inflow volumes. Sensitivity analysis, performed to evaluate the effect of individual and coupled oscillations, revealed a strong signal for AMO and ENSO indices compared to PDO and NAO indices for one year lead-time inflow forecast. Inflow predictions from the SVM models were better when compared with the predictions obtained from feed forward back propagation Artificial Neural Network (ANN) models.

  6. Fingerprint recognition of alien invasive weeds based on the texture character and machine learning

    NASA Astrophysics Data System (ADS)

    Yu, Jia-Jia; Li, Xiao-Li; He, Yong; Xu, Zheng-Hao

    2008-11-01

    Multi-spectral imaging technique based on texture analysis and machine learning was proposed to discriminate alien invasive weeds with similar outline but different categories. The objectives of this study were to investigate the feasibility of using Multi-spectral imaging, especially the near-infrared (NIR) channel (800 nm+/-10 nm) to find the weeds' fingerprints, and validate the performance with specific eigenvalues by co-occurrence matrix. Veronica polita Pries, Veronica persica Poir, longtube ground ivy, Laminum amplexicaule Linn. were selected in this study, which perform different effect in field, and are alien invasive species in China. 307 weed leaves' images were randomly selected for the calibration set, while the remaining 207 samples for the prediction set. All images were pretreated by Wallis filter to adjust the noise by uneven lighting. Gray level co-occurrence matrix was applied to extract the texture character, which shows density, randomness correlation, contrast and homogeneity of texture with different algorithms. Three channels (green channel by 550 nm+/-10 nm, red channel by 650 nm+/-10 nm and NIR channel by 800 nm+/-10 nm) were respectively calculated to get the eigenvalues.Least-squares support vector machines (LS-SVM) was applied to discriminate the categories of weeds by the eigenvalues from co-occurrence matrix. Finally, recognition ratio of 83.35% by NIR channel was obtained, better than the results by green channel (76.67%) and red channel (69.46%). The prediction results of 81.35% indicated that the selected eigenvalues reflected the main characteristics of weeds' fingerprint based on multi-spectral (especially by NIR channel) and LS-SVM model.

  7. Use of Vis/NIRS for the determination of sugar content of cola soft drinks based on chemometric methods

    NASA Astrophysics Data System (ADS)

    Liu, Fei; He, Yong

    2008-03-01

    Three different chemometric methods were performed for the determination of sugar content of cola soft drinks using visible and near infrared spectroscopy (Vis/NIRS). Four varieties of colas were prepared and 180 samples (45 samples for each variety) were selected for the calibration set, while 60 samples (15 samples for each variety) for the validation set. The smoothing way of Savitzky-Golay, standard normal variate (SNV) and Savitzky-Golay first derivative transformation were applied for the pre-processing of spectral data. The first eleven principal components (PCs) extracted by partial least squares (PLS) analysis were employed as the inputs of BP neural network (BPNN) and least squares-support vector machine (LS-SVM) model. Then the BPNN model with the optimal structural parameters and LS-SVM model with radial basis function (RBF) kernel were applied to build the regression model with a comparison of PLS regression. The correlation coefficient (r), root mean square error of prediction (RMSEP) and bias for prediction were 0.971, 1.259 and -0.335 for PLS, 0.986, 0.763, and -0.042 for BPNN, while 0.978, 0.995 and -0.227 for LS-SVM, respectively. All the three methods supplied a high and satisfying precision. The results indicated that Vis/NIR spectroscopy combined with chemometric methods could be utilized as a high precision way for the determination of sugar content of cola soft drinks.

  8. Study on Temperature and Synthetic Compensation of Piezo-Resistive Differential Pressure Sensors by Coupled Simulated Annealing and Simplex Optimized Kernel Extreme Learning Machine

    PubMed Central

    Li, Ji; Hu, Guoqing; Zhou, Yonghong; Zou, Chong; Peng, Wei; Alam SM, Jahangir

    2017-01-01

    As a high performance-cost ratio solution for differential pressure measurement, piezo-resistive differential pressure sensors are widely used in engineering processes. However, their performance is severely affected by the environmental temperature and the static pressure applied to them. In order to modify the non-linear measuring characteristics of the piezo-resistive differential pressure sensor, compensation actions should synthetically consider these two aspects. Advantages such as nonlinear approximation capability, highly desirable generalization ability and computational efficiency make the kernel extreme learning machine (KELM) a practical approach for this critical task. Since the KELM model is intrinsically sensitive to the regularization parameter and the kernel parameter, a searching scheme combining the coupled simulated annealing (CSA) algorithm and the Nelder-Mead simplex algorithm is adopted to find an optimal KLEM parameter set. A calibration experiment at different working pressure levels was conducted within the temperature range to assess the proposed method. In comparison with other compensation models such as the back-propagation neural network (BP), radius basis neural network (RBF), particle swarm optimization optimized support vector machine (PSO-SVM), particle swarm optimization optimized least squares support vector machine (PSO-LSSVM) and extreme learning machine (ELM), the compensation results show that the presented compensation algorithm exhibits a more satisfactory performance with respect to temperature compensation and synthetic compensation problems. PMID:28422080

  9. Study on Temperature and Synthetic Compensation of Piezo-Resistive Differential Pressure Sensors by Coupled Simulated Annealing and Simplex Optimized Kernel Extreme Learning Machine.

    PubMed

    Li, Ji; Hu, Guoqing; Zhou, Yonghong; Zou, Chong; Peng, Wei; Alam Sm, Jahangir

    2017-04-19

    As a high performance-cost ratio solution for differential pressure measurement, piezo-resistive differential pressure sensors are widely used in engineering processes. However, their performance is severely affected by the environmental temperature and the static pressure applied to them. In order to modify the non-linear measuring characteristics of the piezo-resistive differential pressure sensor, compensation actions should synthetically consider these two aspects. Advantages such as nonlinear approximation capability, highly desirable generalization ability and computational efficiency make the kernel extreme learning machine (KELM) a practical approach for this critical task. Since the KELM model is intrinsically sensitive to the regularization parameter and the kernel parameter, a searching scheme combining the coupled simulated annealing (CSA) algorithm and the Nelder-Mead simplex algorithm is adopted to find an optimal KLEM parameter set. A calibration experiment at different working pressure levels was conducted within the temperature range to assess the proposed method. In comparison with other compensation models such as the back-propagation neural network (BP), radius basis neural network (RBF), particle swarm optimization optimized support vector machine (PSO-SVM), particle swarm optimization optimized least squares support vector machine (PSO-LSSVM) and extreme learning machine (ELM), the compensation results show that the presented compensation algorithm exhibits a more satisfactory performance with respect to temperature compensation and synthetic compensation problems.

  10. Multi-phase classification by a least-squares support vector machine approach in tomography images of geological samples

    NASA Astrophysics Data System (ADS)

    Khan, Faisal; Enzmann, Frieder; Kersten, Michael

    2016-03-01

    Image processing of X-ray-computed polychromatic cone-beam micro-tomography (μXCT) data of geological samples mainly involves artefact reduction and phase segmentation. For the former, the main beam-hardening (BH) artefact is removed by applying a best-fit quadratic surface algorithm to a given image data set (reconstructed slice), which minimizes the BH offsets of the attenuation data points from that surface. A Matlab code for this approach is provided in the Appendix. The final BH-corrected image is extracted from the residual data or from the difference between the surface elevation values and the original grey-scale values. For the segmentation, we propose a novel least-squares support vector machine (LS-SVM, an algorithm for pixel-based multi-phase classification) approach. A receiver operating characteristic (ROC) analysis was performed on BH-corrected and uncorrected samples to show that BH correction is in fact an important prerequisite for accurate multi-phase classification. The combination of the two approaches was thus used to classify successfully three different more or less complex multi-phase rock core samples.

  11. A Novel Degradation Identification Method for Wind Turbine Pitch System

    NASA Astrophysics Data System (ADS)

    Guo, Hui-Dong

    2018-04-01

    It’s difficult for traditional threshold value method to identify degradation of operating equipment accurately. An novel degradation evaluation method suitable for wind turbine condition maintenance strategy implementation was proposed in this paper. Based on the analysis of typical variable-speed pitch-to-feather control principle and monitoring parameters for pitch system, a multi input multi output (MIMO) regression model was applied to pitch system, where wind speed, power generation regarding as input parameters, wheel rotation speed, pitch angle and motor driving currency for three blades as output parameters. Then, the difference between the on-line measurement and the calculated value from the MIMO regression model applying least square support vector machines (LSSVM) method was defined as the Observed Vector of the system. The Gaussian mixture model (GMM) was applied to fitting the distribution of the multi dimension Observed Vectors. Applying the model established, the Degradation Index was calculated using the SCADA data of a wind turbine damaged its pitch bearing retainer and rolling body, which illustrated the feasibility of the provided method.

  12. Discrimination and Measurements of Three Flavonols with Similar Structure Using Terahertz Spectroscopy and Chemometrics

    NASA Astrophysics Data System (ADS)

    Yan, Ling; Liu, Changhong; Qu, Hao; Liu, Wei; Zhang, Yan; Yang, Jianbo; Zheng, Lei

    2018-03-01

    Terahertz (THz) technique, a recently developed spectral method, has been researched and used for the rapid discrimination and measurements of food compositions due to its low-energy and non-ionizing characteristics. In this study, THz spectroscopy combined with chemometrics has been utilized for qualitative and quantitative analysis of myricetin, quercetin, and kaempferol with concentrations of 0.025, 0.05, and 0.1 mg/mL. The qualitative discrimination was achieved by KNN, ELM, and RF models with the spectra pre-treatments. An excellent discrimination (100% CCR in the prediction set) could be achieved using the RF model. Furthermore, the quantitative analyses were performed by partial least square regression (PLSR) and least squares support vector machine (LS-SVM). Comparing to the PLSR models, the LS-SVM yielded better results with low RMSEP (0.0044, 0.0039, and 0.0048), higher Rp (0.9601, 0.9688, and 0.9359), and higher RPD (8.6272, 9.6333, and 7.9083) for myricetin, quercetin, and kaempferol, respectively. Our results demonstrate that THz spectroscopy technique is a powerful tool for identification of three flavonols with similar chemical structures and quantitative determination of their concentrations.

  13. [Study on the early detection of Sclerotinia of Brassica napus based on combinational-stimulated bands].

    PubMed

    Liu, Fei; Feng, Lei; Lou, Bing-gan; Sun, Guang-ming; Wang, Lian-ping; He, Yong

    2010-07-01

    The combinational-stimulated bands were used to develop linear and nonlinear calibrations for the early detection of sclerotinia of oilseed rape (Brassica napus L.). Eighty healthy and 100 Sclerotinia leaf samples were scanned, and different preprocessing methods combined with successive projections algorithm (SPA) were applied to develop partial least squares (PLS) discriminant models, multiple linear regression (MLR) and least squares-support vector machine (LS-SVM) models. The results indicated that the optimal full-spectrum PLS model was achieved by direct orthogonal signal correction (DOSC), then De-trending and Raw spectra with correct recognition ratio of 100%, 95.7% and 95.7%, respectively. When using combinational-stimulated bands, the optimal linear models were SPA-MLR (DOSC) and SPA-PLS (DOSC) with correct recognition ratio of 100%. All SPA-LSSVM models using DOSC, De-trending and Raw spectra achieved perfect results with recognition of 100%. The overall results demonstrated that it was feasible to use combinational-stimulated bands for the early detection of Sclerotinia of oilseed rape, and DOSC-SPA was a powerful way for informative wavelength selection. This method supplied a new approach to the early detection and portable monitoring instrument of sclerotinia.

  14. Approximate l-fold cross-validation with Least Squares SVM and Kernel Ridge Regression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Edwards, Richard E; Zhang, Hao; Parker, Lynne Edwards

    2013-01-01

    Kernel methods have difficulties scaling to large modern data sets. The scalability issues are based on computational and memory requirements for working with a large matrix. These requirements have been addressed over the years by using low-rank kernel approximations or by improving the solvers scalability. However, Least Squares Support VectorMachines (LS-SVM), a popular SVM variant, and Kernel Ridge Regression still have several scalability issues. In particular, the O(n^3) computational complexity for solving a single model, and the overall computational complexity associated with tuning hyperparameters are still major problems. We address these problems by introducing an O(n log n) approximate l-foldmore » cross-validation method that uses a multi-level circulant matrix to approximate the kernel. In addition, we prove our algorithm s computational complexity and present empirical runtimes on data sets with approximately 1 million data points. We also validate our approximate method s effectiveness at selecting hyperparameters on real world and standard benchmark data sets. Lastly, we provide experimental results on using a multi-level circulant kernel approximation to solve LS-SVM problems with hyperparameters selected using our method.« less

  15. [Multi-Target Recognition of Internal and External Defects of Potato by Semi-Transmission Hyperspectral Imaging and Manifold Learning Algorithm].

    PubMed

    Huang, Tao; Li, Xiao-yu; Jin, Rui; Ku, Jing; Xu, Sen-miao; Xu, Meng-ling; Wu, Zhen-zhong; Kong, De-guo

    2015-04-01

    The present paper put forward a non-destructive detection method which combines semi-transmission hyperspectral imaging technology with manifold learning dimension reduction algorithm and least squares support vector machine (LSSVM) to recognize internal and external defects in potatoes simultaneously. Three hundred fifteen potatoes were bought in farmers market as research object, and semi-transmission hyperspectral image acquisition system was constructed to acquire the hyperspectral images of normal external defects (bud and green rind) and internal defect (hollow heart) potatoes. In order to conform to the actual production, defect part is randomly put right, side and back to the acquisition probe when the hyperspectral images of external defects potatoes are acquired. The average spectrums (390-1,040 nm) were extracted from the region of interests for spectral preprocessing. Then three kinds of manifold learning algorithm were respectively utilized to reduce the dimension of spectrum data, including supervised locally linear embedding (SLLE), locally linear embedding (LLE) and isometric mapping (ISOMAP), the low-dimensional data gotten by manifold learning algorithms is used as model input, Error Correcting Output Code (ECOC) and LSSVM were combined to develop the multi-target classification model. By comparing and analyzing results of the three models, we concluded that SLLE is the optimal manifold learning dimension reduction algorithm, and the SLLE-LSSVM model is determined to get the best recognition rate for recognizing internal and external defects potatoes. For test set data, the single recognition rate of normal, bud, green rind and hollow heart potato reached 96.83%, 86.96%, 86.96% and 95% respectively, and he hybrid recognition rate was 93.02%. The results indicate that combining the semi-transmission hyperspectral imaging technology with SLLE-LSSVM is a feasible qualitative analytical method which can simultaneously recognize the internal and external defects potatoes and also provide technical reference for rapid on-line non-destructive detecting of the internal and external defects potatoes.

  16. Evaluation of extreme learning machine for classification of individual and combined finger movements using electromyography on amputees and non-amputees.

    PubMed

    Anam, Khairul; Al-Jumaily, Adel

    2017-01-01

    The success of myoelectric pattern recognition (M-PR) mostly relies on the features extracted and classifier employed. This paper proposes and evaluates a fast classifier, extreme learning machine (ELM), to classify individual and combined finger movements on amputees and non-amputees. ELM is a single hidden layer feed-forward network (SLFN) that avoids iterative learning by determining input weights randomly and output weights analytically. Therefore, it can accelerate the training time of SLFNs. In addition to the classifier evaluation, this paper evaluates various feature combinations to improve the performance of M-PR and investigate some feature projections to improve the class separability of the features. Different from other studies on the implementation of ELM in the myoelectric controller, this paper presents a complete and thorough investigation of various types of ELMs including the node-based and kernel-based ELM. Furthermore, this paper provides comparisons of ELMs and other well-known classifiers such as linear discriminant analysis (LDA), k-nearest neighbour (kNN), support vector machine (SVM) and least-square SVM (LS-SVM). The experimental results show the most accurate ELM classifier is radial basis function ELM (RBF-ELM). The comparison of RBF-ELM and other well-known classifiers shows that RBF-ELM is as accurate as SVM and LS-SVM but faster than the SVM family; it is superior to LDA and kNN. The experimental results also indicate that the accuracy gap of the M-PR on the amputees and non-amputees is not too much with the accuracy of 98.55% on amputees and 99.5% on the non-amputees using six electromyography (EMG) channels. Copyright © 2016 Elsevier Ltd. All rights reserved.

  17. Comparison of Benchtop Fourier-Transform (FT) and Portable Grating Scanning Spectrometers for Determination of Total Soluble Solid Contents in Single Grape Berry (Vitis vinifera L.) and Calibration Transfer.

    PubMed

    Xiao, Hui; Sun, Ke; Sun, Ye; Wei, Kangli; Tu, Kang; Pan, Leiqing

    2017-11-22

    Near-infrared (NIR) spectroscopy was applied for the determination of total soluble solid contents (SSC) of single Ruby Seedless grape berries using both benchtop Fourier transform (VECTOR 22/N) and portable grating scanning (SupNIR-1500) spectrometers in this study. The results showed that the best SSC prediction was obtained by VECTOR 22/N in the range of 12,000 to 4000 cm -1 (833-2500 nm) for Ruby Seedless with determination coefficient of prediction (R p ²) of 0.918, root mean squares error of prediction (RMSEP) of 0.758% based on least squares support vector machine (LS-SVM). Calibration transfer was conducted on the same spectral range of two instruments (1000-1800 nm) based on the LS-SVM model. By conducting Kennard-Stone (KS) to divide sample sets, selecting the optimal number of standardization samples and applying Passing-Bablok regression to choose the optimal instrument as the master instrument, a modified calibration transfer method between two spectrometers was developed. When 45 samples were selected for the standardization set, the linear interpolation-piecewise direct standardization (linear interpolation-PDS) performed well for calibration transfer with R p ² of 0.857 and RMSEP of 1.099% in the spectral region of 1000-1800 nm. And it was proved that re-calculating the standardization samples into master model could improve the performance of calibration transfer in this study. This work indicated that NIR could be used as a rapid and non-destructive method for SSC prediction, and provided a feasibility to solve the transfer difficulty between totally different NIR spectrometers.

  18. Spectral feature extraction of EEG signals and pattern recognition during mental tasks of 2-D cursor movements for BCI using SVM and ANN.

    PubMed

    Bascil, M Serdar; Tesneli, Ahmet Y; Temurtas, Feyzullah

    2016-09-01

    Brain computer interface (BCI) is a new communication way between man and machine. It identifies mental task patterns stored in electroencephalogram (EEG). So, it extracts brain electrical activities recorded by EEG and transforms them machine control commands. The main goal of BCI is to make available assistive environmental devices for paralyzed people such as computers and makes their life easier. This study deals with feature extraction and mental task pattern recognition on 2-D cursor control from EEG as offline analysis approach. The hemispherical power density changes are computed and compared on alpha-beta frequency bands with only mental imagination of cursor movements. First of all, power spectral density (PSD) features of EEG signals are extracted and high dimensional data reduced by principle component analysis (PCA) and independent component analysis (ICA) which are statistical algorithms. In the last stage, all features are classified with two types of support vector machine (SVM) which are linear and least squares (LS-SVM) and three different artificial neural network (ANN) structures which are learning vector quantization (LVQ), multilayer neural network (MLNN) and probabilistic neural network (PNN) and mental task patterns are successfully identified via k-fold cross validation technique.

  19. New model for prediction binary mixture of antihistamine decongestant using artificial neural networks and least squares support vector machine by spectrophotometry method

    NASA Astrophysics Data System (ADS)

    Mofavvaz, Shirin; Sohrabi, Mahmoud Reza; Nezamzadeh-Ejhieh, Alireza

    2017-07-01

    In the present study, artificial neural networks (ANNs) and least squares support vector machines (LS-SVM) as intelligent methods based on absorption spectra in the range of 230-300 nm have been used for determination of antihistamine decongestant contents. In the first step, one type of network (feed-forward back-propagation) from the artificial neural network with two different training algorithms, Levenberg-Marquardt (LM) and gradient descent with momentum and adaptive learning rate back-propagation (GDX) algorithm, were employed and their performance was evaluated. The performance of the LM algorithm was better than the GDX algorithm. In the second one, the radial basis network was utilized and results compared with the previous network. In the last one, the other intelligent method named least squares support vector machine was proposed to construct the antihistamine decongestant prediction model and the results were compared with two of the aforementioned networks. The values of the statistical parameters mean square error (MSE), Regression coefficient (R2), correlation coefficient (r) and also mean recovery (%), relative standard deviation (RSD) used for selecting the best model between these methods. Moreover, the proposed methods were compared to the high- performance liquid chromatography (HPLC) as a reference method. One way analysis of variance (ANOVA) test at the 95% confidence level applied to the comparison results of suggested and reference methods that there were no significant differences between them.

  20. Back analysis of geomechanical parameters in underground engineering using artificial bee colony.

    PubMed

    Zhu, Changxing; Zhao, Hongbo; Zhao, Ming

    2014-01-01

    Accurate geomechanical parameters are critical in tunneling excavation, design, and supporting. In this paper, a displacements back analysis based on artificial bee colony (ABC) algorithm is proposed to identify geomechanical parameters from monitored displacements. ABC was used as global optimal algorithm to search the unknown geomechanical parameters for the problem with analytical solution. To the problem without analytical solution, optimal back analysis is time-consuming, and least square support vector machine (LSSVM) was used to build the relationship between unknown geomechanical parameters and displacement and improve the efficiency of back analysis. The proposed method was applied to a tunnel with analytical solution and a tunnel without analytical solution. The results show the proposed method is feasible.

  1. Support vector machine regression (SVR/LS-SVM)--an alternative to neural networks (ANN) for analytical chemistry? Comparison of nonlinear methods on near infrared (NIR) spectroscopy data.

    PubMed

    Balabin, Roman M; Lomakina, Ekaterina I

    2011-04-21

    In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.

  2. [Discrimination of donkey meat by NIR and chemometrics].

    PubMed

    Niu, Xiao-Ying; Shao, Li-Min; Dong, Fang; Zhao, Zhi-Lei; Zhu, Yan

    2014-10-01

    Donkey meat samples (n = 167) from different parts of donkey body (neck, costalia, rump, and tendon), beef (n = 47), pork (n = 51) and mutton (n = 32) samples were used to establish near-infrared reflectance spectroscopy (NIR) classification models in the spectra range of 4,000~12,500 cm(-1). The accuracies of classification models constructed by Mahalanobis distances analysis, soft independent modeling of class analogy (SIMCA) and least squares-support vector machine (LS-SVM), respectively combined with pretreatment of Savitzky-Golay smooth (5, 15 and 25 points) and derivative (first and second), multiplicative scatter correction and standard normal variate, were compared. The optimal models for intact samples were obtained by Mahalanobis distances analysis with the first 11 principal components (PCs) from original spectra as inputs and by LS-SVM with the first 6 PCs as inputs, and correctly classified 100% of calibration set and 98. 96% of prediction set. For minced samples of 7 mm diameter the optimal result was attained by LS-SVM with the first 5 PCs from original spectra as inputs, which gained an accuracy of 100% for calibration and 97.53% for prediction. For minced diameter of 5 mm SIMCA model with the first 8 PCs from original spectra as inputs correctly classified 100% of calibration and prediction. And for minced diameter of 3 mm Mahalanobis distances analysis and SIMCA models both achieved 100% accuracy for calibration and prediction respectively with the first 7 and 9 PCs from original spectra as inputs. And in these models, donkey meat samples were all correctly classified with 100% either in calibration or prediction. The results show that it is feasible that NIR with chemometrics methods is used to discriminate donkey meat from the else meat.

  3. A comparison of performance of several artificial intelligence methods for predicting the dynamic viscosity of TiO2/SAE 50 nano-lubricant

    NASA Astrophysics Data System (ADS)

    Hemmat Esfe, Mohammad; Tatar, Afshin; Ahangar, Mohammad Reza Hassani; Rostamian, Hossein

    2018-02-01

    Since the conventional thermal fluids such as water, oil, and ethylene glycol have poor thermal properties, the tiny solid particles are added to these fluids to increase their heat transfer improvement. As viscosity determines the rheological behavior of a fluid, studying the parameters affecting the viscosity is crucial. Since the experimental measurement of viscosity is expensive and time consuming, predicting this parameter is the apt method. In this work, three artificial intelligence methods containing Genetic Algorithm-Radial Basis Function Neural Networks (GA-RBF), Least Square Support Vector Machine (LS-SVM) and Gene Expression Programming (GEP) were applied to predict the viscosity of TiO2/SAE 50 nano-lubricant with Non-Newtonian power-law behavior using experimental data. The correlation factor (R2), Average Absolute Relative Deviation (AARD), Root Mean Square Error (RMSE), and Margin of Deviation were employed to investigate the accuracy of the proposed models. RMSE values of 0.58, 1.28, and 6.59 and R2 values of 0.99998, 0.99991, and 0.99777 reveal the accuracy of the proposed models for respective GA-RBF, CSA-LSSVM, and GEP methods. Among the developed models, the GA-RBF shows the best accuracy.

  4. Non-destructive evaluation of bacteria-infected watermelon seeds using visible/near-infrared hyperspectral imaging.

    PubMed

    Lee, Hoonsoo; Kim, Moon S; Song, Yu-Rim; Oh, Chang-Sik; Lim, Hyoun-Sub; Lee, Wang-Hee; Kang, Jum-Soon; Cho, Byoung-Kwan

    2017-03-01

    There is a need to minimize economic damage by sorting infected seeds from healthy seeds before seeding. However, current methods of detecting infected seeds, such as seedling grow-out, enzyme-linked immunosorbent assays, the polymerase chain reaction (PCR) and the real-time PCR have a critical drawbacks in that they are time-consuming, labor-intensive and destructive procedures. The present study aimed to evaluate the potential of visible/near-infrared (Vis/NIR) hyperspectral imaging system for detecting bacteria-infected watermelon seeds. A hyperspectral Vis/NIR reflectance imaging system (spectral region of 400-1000 nm) was constructed to obtain hyperspectral reflectance images for 336 bacteria-infected watermelon seeds, which were then subjected to partial least square discriminant analysis (PLS-DA) and a least-squares support vector machine (LS-SVM) to classify bacteria-infected watermelon seeds from healthy watermelon seeds. The developed system detected bacteria-infected watermelon seeds with an accuracy > 90% (PLS-DA: 91.7%, LS-SVM: 90.5%), suggesting that the Vis/NIR hyperspectral imaging system is effective for quarantining bacteria-infected watermelon seeds. The results of the present study show that it is possible to use the Vis/NIR hyperspectral imaging system for detecting bacteria-infected watermelon seeds. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.

  5. New bandwidth selection criterion for Kernel PCA: approach to dimensionality reduction and classification problems.

    PubMed

    Thomas, Minta; De Brabanter, Kris; De Moor, Bart

    2014-05-10

    DNA microarrays are potentially powerful technology for improving diagnostic classification, treatment selection, and prognostic assessment. The use of this technology to predict cancer outcome has a history of almost a decade. Disease class predictors can be designed for known disease cases and provide diagnostic confirmation or clarify abnormal cases. The main input to this class predictors are high dimensional data with many variables and few observations. Dimensionality reduction of these features set significantly speeds up the prediction task. Feature selection and feature transformation methods are well known preprocessing steps in the field of bioinformatics. Several prediction tools are available based on these techniques. Studies show that a well tuned Kernel PCA (KPCA) is an efficient preprocessing step for dimensionality reduction, but the available bandwidth selection method for KPCA was computationally expensive. In this paper, we propose a new data-driven bandwidth selection criterion for KPCA, which is related to least squares cross-validation for kernel density estimation. We propose a new prediction model with a well tuned KPCA and Least Squares Support Vector Machine (LS-SVM). We estimate the accuracy of the newly proposed model based on 9 case studies. Then, we compare its performances (in terms of test set Area Under the ROC Curve (AUC) and computational time) with other well known techniques such as whole data set + LS-SVM, PCA + LS-SVM, t-test + LS-SVM, Prediction Analysis of Microarrays (PAM) and Least Absolute Shrinkage and Selection Operator (Lasso). Finally, we assess the performance of the proposed strategy with an existing KPCA parameter tuning algorithm by means of two additional case studies. We propose, evaluate, and compare several mathematical/statistical techniques, which apply feature transformation/selection for subsequent classification, and consider its application in medical diagnostics. Both feature selection and feature transformation perform well on classification tasks. Due to the dynamic selection property of feature selection, it is hard to define significant features for the classifier, which predicts classes of future samples. Moreover, the proposed strategy enjoys a distinctive advantage with its relatively lesser time complexity.

  6. Fresh Biomass Estimation in Heterogeneous Grassland Using Hyperspectral Measurements and Multivariate Statistical Analysis

    NASA Astrophysics Data System (ADS)

    Darvishzadeh, R.; Skidmore, A. K.; Mirzaie, M.; Atzberger, C.; Schlerf, M.

    2014-12-01

    Accurate estimation of grassland biomass at their peak productivity can provide crucial information regarding the functioning and productivity of the rangelands. Hyperspectral remote sensing has proved to be valuable for estimation of vegetation biophysical parameters such as biomass using different statistical techniques. However, in statistical analysis of hyperspectral data, multicollinearity is a common problem due to large amount of correlated hyper-spectral reflectance measurements. The aim of this study was to examine the prospect of above ground biomass estimation in a heterogeneous Mediterranean rangeland employing multivariate calibration methods. Canopy spectral measurements were made in the field using a GER 3700 spectroradiometer, along with concomitant in situ measurements of above ground biomass for 170 sample plots. Multivariate calibrations including partial least squares regression (PLSR), principal component regression (PCR), and Least-Squared Support Vector Machine (LS-SVM) were used to estimate the above ground biomass. The prediction accuracy of the multivariate calibration methods were assessed using cross validated R2 and RMSE. The best model performance was obtained using LS_SVM and then PLSR both calibrated with first derivative reflectance dataset with R2cv = 0.88 & 0.86 and RMSEcv= 1.15 & 1.07 respectively. The weakest prediction accuracy was appeared when PCR were used (R2cv = 0.31 and RMSEcv= 2.48). The obtained results highlight the importance of multivariate calibration methods for biomass estimation when hyperspectral data are used.

  7. Discrimination of tomatoes bred by spaceflight mutagenesis using visible/near infrared spectroscopy and chemometrics

    NASA Astrophysics Data System (ADS)

    Shao, Yongni; Xie, Chuanqi; Jiang, Linjun; Shi, Jiahui; Zhu, Jiajin; He, Yong

    2015-04-01

    Visible/near infrared spectroscopy (Vis/NIR) based on sensitive wavelengths (SWs) and chemometrics was proposed to discriminate different tomatoes bred by spaceflight mutagenesis from their leafs or fruits (green or mature). The tomato breeds were mutant M1, M2 and their parent. Partial least squares (PLS) analysis and least squares-support vector machine (LS-SVM) were implemented for calibration models. PLS analysis was implemented for calibration models with different wavebands including the visible region (400-700 nm) and the near infrared region (700-1000 nm). The best PLS models were achieved in the visible region for the leaf and green fruit samples and in the near infrared region for the mature fruit samples. Furthermore, different latent variables (4-8 LVs for leafs, 5-9 LVs for green fruits, and 4-9 LVs for mature fruits) were used as inputs of LS-SVM to develop the LV-LS-SVM models with the grid search technique and radial basis function (RBF) kernel. The optimal LV-LS-SVM models were achieved with six LVs for the leaf samples, seven LVs for green fruits, and six LVs for mature fruits, respectively, and they outperformed the PLS models. Moreover, independent component analysis (ICA) was executed to select several SWs based on loading weights. The optimal LS-SVM model was achieved with SWs of 550-560 nm, 562-574 nm, 670-680 nm and 705-715 nm for the leaf samples; 548-556 nm, 559-564 nm, 678-685 nm and 962-974 nm for the green fruit samples; and 712-718 nm, 720-729 nm, 968-978 nm and 820-830 nm for the mature fruit samples. All of them had better performance than PLS and LV-LS-SVM, with the parameters of correlation coefficient (rp), root mean square error of prediction (RMSEP) and bias of 0.9792, 0.2632 and 0.0901 based on leaf discrimination, 0.9837, 0.2783 and 0.1758 based on green fruit discrimination, 0.9804, 0.2215 and -0.0035 based on mature fruit discrimination, respectively. The overall results indicated that ICA was an effective way for the selection of SWs, and the Vis/NIR combined with LS-SVM models had the capability to predict the different breeds (mutant M1, mutant M2 and their parent) of tomatoes from leafs and fruits.

  8. Incremental Support Vector Machine Framework for Visual Sensor Networks

    NASA Astrophysics Data System (ADS)

    Awad, Mariette; Jiang, Xianhua; Motai, Yuichi

    2006-12-01

    Motivated by the emerging requirements of surveillance networks, we present in this paper an incremental multiclassification support vector machine (SVM) technique as a new framework for action classification based on real-time multivideo collected by homogeneous sites. The technique is based on an adaptation of least square SVM (LS-SVM) formulation but extends beyond the static image-based learning of current SVM methodologies. In applying the technique, an initial supervised offline learning phase is followed by a visual behavior data acquisition and an online learning phase during which the cluster head performs an ensemble of model aggregations based on the sensor nodes inputs. The cluster head then selectively switches on designated sensor nodes for future incremental learning. Combining sensor data offers an improvement over single camera sensing especially when the latter has an occluded view of the target object. The optimization involved alleviates the burdens of power consumption and communication bandwidth requirements. The resulting misclassification error rate, the iterative error reduction rate of the proposed incremental learning, and the decision fusion technique prove its validity when applied to visual sensor networks. Furthermore, the enabled online learning allows an adaptive domain knowledge insertion and offers the advantage of reducing both the model training time and the information storage requirements of the overall system which makes it even more attractive for distributed sensor networks communication.

  9. Modeling and Compensation of Random Drift of MEMS Gyroscopes Based on Least Squares Support Vector Machine Optimized by Chaotic Particle Swarm Optimization.

    PubMed

    Xing, Haifeng; Hou, Bo; Lin, Zhihui; Guo, Meifeng

    2017-10-13

    MEMS (Micro Electro Mechanical System) gyroscopes have been widely applied to various fields, but MEMS gyroscope random drift has nonlinear and non-stationary characteristics. It has attracted much attention to model and compensate the random drift because it can improve the precision of inertial devices. This paper has proposed to use wavelet filtering to reduce noise in the original data of MEMS gyroscopes, then reconstruct the random drift data with PSR (phase space reconstruction), and establish the model for the reconstructed data by LSSVM (least squares support vector machine), of which the parameters were optimized using CPSO (chaotic particle swarm optimization). Comparing the effect of modeling the MEMS gyroscope random drift with BP-ANN (back propagation artificial neural network) and the proposed method, the results showed that the latter had a better prediction accuracy. Using the compensation of three groups of MEMS gyroscope random drift data, the standard deviation of three groups of experimental data dropped from 0.00354°/s, 0.00412°/s, and 0.00328°/s to 0.00065°/s, 0.00072°/s and 0.00061°/s, respectively, which demonstrated that the proposed method can reduce the influence of MEMS gyroscope random drift and verified the effectiveness of this method for modeling MEMS gyroscope random drift.

  10. Support vector machine for day ahead electricity price forecasting

    NASA Astrophysics Data System (ADS)

    Razak, Intan Azmira binti Wan Abdul; Abidin, Izham bin Zainal; Siah, Yap Keem; Rahman, Titik Khawa binti Abdul; Lada, M. Y.; Ramani, Anis Niza binti; Nasir, M. N. M.; Ahmad, Arfah binti

    2015-05-01

    Electricity price forecasting has become an important part of power system operation and planning. In a pool- based electric energy market, producers submit selling bids consisting in energy blocks and their corresponding minimum selling prices to the market operator. Meanwhile, consumers submit buying bids consisting in energy blocks and their corresponding maximum buying prices to the market operator. Hence, both producers and consumers use day ahead price forecasts to derive their respective bidding strategies to the electricity market yet reduce the cost of electricity. However, forecasting electricity prices is a complex task because price series is a non-stationary and highly volatile series. Many factors cause for price spikes such as volatility in load and fuel price as well as power import to and export from outside the market through long term contract. This paper introduces an approach of machine learning algorithm for day ahead electricity price forecasting with Least Square Support Vector Machine (LS-SVM). Previous day data of Hourly Ontario Electricity Price (HOEP), generation's price and demand from Ontario power market are used as the inputs for training data. The simulation is held using LSSVMlab in Matlab with the training and testing data of 2004. SVM that widely used for classification and regression has great generalization ability with structured risk minimization principle rather than empirical risk minimization. Moreover, same parameter settings in trained SVM give same results that absolutely reduce simulation process compared to other techniques such as neural network and time series. The mean absolute percentage error (MAPE) for the proposed model shows that SVM performs well compared to neural network.

  11. A fault diagnosis scheme for planetary gearboxes using modified multi-scale symbolic dynamic entropy and mRMR feature selection

    NASA Astrophysics Data System (ADS)

    Li, Yongbo; Yang, Yuantao; Li, Guoyan; Xu, Minqiang; Huang, Wenhu

    2017-07-01

    Health condition identification of planetary gearboxes is crucial to reduce the downtime and maximize productivity. This paper aims to develop a novel fault diagnosis method based on modified multi-scale symbolic dynamic entropy (MMSDE) and minimum redundancy maximum relevance (mRMR) to identify the different health conditions of planetary gearbox. MMSDE is proposed to quantify the regularity of time series, which can assess the dynamical characteristics over a range of scales. MMSDE has obvious advantages in the detection of dynamical changes and computation efficiency. Then, the mRMR approach is introduced to refine the fault features. Lastly, the obtained new features are fed into the least square support vector machine (LSSVM) to complete the fault pattern identification. The proposed method is numerically and experimentally demonstrated to be able to recognize the different fault types of planetary gearboxes.

  12. An online semi-supervised brain-computer interface.

    PubMed

    Gu, Zhenghui; Yu, Zhuliang; Shen, Zhifang; Li, Yuanqing

    2013-09-01

    Practical brain-computer interface (BCI) systems should require only low training effort for the user, and the algorithms used to classify the intent of the user should be computationally efficient. However, due to inter- and intra-subject variations in EEG signal, intermittent training/calibration is often unavoidable. In this paper, we present an online semi-supervised P300 BCI speller system. After a short initial training (around or less than 1 min in our experiments), the system is switched to a mode where the user can input characters through selective attention. In this mode, a self-training least squares support vector machine (LS-SVM) classifier is gradually enhanced in back end with the unlabeled EEG data collected online after every character input. In this way, the classifier is gradually enhanced. Even though the user may experience some errors in input at the beginning due to the small initial training dataset, the accuracy approaches that of fully supervised method in a few minutes. The algorithm based on LS-SVM and its sequential update has low computational complexity; thus, it is suitable for online applications. The effectiveness of the algorithm has been validated through data analysis on BCI Competition III dataset II (P300 speller BCI data). The performance of the online system was evaluated through experimental results on eight healthy subjects, where all of them achieved the spelling accuracy of 85 % or above within an average online semi-supervised learning time of around 3 min.

  13. Tackling Missing Data in Community Health Studies Using Additive LS-SVM Classifier.

    PubMed

    Wang, Guanjin; Deng, Zhaohong; Choi, Kup-Sze

    2018-03-01

    Missing data is a common issue in community health and epidemiological studies. Direct removal of samples with missing data can lead to reduced sample size and information bias, which deteriorates the significance of the results. While data imputation methods are available to deal with missing data, they are limited in performance and could introduce noises into the dataset. Instead of data imputation, a novel method based on additive least square support vector machine (LS-SVM) is proposed in this paper for predictive modeling when the input features of the model contain missing data. The method also determines simultaneously the influence of the features with missing values on the classification accuracy using the fast leave-one-out cross-validation strategy. The performance of the method is evaluated by applying it to predict the quality of life (QOL) of elderly people using health data collected in the community. The dataset involves demographics, socioeconomic status, health history, and the outcomes of health assessments of 444 community-dwelling elderly people, with 5% to 60% of data missing in some of the input features. The QOL is measured using a standard questionnaire of the World Health Organization. Results show that the proposed method outperforms four conventional methods for handling missing data-case deletion, feature deletion, mean imputation, and K-nearest neighbor imputation, with the average QOL prediction accuracy reaching 0.7418. It is potentially a promising technique for tackling missing data in community health research and other applications.

  14. Development of Noninvasive Classification Methods for Different Roasting Degrees of Coffee Beans Using Hyperspectral Imaging

    PubMed Central

    Chu, Bingquan; Yu, Keqiang; Zhao, Yanru

    2018-01-01

    This study aimed to develop an approach for quickly and noninvasively differentiating the roasting degrees of coffee beans using hyperspectral imaging (HSI). The qualitative properties of seven roasting degrees of coffee beans (unroasted, light, moderately light, light medium, medium, moderately dark, and dark) were assayed, including moisture, crude fat, trigonelline, chlorogenic acid, and caffeine contents. These properties were influenced greatly by the respective roasting degree. Their hyperspectral images (874–1734 nm) were collected using a hyperspectral reflectance imaging system. The spectra of the regions of interest were manually extracted from the HSI images. Then, principal components analysis was employed to compress the spectral data and select the optimal wavelengths based on loading weight analysis. Meanwhile, the random frog (RF) methodology and the successive projections algorithm were also adopted to pick effective wavelengths from the spectral data. Finally, least squares support vector machine (LS-SVM) was utilized to establish discriminative models using spectral reflectance and corresponding labeled classes for each degree of roast sample. The results showed that the LS-SVM model, established by the RF selecting method, with eight wavelengths performed very well, achieving an overall classification accuracy of 90.30%. In conclusion, HSI was illustrated as a potential technique for noninvasively classifying the roasting degrees of coffee beans and might have an important application for the development of nondestructive, real-time, and portable sensors to monitor the roasting process of coffee beans. PMID:29671781

  15. Development of Noninvasive Classification Methods for Different Roasting Degrees of Coffee Beans Using Hyperspectral Imaging.

    PubMed

    Chu, Bingquan; Yu, Keqiang; Zhao, Yanru; He, Yong

    2018-04-19

    This study aimed to develop an approach for quickly and noninvasively differentiating the roasting degrees of coffee beans using hyperspectral imaging (HSI). The qualitative properties of seven roasting degrees of coffee beans (unroasted, light, moderately light, light medium, medium, moderately dark, and dark) were assayed, including moisture, crude fat, trigonelline, chlorogenic acid, and caffeine contents. These properties were influenced greatly by the respective roasting degree. Their hyperspectral images (874⁻1734 nm) were collected using a hyperspectral reflectance imaging system. The spectra of the regions of interest were manually extracted from the HSI images. Then, principal components analysis was employed to compress the spectral data and select the optimal wavelengths based on loading weight analysis. Meanwhile, the random frog (RF) methodology and the successive projections algorithm were also adopted to pick effective wavelengths from the spectral data. Finally, least squares support vector machine (LS-SVM) was utilized to establish discriminative models using spectral reflectance and corresponding labeled classes for each degree of roast sample. The results showed that the LS-SVM model, established by the RF selecting method, with eight wavelengths performed very well, achieving an overall classification accuracy of 90.30%. In conclusion, HSI was illustrated as a potential technique for noninvasively classifying the roasting degrees of coffee beans and might have an important application for the development of nondestructive, real-time, and portable sensors to monitor the roasting process of coffee beans.

  16. Discrimination of tomatoes bred by spaceflight mutagenesis using visible/near infrared spectroscopy and chemometrics.

    PubMed

    Shao, Yongni; Xie, Chuanqi; Jiang, Linjun; Shi, Jiahui; Zhu, Jiajin; He, Yong

    2015-04-05

    Visible/near infrared spectroscopy (Vis/NIR) based on sensitive wavelengths (SWs) and chemometrics was proposed to discriminate different tomatoes bred by spaceflight mutagenesis from their leafs or fruits (green or mature). The tomato breeds were mutant M1, M2 and their parent. Partial least squares (PLS) analysis and least squares-support vector machine (LS-SVM) were implemented for calibration models. PLS analysis was implemented for calibration models with different wavebands including the visible region (400-700 nm) and the near infrared region (700-1000 nm). The best PLS models were achieved in the visible region for the leaf and green fruit samples and in the near infrared region for the mature fruit samples. Furthermore, different latent variables (4-8 LVs for leafs, 5-9 LVs for green fruits, and 4-9 LVs for mature fruits) were used as inputs of LS-SVM to develop the LV-LS-SVM models with the grid search technique and radial basis function (RBF) kernel. The optimal LV-LS-SVM models were achieved with six LVs for the leaf samples, seven LVs for green fruits, and six LVs for mature fruits, respectively, and they outperformed the PLS models. Moreover, independent component analysis (ICA) was executed to select several SWs based on loading weights. The optimal LS-SVM model was achieved with SWs of 550-560 nm, 562-574 nm, 670-680 nm and 705-71 5 nm for the leaf samples; 548-556 nm, 559-564 nm, 678-685 nm and 962-974 nm for the green fruit samples; and 712-718 nm, 720-729 nm, 968-978 nm and 820-830 nm for the mature fruit samples. All of them had better performance than PLS and LV-LS-SVM, with the parameters of correlation coefficient (rp), root mean square error of prediction (RMSEP) and bias of 0.9792, 0.2632 and 0.0901 based on leaf discrimination, 0.9837, 0.2783 and 0.1758 based on green fruit discrimination, 0.9804, 0.2215 and -0.0035 based on mature fruit discrimination, respectively. The overall results indicated that ICA was an effective way for the selection of SWs, and the Vis/NIR combined with LS-SVM models had the capability to predict the different breeds (mutant M1, mutant M2 and their parent) of tomatoes from leafs and fruits. Copyright © 2015 Elsevier B.V. All rights reserved.

  17. Classification of brain tumours using short echo time 1H MR spectra

    NASA Astrophysics Data System (ADS)

    Devos, A.; Lukas, L.; Suykens, J. A. K.; Vanhamme, L.; Tate, A. R.; Howe, F. A.; Majós, C.; Moreno-Torres, A.; van der Graaf, M.; Arús, C.; Van Huffel, S.

    2004-09-01

    The purpose was to objectively compare the application of several techniques and the use of several input features for brain tumour classification using Magnetic Resonance Spectroscopy (MRS). Short echo time 1H MRS signals from patients with glioblastomas ( n = 87), meningiomas ( n = 57), metastases ( n = 39), and astrocytomas grade II ( n = 22) were provided by six centres in the European Union funded INTERPRET project. Linear discriminant analysis, least squares support vector machines (LS-SVM) with a linear kernel and LS-SVM with radial basis function kernel were applied and evaluated over 100 stratified random splittings of the dataset into training and test sets. The area under the receiver operating characteristic curve (AUC) was used to measure the performance of binary classifiers, while the percentage of correct classifications was used to evaluate the multiclass classifiers. The influence of several factors on the classification performance has been tested: L2- vs. water normalization, magnitude vs. real spectra and baseline correction. The effect of input feature reduction was also investigated by using only the selected frequency regions containing the most discriminatory information, and peak integrated values. Using L2-normalized complete spectra the automated binary classifiers reached a mean test AUC of more than 0.95, except for glioblastomas vs. metastases. Similar results were obtained for all classification techniques and input features except for water normalized spectra, where classification performance was lower. This indicates that data acquisition and processing can be simplified for classification purposes, excluding the need for separate water signal acquisition, baseline correction or phasing.

  18. Multisource image fusion method using support value transform.

    PubMed

    Zheng, Sheng; Shi, Wen-Zhong; Liu, Jian; Zhu, Guang-Xi; Tian, Jin-Wen

    2007-07-01

    With the development of numerous imaging sensors, many images can be simultaneously pictured by various sensors. However, there are many scenarios where no one sensor can give the complete picture. Image fusion is an important approach to solve this problem and produces a single image which preserves all relevant information from a set of different sensors. In this paper, we proposed a new image fusion method using the support value transform, which uses the support value to represent the salient features of image. This is based on the fact that, in support vector machines (SVMs), the data with larger support values have a physical meaning in the sense that they reveal relative more importance of the data points for contributing to the SVM model. The mapped least squares SVM (mapped LS-SVM) is used to efficiently compute the support values of image. The support value analysis is developed by using a series of multiscale support value filters, which are obtained by filling zeros in the basic support value filter deduced from the mapped LS-SVM to match the resolution of the desired level. Compared with the widely used image fusion methods, such as the Laplacian pyramid, discrete wavelet transform methods, the proposed method is an undecimated transform-based approach. The fusion experiments are undertaken on multisource images. The results demonstrate that the proposed approach is effective and is superior to the conventional image fusion methods in terms of the pertained quantitative fusion evaluation indexes, such as quality of visual information (Q(AB/F)), the mutual information, etc.

  19. New models to predict depth of infiltration in endometrial carcinoma based on transvaginal sonography.

    PubMed

    De Smet, F; De Brabanter, J; Van den Bosch, T; Pochet, N; Amant, F; Van Holsbeke, C; Moerman, P; De Moor, B; Vergote, I; Timmerman, D

    2006-06-01

    Preoperative knowledge of the depth of myometrial infiltration is important in patients with endometrial carcinoma. This study aimed at assessing the value of histopathological parameters obtained from an endometrial biopsy (Pipelle de Cornier; results available preoperatively) and ultrasound measurements obtained after transvaginal sonography with color Doppler imaging in the preoperative prediction of the depth of myometrial invasion, as determined by the final histopathological examination of the hysterectomy specimen (the gold standard). We first collected ultrasound and histopathological data from 97 consecutive women with endometrial carcinoma and divided them into two groups according to surgical stage (Stages Ia and Ib vs. Stages Ic and higher). The areas (AUC) under the receiver-operating characteristics curves of the subjective assessment of depth of invasion by an experienced gynecologist and of the individual ultrasound parameters were calculated. Subsequently, we used these variables to train a logistic regression model and least squares support vector machines (LS-SVM) with linear and RBF (radial basis function) kernels. Finally, these models were validated prospectively on data from 76 new patients in order to make a preoperative prediction of the depth of invasion. Of all ultrasound parameters, the ratio of the endometrial and uterine volumes had the largest AUC (78%), while that of the subjective assessment was 79%. The AUCs of the blood flow indices were low (range, 51-64%). Stepwise logistic regression selected the degree of differentiation, the number of fibroids, the endometrial thickness and the volume of the tumor. Compared with the AUC of the subjective assessment (72%), prospective evaluation of the mathematical models resulted in a higher AUC for the LS-SVM model with an RBF kernel (77%), but this difference was not significant. Single morphological parameters do not improve the predictive power when compared with the subjective assessment of depth of myometrial invasion of endometrial cancer, and blood flow indices do not contribute to the prediction of stage. In this study an LS-SVM model with an RBF kernel gave the best prediction; while this might be more reliable than subjective assessment, confirmation by larger prospective studies is required. Copyright 2006 ISUOG. Published by John Wiley & Sons, Ltd.

  20. Predicting breast cancer using an expression values weighted clinical classifier.

    PubMed

    Thomas, Minta; De Brabanter, Kris; Suykens, Johan A K; De Moor, Bart

    2014-12-31

    Clinical data, such as patient history, laboratory analysis, ultrasound parameters-which are the basis of day-to-day clinical decision support-are often used to guide the clinical management of cancer in the presence of microarray data. Several data fusion techniques are available to integrate genomics or proteomics data, but only a few studies have created a single prediction model using both gene expression and clinical data. These studies often remain inconclusive regarding an obtained improvement in prediction performance. To improve clinical management, these data should be fully exploited. This requires efficient algorithms to integrate these data sets and design a final classifier. LS-SVM classifiers and generalized eigenvalue/singular value decompositions are successfully used in many bioinformatics applications for prediction tasks. While bringing up the benefits of these two techniques, we propose a machine learning approach, a weighted LS-SVM classifier to integrate two data sources: microarray and clinical parameters. We compared and evaluated the proposed methods on five breast cancer case studies. Compared to LS-SVM classifier on individual data sets, generalized eigenvalue decomposition (GEVD) and kernel GEVD, the proposed weighted LS-SVM classifier offers good prediction performance, in terms of test area under ROC Curve (AUC), on all breast cancer case studies. Thus a clinical classifier weighted with microarray data set results in significantly improved diagnosis, prognosis and prediction responses to therapy. The proposed model has been shown as a promising mathematical framework in both data fusion and non-linear classification problems.

  1. A PDF-based classification of gait cadence patterns in patients with amyotrophic lateral sclerosis.

    PubMed

    Wu, Yunfeng; Ng, Sin Chun

    2010-01-01

    Amyotrophic lateral sclerosis (ALS) is a type of neurological disease due to the degeneration of motor neurons. During the course of such a progressive disease, it would be difficult for ALS patients to regulate normal locomotion, so that the gait stability becomes perturbed. This paper presents a pilot statistical study on the gait cadence (or stride interval) in ALS, based on the statistical analysis method. The probability density functions (PDFs) of stride interval were first estimated with the nonparametric Parzen-window method. We computed the mean of the left-foot stride interval and the modified Kullback-Leibler divergence (MKLD) from the PDFs estimated. The analysis results suggested that both of these two statistical parameters were significantly altered in ALS, and the least-squares support vector machine (LS-SVM) may effectively distinguish the stride patterns between the ALS patients and healthy controls, with an accurate rate of 82.8% and an area of 0.87 under the receiver operating characteristic curve.

  2. Detection of Life Threatening Ventricular Arrhythmia Using Digital Taylor Fourier Transform.

    PubMed

    Tripathy, Rajesh K; Zamora-Mendez, Alejandro; de la O Serna, José A; Paternina, Mario R Arrieta; Arrieta, Juan G; Naik, Ganesh R

    2018-01-01

    Accurate detection and classification of life-threatening ventricular arrhythmia episodes such as ventricular fibrillation (VF) and rapid ventricular tachycardia (VT) from electrocardiogram (ECG) is a challenging problem for patient monitoring and defibrillation therapy. This paper introduces a novel method for detection and classification of life-threatening ventricular arrhythmia episodes. The ECG signal is decomposed into various oscillatory modes using digital Taylor-Fourier transform (DTFT). The magnitude feature and a novel phase feature namely the phase difference (PD) are evaluated from the mode Taylor-Fourier coefficients of ECG signal. The least square support vector machine (LS-SVM) classifier with linear and radial basis function (RBF) kernels is employed for detection and classification of VT vs. VF, non-shock vs. shock and VF vs. non-VF arrhythmia episodes. The accuracy, sensitivity, and specificity values obtained using the proposed method are 89.81, 86.38, and 93.97%, respectively for the classification of Non-VF and VF episodes. Comparison with the performance of the state-of-the-art features demonstrate the advantages of the proposition.

  3. Detection of Life Threatening Ventricular Arrhythmia Using Digital Taylor Fourier Transform

    PubMed Central

    Tripathy, Rajesh K.; Zamora-Mendez, Alejandro; de la O Serna, José A.; Paternina, Mario R. Arrieta; Arrieta, Juan G.; Naik, Ganesh R.

    2018-01-01

    Accurate detection and classification of life-threatening ventricular arrhythmia episodes such as ventricular fibrillation (VF) and rapid ventricular tachycardia (VT) from electrocardiogram (ECG) is a challenging problem for patient monitoring and defibrillation therapy. This paper introduces a novel method for detection and classification of life-threatening ventricular arrhythmia episodes. The ECG signal is decomposed into various oscillatory modes using digital Taylor-Fourier transform (DTFT). The magnitude feature and a novel phase feature namely the phase difference (PD) are evaluated from the mode Taylor-Fourier coefficients of ECG signal. The least square support vector machine (LS-SVM) classifier with linear and radial basis function (RBF) kernels is employed for detection and classification of VT vs. VF, non-shock vs. shock and VF vs. non-VF arrhythmia episodes. The accuracy, sensitivity, and specificity values obtained using the proposed method are 89.81, 86.38, and 93.97%, respectively for the classification of Non-VF and VF episodes. Comparison with the performance of the state-of-the-art features demonstrate the advantages of the proposition.

  4. Development of predictive models for total phenolics and free p-coumaric acid contents in barley grain by near-infrared spectroscopy.

    PubMed

    Han, Zhigang; Cai, Shengguan; Zhang, Xuelei; Qian, Qiufeng; Huang, Yuqing; Dai, Fei; Zhang, Guoping

    2017-07-15

    Barley grains are rich in phenolic compounds, which are associated with reduced risk of chronic diseases. Development of barley cultivars with high phenolic acid content has become one of the main objectives in breeding programs. A rapid and accurate method for measuring phenolic compounds would be helpful for crop breeding. We developed predictive models for both total phenolics (TPC) and p-coumaric acid (PA), based on near-infrared spectroscopy (NIRS) analysis. Regressions of partial least squares (PLS) and least squares support vector machine (LS-SVM) were compared for improving the models, and Monte Carlo-Uninformative Variable Elimination (MC-UVE) was applied to select informative wavelengths. The optimal calibration models generated high coefficients of correlation (r pre ) and ratio performance deviation (RPD) for TPC and PA. These results indicated the models are suitable for rapid determination of phenolic compounds in barley grains. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. [Near infrared spectroscopy study on water content in turbine oil].

    PubMed

    Chen, Bin; Liu, Ge; Zhang, Xian-Ming

    2013-11-01

    Near infrared (NIR) spectroscopy combined with successive projections algorithm (SPA) was investigated for determination of water content in turbine oil. Through the 57 samples of different water content in turbine oil scanned applying near infrared (NIR) spectroscopy, with the water content in the turbine oil of 0-0.156%, different pretreatment methods such as the original spectra, first derivative spectra and differential polynomial least squares fitting algorithm Savitzky-Golay (SG), and successive projections algorithm (SPA) were applied for the extraction of effective wavelengths, the correlation coefficient (R) and root mean square error (RMSE) were used as the model evaluation indices, accordingly water content in turbine oil was investigated. The results indicated that the original spectra with different water content in turbine oil were pretreated by the performance of first derivative + SG pretreatments, then the selected effective wavelengths were used as the inputs of least square support vector machine (LS-SVM). A total of 16 variables selected by SPA were employed to construct the model of SPA and least square support vector machine (SPA-LS-SVM). There is 9 as The correlation coefficient was 0.975 9 and the root of mean square error of validation set was 2.655 8 x 10(-3) using the model, and it is feasible to determine the water content in oil using near infrared spectroscopy and SPA-LS-SVM, and an excellent prediction precision was obtained. This study supplied a new and alternative approach to the further application of near infrared spectroscopy in on-line monitoring of contamination such as water content in oil.

  6. [Identification of varieties of textile fibers by using Vis/NIR infrared spectroscopy technique].

    PubMed

    Wu, Gui-Fang; He, Yong

    2010-02-01

    The aim of the present paper was to provide new insight into Vis/NIR spectroscopic analysis of textile fibers. In order to achieve rapid identification of the varieties of fibers, the authors selected 5 kinds of fibers of cotton, flax, wool, silk and tencel to do a study with Vis/NIR spectroscopy. Firstly, the spectra of each kind of fiber were scanned by spectrometer, and principal component analysis (PCA) method was used to analyze the characteristics of the pattern of Vis/NIR spectra. Principal component scores scatter plot (PC1 x PC2 x PC3) of fiber indicated the classification effect of five varieties of fibers. The former 6 principal components (PCs) were selected according to the quantity and size of PCs. The PCA classification model was optimized by using the least-squares support vector machines (LS-SVM) method. The authors used the 6 PCs extracted by PCA as the inputs of LS-SVM, and PCA-LS-SVM model was built to achieve varieties validation as well as mathematical model building and optimization analysis. Two hundred samples (40 samples for each variety of fibers) of five varieties of fibers were used for calibration of PCA-LS-SVM model, and the other 50 samples (10 samples for each variety of fibers) were used for validation. The result of validation showed that Vis/NIR spectroscopy technique based on PCA-LS-SVM had a powerful classification capability. It provides a new method for identifying varieties of fibers rapidly and real time, so it has important significance for protecting the rights of consumers, ensuring the quality of textiles, and implementing rationalization production and transaction of textile materials and its production.

  7. Pan evaporation modeling using six different heuristic computing methods in different climates of China

    NASA Astrophysics Data System (ADS)

    Wang, Lunche; Kisi, Ozgur; Zounemat-Kermani, Mohammad; Li, Hui

    2017-01-01

    Pan evaporation (Ep) plays important roles in agricultural water resources management. One of the basic challenges is modeling Ep using limited climatic parameters because there are a number of factors affecting the evaporation rate. This study investigated the abilities of six different soft computing methods, multi-layer perceptron (MLP), generalized regression neural network (GRNN), fuzzy genetic (FG), least square support vector machine (LSSVM), multivariate adaptive regression spline (MARS), adaptive neuro-fuzzy inference systems with grid partition (ANFIS-GP), and two regression methods, multiple linear regression (MLR) and Stephens and Stewart model (SS) in predicting monthly Ep. Long-term climatic data at various sites crossing a wide range of climates during 1961-2000 are used for model development and validation. The results showed that the models have different accuracies in different climates and the MLP model performed superior to the other models in predicting monthly Ep at most stations using local input combinations (for example, the MAE (mean absolute errors), RMSE (root mean square errors), and determination coefficient (R2) are 0.314 mm/day, 0.405 mm/day and 0.988, respectively for HEB station), while GRNN model performed better in Tibetan Plateau (MAE, RMSE and R2 are 0.459 mm/day, 0.592 mm/day and 0.932, respectively). The accuracies of above models ranked as: MLP, GRNN, LSSVM, FG, ANFIS-GP, MARS and MLR. The overall results indicated that the soft computing techniques generally performed better than the regression methods, but MLR and SS models can be more preferred at some climatic zones instead of complex nonlinear models, for example, the BJ (Beijing), CQ (Chongqing) and HK (Haikou) stations. Therefore, it can be concluded that Ep could be successfully predicted using above models in hydrological modeling studies.

  8. A PCA aided cross-covariance scheme for discriminative feature extraction from EEG signals.

    PubMed

    Zarei, Roozbeh; He, Jing; Siuly, Siuly; Zhang, Yanchun

    2017-07-01

    Feature extraction of EEG signals plays a significant role in Brain-computer interface (BCI) as it can significantly affect the performance and the computational time of the system. The main aim of the current work is to introduce an innovative algorithm for acquiring reliable discriminating features from EEG signals to improve classification performances and to reduce the time complexity. This study develops a robust feature extraction method combining the principal component analysis (PCA) and the cross-covariance technique (CCOV) for the extraction of discriminatory information from the mental states based on EEG signals in BCI applications. We apply the correlation based variable selection method with the best first search on the extracted features to identify the best feature set for characterizing the distribution of mental state signals. To verify the robustness of the proposed feature extraction method, three machine learning techniques: multilayer perceptron neural networks (MLP), least square support vector machine (LS-SVM), and logistic regression (LR) are employed on the obtained features. The proposed methods are evaluated on two publicly available datasets. Furthermore, we evaluate the performance of the proposed methods by comparing it with some recently reported algorithms. The experimental results show that all three classifiers achieve high performance (above 99% overall classification accuracy) for the proposed feature set. Among these classifiers, the MLP and LS-SVM methods yield the best performance for the obtained feature. The average sensitivity, specificity and classification accuracy for these two classifiers are same, which are 99.32%, 100%, and 99.66%, respectively for the BCI competition dataset IVa and 100%, 100%, and 100%, for the BCI competition dataset IVb. The results also indicate the proposed methods outperform the most recently reported methods by at least 0.25% average accuracy improvement in dataset IVa. The execution time results show that the proposed method has less time complexity after feature selection. The proposed feature extraction method is very effective for getting representatives information from mental states EEG signals in BCI applications and reducing the computational complexity of classifiers by reducing the number of extracted features. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Switching and optimizing control for coal flotation process based on a hybrid model

    PubMed Central

    Dong, Zhiyong; Wang, Ranfeng; Fan, Minqiang; Fu, Xiang

    2017-01-01

    Flotation is an important part of coal preparation, and the flotation column is widely applied as efficient flotation equipment. This process is complex and affected by many factors, with the froth depth and reagent dosage being two of the most important and frequently manipulated variables. This paper proposes a new method of switching and optimizing control for the coal flotation process. A hybrid model is built and evaluated using industrial data. First, wavelet analysis and principal component analysis (PCA) are applied for signal pre-processing. Second, a control model for optimizing the set point of the froth depth is constructed based on fuzzy control, and a control model is designed to optimize the reagent dosages based on expert system. Finally, the least squares-support vector machine (LS-SVM) is used to identify the operating conditions of the flotation process and to select one of the two models (froth depth or reagent dosage) for subsequent operation according to the condition parameters. The hybrid model is developed and evaluated on an industrial coal flotation column and exhibits satisfactory performance. PMID:29040305

  10. Iterative variational mode decomposition based automated detection of glaucoma using fundus images.

    PubMed

    Maheshwari, Shishir; Pachori, Ram Bilas; Kanhangad, Vivek; Bhandary, Sulatha V; Acharya, U Rajendra

    2017-09-01

    Glaucoma is one of the leading causes of permanent vision loss. It is an ocular disorder caused by increased fluid pressure within the eye. The clinical methods available for the diagnosis of glaucoma require skilled supervision. They are manual, time consuming, and out of reach of common people. Hence, there is a need for an automated glaucoma diagnosis system for mass screening. In this paper, we present a novel method for an automated diagnosis of glaucoma using digital fundus images. Variational mode decomposition (VMD) method is used in an iterative manner for image decomposition. Various features namely, Kapoor entropy, Renyi entropy, Yager entropy, and fractal dimensions are extracted from VMD components. ReliefF algorithm is used to select the discriminatory features and these features are then fed to the least squares support vector machine (LS-SVM) for classification. Our proposed method achieved classification accuracies of 95.19% and 94.79% using three-fold and ten-fold cross-validation strategies, respectively. This system can aid the ophthalmologists in confirming their manual reading of classes (glaucoma or normal) using fundus images. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. Knee Joint Vibration Signal Analysis with Matching Pursuit Decomposition and Dynamic Weighted Classifier Fusion

    PubMed Central

    Cai, Suxian; Yang, Shanshan; Zheng, Fang; Lu, Meng; Wu, Yunfeng; Krishnan, Sridhar

    2013-01-01

    Analysis of knee joint vibration (VAG) signals can provide quantitative indices for detection of knee joint pathology at an early stage. In addition to the statistical features developed in the related previous studies, we extracted two separable features, that is, the number of atoms derived from the wavelet matching pursuit decomposition and the number of significant signal turns detected with the fixed threshold in the time domain. To perform a better classification over the data set of 89 VAG signals, we applied a novel classifier fusion system based on the dynamic weighted fusion (DWF) method to ameliorate the classification performance. For comparison, a single leastsquares support vector machine (LS-SVM) and the Bagging ensemble were used for the classification task as well. The results in terms of overall accuracy in percentage and area under the receiver operating characteristic curve obtained with the DWF-based classifier fusion method reached 88.76% and 0.9515, respectively, which demonstrated the effectiveness and superiority of the DWF method with two distinct features for the VAG signal analysis. PMID:23573175

  12. Kernel spectral clustering with memory effect

    NASA Astrophysics Data System (ADS)

    Langone, Rocco; Alzate, Carlos; Suykens, Johan A. K.

    2013-05-01

    Evolving graphs describe many natural phenomena changing over time, such as social relationships, trade markets, metabolic networks etc. In this framework, performing community detection and analyzing the cluster evolution represents a critical task. Here we propose a new model for this purpose, where the smoothness of the clustering results over time can be considered as a valid prior knowledge. It is based on a constrained optimization formulation typical of Least Squares Support Vector Machines (LS-SVM), where the objective function is designed to explicitly incorporate temporal smoothness. The latter allows the model to cluster the current data well and to be consistent with the recent history. We also propose new model selection criteria in order to carefully choose the hyper-parameters of our model, which is a crucial issue to achieve good performances. We successfully test the model on four toy problems and on a real world network. We also compare our model with Evolutionary Spectral Clustering, which is a state-of-the-art algorithm for community detection of evolving networks, illustrating that the kernel spectral clustering with memory effect can achieve better or equal performances.

  13. Multimodal data and machine learning for surgery outcome prediction in complicated cases of mesial temporal lobe epilepsy.

    PubMed

    Memarian, Negar; Kim, Sally; Dewar, Sandra; Engel, Jerome; Staba, Richard J

    2015-09-01

    This study sought to predict postsurgical seizure freedom from pre-operative diagnostic test results and clinical information using a rapid automated approach, based on supervised learning methods in patients with drug-resistant focal seizures suspected to begin in temporal lobe. We applied machine learning, specifically a combination of mutual information-based feature selection and supervised learning classifiers on multimodal data, to predict surgery outcome retrospectively in 20 presurgical patients (13 female; mean age±SD, in years 33±9.7 for females, and 35.3±9.4 for males) who were diagnosed with mesial temporal lobe epilepsy (MTLE) and subsequently underwent standard anteromesial temporal lobectomy. The main advantage of the present work over previous studies is the inclusion of the extent of ipsilateral neocortical gray matter atrophy and spatiotemporal properties of depth electrode-recorded seizures as training features for individual patient surgery planning. A maximum relevance minimum redundancy (mRMR) feature selector identified the following features as the most informative predictors of postsurgical seizure freedom in this study's sample of patients: family history of epilepsy, ictal EEG onset pattern (positive correlation with seizure freedom), MRI-based gray matter thickness reduction in the hemisphere ipsilateral to seizure onset, proportion of seizures that first appeared in ipsilateral amygdala to total seizures, age, epilepsy duration, delay in the spread of ipsilateral ictal discharges from site of onset, gender, and number of electrode contacts at seizure onset (negative correlation with seizure freedom). Using these features in combination with a least square support vector machine (LS-SVM) classifier compared to other commonly used classifiers resulted in very high surgical outcome prediction accuracy (95%). Supervised machine learning using multimodal compared to unimodal data accurately predicted postsurgical outcome in patients with atypical MTLE. Published by Elsevier Ltd.

  14. Scoliosis curve type classification using kernel machine from 3D trunk image

    NASA Astrophysics Data System (ADS)

    Adankon, Mathias M.; Dansereau, Jean; Parent, Stefan; Labelle, Hubert; Cheriet, Farida

    2012-03-01

    Adolescent idiopathic scoliosis (AIS) is a deformity of the spine manifested by asymmetry and deformities of the external surface of the trunk. Classification of scoliosis deformities according to curve type is used to plan management of scoliosis patients. Currently, scoliosis curve type is determined based on X-ray exam. However, cumulative exposure to X-rays radiation significantly increases the risk for certain cancer. In this paper, we propose a robust system that can classify the scoliosis curve type from non invasive acquisition of 3D trunk surface of the patients. The 3D image of the trunk is divided into patches and local geometric descriptors characterizing the surface of the back are computed from each patch and forming the features. We perform the reduction of the dimensionality by using Principal Component Analysis and 53 components were retained. In this work a multi-class classifier is built with Least-squares support vector machine (LS-SVM) which is a kernel classifier. For this study, a new kernel was designed in order to achieve a robust classifier in comparison with polynomial and Gaussian kernel. The proposed system was validated using data of 103 patients with different scoliosis curve types diagnosed and classified by an orthopedic surgeon from the X-ray images. The average rate of successful classification was 93.3% with a better rate of prediction for the major thoracic and lumbar/thoracolumbar types.

  15. Evaluation of Chilling Injury in Mangoes Using Multispectral Imaging.

    PubMed

    Hashim, Norhashila; Onwude, Daniel I; Osman, Muhamad Syafiq

    2018-05-01

    Commodities originating from tropical and subtropical climes are prone to chilling injury (CI). This injury could affect the quality and marketing potential of mango after harvest. This will later affect the quality of the produce and subsequent consumer acceptance. In this study, the appearance of CI symptoms in mango was evaluated non-destructively using multispectral imaging. The fruit were stored at 4 °C to induce CI and 12 °C to preserve the quality of the control samples for 4 days before they were taken out and stored at ambient temperature for 24 hr. Measurements using multispectral imaging and standard reference methods were conducted before and after storage. The performance of multispectral imaging was compared using standard reference properties including moisture content (MC), total soluble solids (TSS) content, firmness, pH, and color. Least square support vector machine (LS-SVM) combined with principal component analysis (PCA) were used to discriminate CI samples with those of control and before storage, respectively. The statistical results demonstrated significant changes in the reference quality properties of samples before and after storage. The results also revealed that multispectral parameters have a strong correlation with the reference parameters of L * , a * , TSS, and MC. The MC and L * were found to be the best reference parameters in identifying the severity of CI in mangoes. PCA and LS-SVM analysis indicated that the fruit were successfully classified into their categories, that is, before storage, control, and CI. This indicated that the multispectral imaging technique is feasible for detecting CI in mangoes during postharvest storage and processing. This paper demonstrates a fast, easy, and accurate method of identifying the effect of cold storage on mango, nondestructively. The method presented in this paper can be used industrially to efficiently differentiate different fruits from each other after low temperature storage. © 2018 Institute of Food Technologists®.

  16. Hyperspectral imaging for predicting the allicin and soluble solid content of garlic with variable selection algorithms and chemometric models.

    PubMed

    Rahman, Anisur; Faqeerzada, Mohammad A; Cho, Byoung-Kwan

    2018-03-14

    Allicin and soluble solid content (SSC) in garlic is the responsible for its pungent flavor and odor. However, current conventional methods such as the use of high-pressure liquid chromatography and a refractometer have critical drawbacks in that they are time-consuming, labor-intensive and destructive procedures. The present study aimed to predict allicin and SSC in garlic using hyperspectral imaging in combination with variable selection algorithms and calibration models. Hyperspectral images of 100 garlic cloves were acquired that covered two spectral ranges, from which the mean spectra of each clove were extracted. The calibration models included partial least squares (PLS) and least squares-support vector machine (LS-SVM) regression, as well as different spectral pre-processing techniques, from which the highest performing spectral preprocessing technique and spectral range were selected. Then, variable selection methods, such as regression coefficients, variable importance in projection (VIP) and the successive projections algorithm (SPA), were evaluated for the selection of effective wavelengths (EWs). Furthermore, PLS and LS-SVM regression methods were applied to quantitatively predict the quality attributes of garlic using the selected EWs. Of the established models, the SPA-LS-SVM model obtained an Rpred2 of 0.90 and standard error of prediction (SEP) of 1.01% for SSC prediction, whereas the VIP-LS-SVM model produced the best result with an Rpred2 of 0.83 and SEP of 0.19 mg g -1 for allicin prediction in the range 1000-1700 nm. Furthermore, chemical images of garlic were developed using the best predictive model to facilitate visualization of the spatial distributions of allicin and SSC. The present study clearly demonstrates that hyperspectral imaging combined with an appropriate chemometrics method can potentially be employed as a fast, non-invasive method to predict the allicin and SSC in garlic. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.

  17. [Fast catalogue of alien invasive weeds by Vis/NIR spectroscopy].

    PubMed

    Yu, Jia-Jia; Zou, Wei; He, Yong; Xu, Zheng-Hao

    2009-11-01

    The feasibility of visible and short-wave near-infrared spectroscopy (VIS/WNIR) techniques as means for the nondestructive and fast detection of alien invasive weeds was evaluated. Selected sensitive bands were found validated. In the present study, 3 kinds of alien invasive weeds, Veronica persica, Veronica polita, and Veronica arvensis Linn, and one kind of local weed, Lamiaceae amplexicaule Linn, were employed. The results showed that visible and NIR (Vis/NIR) technology could be introduced in classification of the alien invasive weeds or local weed with the similar outline. Thirty x 4 weeds samples were randomly selected for the calibration set, while the remaining 20 x 4 samples for the prediction set. Smoothing methods of moving average and standard normal variate (SNV) were used to pretreat spectra data. Based on principal components analysis, soft independent models of class analogy (SIMCA) were applied to make the model. Four frontal principal components of each catalogues were applied as the input of SIMCA, and with a significance level of 0.05, recognition ratio of 78.75% was obtained. The average prediction result is 90% except for Veronica polita. According to the modeling power of each spectra data in SIMCA, some possible sensitive bands, 496-521, 589-626 and 789-926 nm, were founded. By using these possible sensitive bands as the inputs of least squares support vector machine (LS-SVM), and setting the result of LS-SVM as the object function value of genetic algorithm (GA), mutational rate, crossover rate and population size were set up as 0.9, 0.5 and 50 respectively. Finally recognition ratio of 95.63% was obtained. The prediction results of 95.63% indicated that the selected wavelengths reflected the main characteristics of the four weeds, which proposed a new way to accelerate the research on cataloguing alien invasive weeds.

  18. Early detection of germinated wheat grains using terahertz image and chemometrics

    NASA Astrophysics Data System (ADS)

    Jiang, Yuying; Ge, Hongyi; Lian, Feiyu; Zhang, Yuan; Xia, Shanhong

    2016-02-01

    In this paper, we propose a feasible tool that uses a terahertz (THz) imaging system for identifying wheat grains at different stages of germination. The THz spectra of the main changed components of wheat grains, maltose and starch, which were obtained by THz time spectroscopy, were distinctly different. Used for original data compression and feature extraction, principal component analysis (PCA) revealed the changes that occurred in the inner chemical structure during germination. Two thresholds, one indicating the start of the release of α-amylase and the second when it reaches the steady state, were obtained through the first five score images. Thus, the first five PCs were input for the partial least-squares regression (PLSR), least-squares support vector machine (LS-SVM), and back-propagation neural network (BPNN) models, which were used to classify seven different germination times between 0 and 48 h, with a prediction accuracy of 92.85%, 93.57%, and 90.71%, respectively. The experimental results indicated that the combination of THz imaging technology and chemometrics could be a new effective way to discriminate wheat grains at the early germination stage of approximately 6 h.

  19. A Non-destructive Terahertz Spectroscopy-Based Method for Transgenic Rice Seed Discrimination via Sparse Representation

    NASA Astrophysics Data System (ADS)

    Hu, Xiaohua; Lang, Wenhui; Liu, Wei; Xu, Xue; Yang, Jianbo; Zheng, Lei

    2017-08-01

    Terahertz (THz) spectroscopy technique has been researched and developed for rapid and non-destructive detection of food safety and quality due to its low-energy and non-ionizing characteristics. The objective of this study was to develop a flexible identification model to discriminate transgenic and non-transgenic rice seeds based on terahertz (THz) spectroscopy. To extract THz spectral features and reduce the feature dimension, sparse representation (SR) is employed in this work. A sufficient sparsity level is selected to train the sparse coding of the THz data, and the random forest (RF) method is then applied to obtain a discrimination model. The results show that there exist differences between transgenic and non-transgenic rice seeds in THz spectral band and, comparing with Least squares support vector machines (LS-SVM) method, SR-RF is a better model for discrimination (accuracy is 95% in prediction set, 100% in calibration set, respectively). The conclusion is that SR may be more useful in the application of THz spectroscopy to reduce dimension and the SR-RF provides a new, effective, and flexible method for detection and identification of transgenic and non-transgenic rice seeds with THz spectral system.

  20. [Identification of Pummelo Cultivars Based on Hyperspectral Imaging Technology].

    PubMed

    Li, Xun-lan; Yi, Shi-lai; He, Shao-lan; Lü, Qiang; Xie, Rang-jin; Zheng, Yong-qiang; Deng, Lie

    2015-09-01

    Existing methods for the identification of pummelo cultivars are usually time-consuming and costly, and are therefore inconvenient to be used in cases that a rapid identification is needed. This research was aimed at identifying different pummelo cultivars by hyperspectral imaging technology which can achieve a rapid and highly sensitive measurement. A total of 240 leaf samples, 60 for each of the four cultivars were investigated. Samples were divided into two groups such as calibration set (48 samples of each cultivar) and validation set (12 samples of each cultivar) by a Kennard-Stone-based algorithm. Hyperspectral images of both adaxial and abaxial surfaces of each leaf were obtained, and were segmented into a region of interest (ROI) using a simple threshold. Spectra of leaf samples were extracted from ROI. To remove the absolute noises of the spectra, only the date of spectral range 400~1000 nm was used for analysis. Multiplicative scatter correction (MSC) and standard normal variable (SNV) were utilized for data preprocessing. Principal component analysis (PCA) was used to extract the best principal components, and successive projections algorithm (SPA) was used to extract the effective wavelengths. Least squares support vector machine (LS-SVM) was used to obtain the discrimination model of the four different pummelo cultivars. To find out the optimal values of σ2 and γ which were important parameters in LS-SVM modeling, Grid-search technique and Cross-Validation were applied. The first 10 and 11 principal components were extracted by PCA for the hyperspectral data of adaxial surface and abaxial surface, respectively. There were 31 and 21 effective wavelengths selected by SPA based on the hyperspectral data of adaxial surface and abaxial surface, respectively. The best principal components and the effective wavelengths were used as inputs of LS-SVM models, and then the PCA-LS-SVM model and the SPA-LS-SVM model were built. The results showed that 99.46% and 98.44% of identification accuracy was achieved in the calibration set for the PCA-LS-SVM model and the SPA-LS-SVM model, respectively, and a 95.83% of identification accuracy was achieved in the validation set for both the PCA-LS-SVM and the SPA- LS-SVM models, which were built based on the hyperspectral data of adaxial surface. Comparatively, the results of the PCA-LS-SVM and the SPA-LS-SVM models built based on the hyperspectral data of abaxial surface both achieved identification accuracies of 100% for both calibration set and validation set. The overall results demonstrated that use of hyperspectral data of adaxial and abaxial leaf surfaces coupled with the use of PCA-LS-SVM and the SPA-LS-SVM could achieve an accurate identification of pummelo cultivars. It was feasible to use hyperspectral imaging technology to identify different pummelo cultivars, and hyperspectral imaging technology provided an alternate way of rapid identification of pummelo cultivars. Moreover, the results in this paper demonstrated that the data from the abaxial surface of leaf was more sensitive in identifying pummelo cultivars. This study provided a new method for to the fast discrimination of pummelo cultivars.

  1. CCD-Based Skinning Injury Recognition on Potato Tubers (Solanum tuberosum L.): A Comparison between Visible and Biospeckle Imaging

    PubMed Central

    Gao, Yingwang; Geng, Jinfeng; Rao, Xiuqin; Ying, Yibin

    2016-01-01

    Skinning injury on potato tubers is a kind of superficial wound that is generally inflicted by mechanical forces during harvest and postharvest handling operations. Though skinning injury is pervasive and obstructive, its detection is very limited. This study attempted to identify injured skin using two CCD (Charge Coupled Device) sensor-based machine vision technologies, i.e., visible imaging and biospeckle imaging. The identification of skinning injury was realized via exploiting features extracted from varied ROIs (Region of Interests). The features extracted from visible images were pixel-wise color and texture features, while region-wise BA (Biospeckle Activity) was calculated from biospeckle imaging. In addition, the calculation of BA using varied numbers of speckle patterns were compared. Finally, extracted features were implemented into classifiers of LS-SVM (Least Square Support Vector Machine) and BLR (Binary Logistic Regression), respectively. Results showed that color features performed better than texture features in classifying sound skin and injured skin, especially for injured skin stored no less than 1 day, with the average classification accuracy of 90%. Image capturing and processing efficiency can be speeded up in biospeckle imaging, with captured 512 frames reduced to 125 frames. Classification results obtained based on the feature of BA were acceptable for early skinning injury stored within 1 day, with the accuracy of 88.10%. It is concluded that skinning injury can be recognized by visible and biospeckle imaging during different stages. Visible imaging has the aptitude in recognizing stale skinning injury, while fresh injury can be discriminated by biospeckle imaging. PMID:27763555

  2. CCD-Based Skinning Injury Recognition on Potato Tubers (Solanum tuberosum L.): A Comparison between Visible and Biospeckle Imaging.

    PubMed

    Gao, Yingwang; Geng, Jinfeng; Rao, Xiuqin; Ying, Yibin

    2016-10-18

    Skinning injury on potato tubers is a kind of superficial wound that is generally inflicted by mechanical forces during harvest and postharvest handling operations. Though skinning injury is pervasive and obstructive, its detection is very limited. This study attempted to identify injured skin using two CCD (Charge Coupled Device) sensor-based machine vision technologies, i.e., visible imaging and biospeckle imaging. The identification of skinning injury was realized via exploiting features extracted from varied ROIs (Region of Interests). The features extracted from visible images were pixel-wise color and texture features, while region-wise BA (Biospeckle Activity) was calculated from biospeckle imaging. In addition, the calculation of BA using varied numbers of speckle patterns were compared. Finally, extracted features were implemented into classifiers of LS-SVM (Least Square Support Vector Machine) and BLR (Binary Logistic Regression), respectively. Results showed that color features performed better than texture features in classifying sound skin and injured skin, especially for injured skin stored no less than 1 day, with the average classification accuracy of 90%. Image capturing and processing efficiency can be speeded up in biospeckle imaging, with captured 512 frames reduced to 125 frames. Classification results obtained based on the feature of BA were acceptable for early skinning injury stored within 1 day, with the accuracy of 88.10%. It is concluded that skinning injury can be recognized by visible and biospeckle imaging during different stages. Visible imaging has the aptitude in recognizing stale skinning injury, while fresh injury can be discriminated by biospeckle imaging.

  3. Application of the Teager-Kaiser energy operator in bearing fault diagnosis.

    PubMed

    Henríquez Rodríguez, Patricia; Alonso, Jesús B; Ferrer, Miguel A; Travieso, Carlos M

    2013-03-01

    Condition monitoring of rotating machines is important in the prevention of failures. As most machine malfunctions are related to bearing failures, several bearing diagnosis techniques have been developed. Some of them feature the bearing vibration signal with statistical measures and others extract the bearing fault characteristic frequency from the AM component of the vibration signal. In this paper, we propose to transform the vibration signal to the Teager-Kaiser domain and feature it with statistical and energy-based measures. A bearing database with normal and faulty bearings is used. The diagnosis is performed with two classifiers: a neural network classifier and a LS-SVM classifier. Experiments show that the Teager domain features outperform those based on the temporal or AM signal. Copyright © 2012 ISA. Published by Elsevier Ltd. All rights reserved.

  4. Detection of Soil Nitrogen Using Near Infrared Sensors Based on Soil Pretreatment and Algorithms

    PubMed Central

    Nie, Pengcheng; Dong, Tao; He, Yong; Qu, Fangfang

    2017-01-01

    Soil nitrogen content is one of the important growth nutrient parameters of crops. It is a prerequisite for scientific fertilization to accurately grasp soil nutrient information in precision agriculture. The information about nutrients such as nitrogen in the soil can be obtained quickly by using a near-infrared sensor. The data can be analyzed in the detection process, which is nondestructive and non-polluting. In order to investigate the effect of soil pretreatment on nitrogen content by near infrared sensor, 16 nitrogen concentrations were mixed with soil and the soil samples were divided into three groups with different pretreatment. The first group of soil samples with strict pretreatment were dried, ground, sieved and pressed. The second group of soil samples were dried and ground. The third group of soil samples were simply dried. Three linear different modeling methods are used to analyze the spectrum, including partial least squares (PLS), uninformative variable elimination (UVE), competitive adaptive reweighted algorithm (CARS). The model of nonlinear partial least squares which supports vector machine (LS-SVM) is also used to analyze the soil reflectance spectrum. The results show that the soil samples with strict pretreatment have the best accuracy in predicting nitrogen content by near-infrared sensor, and the pretreatment method is suitable for practical application. PMID:28492480

  5. Successive Projections Algorithm-Multivariable Linear Regression Classifier for the Detection of Contaminants on Chicken Carcasses in Hyperspectral Images

    NASA Astrophysics Data System (ADS)

    Wu, W.; Chen, G. Y.; Kang, R.; Xia, J. C.; Huang, Y. P.; Chen, K. J.

    2017-07-01

    During slaughtering and further processing, chicken carcasses are inevitably contaminated by microbial pathogen contaminants. Due to food safety concerns, many countries implement a zero-tolerance policy that forbids the placement of visibly contaminated carcasses in ice-water chiller tanks during processing. Manual detection of contaminants is labor consuming and imprecise. Here, a successive projections algorithm (SPA)-multivariable linear regression (MLR) classifier based on an optimal performance threshold was developed for automatic detection of contaminants on chicken carcasses. Hyperspectral images were obtained using a hyperspectral imaging system. A regression model of the classifier was established by MLR based on twelve characteristic wavelengths (505, 537, 561, 562, 564, 575, 604, 627, 656, 665, 670, and 689 nm) selected by SPA , and the optimal threshold T = 1 was obtained from the receiver operating characteristic (ROC) analysis. The SPA-MLR classifier provided the best detection results when compared with the SPA-partial least squares (PLS) regression classifier and the SPA-least squares supported vector machine (LS-SVM) classifier. The true positive rate (TPR) of 100% and the false positive rate (FPR) of 0.392% indicate that the SPA-MLR classifier can utilize spatial and spectral information to effectively detect contaminants on chicken carcasses.

  6. Application of a novel hybrid method for spatiotemporal data imputation: A case study of the Minqin County groundwater level

    NASA Astrophysics Data System (ADS)

    Zhang, Zhongrong; Yang, Xuan; Li, Hao; Li, Weide; Yan, Haowen; Shi, Fei

    2017-10-01

    The techniques for data analyses have been widely developed in past years, however, missing data still represent a ubiquitous problem in many scientific fields. In particular, dealing with missing spatiotemporal data presents an enormous challenge. Nonetheless, in recent years, a considerable amount of research has focused on spatiotemporal problems, making spatiotemporal missing data imputation methods increasingly indispensable. In this paper, a novel spatiotemporal hybrid method is proposed to verify and imputed spatiotemporal missing values. This new method, termed SOM-FLSSVM, flexibly combines three advanced techniques: self-organizing feature map (SOM) clustering, the fruit fly optimization algorithm (FOA) and the least squares support vector machine (LSSVM). We employ a cross-validation (CV) procedure and FOA swarm intelligence optimization strategy that can search available parameters and determine the optimal imputation model. The spatiotemporal underground water data for Minqin County, China, were selected to test the reliability and imputation ability of SOM-FLSSVM. We carried out a validation experiment and compared three well-studied models with SOM-FLSSVM using a different missing data ratio from 0.1 to 0.8 in the same data set. The results demonstrate that the new hybrid method performs well in terms of both robustness and accuracy for spatiotemporal missing data.

  7. Application of Hyperspectral Imaging and Chemometric Calibrations for Variety Discrimination of Maize Seeds

    PubMed Central

    Zhang, Xiaolei; Liu, Fei; He, Yong; Li, Xiaoli

    2012-01-01

    Hyperspectral imaging in the visible and near infrared (VIS-NIR) region was used to develop a novel method for discriminating different varieties of commodity maize seeds. Firstly, hyperspectral images of 330 samples of six varieties of maize seeds were acquired using a hyperspectral imaging system in the 380–1,030 nm wavelength range. Secondly, principal component analysis (PCA) and kernel principal component analysis (KPCA) were used to explore the internal structure of the spectral data. Thirdly, three optimal wavelengths (523, 579 and 863 nm) were selected by implementing PCA directly on each image. Then four textural variables including contrast, homogeneity, energy and correlation were extracted from gray level co-occurrence matrix (GLCM) of each monochromatic image based on the optimal wavelengths. Finally, several models for maize seeds identification were established by least squares-support vector machine (LS-SVM) and back propagation neural network (BPNN) using four different combinations of principal components (PCs), kernel principal components (KPCs) and textural features as input variables, respectively. The recognition accuracy achieved in the PCA-GLCM-LS-SVM model (98.89%) was the most satisfactory one. We conclude that hyperspectral imaging combined with texture analysis can be implemented for fast classification of different varieties of maize seeds. PMID:23235456

  8. The assisted prediction modelling frame with hybridisation and ensemble for business risk forecasting and an implementation

    NASA Astrophysics Data System (ADS)

    Li, Hui; Hong, Lu-Yao; Zhou, Qing; Yu, Hai-Jie

    2015-08-01

    The business failure of numerous companies results in financial crises. The high social costs associated with such crises have made people to search for effective tools for business risk prediction, among which, support vector machine is very effective. Several modelling means, including single-technique modelling, hybrid modelling, and ensemble modelling, have been suggested in forecasting business risk with support vector machine. However, existing literature seldom focuses on the general modelling frame for business risk prediction, and seldom investigates performance differences among different modelling means. We reviewed researches on forecasting business risk with support vector machine, proposed the general assisted prediction modelling frame with hybridisation and ensemble (APMF-WHAE), and finally, investigated the use of principal components analysis, support vector machine, random sampling, and group decision, under the general frame in forecasting business risk. Under the APMF-WHAE frame with support vector machine as the base predictive model, four specific predictive models were produced, namely, pure support vector machine, a hybrid support vector machine involved with principal components analysis, a support vector machine ensemble involved with random sampling and group decision, and an ensemble of hybrid support vector machine using group decision to integrate various hybrid support vector machines on variables produced from principle components analysis and samples from random sampling. The experimental results indicate that hybrid support vector machine and ensemble of hybrid support vector machines were able to produce dominating performance than pure support vector machine and support vector machine ensemble.

  9. Kernel learning at the first level of inference.

    PubMed

    Cawley, Gavin C; Talbot, Nicola L C

    2014-05-01

    Kernel learning methods, whether Bayesian or frequentist, typically involve multiple levels of inference, with the coefficients of the kernel expansion being determined at the first level and the kernel and regularisation parameters carefully tuned at the second level, a process known as model selection. Model selection for kernel machines is commonly performed via optimisation of a suitable model selection criterion, often based on cross-validation or theoretical performance bounds. However, if there are a large number of kernel parameters, as for instance in the case of automatic relevance determination (ARD), there is a substantial risk of over-fitting the model selection criterion, resulting in poor generalisation performance. In this paper we investigate the possibility of learning the kernel, for the Least-Squares Support Vector Machine (LS-SVM) classifier, at the first level of inference, i.e. parameter optimisation. The kernel parameters and the coefficients of the kernel expansion are jointly optimised at the first level of inference, minimising a training criterion with an additional regularisation term acting on the kernel parameters. The key advantage of this approach is that the values of only two regularisation parameters need be determined in model selection, substantially alleviating the problem of over-fitting the model selection criterion. The benefits of this approach are demonstrated using a suite of synthetic and real-world binary classification benchmark problems, where kernel learning at the first level of inference is shown to be statistically superior to the conventional approach, improves on our previous work (Cawley and Talbot, 2007) and is competitive with Multiple Kernel Learning approaches, but with reduced computational expense. Copyright © 2014 Elsevier Ltd. All rights reserved.

  10. Hyperspectral Imaging for Predicting the Internal Quality of Kiwifruits Based on Variable Selection Algorithms and Chemometric Models.

    PubMed

    Zhu, Hongyan; Chu, Bingquan; Fan, Yangyang; Tao, Xiaoya; Yin, Wenxin; He, Yong

    2017-08-10

    We investigated the feasibility and potentiality of determining firmness, soluble solids content (SSC), and pH in kiwifruits using hyperspectral imaging, combined with variable selection methods and calibration models. The images were acquired by a push-broom hyperspectral reflectance imaging system covering two spectral ranges. Weighted regression coefficients (BW), successive projections algorithm (SPA) and genetic algorithm-partial least square (GAPLS) were compared and evaluated for the selection of effective wavelengths. Moreover, multiple linear regression (MLR), partial least squares regression and least squares support vector machine (LS-SVM) were developed to predict quality attributes quantitatively using effective wavelengths. The established models, particularly SPA-MLR, SPA-LS-SVM and GAPLS-LS-SVM, performed well. The SPA-MLR models for firmness (R pre  = 0.9812, RPD = 5.17) and SSC (R pre  = 0.9523, RPD = 3.26) at 380-1023 nm showed excellent performance, whereas GAPLS-LS-SVM was the optimal model at 874-1734 nm for predicting pH (R pre  = 0.9070, RPD = 2.60). Image processing algorithms were developed to transfer the predictive model in every pixel to generate prediction maps that visualize the spatial distribution of firmness and SSC. Hence, the results clearly demonstrated that hyperspectral imaging has the potential as a fast and non-invasive method to predict the quality attributes of kiwifruits.

  11. Potential of Visible and Near Infrared Spectroscopy and Pattern Recognition for Rapid Quantification of Notoginseng Powder with Adulterants

    PubMed Central

    Nie, Pengcheng; Wu, Di; Sun, Da-Wen; Cao, Fang; Bao, Yidan; He, Yong

    2013-01-01

    Notoginseng is a classical traditional Chinese medical herb, which is of high economic and medical value. Notoginseng powder (NP) could be easily adulterated with Sophora flavescens powder (SFP) or corn flour (CF), because of their similar tastes and appearances and much lower cost for these adulterants. The objective of this study is to quantify the NP content in adulterated NP by using a rapid and non-destructive visible and near infrared (Vis-NIR) spectroscopy method. Three wavelength ranges of visible spectra, short-wave near infrared spectra (SNIR) and long-wave near infrared spectra (LNIR) were separately used to establish the model based on two calibration methods of partial least square regression (PLSR) and least-squares support vector machines (LS-SVM), respectively. Competitive adaptive reweighted sampling (CARS) was conducted to identify the most important wavelengths/variables that had the greatest influence on the adulterant quantification throughout the whole wavelength range. The CARS-PLSR models based on LNIR were determined as the best models for the quantification of NP adulterated with SFP, CF, and their mixtures, in which the rP values were 0.940, 0.939, and 0.867 for the three models respectively. The research demonstrated the potential of the Vis-NIR spectroscopy technique for the rapid and non-destructive quantification of NP containing adulterants. PMID:24129019

  12. Research on bearing fault diagnosis of large machinery based on mathematical morphology

    NASA Astrophysics Data System (ADS)

    Wang, Yu

    2018-04-01

    To study the automatic diagnosis of large machinery fault based on support vector machine, combining the four common faults of the large machinery, the support vector machine is used to classify and identify the fault. The extracted feature vectors are entered. The feature vector is trained and identified by multi - classification method. The optimal parameters of the support vector machine are searched by trial and error method and cross validation method. Then, the support vector machine is compared with BP neural network. The results show that the support vector machines are short in time and high in classification accuracy. It is more suitable for the research of fault diagnosis in large machinery. Therefore, it can be concluded that the training speed of support vector machines (SVM) is fast and the performance is good.

  13. Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.

    PubMed

    Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi

    2013-01-01

    The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.

  14. Community detection using Kernel Spectral Clustering with memory

    NASA Astrophysics Data System (ADS)

    Langone, Rocco; Suykens, Johan A. K.

    2013-02-01

    This work is related to the problem of community detection in dynamic scenarios, which for instance arises in the segmentation of moving objects, clustering of telephone traffic data, time-series micro-array data etc. A desirable feature of a clustering model which has to capture the evolution of communities over time is the temporal smoothness between clusters in successive time-steps. In this way the model is able to track the long-term trend and in the same time it smooths out short-term variation due to noise. We use the Kernel Spectral Clustering with Memory effect (MKSC) which allows to predict cluster memberships of new nodes via out-of-sample extension and has a proper model selection scheme. It is based on a constrained optimization formulation typical of Least Squares Support Vector Machines (LS-SVM), where the objective function is designed to explicitly incorporate temporal smoothness as a valid prior knowledge. The latter, in fact, allows the model to cluster the current data well and to be consistent with the recent history. Here we propose a generalization of the MKSC model with an arbitrary memory, not only one time-step in the past. The experiments conducted on toy problems confirm our expectations: the more memory we add to the model, the smoother over time are the clustering results. We also compare with the Evolutionary Spectral Clustering (ESC) algorithm which is a state-of-the art method, and we obtain comparable or better results.

  15. Currency crisis indication by using ensembles of support vector machine classifiers

    NASA Astrophysics Data System (ADS)

    Ramli, Nor Azuana; Ismail, Mohd Tahir; Wooi, Hooy Chee

    2014-07-01

    There are many methods that had been experimented in the analysis of currency crisis. However, not all methods could provide accurate indications. This paper introduces an ensemble of classifiers by using Support Vector Machine that's never been applied in analyses involving currency crisis before with the aim of increasing the indication accuracy. The proposed ensemble classifiers' performances are measured using percentage of accuracy, root mean squared error (RMSE), area under the Receiver Operating Characteristics (ROC) curve and Type II error. The performances of an ensemble of Support Vector Machine classifiers are compared with the single Support Vector Machine classifier and both of classifiers are tested on the data set from 27 countries with 12 macroeconomic indicators for each country. From our analyses, the results show that the ensemble of Support Vector Machine classifiers outperforms single Support Vector Machine classifier on the problem involving indicating a currency crisis in terms of a range of standard measures for comparing the performance of classifiers.

  16. Methods, systems and apparatus for controlling third harmonic voltage when operating a multi-space machine in an overmodulation region

    DOEpatents

    Perisic, Milun; Kinoshita, Michael H; Ranson, Ray M; Gallegos-Lopez, Gabriel

    2014-06-03

    Methods, system and apparatus are provided for controlling third harmonic voltages when operating a multi-phase machine in an overmodulation region. The multi-phase machine can be, for example, a five-phase machine in a vector controlled motor drive system that includes a five-phase PWM controlled inverter module that drives the five-phase machine. Techniques for overmodulating a reference voltage vector are provided. For example, when the reference voltage vector is determined to be within the overmodulation region, an angle of the reference voltage vector can be modified to generate a reference voltage overmodulation control angle, and a magnitude of the reference voltage vector can be modified, based on the reference voltage overmodulation control angle, to generate a modified magnitude of the reference voltage vector. By modifying the reference voltage vector, voltage command signals that control a five-phase inverter module can be optimized to increase output voltages generated by the five-phase inverter module.

  17. Signal detection using support vector machines in the presence of ultrasonic speckle

    NASA Astrophysics Data System (ADS)

    Kotropoulos, Constantine L.; Pitas, Ioannis

    2002-04-01

    Support Vector Machines are a general algorithm based on guaranteed risk bounds of statistical learning theory. They have found numerous applications, such as in classification of brain PET images, optical character recognition, object detection, face verification, text categorization and so on. In this paper we propose the use of support vector machines to segment lesions in ultrasound images and we assess thoroughly their lesion detection ability. We demonstrate that trained support vector machines with a Radial Basis Function kernel segment satisfactorily (unseen) ultrasound B-mode images as well as clinical ultrasonic images.

  18. Support Vector Machines Model of Computed Tomography for Assessing Lymph Node Metastasis in Esophageal Cancer with Neoadjuvant Chemotherapy.

    PubMed

    Wang, Zhi-Long; Zhou, Zhi-Guo; Chen, Ying; Li, Xiao-Ting; Sun, Ying-Shi

    The aim of this study was to diagnose lymph node metastasis of esophageal cancer by support vector machines model based on computed tomography. A total of 131 esophageal cancer patients with preoperative chemotherapy and radical surgery were included. Various indicators (tumor thickness, tumor length, tumor CT value, total number of lymph nodes, and long axis and short axis sizes of largest lymph node) on CT images before and after neoadjuvant chemotherapy were recorded. A support vector machines model based on these CT indicators was built to predict lymph node metastasis. Support vector machines model diagnosed lymph node metastasis better than preoperative short axis size of largest lymph node on CT. The area under the receiver operating characteristic curves were 0.887 and 0.705, respectively. The support vector machine model of CT images can help diagnose lymph node metastasis in esophageal cancer with preoperative chemotherapy.

  19. A new method for the prediction of chatter stability lobes based on dynamic cutting force simulation model and support vector machine

    NASA Astrophysics Data System (ADS)

    Peng, Chong; Wang, Lun; Liao, T. Warren

    2015-10-01

    Currently, chatter has become the critical factor in hindering machining quality and productivity in machining processes. To avoid cutting chatter, a new method based on dynamic cutting force simulation model and support vector machine (SVM) is presented for the prediction of chatter stability lobes. The cutting force is selected as the monitoring signal, and the wavelet energy entropy theory is used to extract the feature vectors. A support vector machine is constructed using the MATLAB LIBSVM toolbox for pattern classification based on the feature vectors derived from the experimental cutting data. Then combining with the dynamic cutting force simulation model, the stability lobes diagram (SLD) can be estimated. Finally, the predicted results are compared with existing methods such as zero-order analytical (ZOA) and semi-discretization (SD) method as well as actual cutting experimental results to confirm the validity of this new method.

  20. [Different wavelengths selection methods for identification of early blight on tomato leaves by using hyperspectral imaging technique].

    PubMed

    Cheng, Shu-Xi; Xie, Chuan-Qi; Wang, Qiao-Nan; He, Yong; Shao, Yong-Ni

    2014-05-01

    Identification of early blight on tomato leaves by using hyperspectral imaging technique based on different effective wavelengths selection methods (successive projections algorithm, SPA; x-loading weights, x-LW; gram-schmidt orthogonaliza-tion, GSO) was studied in the present paper. Hyperspectral images of seventy healthy and seventy infected tomato leaves were obtained by hyperspectral imaging system across the wavelength range of 380-1023 nm. Reflectance of all pixels in region of interest (ROI) was extracted by ENVI 4. 7 software. Least squares-support vector machine (LS-SVM) model was established based on the full spectral wavelengths. It obtained an excellent result with the highest identification accuracy (100%) in both calibration and prediction sets. Then, EW-LS-SVM and EW-LDA models were established based on the selected wavelengths suggested by SPA, x-LW and GSO, respectively. The results showed that all of the EW-LS-SVM and EW-LDA models performed well with the identification accuracy of 100% in EW-LS-SVM model and 100%, 100% and 97. 83% in EW-LDA model, respectively. Moreover, the number of input wavelengths of SPA-LS-SVM, x-LW-LS-SVM and GSO-LS-SVM models were four (492, 550, 633 and 680 nm), three (631, 719 and 747 nm) and two (533 and 657 nm), respectively. Fewer input variables were beneficial for the development of identification instrument. It demonstrated that it is feasible to identify early blight on tomato leaves by using hyperspectral imaging, and SPA, x-LW and GSO were effective wavelengths selection methods.

  1. Rapid determination of biogenic amines in cooked beef using hyperspectral imaging with sparse representation algorithm

    NASA Astrophysics Data System (ADS)

    Yang, Dong; Lu, Anxiang; Ren, Dong; Wang, Jihua

    2017-11-01

    This study explored the feasibility of rapid detection of biogenic amines (BAs) in cooked beef during the storage process using hyperspectral imaging technique combined with sparse representation (SR) algorithm. The hyperspectral images of samples were collected in the two spectral ranges of 400-1000 nm and 1000-1800 nm, separately. The spectral data were reduced dimensionality by SR and principal component analysis (PCA) algorithms, and then integrated the least square support vector machine (LS-SVM) to build the SR-LS-SVM and PC-LS-SVM models for the prediction of BAs values in cooked beef. The results showed that the SR-LS-SVM model exhibited the best predictive ability with determination coefficients (RP2) of 0.943 and root mean square errors (RMSEP) of 1.206 in the range of 400-1000 nm of prediction set. The SR and PCA algorithms were further combined to establish the best SR-PC-LS-SVM model for BAs prediction, which had high RP2of 0.969 and low RMSEP of 1.039 in the region of 400-1000 nm. The visual map of the BAs was generated using the best SR-PC-LS-SVM model with imaging process algorithms, which could be used to observe the changes of BAs in cooked beef more intuitively. The study demonstrated that hyperspectral imaging technique combined with sparse representation were able to detect effectively the BAs values in cooked beef during storage and the built SR-PC-LS-SVM model had a potential for rapid and accurate determination of freshness indexes in other meat and meat products.

  2. Comparing machine learning and logistic regression methods for predicting hypertension using a combination of gene expression and next-generation sequencing data.

    PubMed

    Held, Elizabeth; Cape, Joshua; Tintle, Nathan

    2016-01-01

    Machine learning methods continue to show promise in the analysis of data from genetic association studies because of the high number of variables relative to the number of observations. However, few best practices exist for the application of these methods. We extend a recently proposed supervised machine learning approach for predicting disease risk by genotypes to be able to incorporate gene expression data and rare variants. We then apply 2 different versions of the approach (radial and linear support vector machines) to simulated data from Genetic Analysis Workshop 19 and compare performance to logistic regression. Method performance was not radically different across the 3 methods, although the linear support vector machine tended to show small gains in predictive ability relative to a radial support vector machine and logistic regression. Importantly, as the number of genes in the models was increased, even when those genes contained causal rare variants, model predictive ability showed a statistically significant decrease in performance for both the radial support vector machine and logistic regression. The linear support vector machine showed more robust performance to the inclusion of additional genes. Further work is needed to evaluate machine learning approaches on larger samples and to evaluate the relative improvement in model prediction from the incorporation of gene expression data.

  3. A comparative study of surface EMG classification by fuzzy relevance vector machine and fuzzy support vector machine.

    PubMed

    Xie, Hong-Bo; Huang, Hu; Wu, Jianhua; Liu, Lei

    2015-02-01

    We present a multiclass fuzzy relevance vector machine (FRVM) learning mechanism and evaluate its performance to classify multiple hand motions using surface electromyographic (sEMG) signals. The relevance vector machine (RVM) is a sparse Bayesian kernel method which avoids some limitations of the support vector machine (SVM). However, RVM still suffers the difficulty of possible unclassifiable regions in multiclass problems. We propose two fuzzy membership function-based FRVM algorithms to solve such problems, based on experiments conducted on seven healthy subjects and two amputees with six hand motions. Two feature sets, namely, AR model coefficients and room mean square value (AR-RMS), and wavelet transform (WT) features, are extracted from the recorded sEMG signals. Fuzzy support vector machine (FSVM) analysis was also conducted for wide comparison in terms of accuracy, sparsity, training and testing time, as well as the effect of training sample sizes. FRVM yielded comparable classification accuracy with dramatically fewer support vectors in comparison with FSVM. Furthermore, the processing delay of FRVM was much less than that of FSVM, whilst training time of FSVM much faster than FRVM. The results indicate that FRVM classifier trained using sufficient samples can achieve comparable generalization capability as FSVM with significant sparsity in multi-channel sEMG classification, which is more suitable for sEMG-based real-time control applications.

  4. Testing of the Support Vector Machine for Binary-Class Classification

    NASA Technical Reports Server (NTRS)

    Scholten, Matthew

    2011-01-01

    The Support Vector Machine is a powerful algorithm, useful in classifying data in to species. The Support Vector Machines implemented in this research were used as classifiers for the final stage in a Multistage Autonomous Target Recognition system. A single kernel SVM known as SVMlight, and a modified version known as a Support Vector Machine with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SMV as a method for classification. From trial to trial, SVM produces consistent results

  5. Bayesian Kernel Methods for Non-Gaussian Distributions: Binary and Multi-class Classification Problems

    DTIC Science & Technology

    2013-05-28

    those of the support vector machine and relevance vector machine, and the model runs more quickly than the other algorithms . When one class occurs...incremental support vector machine algorithm for online learning when fewer than 50 data points are available. (a) Papers published in peer-reviewed journals...learning environments, where data processing occurs one observation at a time and the classification algorithm improves over time with new

  6. A combined LS-SVM & MLR QSAR workflow for predicting the inhibition of CXCR3 receptor by quinazolinone analogs.

    PubMed

    Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Igglessi-Markopoulou, Olga; Kollias, George

    2010-05-01

    A novel QSAR workflow is constructed that combines MLR with LS-SVM classification techniques for the identification of quinazolinone analogs as "active" or "non-active" CXCR3 antagonists. The accuracy of the LS-SVM classification technique for the training set and test was 100% and 90%, respectively. For the "active" analogs a validated MLR QSAR model estimates accurately their I-IP10 IC(50) inhibition values. The accuracy of the QSAR model (R (2) = 0.80) is illustrated using various evaluation techniques, such as leave-one-out procedure (R(LOO2)) = 0.67) and validation through an external test set (R(pred2) = 0.78). The key conclusion of this study is that the selected molecular descriptors, Highest Occupied Molecular Orbital energy (HOMO), Principal Moment of Inertia along X and Y axes PMIX and PMIZ, Polar Surface Area (PSA), Presence of triple bond (PTrplBnd), and Kier shape descriptor ((1) kappa), demonstrate discriminatory and pharmacophore abilities.

  7. [Support vector machine?assisted diagnosis of human malignant gastric tissues based on dielectric properties].

    PubMed

    Zhang, Sa; Li, Zhou; Xin, Xue-Gang

    2017-12-20

    To achieve differential diagnosis of normal and malignant gastric tissues based on discrepancies in their dielectric properties using support vector machine. The dielectric properties of normal and malignant gastric tissues at the frequency ranging from 42.58 to 500 MHz were measured by coaxial probe method, and the Cole?Cole model was used to fit the measured data. Receiver?operating characteristic (ROC) curve analysis was used to evaluate the discrimination capability with respect to permittivity, conductivity, and Cole?Cole fitting parameters. Support vector machine was used for discriminating normal and malignant gastric tissues, and the discrimination accuracy was calculated using k?fold cross? The area under the ROC curve was above 0.8 for permittivity at the 5 frequencies at the lower end of the measured frequency range. The combination of the support vector machine with the permittivity at all these 5 frequencies combined achieved the highest discrimination accuracy of 84.38% with a MATLAB runtime of 3.40 s. The support vector machine?assisted diagnosis is feasible for human malignant gastric tissues based on the dielectric properties.

  8. Research on intrusion detection based on Kohonen network and support vector machine

    NASA Astrophysics Data System (ADS)

    Shuai, Chunyan; Yang, Hengcheng; Gong, Zeweiyi

    2018-05-01

    In view of the problem of low detection accuracy and the long detection time of support vector machine, which directly applied to the network intrusion detection system. Optimization of SVM parameters can greatly improve the detection accuracy, but it can not be applied to high-speed network because of the long detection time. a method based on Kohonen neural network feature selection is proposed to reduce the optimization time of support vector machine parameters. Firstly, this paper is to calculate the weights of the KDD99 network intrusion data by Kohonen network and select feature by weight. Then, after the feature selection is completed, genetic algorithm (GA) and grid search method are used for parameter optimization to find the appropriate parameters and classify them by support vector machines. By comparing experiments, it is concluded that feature selection can reduce the time of parameter optimization, which has little influence on the accuracy of classification. The experiments suggest that the support vector machine can be used in the network intrusion detection system and reduce the missing rate.

  9. A Power Transformers Fault Diagnosis Model Based on Three DGA Ratios and PSO Optimization SVM

    NASA Astrophysics Data System (ADS)

    Ma, Hongzhe; Zhang, Wei; Wu, Rongrong; Yang, Chunyan

    2018-03-01

    In order to make up for the shortcomings of existing transformer fault diagnosis methods in dissolved gas-in-oil analysis (DGA) feature selection and parameter optimization, a transformer fault diagnosis model based on the three DGA ratios and particle swarm optimization (PSO) optimize support vector machine (SVM) is proposed. Using transforming support vector machine to the nonlinear and multi-classification SVM, establishing the particle swarm optimization to optimize the SVM multi classification model, and conducting transformer fault diagnosis combined with the cross validation principle. The fault diagnosis results show that the average accuracy of test method is better than the standard support vector machine and genetic algorithm support vector machine, and the proposed method can effectively improve the accuracy of transformer fault diagnosis is proved.

  10. No-reference quality assessment based on visual perception

    NASA Astrophysics Data System (ADS)

    Li, Junshan; Yang, Yawei; Hu, Shuangyan; Zhang, Jiao

    2014-11-01

    The visual quality assessment of images/videos is an ongoing hot research topic, which has become more and more important for numerous image and video processing applications with the rapid development of digital imaging and communication technologies. The goal of image quality assessment (IQA) algorithms is to automatically assess the quality of images/videos in agreement with human quality judgments. Up to now, two kinds of models have been used for IQA, namely full-reference (FR) and no-reference (NR) models. For FR models, IQA algorithms interpret image quality as fidelity or similarity with a perfect image in some perceptual space. However, the reference image is not available in many practical applications, and a NR IQA approach is desired. Considering natural vision as optimized by the millions of years of evolutionary pressure, many methods attempt to achieve consistency in quality prediction by modeling salient physiological and psychological features of the human visual system (HVS). To reach this goal, researchers try to simulate HVS with image sparsity coding and supervised machine learning, which are two main features of HVS. A typical HVS captures the scenes by sparsity coding, and uses experienced knowledge to apperceive objects. In this paper, we propose a novel IQA approach based on visual perception. Firstly, a standard model of HVS is studied and analyzed, and the sparse representation of image is accomplished with the model; and then, the mapping correlation between sparse codes and subjective quality scores is trained with the regression technique of least squaresupport vector machine (LS-SVM), which gains the regressor that can predict the image quality; the visual metric of image is predicted with the trained regressor at last. We validate the performance of proposed approach on Laboratory for Image and Video Engineering (LIVE) database, the specific contents of the type of distortions present in the database are: 227 images of JPEG2000, 233 images of JPEG, 174 images of White Noise, 174 images of Gaussian Blur, 174 images of Fast Fading. The database includes subjective differential mean opinion score (DMOS) for each image. The experimental results show that the proposed approach not only can assess many kinds of distorted images quality, but also exhibits a superior accuracy and monotonicity.

  11. Modeling diurnal land temperature cycles over Los Angeles using downscaled GOES imagery

    NASA Astrophysics Data System (ADS)

    Weng, Qihao; Fu, Peng

    2014-11-01

    Land surface temperature is a key parameter for monitoring urban heat islands, assessing heat related risks, and estimating building energy consumption. These environmental issues are characterized by high temporal variability. A possible solution from the remote sensing perspective is to utilize geostationary satellites images, for instance, images from Geostationary Operational Environmental System (GOES) and Meteosat Second Generation (MSG). These satellite systems, however, with coarse spatial but high temporal resolution (sub-hourly imagery at 3-10 km resolution), often limit their usage to meteorological forecasting and global climate modeling. Therefore, how to develop efficient and effective methods to disaggregate these coarse resolution images to a proper scale suitable for regional and local studies need be explored. In this study, we propose a least square support vector machine (LSSVM) method to achieve the goal of downscaling of GOES image data to half-hourly 1-km LSTs by fusing it with MODIS data products and Shuttle Radar Topography Mission (SRTM) digital elevation data. The result of downscaling suggests that the proposed method successfully disaggregated GOES images to half-hourly 1-km LSTs with accuracy of approximately 2.5 K when validated against with MODIS LSTs at the same over-passing time. The synthetic LST datasets were further explored for monitoring of surface urban heat island (UHI) in the Los Angeles region by extracting key diurnal temperature cycle (DTC) parameters. It is found that the datasets and DTC derived parameters were more suitable for monitoring of daytime- other than nighttime-UHI. With the downscaled GOES 1-km LSTs, the diurnal temperature variations can well be characterized. An accuracy of about 2.5 K was achieved in terms of the fitted results at both 1 km and 5 km resolutions.

  12. [Determination of soluble solids content in Nanfeng Mandarin by Vis/NIR spectroscopy and UVE-ICA-LS-SVM].

    PubMed

    Sun, Tong; Xu, Wen-Li; Hu, Tian; Liu, Mu-Hua

    2013-12-01

    The objective of the present research was to assess soluble solids content (SSC) of Nanfeng mandarin by visible/near infrared (Vis/NIR) spectroscopy combined with new variable selection method, simplify prediction model and improve the performance of prediction model for SSC of Nanfeng mandarin. A total of 300 Nanfeng mandarin samples were used, the numbers of Nanfeng mandarin samples in calibration, validation and prediction sets were 150, 75 and 75, respectively. Vis/NIR spectra of Nanfeng mandarin samples were acquired by a QualitySpec spectrometer in the wavelength range of 350-1000 nm. Uninformative variables elimination (UVE) was used to eliminate wavelength variables that had few information of SSC, then independent component analysis (ICA) was used to extract independent components (ICs) from spectra that eliminated uninformative wavelength variables. At last, least squares support vector machine (LS-SVM) was used to develop calibration models for SSC of Nanfeng mandarin using extracted ICs, and 75 prediction samples that had not been used for model development were used to evaluate the performance of SSC model of Nanfeng mandarin. The results indicate t hat Vis/NIR spectroscopy combinedwith UVE-ICA-LS-SVM is suitable for assessing SSC o f Nanfeng mandarin, and t he precision o f prediction ishigh. UVE--ICA is an effective method to eliminate uninformative wavelength variables, extract important spectral information, simplify prediction model and improve the performance of prediction model. The SSC model developed by UVE-ICA-LS-SVM is superior to that developed by PLS, PCA-LS-SVM or ICA-LS-SVM, and the coefficient of determination and root mean square error in calibration, validation and prediction sets were 0.978, 0.230%, 0.965, 0.301% and 0.967, 0.292%, respectively.

  13. Methods, systems and apparatus for optimization of third harmonic current injection in a multi-phase machine

    DOEpatents

    Gallegos-Lopez, Gabriel

    2012-10-02

    Methods, system and apparatus are provided for increasing voltage utilization in a five-phase vector controlled machine drive system that employs third harmonic current injection to increase torque and power output by a five-phase machine. To do so, a fundamental current angle of a fundamental current vector is optimized for each particular torque-speed of operating point of the five-phase machine.

  14. Multi-centre diagnostic classification of individual structural neuroimaging scans from patients with major depressive disorder.

    PubMed

    Mwangi, Benson; Ebmeier, Klaus P; Matthews, Keith; Steele, J Douglas

    2012-05-01

    Quantitative abnormalities of brain structure in patients with major depressive disorder have been reported at a group level for decades. However, these structural differences appear subtle in comparison with conventional radiologically defined abnormalities, with considerable inter-subject variability. Consequently, it has not been possible to readily identify scans from patients with major depressive disorder at an individual level. Recently, machine learning techniques such as relevance vector machines and support vector machines have been applied to predictive classification of individual scans with variable success. Here we describe a novel hybrid method, which combines machine learning with feature selection and characterization, with the latter aimed at maximizing the accuracy of machine learning prediction. The method was tested using a multi-centre dataset of T(1)-weighted 'structural' scans. A total of 62 patients with major depressive disorder and matched controls were recruited from referred secondary care clinical populations in Aberdeen and Edinburgh, UK. The generalization ability and predictive accuracy of the classifiers was tested using data left out of the training process. High prediction accuracy was achieved (~90%). While feature selection was important for maximizing high predictive accuracy with machine learning, feature characterization contributed only a modest improvement to relevance vector machine-based prediction (~5%). Notably, while the only information provided for training the classifiers was T(1)-weighted scans plus a categorical label (major depressive disorder versus controls), both relevance vector machine and support vector machine 'weighting factors' (used for making predictions) correlated strongly with subjective ratings of illness severity. These results indicate that machine learning techniques have the potential to inform clinical practice and research, as they can make accurate predictions about brain scan data from individual subjects. Furthermore, machine learning weighting factors may reflect an objective biomarker of major depressive disorder illness severity, based on abnormalities of brain structure.

  15. Three-dimensional tool radius compensation for multi-axis peripheral milling

    NASA Astrophysics Data System (ADS)

    Chen, Youdong; Wang, Tianmiao

    2013-05-01

    Few function about 3D tool radius compensation is applied to generating executable motion control commands in the existing computer numerical control (CNC) systems. Once the tool radius is changed, especially in the case of tool size changing with tool wear in machining, a new NC program has to be recreated. A generic 3D tool radius compensation method for multi-axis peripheral milling in CNC systems is presented. The offset path is calculated by offsetting the tool path along the direction of the offset vector with a given distance. The offset vector is perpendicular to both the tangent vector of the tool path and the orientation vector of the tool axis relative to the workpiece. The orientation vector equations of the tool axis relative to the workpiece are obtained through homogeneous coordinate transformation matrix and forward kinematics of generalized kinematics model of multi-axis machine tools. To avoid cutting into the corner formed by the two adjacent tool paths, the coordinates of offset path at the intersection point have been calculated according to the transition type that is determined by the angle between the two tool path tangent vectors at the corner. Through the verification by the solid cutting simulation software VERICUT® with different tool radiuses on a table-tilting type five-axis machine tool, and by the real machining experiment of machining a soup spoon on a five-axis machine tool with the developed CNC system, the effectiveness of the proposed 3D tool radius compensation method is confirmed. The proposed compensation method can be suitable for all kinds of three- to five-axis machine tools as a general form.

  16. A Wavelet Support Vector Machine Combination Model for Singapore Tourist Arrival to Malaysia

    NASA Astrophysics Data System (ADS)

    Rafidah, A.; Shabri, Ani; Nurulhuda, A.; Suhaila, Y.

    2017-08-01

    In this study, wavelet support vector machine model (WSVM) is proposed and applied for monthly data Singapore tourist time series prediction. The WSVM model is combination between wavelet analysis and support vector machine (SVM). In this study, we have two parts, first part we compare between the kernel function and second part we compare between the developed models with single model, SVM. The result showed that kernel function linear better than RBF while WSVM outperform with single model SVM to forecast monthly Singapore tourist arrival to Malaysia.

  17. Quantum Support Vector Machine for Big Data Classification

    NASA Astrophysics Data System (ADS)

    Rebentrost, Patrick; Mohseni, Masoud; Lloyd, Seth

    2014-09-01

    Supervised machine learning is the classification of new data based on already classified training examples. In this work, we show that the support vector machine, an optimized binary classifier, can be implemented on a quantum computer, with complexity logarithmic in the size of the vectors and the number of training examples. In cases where classical sampling algorithms require polynomial time, an exponential speedup is obtained. At the core of this quantum big data algorithm is a nonsparse matrix exponentiation technique for efficiently performing a matrix inversion of the training data inner-product (kernel) matrix.

  18. Optimization of large matrix calculations for execution on the Cray X-MP vector supercomputer

    NASA Technical Reports Server (NTRS)

    Hornfeck, William A.

    1988-01-01

    A considerable volume of large computational computer codes were developed for NASA over the past twenty-five years. This code represents algorithms developed for machines of earlier generation. With the emergence of the vector supercomputer as a viable, commercially available machine, an opportunity exists to evaluate optimization strategies to improve the efficiency of existing software. This result is primarily due to architectural differences in the latest generation of large-scale machines and the earlier, mostly uniprocessor, machines. A sofware package being used by NASA to perform computations on large matrices is described, and a strategy for conversion to the Cray X-MP vector supercomputer is also described.

  19. Predicting primary progressive aphasias with support vector machine approaches in structural MRI data.

    PubMed

    Bisenius, Sandrine; Mueller, Karsten; Diehl-Schmid, Janine; Fassbender, Klaus; Grimmer, Timo; Jessen, Frank; Kassubek, Jan; Kornhuber, Johannes; Landwehrmeyer, Bernhard; Ludolph, Albert; Schneider, Anja; Anderl-Straub, Sarah; Stuke, Katharina; Danek, Adrian; Otto, Markus; Schroeter, Matthias L

    2017-01-01

    Primary progressive aphasia (PPA) encompasses the three subtypes nonfluent/agrammatic variant PPA, semantic variant PPA, and the logopenic variant PPA, which are characterized by distinct patterns of language difficulties and regional brain atrophy. To validate the potential of structural magnetic resonance imaging data for early individual diagnosis, we used support vector machine classification on grey matter density maps obtained by voxel-based morphometry analysis to discriminate PPA subtypes (44 patients: 16 nonfluent/agrammatic variant PPA, 17 semantic variant PPA, 11 logopenic variant PPA) from 20 healthy controls (matched for sample size, age, and gender) in the cohort of the multi-center study of the German consortium for frontotemporal lobar degeneration. Here, we compared a whole-brain with a meta-analysis-based disease-specific regions-of-interest approach for support vector machine classification. We also used support vector machine classification to discriminate the three PPA subtypes from each other. Whole brain support vector machine classification enabled a very high accuracy between 91 and 97% for identifying specific PPA subtypes vs. healthy controls, and 78/95% for the discrimination between semantic variant vs. nonfluent/agrammatic or logopenic PPA variants. Only for the discrimination between nonfluent/agrammatic and logopenic PPA variants accuracy was low with 55%. Interestingly, the regions that contributed the most to the support vector machine classification of patients corresponded largely to the regions that were atrophic in these patients as revealed by group comparisons. Although the whole brain approach took also into account regions that were not covered in the regions-of-interest approach, both approaches showed similar accuracies due to the disease-specificity of the selected networks. Conclusion, support vector machine classification of multi-center structural magnetic resonance imaging data enables prediction of PPA subtypes with a very high accuracy paving the road for its application in clinical settings.

  20. Support vector machines

    NASA Technical Reports Server (NTRS)

    Garay, Michael J.; Mazzoni, Dominic; Davies, Roger; Wagstaff, Kiri

    2004-01-01

    Support Vector Machines (SVMs) are a type of supervised learning algorith,, other examples of which are Artificial Neural Networks (ANNs), Decision Trees, and Naive Bayesian Classifiers. Supervised learning algorithms are used to classify objects labled by a 'supervisor' - typically a human 'expert.'.

  1. Lysine acetylation sites prediction using an ensemble of support vector machine classifiers.

    PubMed

    Xu, Yan; Wang, Xiao-Bo; Ding, Jun; Wu, Ling-Yun; Deng, Nai-Yang

    2010-05-07

    Lysine acetylation is an essentially reversible and high regulated post-translational modification which regulates diverse protein properties. Experimental identification of acetylation sites is laborious and expensive. Hence, there is significant interest in the development of computational methods for reliable prediction of acetylation sites from amino acid sequences. In this paper we use an ensemble of support vector machine classifiers to perform this work. The experimentally determined acetylation lysine sites are extracted from Swiss-Prot database and scientific literatures. Experiment results show that an ensemble of support vector machine classifiers outperforms single support vector machine classifier and other computational methods such as PAIL and LysAcet on the problem of predicting acetylation lysine sites. The resulting method has been implemented in EnsemblePail, a web server for lysine acetylation sites prediction available at http://www.aporc.org/EnsemblePail/. Copyright (c) 2010 Elsevier Ltd. All rights reserved.

  2. Online Sequential Projection Vector Machine with Adaptive Data Mean Update

    PubMed Central

    Chen, Lin; Jia, Ji-Ting; Zhang, Qiong; Deng, Wan-Yu; Wei, Wei

    2016-01-01

    We propose a simple online learning algorithm especial for high-dimensional data. The algorithm is referred to as online sequential projection vector machine (OSPVM) which derives from projection vector machine and can learn from data in one-by-one or chunk-by-chunk mode. In OSPVM, data centering, dimension reduction, and neural network training are integrated seamlessly. In particular, the model parameters including (1) the projection vectors for dimension reduction, (2) the input weights, biases, and output weights, and (3) the number of hidden nodes can be updated simultaneously. Moreover, only one parameter, the number of hidden nodes, needs to be determined manually, and this makes it easy for use in real applications. Performance comparison was made on various high-dimensional classification problems for OSPVM against other fast online algorithms including budgeted stochastic gradient descent (BSGD) approach, adaptive multihyperplane machine (AMM), primal estimated subgradient solver (Pegasos), online sequential extreme learning machine (OSELM), and SVD + OSELM (feature selection based on SVD is performed before OSELM). The results obtained demonstrated the superior generalization performance and efficiency of the OSPVM. PMID:27143958

  3. Online Sequential Projection Vector Machine with Adaptive Data Mean Update.

    PubMed

    Chen, Lin; Jia, Ji-Ting; Zhang, Qiong; Deng, Wan-Yu; Wei, Wei

    2016-01-01

    We propose a simple online learning algorithm especial for high-dimensional data. The algorithm is referred to as online sequential projection vector machine (OSPVM) which derives from projection vector machine and can learn from data in one-by-one or chunk-by-chunk mode. In OSPVM, data centering, dimension reduction, and neural network training are integrated seamlessly. In particular, the model parameters including (1) the projection vectors for dimension reduction, (2) the input weights, biases, and output weights, and (3) the number of hidden nodes can be updated simultaneously. Moreover, only one parameter, the number of hidden nodes, needs to be determined manually, and this makes it easy for use in real applications. Performance comparison was made on various high-dimensional classification problems for OSPVM against other fast online algorithms including budgeted stochastic gradient descent (BSGD) approach, adaptive multihyperplane machine (AMM), primal estimated subgradient solver (Pegasos), online sequential extreme learning machine (OSELM), and SVD + OSELM (feature selection based on SVD is performed before OSELM). The results obtained demonstrated the superior generalization performance and efficiency of the OSPVM.

  4. [Application of infrared spectroscopy technique to protein content fast measurement in milk powder based on support vector machines].

    PubMed

    Wu, Di; Cao, Fang; Feng, Shui-Juan; He, Yong

    2008-05-01

    In the present study, the JASCO Model FTIR-4 000 fourier transform infrared spectrometer (Japan) was used, with a valid range of 7 800-350 cm(-1). Seven brands of milk powder were bought in a local supermarket. Milk powder was compressed into a uniform tablet with a diameter of 5 mm and a thickness of 2 mm, and then scanned by the spectrometer. Each sample was scanned 40 times and the data were averaged. About 60 samples were measured for each brand, and data for 409 samples were obtained. NIRS analysis was based on the range of 4 000 to 6 666 cm(-1), while MIRS analysis was between 400 and 4 000 cm(-1). The protein content was determined by kjeldahl method and the factor 6.38 was used to convert the nitrogen values to protein. The protein content value is the weight of protein per 100 g of milk powder. The NIR data of the milk powder exhibited slight differences. Univariate analysis was not really appropriate for analyzing the data sets. From NIRS region, it could be observed that the trend of different curves is similar. The one around 4 312 cm(-1) embodies the vibration of protein. From MIRS region, it could be determined that there are many differences between transmission value curves. Two troughs around 1 545 and 1 656 cm(-1) stand for the vibration of amide I and II bands of protein. The smoothing way of Savitzky-Golay with 3 segments and zero polynomials and multiplicative scatter correction (MSC) were applied for denoising. First 8 important principle components (PCs), which were obtained from principle component analysis (PCA), were the optimal input feature subset. Least-squares support vector machines was applied to build the protein prediction model based on infrared spectral transmission value. The prediction result was better than that of traditional PLS regression model as the determination coefficient for prediction (R(p)2) is 0.951 7 and root mean square error for prediction (RMSEP) is 0.520 201. These indicate that LS-SVM is a powerful tool for spectral analysis. Moreover, the study compared the prediction results based on near infrared spectral data and mid-infrared spectral data. The results showed that the performance of the model with mid-infrared spectral data was better than the one with near infrared spectra data. It was concluded that infrared spectroscopy technique can do the quantification of protein content in milk powder fast and non-destructively and the process was simple and easy to operate. The results of this study can be used for the design of a simple and non-destructive spectra sensor for the quantitative of protein content in milk powder.

  5. A hybrid approach to select features and classify diseases based on medical data

    NASA Astrophysics Data System (ADS)

    AbdelLatif, Hisham; Luo, Jiawei

    2018-03-01

    Feature selection is popular problem in the classification of diseases in clinical medicine. Here, we developing a hybrid methodology to classify diseases, based on three medical datasets, Arrhythmia, Breast cancer, and Hepatitis datasets. This methodology called k-means ANOVA Support Vector Machine (K-ANOVA-SVM) uses K-means cluster with ANOVA statistical to preprocessing data and selection the significant features, and Support Vector Machines in the classification process. To compare and evaluate the performance, we choice three classification algorithms, decision tree Naïve Bayes, Support Vector Machines and applied the medical datasets direct to these algorithms. Our methodology was a much better classification accuracy is given of 98% in Arrhythmia datasets, 92% in Breast cancer datasets and 88% in Hepatitis datasets, Compare to use the medical data directly with decision tree Naïve Bayes, and Support Vector Machines. Also, the ROC curve and precision with (K-ANOVA-SVM) Achieved best results than other algorithms

  6. Interpreting linear support vector machine models with heat map molecule coloring

    PubMed Central

    2011-01-01

    Background Model-based virtual screening plays an important role in the early drug discovery stage. The outcomes of high-throughput screenings are a valuable source for machine learning algorithms to infer such models. Besides a strong performance, the interpretability of a machine learning model is a desired property to guide the optimization of a compound in later drug discovery stages. Linear support vector machines showed to have a convincing performance on large-scale data sets. The goal of this study is to present a heat map molecule coloring technique to interpret linear support vector machine models. Based on the weights of a linear model, the visualization approach colors each atom and bond of a compound according to its importance for activity. Results We evaluated our approach on a toxicity data set, a chromosome aberration data set, and the maximum unbiased validation data sets. The experiments show that our method sensibly visualizes structure-property and structure-activity relationships of a linear support vector machine model. The coloring of ligands in the binding pocket of several crystal structures of a maximum unbiased validation data set target indicates that our approach assists to determine the correct ligand orientation in the binding pocket. Additionally, the heat map coloring enables the identification of substructures important for the binding of an inhibitor. Conclusions In combination with heat map coloring, linear support vector machine models can help to guide the modification of a compound in later stages of drug discovery. Particularly substructures identified as important by our method might be a starting point for optimization of a lead compound. The heat map coloring should be considered as complementary to structure based modeling approaches. As such, it helps to get a better understanding of the binding mode of an inhibitor. PMID:21439031

  7. Identifying saltcedar with hyperspectral data and support vector machines

    USDA-ARS?s Scientific Manuscript database

    Saltcedar (Tamarix spp.) are a group of dense phreatophytic shrubs and trees that are invasive to riparian areas throughout the United States. This study determined the feasibility of using hyperspectral data and a support vector machine (SVM) classifier to discriminate saltcedar from other cover t...

  8. Automatic event detection in low SNR microseismic signals based on multi-scale permutation entropy and a support vector machine

    NASA Astrophysics Data System (ADS)

    Jia, Rui-Sheng; Sun, Hong-Mei; Peng, Yan-Jun; Liang, Yong-Quan; Lu, Xin-Ming

    2017-07-01

    Microseismic monitoring is an effective means for providing early warning of rock or coal dynamical disasters, and its first step is microseismic event detection, although low SNR microseismic signals often cannot effectively be detected by routine methods. To solve this problem, this paper presents permutation entropy and a support vector machine to detect low SNR microseismic events. First, an extraction method of signal features based on multi-scale permutation entropy is proposed by studying the influence of the scale factor on the signal permutation entropy. Second, the detection model of low SNR microseismic events based on the least squares support vector machine is built by performing a multi-scale permutation entropy calculation for the collected vibration signals, constructing a feature vector set of signals. Finally, a comparative analysis of the microseismic events and noise signals in the experiment proves that the different characteristics of the two can be fully expressed by using multi-scale permutation entropy. The detection model of microseismic events combined with the support vector machine, which has the features of high classification accuracy and fast real-time algorithms, can meet the requirements of online, real-time extractions of microseismic events.

  9. Spray characterization of ULV sprayers typically used in vector control

    USDA-ARS?s Scientific Manuscript database

    Numerous spray machines are used to apply products for the control of human disease vectors, such as mosquitoes and flies. However, the selection and setup of these machines significantly affect the level of control achieved during an application. The droplet spectra produced by nine different ULV...

  10. Applying spectral unmixing and support vector machine to airborne hyperspectral imagery for detecting giant reed

    USDA-ARS?s Scientific Manuscript database

    This study evaluated linear spectral unmixing (LSU), mixture tuned matched filtering (MTMF) and support vector machine (SVM) techniques for detecting and mapping giant reed (Arundo donax L.), an invasive weed that presents a severe threat to agroecosystems and riparian areas throughout the southern ...

  11. Support vector machines classifiers of physical activities in preschoolers

    USDA-ARS?s Scientific Manuscript database

    The goal of this study is to develop, test, and compare multinomial logistic regression (MLR) and support vector machines (SVM) in classifying preschool-aged children physical activity data acquired from an accelerometer. In this study, 69 children aged 3-5 years old were asked to participate in a s...

  12. Fabric wrinkle characterization and classification using modified wavelet coefficients and optimized support-vector-machine classifier

    USDA-ARS?s Scientific Manuscript database

    This paper presents a novel wrinkle evaluation method that uses modified wavelet coefficients and an optimized support-vector-machine (SVM) classification scheme to characterize and classify wrinkle appearance of fabric. Fabric images were decomposed with the wavelet transform (WT), and five parame...

  13. Comparison of Support Vector Machine, Neural Network, and CART Algorithms for the Land-Cover Classification Using Limited Training Data Points

    EPA Science Inventory

    Support vector machine (SVM) was applied for land-cover characterization using MODIS time-series data. Classification performance was examined with respect to training sample size, sample variability, and landscape homogeneity (purity). The results were compared to two convention...

  14. SOLAR FLARE PREDICTION USING SDO/HMI VECTOR MAGNETIC FIELD DATA WITH A MACHINE-LEARNING ALGORITHM

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bobra, M. G.; Couvidat, S., E-mail: couvidat@stanford.edu

    2015-01-10

    We attempt to forecast M- and X-class solar flares using a machine-learning algorithm, called support vector machine (SVM), and four years of data from the Solar Dynamics Observatory's Helioseismic and Magnetic Imager, the first instrument to continuously map the full-disk photospheric vector magnetic field from space. Most flare forecasting efforts described in the literature use either line-of-sight magnetograms or a relatively small number of ground-based vector magnetograms. This is the first time a large data set of vector magnetograms has been used to forecast solar flares. We build a catalog of flaring and non-flaring active regions sampled from a databasemore » of 2071 active regions, comprised of 1.5 million active region patches of vector magnetic field data, and characterize each active region by 25 parameters. We then train and test the machine-learning algorithm and we estimate its performances using forecast verification metrics with an emphasis on the true skill statistic (TSS). We obtain relatively high TSS scores and overall predictive abilities. We surmise that this is partly due to fine-tuning the SVM for this purpose and also to an advantageous set of features that can only be calculated from vector magnetic field data. We also apply a feature selection algorithm to determine which of our 25 features are useful for discriminating between flaring and non-flaring active regions and conclude that only a handful are needed for good predictive abilities.« less

  15. Differentiation of Enhancing Glioma and Primary Central Nervous System Lymphoma by Texture-Based Machine Learning.

    PubMed

    Alcaide-Leon, P; Dufort, P; Geraldo, A F; Alshafai, L; Maralani, P J; Spears, J; Bharatha, A

    2017-06-01

    Accurate preoperative differentiation of primary central nervous system lymphoma and enhancing glioma is essential to avoid unnecessary neurosurgical resection in patients with primary central nervous system lymphoma. The purpose of the study was to evaluate the diagnostic performance of a machine-learning algorithm by using texture analysis of contrast-enhanced T1-weighted images for differentiation of primary central nervous system lymphoma and enhancing glioma. Seventy-one adult patients with enhancing gliomas and 35 adult patients with primary central nervous system lymphomas were included. The tumors were manually contoured on contrast-enhanced T1WI, and the resulting volumes of interest were mined for textural features and subjected to a support vector machine-based machine-learning protocol. Three readers classified the tumors independently on contrast-enhanced T1WI. Areas under the receiver operating characteristic curves were estimated for each reader and for the support vector machine classifier. A noninferiority test for diagnostic accuracy based on paired areas under the receiver operating characteristic curve was performed with a noninferiority margin of 0.15. The mean areas under the receiver operating characteristic curve were 0.877 (95% CI, 0.798-0.955) for the support vector machine classifier; 0.878 (95% CI, 0.807-0.949) for reader 1; 0.899 (95% CI, 0.833-0.966) for reader 2; and 0.845 (95% CI, 0.757-0.933) for reader 3. The mean area under the receiver operating characteristic curve of the support vector machine classifier was significantly noninferior to the mean area under the curve of reader 1 ( P = .021), reader 2 ( P = .035), and reader 3 ( P = .007). Support vector machine classification based on textural features of contrast-enhanced T1WI is noninferior to expert human evaluation in the differentiation of primary central nervous system lymphoma and enhancing glioma. © 2017 by American Journal of Neuroradiology.

  16. An implementation of support vector machine on sentiment classification of movie reviews

    NASA Astrophysics Data System (ADS)

    Yulietha, I. M.; Faraby, S. A.; Adiwijaya; Widyaningtyas, W. C.

    2018-03-01

    With technological advances, all information about movie is available on the internet. If the information is processed properly, it will get the quality of the information. This research proposes to the classify sentiments on movie review documents. This research uses Support Vector Machine (SVM) method because it can classify high dimensional data in accordance with the data used in this research in the form of text. Support Vector Machine is a popular machine learning technique for text classification because it can classify by learning from a collection of documents that have been classified previously and can provide good result. Based on number of datasets, the 90-10 composition has the best result that is 85.6%. Based on SVM kernel, kernel linear with constant 1 has the best result that is 84.9%

  17. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics

    PubMed Central

    HUANG, SHUJUN; CAI, NIANGUANG; PACHECO, PEDRO PENZUTI; NARANDES, SHAVIRA; WANG, YANG; XU, WAYNE

    2017-01-01

    Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better understanding of cancer driver genes. Herein we reviewed the recent progress of SVMs in cancer genomic studies. We intend to comprehend the strength of the SVM learning and its future perspective in cancer genomic applications. PMID:29275361

  18. Detection of distorted frames in retinal video-sequences via machine learning

    NASA Astrophysics Data System (ADS)

    Kolar, Radim; Liberdova, Ivana; Odstrcilik, Jan; Hracho, Michal; Tornow, Ralf P.

    2017-07-01

    This paper describes detection of distorted frames in retinal sequences based on set of global features extracted from each frame. The feature vector is consequently used in classification step, in which three types of classifiers are tested. The best classification accuracy 96% has been achieved with support vector machine approach.

  19. Support vector machine applied to predict the zoonotic potential of E. coli O157 cattle isolates

    USDA-ARS?s Scientific Manuscript database

    Methods based on sequence data analysis facilitate the tracking of disease outbreaks, allow relationships between strains to be reconstructed and virulence factors to be identified. However, these methods are used postfactum after an outbreak has happened. Here, we show that support vector machine a...

  20. IMPROVEMENT OF SMVGEAR II ON VECTOR AND SCALAR MACHINES THROUGH ABSOLUTE ERROR TOLERANCE CONTROL (R823186)

    EPA Science Inventory

    The computer speed of SMVGEAR II was improved markedly on scalar and vector machines with relatively little loss in accuracy. The improvement was due to a method of frequently recalculating the absolute error tolerance instead of keeping it constant for a given set of chemistry. ...

  1. Using the Relevance Vector Machine Model Combined with Local Phase Quantization to Predict Protein-Protein Interactions from Protein Sequences.

    PubMed

    An, Ji-Yong; Meng, Fan-Rong; You, Zhu-Hong; Fang, Yu-Hong; Zhao, Yu-Jun; Zhang, Ming

    2016-01-01

    We propose a novel computational method known as RVM-LPQ that combines the Relevance Vector Machine (RVM) model and Local Phase Quantization (LPQ) to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the LPQ feature representation on a Position Specific Scoring Matrix (PSSM), reducing the influence of noise using a Principal Component Analysis (PCA), and using a Relevance Vector Machine (RVM) based classifier. We perform 5-fold cross-validation experiments on Yeast and Human datasets, and we achieve very high accuracies of 92.65% and 97.62%, respectively, which is significantly better than previous works. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM) classifier on the Yeast dataset. The experimental results demonstrate that our RVM-LPQ method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool for future proteomics research.

  2. Salient Feature Identification and Analysis using Kernel-Based Classification Techniques for Synthetic Aperture Radar Automatic Target Recognition

    DTIC Science & Technology

    2014-03-27

    and machine learning for a range of research including such topics as medical imaging [10] and handwriting recognition [11]. The type of feature...1989. [11] C. Bahlmann, B. Haasdonk, and H. Burkhardt, “Online handwriting recognition with support vector machines-a kernel approach,” in Eighth...International Workshop on Frontiers in Handwriting Recognition, pp. 49–54, IEEE, 2002. [12] C. Cortes and V. Vapnik, “Support-vector networks,” Machine

  3. A low cost implementation of multi-parameter patient monitor using intersection kernel support vector machine classifier

    NASA Astrophysics Data System (ADS)

    Mohan, Dhanya; Kumar, C. Santhosh

    2016-03-01

    Predicting the physiological condition (normal/abnormal) of a patient is highly desirable to enhance the quality of health care. Multi-parameter patient monitors (MPMs) using heart rate, arterial blood pressure, respiration rate and oxygen saturation (S pO2) as input parameters were developed to monitor the condition of patients, with minimum human resource utilization. The Support vector machine (SVM), an advanced machine learning approach popularly used for classification and regression is used for the realization of MPMs. For making MPMs cost effective, we experiment on the hardware implementation of the MPM using support vector machine classifier. The training of the system is done using the matlab environment and the detection of the alarm/noalarm condition is implemented in hardware. We used different kernels for SVM classification and note that the best performance was obtained using intersection kernel SVM (IKSVM). The intersection kernel support vector machine classifier MPM has outperformed the best known MPM using radial basis function kernel by an absoute improvement of 2.74% in accuracy, 1.86% in sensitivity and 3.01% in specificity. The hardware model was developed based on the improved performance system using Verilog Hardware Description Language and was implemented on Altera cyclone-II development board.

  4. Support Vector Machines: Relevance Feedback and Information Retrieval.

    ERIC Educational Resources Information Center

    Drucker, Harris; Shahrary, Behzad; Gibbon, David C.

    2002-01-01

    Compares support vector machines (SVMs) to Rocchio, Ide regular and Ide dec-hi algorithms in information retrieval (IR) of text documents using relevancy feedback. If the preliminary search is so poor that one has to search through many documents to find at least one relevant document, then SVM is preferred. Includes nine tables. (Contains 24…

  5. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics.

    PubMed

    Huang, Shujun; Cai, Nianguang; Pacheco, Pedro Penzuti; Narrandes, Shavira; Wang, Yang; Xu, Wayne

    2018-01-01

    Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better understanding of cancer driver genes. Herein we reviewed the recent progress of SVMs in cancer genomic studies. We intend to comprehend the strength of the SVM learning and its future perspective in cancer genomic applications. Copyright© 2018, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.

  6. Dual linear structured support vector machine tracking method via scale correlation filter

    NASA Astrophysics Data System (ADS)

    Li, Weisheng; Chen, Yanquan; Xiao, Bin; Feng, Chen

    2018-01-01

    Adaptive tracking-by-detection methods based on structured support vector machine (SVM) performed well on recent visual tracking benchmarks. However, these methods did not adopt an effective strategy of object scale estimation, which limits the overall tracking performance. We present a tracking method based on a dual linear structured support vector machine (DLSSVM) with a discriminative scale correlation filter. The collaborative tracker comprised of a DLSSVM model and a scale correlation filter obtains good results in tracking target position and scale estimation. The fast Fourier transform is applied for detection. Extensive experiments show that our tracking approach outperforms many popular top-ranking trackers. On a benchmark including 100 challenging video sequences, the average precision of the proposed method is 82.8%.

  7. Object recognition of ladar with support vector machine

    NASA Astrophysics Data System (ADS)

    Sun, Jian-Feng; Li, Qi; Wang, Qi

    2005-01-01

    Intensity, range and Doppler images can be obtained by using laser radar. Laser radar can detect much more object information than other detecting sensor, such as passive infrared imaging and synthetic aperture radar (SAR), so it is well suited as the sensor of object recognition. Traditional method of laser radar object recognition is extracting target features, which can be influenced by noise. In this paper, a laser radar recognition method-Support Vector Machine is introduced. Support Vector Machine (SVM) is a new hotspot of recognition research after neural network. It has well performance on digital written and face recognition. Two series experiments about SVM designed for preprocessing and non-preprocessing samples are performed by real laser radar images, and the experiments results are compared.

  8. nu-Anomica: A Fast Support Vector Based Novelty Detection Technique

    NASA Technical Reports Server (NTRS)

    Das, Santanu; Bhaduri, Kanishka; Oza, Nikunj C.; Srivastava, Ashok N.

    2009-01-01

    In this paper we propose nu-Anomica, a novel anomaly detection technique that can be trained on huge data sets with much reduced running time compared to the benchmark one-class Support Vector Machines algorithm. In -Anomica, the idea is to train the machine such that it can provide a close approximation to the exact decision plane using fewer training points and without losing much of the generalization performance of the classical approach. We have tested the proposed algorithm on a variety of continuous data sets under different conditions. We show that under all test conditions the developed procedure closely preserves the accuracy of standard one-class Support Vector Machines while reducing both the training time and the test time by 5 - 20 times.

  9. Power line identification of millimeter wave radar based on PCA-GS-SVM

    NASA Astrophysics Data System (ADS)

    Fang, Fang; Zhang, Guifeng; Cheng, Yansheng

    2017-12-01

    Aiming at the problem that the existing detection method can not effectively solve the security of UAV's ultra low altitude flight caused by power line, a power line recognition method based on grid search (GS) and the principal component analysis and support vector machine (PCA-SVM) is proposed. Firstly, the candidate line of Hough transform is reduced by PCA, and the main feature of candidate line is extracted. Then, upport vector machine (SVM is) optimized by grid search method (GS). Finally, using support vector machine classifier optimized parameters to classify the candidate line. MATLAB simulation results show that this method can effectively identify the power line and noise, and has high recognition accuracy and algorithm efficiency.

  10. Support vector machine for automatic pain recognition

    NASA Astrophysics Data System (ADS)

    Monwar, Md Maruf; Rezaei, Siamak

    2009-02-01

    Facial expressions are a key index of emotion and the interpretation of such expressions of emotion is critical to everyday social functioning. In this paper, we present an efficient video analysis technique for recognition of a specific expression, pain, from human faces. We employ an automatic face detector which detects face from the stored video frame using skin color modeling technique. For pain recognition, location and shape features of the detected faces are computed. These features are then used as inputs to a support vector machine (SVM) for classification. We compare the results with neural network based and eigenimage based automatic pain recognition systems. The experiment results indicate that using support vector machine as classifier can certainly improve the performance of automatic pain recognition system.

  11. The maximum vector-angular margin classifier and its fast training on large datasets using a core vector machine.

    PubMed

    Hu, Wenjun; Chung, Fu-Lai; Wang, Shitong

    2012-03-01

    Although pattern classification has been extensively studied in the past decades, how to effectively solve the corresponding training on large datasets is a problem that still requires particular attention. Many kernelized classification methods, such as SVM and SVDD, can be formulated as the corresponding quadratic programming (QP) problems, but computing the associated kernel matrices requires O(n2)(or even up to O(n3)) computational complexity, where n is the size of the training patterns, which heavily limits the applicability of these methods for large datasets. In this paper, a new classification method called the maximum vector-angular margin classifier (MAMC) is first proposed based on the vector-angular margin to find an optimal vector c in the pattern feature space, and all the testing patterns can be classified in terms of the maximum vector-angular margin ρ, between the vector c and all the training data points. Accordingly, it is proved that the kernelized MAMC can be equivalently formulated as the kernelized Minimum Enclosing Ball (MEB), which leads to a distinctive merit of MAMC, i.e., it has the flexibility of controlling the sum of support vectors like v-SVC and may be extended to a maximum vector-angular margin core vector machine (MAMCVM) by connecting the core vector machine (CVM) method with MAMC such that the corresponding fast training on large datasets can be effectively achieved. Experimental results on artificial and real datasets are provided to validate the power of the proposed methods. Copyright © 2011 Elsevier Ltd. All rights reserved.

  12. Vowel Imagery Decoding toward Silent Speech BCI Using Extreme Learning Machine with Electroencephalogram

    PubMed Central

    Kim, Jongin; Park, Hyeong-jun

    2016-01-01

    The purpose of this study is to classify EEG data on imagined speech in a single trial. We recorded EEG data while five subjects imagined different vowels, /a/, /e/, /i/, /o/, and /u/. We divided each single trial dataset into thirty segments and extracted features (mean, variance, standard deviation, and skewness) from all segments. To reduce the dimension of the feature vector, we applied a feature selection algorithm based on the sparse regression model. These features were classified using a support vector machine with a radial basis function kernel, an extreme learning machine, and two variants of an extreme learning machine with different kernels. Because each single trial consisted of thirty segments, our algorithm decided the label of the single trial by selecting the most frequent output among the outputs of the thirty segments. As a result, we observed that the extreme learning machine and its variants achieved better classification rates than the support vector machine with a radial basis function kernel and linear discrimination analysis. Thus, our results suggested that EEG responses to imagined speech could be successfully classified in a single trial using an extreme learning machine with a radial basis function and linear kernel. This study with classification of imagined speech might contribute to the development of silent speech BCI systems. PMID:28097128

  13. Entanglement-Based Machine Learning on a Quantum Computer

    NASA Astrophysics Data System (ADS)

    Cai, X.-D.; Wu, D.; Su, Z.-E.; Chen, M.-C.; Wang, X.-L.; Li, Li; Liu, N.-L.; Lu, C.-Y.; Pan, J.-W.

    2015-03-01

    Machine learning, a branch of artificial intelligence, learns from previous experience to optimize performance, which is ubiquitous in various fields such as computer sciences, financial analysis, robotics, and bioinformatics. A challenge is that machine learning with the rapidly growing "big data" could become intractable for classical computers. Recently, quantum machine learning algorithms [Lloyd, Mohseni, and Rebentrost, arXiv.1307.0411] were proposed which could offer an exponential speedup over classical algorithms. Here, we report the first experimental entanglement-based classification of two-, four-, and eight-dimensional vectors to different clusters using a small-scale photonic quantum computer, which are then used to implement supervised and unsupervised machine learning. The results demonstrate the working principle of using quantum computers to manipulate and classify high-dimensional vectors, the core mathematical routine in machine learning. The method can, in principle, be scaled to larger numbers of qubits, and may provide a new route to accelerate machine learning.

  14. Support vector machine in machine condition monitoring and fault diagnosis

    NASA Astrophysics Data System (ADS)

    Widodo, Achmad; Yang, Bo-Suk

    2007-08-01

    Recently, the issue of machine condition monitoring and fault diagnosis as a part of maintenance system became global due to the potential advantages to be gained from reduced maintenance costs, improved productivity and increased machine availability. This paper presents a survey of machine condition monitoring and fault diagnosis using support vector machine (SVM). It attempts to summarize and review the recent research and developments of SVM in machine condition monitoring and diagnosis. Numerous methods have been developed based on intelligent systems such as artificial neural network, fuzzy expert system, condition-based reasoning, random forest, etc. However, the use of SVM for machine condition monitoring and fault diagnosis is still rare. SVM has excellent performance in generalization so it can produce high accuracy in classification for machine condition monitoring and diagnosis. Until 2006, the use of SVM in machine condition monitoring and fault diagnosis is tending to develop towards expertise orientation and problem-oriented domain. Finally, the ability to continually change and obtain a novel idea for machine condition monitoring and fault diagnosis using SVM will be future works.

  15. Exact analytical modeling of magnetic vector potential in surface inset permanent magnet DC machines considering magnet segmentation

    NASA Astrophysics Data System (ADS)

    Jabbari, Ali

    2018-01-01

    Surface inset permanent magnet DC machine can be used as an alternative in automation systems due to their high efficiency and robustness. Magnet segmentation is a common technique in order to mitigate pulsating torque components in permanent magnet machines. An accurate computation of air-gap magnetic field distribution is necessary in order to calculate machine performance. An exact analytical method for magnetic vector potential calculation in surface inset permanent magnet machines considering magnet segmentation has been proposed in this paper. The analytical method is based on the resolution of Laplace and Poisson equations as well as Maxwell equation in polar coordinate by using sub-domain method. One of the main contributions of the paper is to derive an expression for the magnetic vector potential in the segmented PM region by using hyperbolic functions. The developed method is applied on the performance computation of two prototype surface inset magnet segmented motors with open circuit and on load conditions. The results of these models are validated through FEM method.

  16. Estimation of Teacher Practices Based on Text Transcripts of Teacher Speech Using a Support Vector Machine Algorithm

    ERIC Educational Resources Information Center

    Araya, Roberto; Plana, Francisco; Dartnell, Pablo; Soto-Andrade, Jorge; Luci, Gina; Salinas, Elena; Araya, Marylen

    2012-01-01

    Teacher practice is normally assessed by observers who watch classes or videos of classes. Here, we analyse an alternative strategy that uses text transcripts and a support vector machine classifier. For each one of the 710 videos of mathematics classes from the 2005 Chilean National Teacher Assessment Programme, a single 4-minute slice was…

  17. Identification of quality markers of Yuanhu Zhitong tablets based on integrative pharmacology and data mining.

    PubMed

    Li, Ke; Li, Junfang; Su, Jin; Xiao, Xuefeng; Peng, Xiujuan; Liu, Feng; Li, Defeng; Zhang, Yi; Chong, Tao; Xu, Haiyu; Liu, Changxiao; Yang, Hongjun

    2018-03-07

    The quality evaluation of traditional Chinese medicine (TCM) formulations is needed to guarantee the safety and efficacy. In our laboratory, we established interaction rules between chemical quality control and biological activity evaluations to study Yuanhu Zhitong tablets (YZTs). Moreover, a quality marker (Q-marker) has recently been proposed as a new concept in the quality control of TCM. However, no appropriate methods are available for the identification of Q-markers from the complex TCM systems. We aimed to use an integrative pharmacological (IP) approach to further identify Q-markers from YZTs through the integration of multidisciplinary knowledge. In addition, data mining was used to determine the correlation between multiple constituents of this TCM and its bioactivity to improve quality control. The IP approach was used to identify the active constituents of YZTs and elucidate the molecular mechanisms by integrating chemical and biosynthetic analyses, drug metabolism, and network pharmacology. Data mining methods including grey relational analysis (GRA) and least squares support vector machine (LS-SVM) regression techniques, were used to establish the correlations among the constituents and efficacy, and dose efficacy in multiple dimensions. Seven constituents (tetrahydropalmatine, α-allocryptopine, protopine, corydaline, imperatorin, isoimperatorin, and byakangelicin) were identified as Q-markers of YZT using IP based on their high abundance, specific presence in the individual herbal constituents and the product, appropriate drug-like properties, and critical contribution to the bioactivity of the mixture of YZT constituents. Moreover, three Q-markers (protopine, α-allocryptopine, and corydaline) were highly correlated with the multiple bioactivities of the YZTs, as found using data mining. Finally, three constituents (tetrahydropalmatine, corydaline, and imperatorin) were chosen as minimum combinations that both distinguished the authentic components from false products and indicated the intensity of bioactivity to improve the quality control of YZTs. Tetrahydropalmatine, imperatorin, and corydaline could be used as minimum combinations to effectively control the quality of YZTs. Copyright © 2018. Published by Elsevier GmbH.

  18. A Comparative Investigation of the Combined Effects of Pre-Processing, Wavelength Selection, and Regression Methods on Near-Infrared Calibration Model Performance.

    PubMed

    Wan, Jian; Chen, Yi-Chieh; Morris, A Julian; Thennadil, Suresh N

    2017-07-01

    Near-infrared (NIR) spectroscopy is being widely used in various fields ranging from pharmaceutics to the food industry for analyzing chemical and physical properties of the substances concerned. Its advantages over other analytical techniques include available physical interpretation of spectral data, nondestructive nature and high speed of measurements, and little or no need for sample preparation. The successful application of NIR spectroscopy relies on three main aspects: pre-processing of spectral data to eliminate nonlinear variations due to temperature, light scattering effects and many others, selection of those wavelengths that contribute useful information, and identification of suitable calibration models using linear/nonlinear regression . Several methods have been developed for each of these three aspects and many comparative studies of different methods exist for an individual aspect or some combinations. However, there is still a lack of comparative studies for the interactions among these three aspects, which can shed light on what role each aspect plays in the calibration and how to combine various methods of each aspect together to obtain the best calibration model. This paper aims to provide such a comparative study based on four benchmark data sets using three typical pre-processing methods, namely, orthogonal signal correction (OSC), extended multiplicative signal correction (EMSC) and optical path-length estimation and correction (OPLEC); two existing wavelength selection methods, namely, stepwise forward selection (SFS) and genetic algorithm optimization combined with partial least squares regression for spectral data (GAPLSSP); four popular regression methods, namely, partial least squares (PLS), least absolute shrinkage and selection operator (LASSO), least squares support vector machine (LS-SVM), and Gaussian process regression (GPR). The comparative study indicates that, in general, pre-processing of spectral data can play a significant role in the calibration while wavelength selection plays a marginal role and the combination of certain pre-processing, wavelength selection, and nonlinear regression methods can achieve superior performance over traditional linear regression-based calibration.

  19. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    PubMed Central

    Fernandez-Lozano, C.; Canto, C.; Gestal, M.; Andrade-Garda, J. M.; Rabuñal, J. R.; Dorado, J.; Pazos, A.

    2013-01-01

    Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM). Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA), the most representative variables for a specific classification problem can be selected. PMID:24453933

  20. Machine Learning Prediction of the Energy Gap of Graphene Nanoflakes Using Topological Autocorrelation Vectors.

    PubMed

    Fernandez, Michael; Abreu, Jose I; Shi, Hongqing; Barnard, Amanda S

    2016-11-14

    The possibility of band gap engineering in graphene opens countless new opportunities for application in nanoelectronics. In this work, the energy gaps of 622 computationally optimized graphene nanoflakes were mapped to topological autocorrelation vectors using machine learning techniques. Machine learning modeling revealed that the most relevant correlations appear at topological distances in the range of 1 to 42 with prediction accuracy higher than 80%. The data-driven model can statistically discriminate between graphene nanoflakes with different energy gaps on the basis of their molecular topology.

  1. Interpreting support vector machine models for multivariate group wise analysis in neuroimaging

    PubMed Central

    Gaonkar, Bilwaj; Shinohara, Russell T; Davatzikos, Christos

    2015-01-01

    Machine learning based classification algorithms like support vector machines (SVMs) have shown great promise for turning a high dimensional neuroimaging data into clinically useful decision criteria. However, tracing imaging based patterns that contribute significantly to classifier decisions remains an open problem. This is an issue of critical importance in imaging studies seeking to determine which anatomical or physiological imaging features contribute to the classifier’s decision, thereby allowing users to critically evaluate the findings of such machine learning methods and to understand disease mechanisms. The majority of published work addresses the question of statistical inference for support vector classification using permutation tests based on SVM weight vectors. Such permutation testing ignores the SVM margin, which is critical in SVM theory. In this work we emphasize the use of a statistic that explicitly accounts for the SVM margin and show that the null distributions associated with this statistic are asymptotically normal. Further, our experiments show that this statistic is a lot less conservative as compared to weight based permutation tests and yet specific enough to tease out multivariate patterns in the data. Thus, we can better understand the multivariate patterns that the SVM uses for neuroimaging based classification. PMID:26210913

  2. Harmonic reduction of Direct Torque Control of six-phase induction motor.

    PubMed

    Taheri, A

    2016-07-01

    In this paper, a new switching method in Direct Torque Control (DTC) of a six-phase induction machine for reduction of current harmonics is introduced. Selecting a suitable vector in each sampling period is an ordinal method in the ST-DTC drive of a six-phase induction machine. The six-phase induction machine has 64 voltage vectors and divided further into four groups. In the proposed DTC method, the suitable voltage vectors are selected from two vector groups. By a suitable selection of two vectors in each sampling period, the harmonic amplitude is decreased more, in and various comparison to that of the ST-DTC drive. The harmonics loss is greater reduced, while the electromechanical energy is decreased with switching loss showing a little increase. Spectrum analysis of the phase current in the standard and new switching table DTC of the six-phase induction machine and determination for the amplitude of each harmonics is proposed in this paper. The proposed method has a less sampling time in comparison to the ordinary method. The Harmonic analyses of the current in the low and high speed shows the performance of the presented method. The simplicity of the proposed method and its implementation without any extra hardware is other advantages of the proposed method. The simulation and experimental results show the preference of the proposed method. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.

  3. Support Vector Machine-Based Endmember Extraction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Filippi, Anthony M; Archibald, Richard K

    Introduced in this paper is the utilization of Support Vector Machines (SVMs) to automatically perform endmember extraction from hyperspectral data. The strengths of SVM are exploited to provide a fast and accurate calculated representation of high-dimensional data sets that may consist of multiple distributions. Once this representation is computed, the number of distributions can be determined without prior knowledge. For each distribution, an optimal transform can be determined that preserves informational content while reducing the data dimensionality, and hence, the computational cost. Finally, endmember extraction for the whole data set is accomplished. Results indicate that this Support Vector Machine-Based Endmembermore » Extraction (SVM-BEE) algorithm has the capability of autonomously determining endmembers from multiple clusters with computational speed and accuracy, while maintaining a robust tolerance to noise.« less

  4. Solving the Cauchy-Riemann equations on parallel computers

    NASA Technical Reports Server (NTRS)

    Fatoohi, Raad A.; Grosch, Chester E.

    1987-01-01

    Discussed is the implementation of a single algorithm on three parallel-vector computers. The algorithm is a relaxation scheme for the solution of the Cauchy-Riemann equations; a set of coupled first order partial differential equations. The computers were chosen so as to encompass a variety of architectures. They are: the MPP, and SIMD machine with 16K bit serial processors; FLEX/32, an MIMD machine with 20 processors; and CRAY/2, an MIMD machine with four vector processors. The machine architectures are briefly described. The implementation of the algorithm is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Conclusions are presented.

  5. TWSVR: Regression via Twin Support Vector Machine.

    PubMed

    Khemchandani, Reshma; Goyal, Keshav; Chandra, Suresh

    2016-02-01

    Taking motivation from Twin Support Vector Machine (TWSVM) formulation, Peng (2010) attempted to propose Twin Support Vector Regression (TSVR) where the regressor is obtained via solving a pair of quadratic programming problems (QPPs). In this paper we argue that TSVR formulation is not in the true spirit of TWSVM. Further, taking motivation from Bi and Bennett (2003), we propose an alternative approach to find a formulation for Twin Support Vector Regression (TWSVR) which is in the true spirit of TWSVM. We show that our proposed TWSVR can be derived from TWSVM for an appropriately constructed classification problem. To check the efficacy of our proposed TWSVR we compare its performance with TSVR and classical Support Vector Regression(SVR) on various regression datasets. Copyright © 2015 Elsevier Ltd. All rights reserved.

  6. An assessment of support vector machines for land cover classification

    USGS Publications Warehouse

    Huang, C.; Davis, L.S.; Townshend, J.R.G.

    2002-01-01

    The support vector machine (SVM) is a group of theoretically superior machine learning algorithms. It was found competitive with the best available machine learning algorithms in classifying high-dimensional data sets. This paper gives an introduction to the theoretical development of the SVM and an experimental evaluation of its accuracy, stability and training speed in deriving land cover classifications from satellite images. The SVM was compared to three other popular classifiers, including the maximum likelihood classifier (MLC), neural network classifiers (NNC) and decision tree classifiers (DTC). The impacts of kernel configuration on the performance of the SVM and of the selection of training data and input variables on the four classifiers were also evaluated in this experiment.

  7. Precise on-machine extraction of the surface normal vector using an eddy current sensor array

    NASA Astrophysics Data System (ADS)

    Wang, Yongqing; Lian, Meng; Liu, Haibo; Ying, Yangwei; Sheng, Xianjun

    2016-11-01

    To satisfy the requirements of on-machine measurement of the surface normal during complex surface manufacturing, a highly robust normal vector extraction method using an Eddy current (EC) displacement sensor array is developed, the output of which is almost unaffected by surface brightness, machining coolant and environmental noise. A precise normal vector extraction model based on a triangular-distributed EC sensor array is first established. Calibration of the effects of object surface inclination and coupling interference on measurement results, and the relative position of EC sensors, is involved. A novel apparatus employing three EC sensors and a force transducer was designed, which can be easily integrated into the computer numerical control (CNC) machine tool spindle and/or robot terminal execution. Finally, to test the validity and practicability of the proposed method, typical experiments were conducted with specified testing pieces using the developed approach and system, such as an inclined plane and cylindrical and spherical surfaces.

  8. A collaborative framework for Distributed Privacy-Preserving Support Vector Machine learning.

    PubMed

    Que, Jialan; Jiang, Xiaoqian; Ohno-Machado, Lucila

    2012-01-01

    A Support Vector Machine (SVM) is a popular tool for decision support. The traditional way to build an SVM model is to estimate parameters based on a centralized repository of data. However, in the field of biomedicine, patient data are sometimes stored in local repositories or institutions where they were collected, and may not be easily shared due to privacy concerns. This creates a substantial barrier for researchers to effectively learn from the distributed data using machine learning tools like SVMs. To overcome this difficulty and promote efficient information exchange without sharing sensitive raw data, we developed a Distributed Privacy Preserving Support Vector Machine (DPP-SVM). The DPP-SVM enables privacy-preserving collaborative learning, in which a trusted server integrates "privacy-insensitive" intermediary results. The globally learned model is guaranteed to be exactly the same as learned from combined data. We also provide a free web-service (http://privacy.ucsd.edu:8080/ppsvm/) for multiple participants to collaborate and complete the SVM-learning task in an efficient and privacy-preserving manner.

  9. Stochastic subset selection for learning with kernel machines.

    PubMed

    Rhinelander, Jason; Liu, Xiaoping P

    2012-06-01

    Kernel machines have gained much popularity in applications of machine learning. Support vector machines (SVMs) are a subset of kernel machines and generalize well for classification, regression, and anomaly detection tasks. The training procedure for traditional SVMs involves solving a quadratic programming (QP) problem. The QP problem scales super linearly in computational effort with the number of training samples and is often used for the offline batch processing of data. Kernel machines operate by retaining a subset of observed data during training. The data vectors contained within this subset are referred to as support vectors (SVs). The work presented in this paper introduces a subset selection method for the use of kernel machines in online, changing environments. Our algorithm works by using a stochastic indexing technique when selecting a subset of SVs when computing the kernel expansion. The work described here is novel because it separates the selection of kernel basis functions from the training algorithm used. The subset selection algorithm presented here can be used in conjunction with any online training technique. It is important for online kernel machines to be computationally efficient due to the real-time requirements of online environments. Our algorithm is an important contribution because it scales linearly with the number of training samples and is compatible with current training techniques. Our algorithm outperforms standard techniques in terms of computational efficiency and provides increased recognition accuracy in our experiments. We provide results from experiments using both simulated and real-world data sets to verify our algorithm.

  10. Curriculum Assessment Using Artificial Neural Network and Support Vector Machine Modeling Approaches: A Case Study. IR Applications. Volume 29

    ERIC Educational Resources Information Center

    Chen, Chau-Kuang

    2010-01-01

    Artificial Neural Network (ANN) and Support Vector Machine (SVM) approaches have been on the cutting edge of science and technology for pattern recognition and data classification. In the ANN model, classification accuracy can be achieved by using the feed-forward of inputs, back-propagation of errors, and the adjustment of connection weights. In…

  11. Daily sea level prediction at Chiayi coast, Taiwan using extreme learning machine and relevance vector machine

    NASA Astrophysics Data System (ADS)

    Imani, Moslem; Kao, Huan-Chin; Lan, Wen-Hau; Kuo, Chung-Yen

    2018-02-01

    The analysis and the prediction of sea level fluctuations are core requirements of marine meteorology and operational oceanography. Estimates of sea level with hours-to-days warning times are especially important for low-lying regions and coastal zone management. The primary purpose of this study is to examine the applicability and capability of extreme learning machine (ELM) and relevance vector machine (RVM) models for predicting sea level variations and compare their performances with powerful machine learning methods, namely, support vector machine (SVM) and radial basis function (RBF) models. The input dataset from the period of January 2004 to May 2011 used in the study was obtained from the Dongshi tide gauge station in Chiayi, Taiwan. Results showed that the ELM and RVM models outperformed the other methods. The performance of the RVM approach was superior in predicting the daily sea level time series given the minimum root mean square error of 34.73 mm and the maximum determination coefficient of 0.93 (R2) during the testing periods. Furthermore, the obtained results were in close agreement with the original tide-gauge data, which indicates that RVM approach is a promising alternative method for time series prediction and could be successfully used for daily sea level forecasts.

  12. Virtual screening by a new Clustering-based Weighted Similarity Extreme Learning Machine approach

    PubMed Central

    Kudisthalert, Wasu

    2018-01-01

    Machine learning techniques are becoming popular in virtual screening tasks. One of the powerful machine learning algorithms is Extreme Learning Machine (ELM) which has been applied to many applications and has recently been applied to virtual screening. We propose the Weighted Similarity ELM (WS-ELM) which is based on a single layer feed-forward neural network in a conjunction of 16 different similarity coefficients as activation function in the hidden layer. It is known that the performance of conventional ELM is not robust due to random weight selection in the hidden layer. Thus, we propose a Clustering-based WS-ELM (CWS-ELM) that deterministically assigns weights by utilising clustering algorithms i.e. k-means clustering and support vector clustering. The experiments were conducted on one of the most challenging datasets–Maximum Unbiased Validation Dataset–which contains 17 activity classes carefully selected from PubChem. The proposed algorithms were then compared with other machine learning techniques such as support vector machine, random forest, and similarity searching. The results show that CWS-ELM in conjunction with support vector clustering yields the best performance when utilised together with Sokal/Sneath(1) coefficient. Furthermore, ECFP_6 fingerprint presents the best results in our framework compared to the other types of fingerprints, namely ECFP_4, FCFP_4, and FCFP_6. PMID:29652912

  13. Broiler chickens can benefit from machine learning: support vector machine analysis of observational epidemiological data

    PubMed Central

    Hepworth, Philip J.; Nefedov, Alexey V.; Muchnik, Ilya B.; Morgan, Kenton L.

    2012-01-01

    Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide. PMID:22319115

  14. Broiler chickens can benefit from machine learning: support vector machine analysis of observational epidemiological data.

    PubMed

    Hepworth, Philip J; Nefedov, Alexey V; Muchnik, Ilya B; Morgan, Kenton L

    2012-08-07

    Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide.

  15. Logic Learning Machine and standard supervised methods for Hodgkin's lymphoma prognosis using gene expression data and clinical variables.

    PubMed

    Parodi, Stefano; Manneschi, Chiara; Verda, Damiano; Ferrari, Enrico; Muselli, Marco

    2018-03-01

    This study evaluates the performance of a set of machine learning techniques in predicting the prognosis of Hodgkin's lymphoma using clinical factors and gene expression data. Analysed samples from 130 Hodgkin's lymphoma patients included a small set of clinical variables and more than 54,000 gene features. Machine learning classifiers included three black-box algorithms ( k-nearest neighbour, Artificial Neural Network, and Support Vector Machine) and two methods based on intelligible rules (Decision Tree and the innovative Logic Learning Machine method). Support Vector Machine clearly outperformed any of the other methods. Among the two rule-based algorithms, Logic Learning Machine performed better and identified a set of simple intelligible rules based on a combination of clinical variables and gene expressions. Decision Tree identified a non-coding gene ( XIST) involved in the early phases of X chromosome inactivation that was overexpressed in females and in non-relapsed patients. XIST expression might be responsible for the better prognosis of female Hodgkin's lymphoma patients.

  16. Multiclass Reduced-Set Support Vector Machines

    NASA Technical Reports Server (NTRS)

    Tang, Benyang; Mazzoni, Dominic

    2006-01-01

    There are well-established methods for reducing the number of support vectors in a trained binary support vector machine, often with minimal impact on accuracy. We show how reduced-set methods can be applied to multiclass SVMs made up of several binary SVMs, with significantly better results than reducing each binary SVM independently. Our approach is based on Burges' approach that constructs each reduced-set vector as the pre-image of a vector in kernel space, but we extend this by recomputing the SVM weights and bias optimally using the original SVM objective function. This leads to greater accuracy for a binary reduced-set SVM, and also allows vectors to be 'shared' between multiple binary SVMs for greater multiclass accuracy with fewer reduced-set vectors. We also propose computing pre-images using differential evolution, which we have found to be more robust than gradient descent alone. We show experimental results on a variety of problems and find that this new approach is consistently better than previous multiclass reduced-set methods, sometimes with a dramatic difference.

  17. Support vector machine for the diagnosis of malignant mesothelioma

    NASA Astrophysics Data System (ADS)

    Ushasukhanya, S.; Nithyakalyani, A.; Sivakumar, V.

    2018-04-01

    Harmful mesothelioma is an illness in which threatening (malignancy) cells shape in the covering of the trunk or stomach area. Being presented to asbestos can influence the danger of threatening mesothelioma. Signs and side effects of threatening mesothelioma incorporate shortness of breath and agony under the rib confine. Tests that inspect within the trunk and belly are utilized to recognize (find) and analyse harmful mesothelioma. Certain elements influence forecast (shot of recuperation) and treatment choices. In this review, Support vector machine (SVM) classifiers were utilized for Mesothelioma sickness conclusion. SVM output is contrasted by concentrating on Mesothelioma’s sickness and findings by utilizing similar information set. The support vector machine algorithm gives 92.5% precision acquired by means of 3-overlap cross-approval. The Mesothelioma illness dataset were taken from an organization reports from Turkey.

  18. Combined empirical mode decomposition and texture features for skin lesion classification using quadratic support vector machine.

    PubMed

    Wahba, Maram A; Ashour, Amira S; Napoleon, Sameh A; Abd Elnaby, Mustafa M; Guo, Yanhui

    2017-12-01

    Basal cell carcinoma is one of the most common malignant skin lesions. Automated lesion identification and classification using image processing techniques is highly required to reduce the diagnosis errors. In this study, a novel technique is applied to classify skin lesion images into two classes, namely the malignant Basal cell carcinoma and the benign nevus. A hybrid combination of bi-dimensional empirical mode decomposition and gray-level difference method features is proposed after hair removal. The combined features are further classified using quadratic support vector machine (Q-SVM). The proposed system has achieved outstanding performance of 100% accuracy, sensitivity and specificity compared to other support vector machine procedures as well as with different extracted features. Basal Cell Carcinoma is effectively classified using Q-SVM with the proposed combined features.

  19. The optional selection of micro-motion feature based on Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Li, Bo; Ren, Hongmei; Xiao, Zhi-he; Sheng, Jing

    2017-11-01

    Micro-motion form of target is multiple, different micro-motion forms are apt to be modulated, which makes it difficult for feature extraction and recognition. Aiming at feature extraction of cone-shaped objects with different micro-motion forms, this paper proposes the best selection method of micro-motion feature based on support vector machine. After the time-frequency distribution of radar echoes, comparing the time-frequency spectrum of objects with different micro-motion forms, features are extracted based on the differences between the instantaneous frequency variations of different micro-motions. According to the methods based on SVM (Support Vector Machine) features are extracted, then the best features are acquired. Finally, the result shows the method proposed in this paper is feasible under the test condition of certain signal-to-noise ratio(SNR).

  20. Analysis of spectrally resolved autofluorescence images by support vector machines

    NASA Astrophysics Data System (ADS)

    Mateasik, A.; Chorvat, D.; Chorvatova, A.

    2013-02-01

    Spectral analysis of the autofluorescence images of isolated cardiac cells was performed to evaluate and to classify the metabolic state of the cells in respect to the responses to metabolic modulators. The classification was done using machine learning approach based on support vector machine with the set of the automatically calculated features from recorded spectral profile of spectral autofluorescence images. This classification method was compared with the classical approach where the individual spectral components contributing to cell autofluorescence were estimated by spectral analysis, namely by blind source separation using non-negative matrix factorization. Comparison of both methods showed that machine learning can effectively classify the spectrally resolved autofluorescence images without the need of detailed knowledge about the sources of autofluorescence and their spectral properties.

  1. Three dimensional magnetic fields in extra high speed modified Lundell alternators computed by a combined vector-scalar magnetic potential finite element method

    NASA Technical Reports Server (NTRS)

    Demerdash, N. A.; Wang, R.; Secunde, R.

    1992-01-01

    A 3D finite element (FE) approach was developed and implemented for computation of global magnetic fields in a 14.3 kVA modified Lundell alternator. The essence of the new method is the combined use of magnetic vector and scalar potential formulations in 3D FEs. This approach makes it practical, using state of the art supercomputer resources, to globally analyze magnetic fields and operating performances of rotating machines which have truly 3D magnetic flux patterns. The 3D FE-computed fields and machine inductances as well as various machine performance simulations of the 14.3 kVA machine are presented in this paper and its two companion papers.

  2. Dropout Prediction in E-Learning Courses through the Combination of Machine Learning Techniques

    ERIC Educational Resources Information Center

    Lykourentzou, Ioanna; Giannoukos, Ioannis; Nikolopoulos, Vassilis; Mpardis, George; Loumos, Vassili

    2009-01-01

    In this paper, a dropout prediction method for e-learning courses, based on three popular machine learning techniques and detailed student data, is proposed. The machine learning techniques used are feed-forward neural networks, support vector machines and probabilistic ensemble simplified fuzzy ARTMAP. Since a single technique may fail to…

  3. Landslide susceptibility mapping & prediction using Support Vector Machine for Mandakini River Basin, Garhwal Himalaya, India

    NASA Astrophysics Data System (ADS)

    Kumar, Deepak; Thakur, Manoj; Dubey, Chandra S.; Shukla, Dericks P.

    2017-10-01

    In recent years, various machine learning techniques have been applied for landslide susceptibility mapping. In this study, three different variants of support vector machine viz., SVM, Proximal Support Vector Machine (PSVM) and L2-Support Vector Machine - Modified Finite Newton (L2-SVM-MFN) have been applied on the Mandakini River Basin in Uttarakhand, India to carry out the landslide susceptibility mapping. Eight thematic layers such as elevation, slope, aspect, drainages, geology/lithology, buffer of thrusts/faults, buffer of streams and soil along with the past landslide data were mapped in GIS environment and used for landslide susceptibility mapping in MATLAB. The study area covering 1625 km2 has merely 0.11% of area under landslides. There are 2009 pixels for past landslides out of which 50% (1000) landslides were considered as training set while remaining 50% as testing set. The performance of these techniques has been evaluated and the computational results show that L2-SVM-MFN obtains higher prediction values (0.829) of receiver operating characteristic curve (AUC-area under the curve) as compared to 0.807 for PSVM model and 0.79 for SVM. The results obtained from L2-SVM-MFN model are found to be superior than other SVM prediction models and suggest the usefulness of this technique to problem of landslide susceptibility mapping where training data is very less. However, these techniques can be used for satisfactory determination of susceptible zones with these inputs.

  4. Weighted K-means support vector machine for cancer prediction.

    PubMed

    Kim, SungHwan

    2016-01-01

    To date, the support vector machine (SVM) has been widely applied to diverse bio-medical fields to address disease subtype identification and pathogenicity of genetic variants. In this paper, I propose the weighted K-means support vector machine (wKM-SVM) and weighted support vector machine (wSVM), for which I allow the SVM to impose weights to the loss term. Besides, I demonstrate the numerical relations between the objective function of the SVM and weights. Motivated by general ensemble techniques, which are known to improve accuracy, I directly adopt the boosting algorithm to the newly proposed weighted KM-SVM (and wSVM). For predictive performance, a range of simulation studies demonstrate that the weighted KM-SVM (and wSVM) with boosting outperforms the standard KM-SVM (and SVM) including but not limited to many popular classification rules. I applied the proposed methods to simulated data and two large-scale real applications in the TCGA pan-cancer methylation data of breast and kidney cancer. In conclusion, the weighted KM-SVM (and wSVM) increases accuracy of the classification model, and will facilitate disease diagnosis and clinical treatment decisions to benefit patients. A software package (wSVM) is publicly available at the R-project webpage (https://www.r-project.org).

  5. Predicting complications of percutaneous coronary intervention using a novel support vector method.

    PubMed

    Lee, Gyemin; Gurm, Hitinder S; Syed, Zeeshan

    2013-01-01

    To explore the feasibility of a novel approach using an augmented one-class learning algorithm to model in-laboratory complications of percutaneous coronary intervention (PCI). Data from the Blue Cross Blue Shield of Michigan Cardiovascular Consortium (BMC2) multicenter registry for the years 2007 and 2008 (n=41 016) were used to train models to predict 13 different in-laboratory PCI complications using a novel one-plus-class support vector machine (OP-SVM) algorithm. The performance of these models in terms of discrimination and calibration was compared to the performance of models trained using the following classification algorithms on BMC2 data from 2009 (n=20 289): logistic regression (LR), one-class support vector machine classification (OC-SVM), and two-class support vector machine classification (TC-SVM). For the OP-SVM and TC-SVM approaches, variants of the algorithms with cost-sensitive weighting were also considered. The OP-SVM algorithm and its cost-sensitive variant achieved the highest area under the receiver operating characteristic curve for the majority of the PCI complications studied (eight cases). Similar improvements were observed for the Hosmer-Lemeshow χ(2) value (seven cases) and the mean cross-entropy error (eight cases). The OP-SVM algorithm based on an augmented one-class learning problem improved discrimination and calibration across different PCI complications relative to LR and traditional support vector machine classification. Such an approach may have value in a broader range of clinical domains.

  6. Predicting complications of percutaneous coronary intervention using a novel support vector method

    PubMed Central

    Lee, Gyemin; Gurm, Hitinder S; Syed, Zeeshan

    2013-01-01

    Objective To explore the feasibility of a novel approach using an augmented one-class learning algorithm to model in-laboratory complications of percutaneous coronary intervention (PCI). Materials and methods Data from the Blue Cross Blue Shield of Michigan Cardiovascular Consortium (BMC2) multicenter registry for the years 2007 and 2008 (n=41 016) were used to train models to predict 13 different in-laboratory PCI complications using a novel one-plus-class support vector machine (OP-SVM) algorithm. The performance of these models in terms of discrimination and calibration was compared to the performance of models trained using the following classification algorithms on BMC2 data from 2009 (n=20 289): logistic regression (LR), one-class support vector machine classification (OC-SVM), and two-class support vector machine classification (TC-SVM). For the OP-SVM and TC-SVM approaches, variants of the algorithms with cost-sensitive weighting were also considered. Results The OP-SVM algorithm and its cost-sensitive variant achieved the highest area under the receiver operating characteristic curve for the majority of the PCI complications studied (eight cases). Similar improvements were observed for the Hosmer–Lemeshow χ2 value (seven cases) and the mean cross-entropy error (eight cases). Conclusions The OP-SVM algorithm based on an augmented one-class learning problem improved discrimination and calibration across different PCI complications relative to LR and traditional support vector machine classification. Such an approach may have value in a broader range of clinical domains. PMID:23599229

  7. Support Vector Hazards Machine: A Counting Process Framework for Learning Risk Scores for Censored Outcomes.

    PubMed

    Wang, Yuanjia; Chen, Tianle; Zeng, Donglin

    2016-01-01

    Learning risk scores to predict dichotomous or continuous outcomes using machine learning approaches has been studied extensively. However, how to learn risk scores for time-to-event outcomes subject to right censoring has received little attention until recently. Existing approaches rely on inverse probability weighting or rank-based regression, which may be inefficient. In this paper, we develop a new support vector hazards machine (SVHM) approach to predict censored outcomes. Our method is based on predicting the counting process associated with the time-to-event outcomes among subjects at risk via a series of support vector machines. Introducing counting processes to represent time-to-event data leads to a connection between support vector machines in supervised learning and hazards regression in standard survival analysis. To account for different at risk populations at observed event times, a time-varying offset is used in estimating risk scores. The resulting optimization is a convex quadratic programming problem that can easily incorporate non-linearity using kernel trick. We demonstrate an interesting link from the profiled empirical risk function of SVHM to the Cox partial likelihood. We then formally show that SVHM is optimal in discriminating covariate-specific hazard function from population average hazard function, and establish the consistency and learning rate of the predicted risk using the estimated risk scores. Simulation studies show improved prediction accuracy of the event times using SVHM compared to existing machine learning methods and standard conventional approaches. Finally, we analyze two real world biomedical study data where we use clinical markers and neuroimaging biomarkers to predict age-at-onset of a disease, and demonstrate superiority of SVHM in distinguishing high risk versus low risk subjects.

  8. Product Quality Modelling Based on Incremental Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Wang, J.; Zhang, W.; Qin, B.; Shi, W.

    2012-05-01

    Incremental Support vector machine (ISVM) is a new learning method developed in recent years based on the foundations of statistical learning theory. It is suitable for the problem of sequentially arriving field data and has been widely used for product quality prediction and production process optimization. However, the traditional ISVM learning does not consider the quality of the incremental data which may contain noise and redundant data; it will affect the learning speed and accuracy to a great extent. In order to improve SVM training speed and accuracy, a modified incremental support vector machine (MISVM) is proposed in this paper. Firstly, the margin vectors are extracted according to the Karush-Kuhn-Tucker (KKT) condition; then the distance from the margin vectors to the final decision hyperplane is calculated to evaluate the importance of margin vectors, where the margin vectors are removed while their distance exceed the specified value; finally, the original SVs and remaining margin vectors are used to update the SVM. The proposed MISVM can not only eliminate the unimportant samples such as noise samples, but also can preserve the important samples. The MISVM has been experimented on two public data and one field data of zinc coating weight in strip hot-dip galvanizing, and the results shows that the proposed method can improve the prediction accuracy and the training speed effectively. Furthermore, it can provide the necessary decision supports and analysis tools for auto control of product quality, and also can extend to other process industries, such as chemical process and manufacturing process.

  9. Deep Restricted Kernel Machines Using Conjugate Feature Duality.

    PubMed

    Suykens, Johan A K

    2017-08-01

    The aim of this letter is to propose a theory of deep restricted kernel machines offering new foundations for deep learning with kernel machines. From the viewpoint of deep learning, it is partially related to restricted Boltzmann machines, which are characterized by visible and hidden units in a bipartite graph without hidden-to-hidden connections and deep learning extensions as deep belief networks and deep Boltzmann machines. From the viewpoint of kernel machines, it includes least squares support vector machines for classification and regression, kernel principal component analysis (PCA), matrix singular value decomposition, and Parzen-type models. A key element is to first characterize these kernel machines in terms of so-called conjugate feature duality, yielding a representation with visible and hidden units. It is shown how this is related to the energy form in restricted Boltzmann machines, with continuous variables in a nonprobabilistic setting. In this new framework of so-called restricted kernel machine (RKM) representations, the dual variables correspond to hidden features. Deep RKM are obtained by coupling the RKMs. The method is illustrated for deep RKM, consisting of three levels with a least squares support vector machine regression level and two kernel PCA levels. In its primal form also deep feedforward neural networks can be trained within this framework.

  10. Optimal quantum cloning based on the maximin principle by using a priori information

    NASA Astrophysics Data System (ADS)

    Kang, Peng; Dai, Hong-Yi; Wei, Jia-Hua; Zhang, Ming

    2016-10-01

    We propose an optimal 1 →2 quantum cloning method based on the maximin principle by making full use of a priori information of amplitude and phase about the general cloned qubit input set, which is a simply connected region enclosed by a "longitude-latitude grid" on the Bloch sphere. Theoretically, the fidelity of the optimal quantum cloning machine derived from this method is the largest in terms of the maximin principle compared with that of any other machine. The problem solving is an optimization process that involves six unknown complex variables, six vectors in an uncertain-dimensional complex vector space, and four equality constraints. Moreover, by restricting the structure of the quantum cloning machine, the optimization problem is simplified as a three-real-parameter suboptimization problem with only one equality constraint. We obtain the explicit formula for a suboptimal quantum cloning machine. Additionally, the fidelity of our suboptimal quantum cloning machine is higher than or at least equal to that of universal quantum cloning machines and phase-covariant quantum cloning machines. It is also underlined that the suboptimal cloning machine outperforms the "belt quantum cloning machine" for some cases.

  11. Novel method of finding extreme edges in a convex set of N-dimension vectors

    NASA Astrophysics Data System (ADS)

    Hu, Chia-Lun J.

    2001-11-01

    As we published in the last few years, for a binary neural network pattern recognition system to learn a given mapping {Um mapped to Vm, m=1 to M} where um is an N- dimension analog (pattern) vector, Vm is a P-bit binary (classification) vector, the if-and-only-if (IFF) condition that this network can learn this mapping is that each i-set in {Ymi, m=1 to M} (where Ymithere existsVmiUm and Vmi=+1 or -1, is the i-th bit of VR-m).)(i=1 to P and there are P sets included here.) Is POSITIVELY, LINEARLY, INDEPENDENT or PLI. We have shown that this PLI condition is MORE GENERAL than the convexity condition applied to a set of N-vectors. In the design of old learning machines, we know that if a set of N-dimension analog vectors form a convex set, and if the machine can learn the boundary vectors (or extreme edges) of this set, then it can definitely learn the inside vectors contained in this POLYHEDRON CONE. This paper reports a new method and new algorithm to find the boundary vectors of a convex set of ND analog vectors.

  12. A Collaborative Framework for Distributed Privacy-Preserving Support Vector Machine Learning

    PubMed Central

    Que, Jialan; Jiang, Xiaoqian; Ohno-Machado, Lucila

    2012-01-01

    A Support Vector Machine (SVM) is a popular tool for decision support. The traditional way to build an SVM model is to estimate parameters based on a centralized repository of data. However, in the field of biomedicine, patient data are sometimes stored in local repositories or institutions where they were collected, and may not be easily shared due to privacy concerns. This creates a substantial barrier for researchers to effectively learn from the distributed data using machine learning tools like SVMs. To overcome this difficulty and promote efficient information exchange without sharing sensitive raw data, we developed a Distributed Privacy Preserving Support Vector Machine (DPP-SVM). The DPP-SVM enables privacy-preserving collaborative learning, in which a trusted server integrates “privacy-insensitive” intermediary results. The globally learned model is guaranteed to be exactly the same as learned from combined data. We also provide a free web-service (http://privacy.ucsd.edu:8080/ppsvm/) for multiple participants to collaborate and complete the SVM-learning task in an efficient and privacy-preserving manner. PMID:23304414

  13. Prototype Vector Machine for Large Scale Semi-Supervised Learning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Kai; Kwok, James T.; Parvin, Bahram

    2009-04-29

    Practicaldataminingrarelyfalls exactlyinto the supervisedlearning scenario. Rather, the growing amount of unlabeled data poses a big challenge to large-scale semi-supervised learning (SSL). We note that the computationalintensivenessofgraph-based SSLarises largely from the manifold or graph regularization, which in turn lead to large models that are dificult to handle. To alleviate this, we proposed the prototype vector machine (PVM), a highlyscalable,graph-based algorithm for large-scale SSL. Our key innovation is the use of"prototypes vectors" for effcient approximation on both the graph-based regularizer and model representation. The choice of prototypes are grounded upon two important criteria: they not only perform effective low-rank approximation of themore » kernel matrix, but also span a model suffering the minimum information loss compared with the complete model. We demonstrate encouraging performance and appealing scaling properties of the PVM on a number of machine learning benchmark data sets.« less

  14. Fuzzy support vector machines for adaptive Morse code recognition.

    PubMed

    Yang, Cheng-Hong; Jin, Li-Cheng; Chuang, Li-Yeh

    2006-11-01

    Morse code is now being harnessed for use in rehabilitation applications of augmentative-alternative communication and assistive technology, facilitating mobility, environmental control and adapted worksite access. In this paper, Morse code is selected as a communication adaptive device for persons who suffer from muscle atrophy, cerebral palsy or other severe handicaps. A stable typing rate is strictly required for Morse code to be effective as a communication tool. Therefore, an adaptive automatic recognition method with a high recognition rate is needed. The proposed system uses both fuzzy support vector machines and the variable-degree variable-step-size least-mean-square algorithm to achieve these objectives. We apply fuzzy memberships to each point, and provide different contributions to the decision learning function for support vector machines. Statistical analyses demonstrated that the proposed method elicited a higher recognition rate than other algorithms in the literature.

  15. A novel representation for apoptosis protein subcellular localization prediction using support vector machine.

    PubMed

    Zhang, Li; Liao, Bo; Li, Dachao; Zhu, Wen

    2009-07-21

    Apoptosis, or programmed cell death, plays an important role in development of an organism. Obtaining information on subcellular location of apoptosis proteins is very helpful to understand the apoptosis mechanism. In this paper, based on the concept that the position distribution information of amino acids is closely related with the structure and function of proteins, we introduce the concept of distance frequency [Matsuda, S., Vert, J.P., Ueda, N., Toh, H., Akutsu, T., 2005. A novel representation of protein sequences for prediction of subcellular location using support vector machines. Protein Sci. 14, 2804-2813] and propose a novel way to calculate distance frequencies. In order to calculate the local features, each protein sequence is separated into p parts with the same length in our paper. Then we use the novel representation of protein sequences and adopt support vector machine to predict subcellular location. The overall prediction accuracy is significantly improved by jackknife test.

  16. Evaluation and recognition of skin images with aging by support vector machine

    NASA Astrophysics Data System (ADS)

    Hu, Liangjun; Wu, Shulian; Li, Hui

    2016-10-01

    Aging is a very important issue not only in dermatology, but also cosmetic science. Cutaneous aging involves both chronological and photoaging aging process. The evaluation and classification of aging is an important issue with the medical cosmetology workers nowadays. The purpose of this study is to assess chronological-age-related and photo-age-related of human skin. The texture features of skin surface skin, such as coarseness, contrast were analyzed by Fourier transform and Tamura. And the aim of it is to detect the object hidden in the skin texture in difference aging skin. Then, Support vector machine was applied to train the texture feature. The different age's states were distinguished by the support vector machine (SVM) classifier. The results help us to further understand the mechanism of different aging skin from texture feature and help us to distinguish the different aging states.

  17. Scattering transform and LSPTSVM based fault diagnosis of rotating machinery

    NASA Astrophysics Data System (ADS)

    Ma, Shangjun; Cheng, Bo; Shang, Zhaowei; Liu, Geng

    2018-05-01

    This paper proposes an algorithm for fault diagnosis of rotating machinery to overcome the shortcomings of classical techniques which are noise sensitive in feature extraction and time consuming for training. Based on the scattering transform and the least squares recursive projection twin support vector machine (LSPTSVM), the method has the advantages of high efficiency and insensitivity for noise signal. Using the energy of the scattering coefficients in each sub-band, the features of the vibration signals are obtained. Then, an LSPTSVM classifier is used for fault diagnosis. The new method is compared with other common methods including the proximal support vector machine, the standard support vector machine and multi-scale theory by using fault data for two systems, a motor bearing and a gear box. The results show that the new method proposed in this study is more effective for fault diagnosis of rotating machinery.

  18. Classification of Stellar Spectra with Fuzzy Minimum Within-Class Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Zhong-bao, Liu; Wen-ai, Song; Jing, Zhang; Wen-juan, Zhao

    2017-06-01

    Classification is one of the important tasks in astronomy, especially in spectra analysis. Support Vector Machine (SVM) is a typical classification method, which is widely used in spectra classification. Although it performs well in practice, its classification accuracies can not be greatly improved because of two limitations. One is it does not take the distribution of the classes into consideration. The other is it is sensitive to noise. In order to solve the above problems, inspired by the maximization of the Fisher's Discriminant Analysis (FDA) and the SVM separability constraints, fuzzy minimum within-class support vector machine (FMWSVM) is proposed in this paper. In FMWSVM, the distribution of the classes is reflected by the within-class scatter in FDA and the fuzzy membership function is introduced to decrease the influence of the noise. The comparative experiments with SVM on the SDSS datasets verify the effectiveness of the proposed classifier FMWSVM.

  19. Geographical traceability of Marsdenia tenacissima by Fourier transform infrared spectroscopy and chemometrics

    NASA Astrophysics Data System (ADS)

    Li, Chao; Yang, Sheng-Chao; Guo, Qiao-Sheng; Zheng, Kai-Yan; Wang, Ping-Li; Meng, Zhen-Gui

    2016-01-01

    A combination of Fourier transform infrared spectroscopy with chemometrics tools provided an approach for studying Marsdenia tenacissima according to its geographical origin. A total of 128 M. tenacissima samples from four provinces in China were analyzed with FTIR spectroscopy. Six pattern recognition methods were used to construct the discrimination models: support vector machine-genetic algorithms, support vector machine-particle swarm optimization, K-nearest neighbors, radial basis function neural network, random forest and support vector machine-grid search. Experimental results showed that K-nearest neighbors was superior to other mathematical algorithms after data were preprocessed with wavelet de-noising, with a discrimination rate of 100% in both the training and prediction sets. This study demonstrated that FTIR spectroscopy coupled with K-nearest neighbors could be successfully applied to determine the geographical origins of M. tenacissima samples, thereby providing reliable authentication in a rapid, cheap and noninvasive way.

  20. A support vector machine approach for classification of welding defects from ultrasonic signals

    NASA Astrophysics Data System (ADS)

    Chen, Yuan; Ma, Hong-Wei; Zhang, Guang-Ming

    2014-07-01

    Defect classification is an important issue in ultrasonic non-destructive evaluation. A layered multi-class support vector machine (LMSVM) classification system, which combines multiple SVM classifiers through a layered architecture, is proposed in this paper. The proposed LMSVM classification system is applied to the classification of welding defects from ultrasonic test signals. The measured ultrasonic defect echo signals are first decomposed into wavelet coefficients by the wavelet packet transform. The energy of the wavelet coefficients at different frequency channels are used to construct the feature vectors. The bees algorithm (BA) is then used for feature selection and SVM parameter optimisation for the LMSVM classification system. The BA-based feature selection optimises the energy feature vectors. The optimised feature vectors are input to the LMSVM classification system for training and testing. Experimental results of classifying welding defects demonstrate that the proposed technique is highly robust, precise and reliable for ultrasonic defect classification.

  1. Support vector machine based decision for mechanical fault condition monitoring in induction motor using an advanced Hilbert-Park transform.

    PubMed

    Ben Salem, Samira; Bacha, Khmais; Chaari, Abdelkader

    2012-09-01

    In this work we suggest an original fault signature based on an improved combination of Hilbert and Park transforms. Starting from this combination we can create two fault signatures: Hilbert modulus current space vector (HMCSV) and Hilbert phase current space vector (HPCSV). These two fault signatures are subsequently analysed using the classical fast Fourier transform (FFT). The effects of mechanical faults on the HMCSV and HPCSV spectrums are described, and the related frequencies are determined. The magnitudes of spectral components, relative to the studied faults (air-gap eccentricity and outer raceway ball bearing defect), are extracted in order to develop the input vector necessary for learning and testing the support vector machine with an aim of classifying automatically the various states of the induction motor. Copyright © 2012 ISA. Published by Elsevier Ltd. All rights reserved.

  2. Vision based nutrient deficiency classification in maize plants using multi class support vector machines

    NASA Astrophysics Data System (ADS)

    Leena, N.; Saju, K. K.

    2018-04-01

    Nutritional deficiencies in plants are a major concern for farmers as it affects productivity and thus profit. The work aims to classify nutritional deficiencies in maize plant in a non-destructive mannerusing image processing and machine learning techniques. The colored images of the leaves are analyzed and classified with multi-class support vector machine (SVM) method. Several images of maize leaves with known deficiencies like nitrogen, phosphorous and potassium (NPK) are used to train the SVM classifier prior to the classification of test images. The results show that the method was able to classify and identify nutritional deficiencies.

  3. Applications of Support Vector Machines In Chemo And Bioinformatics

    NASA Astrophysics Data System (ADS)

    Jayaraman, V. K.; Sundararajan, V.

    2010-10-01

    Conventional linear & nonlinear tools for classification, regression & data driven modeling are being replaced on a rapid scale by newer techniques & tools based on artificial intelligence and machine learning. While the linear techniques are not applicable for inherently nonlinear problems, newer methods serve as attractive alternatives for solving real life problems. Support Vector Machine (SVM) classifiers are a set of universal feed-forward network based classification algorithms that have been formulated from statistical learning theory and structural risk minimization principle. SVM regression closely follows the classification methodology. In this work recent applications of SVM in Chemo & Bioinformatics will be described with suitable illustrative examples.

  4. A comparative study of machine learning models for ethnicity classification

    NASA Astrophysics Data System (ADS)

    Trivedi, Advait; Bessie Amali, D. Geraldine

    2017-11-01

    This paper endeavours to adopt a machine learning approach to solve the problem of ethnicity recognition. Ethnicity identification is an important vision problem with its use cases being extended to various domains. Despite the multitude of complexity involved, ethnicity identification comes naturally to humans. This meta information can be leveraged to make several decisions, be it in target marketing or security. With the recent development of intelligent systems a sub module to efficiently capture ethnicity would be useful in several use cases. Several attempts to identify an ideal learning model to represent a multi-ethnic dataset have been recorded. A comparative study of classifiers such as support vector machines, logistic regression has been documented. Experimental results indicate that the logical classifier provides a much accurate classification than the support vector machine.

  5. Matrix Multiplication Algorithm Selection with Support Vector Machines

    DTIC Science & Technology

    2015-05-01

    libraries that could intelligently choose the optimal algorithm for a particular set of inputs. Users would be oblivious to the underlying algorithmic...SAT.” J. Artif . Intell. Res.(JAIR), vol. 32, pp. 565–606, 2008. [9] M. G. Lagoudakis and M. L. Littman, “Algorithm selection using reinforcement...Artificial Intelligence , vol. 21, no. 05, pp. 961–976, 2007. [15] C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector machines,” ACM

  6. A Parallel Vector Machine for the PM Programming Language

    NASA Astrophysics Data System (ADS)

    Bellerby, Tim

    2016-04-01

    PM is a new programming language which aims to make the writing of computational geoscience models on parallel hardware accessible to scientists who are not themselves expert parallel programmers. It is based around the concept of communicating operators: language constructs that enable variables local to a single invocation of a parallelised loop to be viewed as if they were arrays spanning the entire loop domain. This mechanism enables different loop invocations (which may or may not be executing on different processors) to exchange information in a manner that extends the successful Communicating Sequential Processes idiom from single messages to collective communication. Communicating operators avoid the additional synchronisation mechanisms, such as atomic variables, required when programming using the Partitioned Global Address Space (PGAS) paradigm. Using a single loop invocation as the fundamental unit of concurrency enables PM to uniformly represent different levels of parallelism from vector operations through shared memory systems to distributed grids. This paper describes an implementation of PM based on a vectorised virtual machine. On a single processor node, concurrent operations are implemented using masked vector operations. Virtual machine instructions operate on vectors of values and may be unmasked, masked using a Boolean field, or masked using an array of active vector cell locations. Conditional structures (such as if-then-else or while statement implementations) calculate and apply masks to the operations they control. A shift in mask representation from Boolean to location-list occurs when active locations become sufficiently sparse. Parallel loops unfold data structures (or vectors of data structures for nested loops) into vectors of values that may additionally be distributed over multiple computational nodes and then split into micro-threads compatible with the size of the local cache. Inter-node communication is accomplished using standard OpenMP and MPI. Performance analyses of the PM vector machine, demonstrating its scaling properties with respect to domain size and the number of processor nodes will be presented for a range of hardware configurations. The PM software and language definition are being made available under unrestrictive MIT and Creative Commons Attribution licenses respectively: www.pm-lang.org.

  7. Seminal quality prediction using data mining methods.

    PubMed

    Sahoo, Anoop J; Kumar, Yugal

    2014-01-01

    Now-a-days, some new classes of diseases have come into existences which are known as lifestyle diseases. The main reasons behind these diseases are changes in the lifestyle of people such as alcohol drinking, smoking, food habits etc. After going through the various lifestyle diseases, it has been found that the fertility rates (sperm quantity) in men has considerably been decreasing in last two decades. Lifestyle factors as well as environmental factors are mainly responsible for the change in the semen quality. The objective of this paper is to identify the lifestyle and environmental features that affects the seminal quality and also fertility rate in man using data mining methods. The five artificial intelligence techniques such as Multilayer perceptron (MLP), Decision Tree (DT), Navie Bayes (Kernel), Support vector machine+Particle swarm optimization (SVM+PSO) and Support vector machine (SVM) have been applied on fertility dataset to evaluate the seminal quality and also to predict the person is either normal or having altered fertility rate. While the eight feature selection techniques such as support vector machine (SVM), neural network (NN), evolutionary logistic regression (LR), support vector machine plus particle swarm optimization (SVM+PSO), principle component analysis (PCA), chi-square test, correlation and T-test methods have been used to identify more relevant features which affect the seminal quality. These techniques are applied on fertility dataset which contains 100 instances with nine attribute with two classes. The experimental result shows that SVM+PSO provides higher accuracy and area under curve (AUC) rate (94% & 0.932) among multi-layer perceptron (MLP) (92% & 0.728), Support Vector Machines (91% & 0.758), Navie Bayes (Kernel) (89% & 0.850) and Decision Tree (89% & 0.735) for some of the seminal parameters. This paper also focuses on the feature selection process i.e. how to select the features which are more important for prediction of fertility rate. In this paper, eight feature selection methods are applied on fertility dataset to find out a set of good features. The investigational results shows that childish diseases (0.079) and high fever features (0.057) has less impact on fertility rate while age (0.8685), season (0.843), surgical intervention (0.7683), alcohol consumption (0.5992), smoking habit (0.575), number of hours spent on setting (0.4366) and accident (0.5973) features have more impact. It is also observed that feature selection methods increase the accuracy of above mentioned techniques (multilayer perceptron 92%, support vector machine 91%, SVM+PSO 94%, Navie Bayes (Kernel) 89% and decision tree 89%) as compared to without feature selection methods (multilayer perceptron 86%, support vector machine 86%, SVM+PSO 85%, Navie Bayes (Kernel) 83% and decision tree 84%) which shows the applicability of feature selection methods in prediction. This paper lightens the application of artificial techniques in medical domain. From this paper, it can be concluded that data mining methods can be used to predict a person with or without disease based on environmental and lifestyle parameters/features rather than undergoing various medical test. In this paper, five data mining techniques are used to predict the fertility rate and among which SVM+PSO provide more accurate results than support vector machine and decision tree.

  8. Research on computer systems benchmarking

    NASA Technical Reports Server (NTRS)

    Smith, Alan Jay (Principal Investigator)

    1996-01-01

    This grant addresses the topic of research on computer systems benchmarking and is more generally concerned with performance issues in computer systems. This report reviews work in those areas during the period of NASA support under this grant. The bulk of the work performed concerned benchmarking and analysis of CPUs, compilers, caches, and benchmark programs. The first part of this work concerned the issue of benchmark performance prediction. A new approach to benchmarking and machine characterization was reported, using a machine characterizer that measures the performance of a given system in terms of a Fortran abstract machine. Another report focused on analyzing compiler performance. The performance impact of optimization in the context of our methodology for CPU performance characterization was based on the abstract machine model. Benchmark programs are analyzed in another paper. A machine-independent model of program execution was developed to characterize both machine performance and program execution. By merging these machine and program characterizations, execution time can be estimated for arbitrary machine/program combinations. The work was continued into the domain of parallel and vector machines, including the issue of caches in vector processors and multiprocessors. All of the afore-mentioned accomplishments are more specifically summarized in this report, as well as those smaller in magnitude supported by this grant.

  9. Application of high-performance computing to numerical simulation of human movement

    NASA Technical Reports Server (NTRS)

    Anderson, F. C.; Ziegler, J. M.; Pandy, M. G.; Whalen, R. T.

    1995-01-01

    We have examined the feasibility of using massively-parallel and vector-processing supercomputers to solve large-scale optimization problems for human movement. Specifically, we compared the computational expense of determining the optimal controls for the single support phase of gait using a conventional serial machine (SGI Iris 4D25), a MIMD parallel machine (Intel iPSC/860), and a parallel-vector-processing machine (Cray Y-MP 8/864). With the human body modeled as a 14 degree-of-freedom linkage actuated by 46 musculotendinous units, computation of the optimal controls for gait could take up to 3 months of CPU time on the Iris. Both the Cray and the Intel are able to reduce this time to practical levels. The optimal solution for gait can be found with about 77 hours of CPU on the Cray and with about 88 hours of CPU on the Intel. Although the overall speeds of the Cray and the Intel were found to be similar, the unique capabilities of each machine are better suited to different portions of the computational algorithm used. The Intel was best suited to computing the derivatives of the performance criterion and the constraints whereas the Cray was best suited to parameter optimization of the controls. These results suggest that the ideal computer architecture for solving very large-scale optimal control problems is a hybrid system in which a vector-processing machine is integrated into the communication network of a MIMD parallel machine.

  10. Automated image segmentation using support vector machines

    NASA Astrophysics Data System (ADS)

    Powell, Stephanie; Magnotta, Vincent A.; Andreasen, Nancy C.

    2007-03-01

    Neurodegenerative and neurodevelopmental diseases demonstrate problems associated with brain maturation and aging. Automated methods to delineate brain structures of interest are required to analyze large amounts of imaging data like that being collected in several on going multi-center studies. We have previously reported on using artificial neural networks (ANN) to define subcortical brain structures including the thalamus (0.88), caudate (0.85) and the putamen (0.81). In this work, apriori probability information was generated using Thirion's demons registration algorithm. The input vector consisted of apriori probability, spherical coordinates, and an iris of surrounding signal intensity values. We have applied the support vector machine (SVM) machine learning algorithm to automatically segment subcortical and cerebellar regions using the same input vector information. SVM architecture was derived from the ANN framework. Training was completed using a radial-basis function kernel with gamma equal to 5.5. Training was performed using 15,000 vectors collected from 15 training images in approximately 10 minutes. The resulting support vectors were applied to delineate 10 images not part of the training set. Relative overlap calculated for the subcortical structures was 0.87 for the thalamus, 0.84 for the caudate, 0.84 for the putamen, and 0.72 for the hippocampus. Relative overlap for the cerebellar lobes ranged from 0.76 to 0.86. The reliability of the SVM based algorithm was similar to the inter-rater reliability between manual raters and can be achieved without rater intervention.

  11. A sparse matrix algorithm on the Boolean vector machine

    NASA Technical Reports Server (NTRS)

    Wagner, Robert A.; Patrick, Merrell L.

    1988-01-01

    VLSI technology is being used to implement a prototype Boolean Vector Machine (BVM), which is a large network of very small processors with equally small memories that operate in SIMD mode; these use bit-serial arithmetic, and communicate via cube-connected cycles network. The BVM's bit-serial arithmetic and the small memories of individual processors are noted to compromise the system's effectiveness in large numerical problem applications. Attention is presently given to the implementation of a basic matrix-vector iteration algorithm for space matrices of the BVM, in order to generate over 1 billion useful floating-point operations/sec for this iteration algorithm. The algorithm is expressed in a novel language designated 'BVM'.

  12. Experiences in using the CYBER 203 for three-dimensional transonic flow calculations

    NASA Technical Reports Server (NTRS)

    Melson, N. D.; Keller, J. D.

    1982-01-01

    In this paper, the authors report on some of their experiences modifying two three-dimensional transonic flow programs (FLO22 and FLO27) for use on the NASA Langley Research Center CYBER 203. Both of the programs discussed were originally written for use on serial machines. Several methods were attempted to optimize the execution of the two programs on the vector machine, including: (1) leaving the program in a scalar form (i.e., serial computation) with compiler software used to optimize and vectorize the program, (2) vectorizing parts of the existing algorithm in the program, and (3) incorporating a new vectorizable algorithm (ZEBRA I or ZEBRA II) in the program.

  13. Algorithm for detection the QRS complexes based on support vector machine

    NASA Astrophysics Data System (ADS)

    Van, G. V.; Podmasteryev, K. V.

    2017-11-01

    The efficiency of computer ECG analysis depends on the accurate detection of QRS-complexes. This paper presents an algorithm for QRS complex detection based of support vector machine (SVM). The proposed algorithm is evaluated on annotated standard databases such as MIT-BIH Arrhythmia database. The QRS detector obtained a sensitivity Se = 98.32% and specificity Sp = 95.46% for MIT-BIH Arrhythmia database. This algorithm can be used as the basis for the software to diagnose electrical activity of the heart.

  14. Support vector machine based classification of fast Fourier transform spectroscopy of proteins

    NASA Astrophysics Data System (ADS)

    Lazarevic, Aleksandar; Pokrajac, Dragoljub; Marcano, Aristides; Melikechi, Noureddine

    2009-02-01

    Fast Fourier transform spectroscopy has proved to be a powerful method for study of the secondary structure of proteins since peak positions and their relative amplitude are affected by the number of hydrogen bridges that sustain this secondary structure. However, to our best knowledge, the method has not been used yet for identification of proteins within a complex matrix like a blood sample. The principal reason is the apparent similarity of protein infrared spectra with actual differences usually masked by the solvent contribution and other interactions. In this paper, we propose a novel machine learning based method that uses protein spectra for classification and identification of such proteins within a given sample. The proposed method uses principal component analysis (PCA) to identify most important linear combinations of original spectral components and then employs support vector machine (SVM) classification model applied on such identified combinations to categorize proteins into one of given groups. Our experiments have been performed on the set of four different proteins, namely: Bovine Serum Albumin, Leptin, Insulin-like Growth Factor 2 and Osteopontin. Our proposed method of applying principal component analysis along with support vector machines exhibits excellent classification accuracy when identifying proteins using their infrared spectra.

  15. Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods.

    PubMed

    Polat, Huseyin; Danaei Mehr, Homay; Cetin, Aydin

    2017-04-01

    As Chronic Kidney Disease progresses slowly, early detection and effective treatment are the only cure to reduce the mortality rate. Machine learning techniques are gaining significance in medical diagnosis because of their classification ability with high accuracy rates. The accuracy of classification algorithms depend on the use of correct feature selection algorithms to reduce the dimension of datasets. In this study, Support Vector Machine classification algorithm was used to diagnose Chronic Kidney Disease. To diagnose the Chronic Kidney Disease, two essential types of feature selection methods namely, wrapper and filter approaches were chosen to reduce the dimension of Chronic Kidney Disease dataset. In wrapper approach, classifier subset evaluator with greedy stepwise search engine and wrapper subset evaluator with the Best First search engine were used. In filter approach, correlation feature selection subset evaluator with greedy stepwise search engine and filtered subset evaluator with the Best First search engine were used. The results showed that the Support Vector Machine classifier by using filtered subset evaluator with the Best First search engine feature selection method has higher accuracy rate (98.5%) in the diagnosis of Chronic Kidney Disease compared to other selected methods.

  16. An asymptotical machine

    NASA Astrophysics Data System (ADS)

    Cristallini, Achille

    2016-07-01

    A new and intriguing machine may be obtained replacing the moving pulley of a gun tackle with a fixed point in the rope. Its most important feature is the asymptotic efficiency. Here we obtain a satisfactory description of this machine by means of vector calculus and elementary trigonometry. The mathematical model has been compared with experimental data and briefly discussed.

  17. Amino acid "little Big Bang": representing amino acid substitution matrices as dot products of Euclidian vectors.

    PubMed

    Zimmermann, Karel; Gibrat, Jean-François

    2010-01-04

    Sequence comparisons make use of a one-letter representation for amino acids, the necessary quantitative information being supplied by the substitution matrices. This paper deals with the problem of finding a representation that provides a comprehensive description of amino acid intrinsic properties consistent with the substitution matrices. We present a Euclidian vector representation of the amino acids, obtained by the singular value decomposition of the substitution matrices. The substitution matrix entries correspond to the dot product of amino acid vectors. We apply this vector encoding to the study of the relative importance of various amino acid physicochemical properties upon the substitution matrices. We also characterize and compare the PAM and BLOSUM series substitution matrices. This vector encoding introduces a Euclidian metric in the amino acid space, consistent with substitution matrices. Such a numerical description of the amino acid is useful when intrinsic properties of amino acids are necessary, for instance, building sequence profiles or finding consensus sequences, using machine learning algorithms such as Support Vector Machine and Neural Networks algorithms.

  18. Elliptic-symmetry vector optical fields.

    PubMed

    Pan, Yue; Li, Yongnan; Li, Si-Min; Ren, Zhi-Cheng; Kong, Ling-Jun; Tu, Chenghou; Wang, Hui-Tian

    2014-08-11

    We present in principle and demonstrate experimentally a new kind of vector fields: elliptic-symmetry vector optical fields. This is a significant development in vector fields, as this breaks the cylindrical symmetry and enriches the family of vector fields. Due to the presence of an additional degrees of freedom, which is the interval between the foci in the elliptic coordinate system, the elliptic-symmetry vector fields are more flexible than the cylindrical vector fields for controlling the spatial structure of polarization and for engineering the focusing fields. The elliptic-symmetry vector fields can find many specific applications from optical trapping to optical machining and so on.

  19. Boosted Regression Trees Outperforms Support Vector Machines in Predicting (Regional) Yields of Winter Wheat from Single and Cumulated Dekadal Spot-VGT Derived Normalized Difference Vegetation Indices

    NASA Astrophysics Data System (ADS)

    Stas, Michiel; Dong, Qinghan; Heremans, Stien; Zhang, Beier; Van Orshoven, Jos

    2016-08-01

    This paper compares two machine learning techniques to predict regional winter wheat yields. The models, based on Boosted Regression Trees (BRT) and Support Vector Machines (SVM), are constructed of Normalized Difference Vegetation Indices (NDVI) derived from low resolution SPOT VEGETATION satellite imagery. Three types of NDVI-related predictors were used: Single NDVI, Incremental NDVI and Targeted NDVI. BRT and SVM were first used to select features with high relevance for predicting the yield. Although the exact selections differed between the prefectures, certain periods with high influence scores for multiple prefectures could be identified. The same period of high influence stretching from March to June was detected by both machine learning methods. After feature selection, BRT and SVM models were applied to the subset of selected features for actual yield forecasting. Whereas both machine learning methods returned very low prediction errors, BRT seems to slightly but consistently outperform SVM.

  20. Snack food as a modulator of human resting-state functional connectivity.

    PubMed

    Mendez-Torrijos, Andrea; Kreitz, Silke; Ivan, Claudiu; Konerth, Laura; Rösch, Julie; Pischetsrieder, Monika; Moll, Gunther; Kratz, Oliver; Dörfler, Arnd; Horndasch, Stefanie; Hess, Andreas

    2018-04-04

    To elucidate the mechanisms of how snack foods may induce non-homeostatic food intake, we used resting state functional magnetic resonance imaging (fMRI), as resting state networks can individually adapt to experience after short time exposures. In addition, we used graph theoretical analysis together with machine learning techniques (support vector machine) to identifying biomarkers that can categorize between high-caloric (potato chips) vs. low-caloric (zucchini) food stimulation. Seventeen healthy human subjects with body mass index (BMI) 19 to 27 underwent 2 different fMRI sessions where an initial resting state scan was acquired, followed by visual presentation of different images of potato chips and zucchini. There was then a 5-minute pause to ingest food (day 1=potato chips, day 3=zucchini), followed by a second resting state scan. fMRI data were further analyzed using graph theory analysis and support vector machine techniques. Potato chips vs. zucchini stimulation led to significant connectivity changes. The support vector machine was able to accurately categorize the 2 types of food stimuli with 100% accuracy. Visual, auditory, and somatosensory structures, as well as thalamus, insula, and basal ganglia were found to be important for food classification. After potato chips consumption, the BMI was associated with the path length and degree in nucleus accumbens, middle temporal gyrus, and thalamus. The results suggest that high vs. low caloric food stimulation in healthy individuals can induce significant changes in resting state networks. These changes can be detected using graph theory measures in conjunction with support vector machine. Additionally, we found that the BMI affects the response of the nucleus accumbens when high caloric food is consumed.

  1. Feature selection using a one dimensional naïve Bayes' classifier increases the accuracy of support vector machine classification of CDR3 repertoires.

    PubMed

    Cinelli, Mattia; Sun, Yuxin; Best, Katharine; Heather, James M; Reich-Zeliger, Shlomit; Shifrut, Eric; Friedman, Nir; Shawe-Taylor, John; Chain, Benny

    2017-04-01

    Somatic DNA recombination, the hallmark of vertebrate adaptive immunity, has the potential to generate a vast diversity of antigen receptor sequences. How this diversity captures antigen specificity remains incompletely understood. In this study we use high throughput sequencing to compare the global changes in T cell receptor β chain complementarity determining region 3 (CDR3β) sequences following immunization with ovalbumin administered with complete Freund's adjuvant (CFA) or CFA alone. The CDR3β sequences were deconstructed into short stretches of overlapping contiguous amino acids. The motifs were ranked according to a one-dimensional Bayesian classifier score comparing their frequency in the repertoires of the two immunization classes. The top ranking motifs were selected and used to create feature vectors which were used to train a support vector machine. The support vector machine achieved high classification scores in a leave-one-out validation test reaching >90% in some cases. The study describes a novel two-stage classification strategy combining a one-dimensional Bayesian classifier with a support vector machine. Using this approach we demonstrate that the frequency of a small number of linear motifs three amino acids in length can accurately identify a CD4 T cell response to ovalbumin against a background response to the complex mixture of antigens which characterize Complete Freund's Adjuvant. The sequence data is available at www.ncbi.nlm.nih.gov/sra/?term¼SRP075893 . The Decombinator package is available at github.com/innate2adaptive/Decombinator . The R package e1071 is available at the CRAN repository https://cran.r-project.org/web/packages/e1071/index.html . b.chain@ucl.ac.uk. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.

  2. Support Vector Machines Trained with Evolutionary Algorithms Employing Kernel Adatron for Large Scale Classification of Protein Structures.

    PubMed

    Arana-Daniel, Nancy; Gallegos, Alberto A; López-Franco, Carlos; Alanís, Alma Y; Morales, Jacob; López-Franco, Adriana

    2016-01-01

    With the increasing power of computers, the amount of data that can be processed in small periods of time has grown exponentially, as has the importance of classifying large-scale data efficiently. Support vector machines have shown good results classifying large amounts of high-dimensional data, such as data generated by protein structure prediction, spam recognition, medical diagnosis, optical character recognition and text classification, etc. Most state of the art approaches for large-scale learning use traditional optimization methods, such as quadratic programming or gradient descent, which makes the use of evolutionary algorithms for training support vector machines an area to be explored. The present paper proposes an approach that is simple to implement based on evolutionary algorithms and Kernel-Adatron for solving large-scale classification problems, focusing on protein structure prediction. The functional properties of proteins depend upon their three-dimensional structures. Knowing the structures of proteins is crucial for biology and can lead to improvements in areas such as medicine, agriculture and biofuels.

  3. Extraction of inland Nypa fruticans (Nipa Palm) using Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Alberto, R. T.; Serrano, S. C.; Damian, G. B.; Camaso, E. E.; Biagtan, A. R.; Panuyas, N. Z.; Quibuyen, J. S.

    2017-09-01

    Mangroves are considered as one of the major habitats in coastal ecosystem, providing a lot of economic and ecological services in human society. Nypa fruticans (Nipa palm) is one of the important species of mangroves because of its versatility and uniqueness as halophytic palm. However, nipas are not only adaptable in saline areas, they can also managed to thrive away from the coastline depending on the favorable soil types available in the area. Because of this, mapping of this species are not limited alone in the near shore areas, but in areas where this species are present as well. The extraction process of Nypa fruticans were carried out using the available LiDAR data. Support Vector Machine (SVM) classification process was used to extract nipas in inland areas. The SVM classification process in mapping Nypa fruticans produced high accuracy of 95+%. The Support Vector Machine classification process to extract inland nipas was proven to be effective by utilizing different terrain derivatives from LiDAR data.

  4. Distributed collaborative probabilistic design for turbine blade-tip radial running clearance using support vector machine of regression

    NASA Astrophysics Data System (ADS)

    Fei, Cheng-Wei; Bai, Guang-Chen

    2014-12-01

    To improve the computational precision and efficiency of probabilistic design for mechanical dynamic assembly like the blade-tip radial running clearance (BTRRC) of gas turbine, a distribution collaborative probabilistic design method-based support vector machine of regression (SR)(called as DCSRM) is proposed by integrating distribution collaborative response surface method and support vector machine regression model. The mathematical model of DCSRM is established and the probabilistic design idea of DCSRM is introduced. The dynamic assembly probabilistic design of aeroengine high-pressure turbine (HPT) BTRRC is accomplished to verify the proposed DCSRM. The analysis results reveal that the optimal static blade-tip clearance of HPT is gained for designing BTRRC, and improving the performance and reliability of aeroengine. The comparison of methods shows that the DCSRM has high computational accuracy and high computational efficiency in BTRRC probabilistic analysis. The present research offers an effective way for the reliability design of mechanical dynamic assembly and enriches mechanical reliability theory and method.

  5. The potential of latent semantic analysis for machine grading of clinical case summaries.

    PubMed

    Kintsch, Walter

    2002-02-01

    This paper introduces latent semantic analysis (LSA), a machine learning method for representing the meaning of words, sentences, and texts. LSA induces a high-dimensional semantic space from reading a very large amount of texts. The meaning of words and texts can be represented as vectors in this space and hence can be compared automatically and objectively. A generative theory of the mental lexicon based on LSA is described. The word vectors LSA constructs are context free, and each word, irrespective of how many meanings or senses it has, is represented by a single vector. However, when a word is used in different contexts, context appropriate word senses emerge. Several applications of LSA to educational software are described, involving the ability of LSA to quickly compare the content of texts, such as an essay written by a student and a target essay. An LSA-based software tool is sketched for machine grading of clinical case summaries written by medical students.

  6. Using Support Vector Machines to Automatically Extract Open Water Signatures from POLDER Multi-Angle Data Over Boreal Regions

    NASA Technical Reports Server (NTRS)

    Pierce, J.; Diaz-Barrios, M.; Pinzon, J.; Ustin, S. L.; Shih, P.; Tournois, S.; Zarco-Tejada, P. J.; Vanderbilt, V. C.; Perry, G. L.; Brass, James A. (Technical Monitor)

    2002-01-01

    This study used Support Vector Machines to classify multiangle POLDER data. Boreal wetland ecosystems cover an estimated 90 x 10(exp 6) ha, about 36% of global wetlands, and are a major source of trace gases emissions to the atmosphere. Four to 20 percent of the global emission of methane to the atmosphere comes from wetlands north of 4 degrees N latitude. Large uncertainties in emissions exist because of large spatial and temporal variation in the production and consumption of methane. Accurate knowledge of the areal extent of open water and inundated vegetation is critical to estimating magnitudes of trace gas emissions. Improvements in land cover mapping have been sought using physical-modeling approaches, neural networks, and active microwave, examples that demonstrate the difficulties of separating open water, inundated vegetation and dry upland vegetation. Here we examine the feasibility of using a support vector machine to classify POLDER data representing open water, inundated vegetation and dry upland vegetation.

  7. Framework for Infectious Disease Analysis: A comprehensive and integrative multi-modeling approach to disease prediction and management.

    PubMed

    Erraguntla, Madhav; Zapletal, Josef; Lawley, Mark

    2017-12-01

    The impact of infectious disease on human populations is a function of many factors including environmental conditions, vector dynamics, transmission mechanics, social and cultural behaviors, and public policy. A comprehensive framework for disease management must fully connect the complete disease lifecycle, including emergence from reservoir populations, zoonotic vector transmission, and impact on human societies. The Framework for Infectious Disease Analysis is a software environment and conceptual architecture for data integration, situational awareness, visualization, prediction, and intervention assessment. Framework for Infectious Disease Analysis automatically collects biosurveillance data using natural language processing, integrates structured and unstructured data from multiple sources, applies advanced machine learning, and uses multi-modeling for analyzing disease dynamics and testing interventions in complex, heterogeneous populations. In the illustrative case studies, natural language processing from social media, news feeds, and websites was used for information extraction, biosurveillance, and situation awareness. Classification machine learning algorithms (support vector machines, random forests, and boosting) were used for disease predictions.

  8. Prediction of B-cell linear epitopes with a combination of support vector machine classification and amino acid propensity identification.

    PubMed

    Wang, Hsin-Wei; Lin, Ya-Chi; Pai, Tun-Wen; Chang, Hao-Teng

    2011-01-01

    Epitopes are antigenic determinants that are useful because they induce B-cell antibody production and stimulate T-cell activation. Bioinformatics can enable rapid, efficient prediction of potential epitopes. Here, we designed a novel B-cell linear epitope prediction system called LEPS, Linear Epitope Prediction by Propensities and Support Vector Machine, that combined physico-chemical propensity identification and support vector machine (SVM) classification. We tested the LEPS on four datasets: AntiJen, HIV, a newly generated PC, and AHP, a combination of these three datasets. Peptides with globally or locally high physicochemical propensities were first identified as primitive linear epitope (LE) candidates. Then, candidates were classified with the SVM based on the unique features of amino acid segments. This reduced the number of predicted epitopes and enhanced the positive prediction value (PPV). Compared to four other well-known LE prediction systems, the LEPS achieved the highest accuracy (72.52%), specificity (84.22%), PPV (32.07%), and Matthews' correlation coefficient (10.36%).

  9. Predicting domain-domain interaction based on domain profiles with feature selection and support vector machines

    PubMed Central

    2010-01-01

    Background Protein-protein interaction (PPI) plays essential roles in cellular functions. The cost, time and other limitations associated with the current experimental methods have motivated the development of computational methods for predicting PPIs. As protein interactions generally occur via domains instead of the whole molecules, predicting domain-domain interaction (DDI) is an important step toward PPI prediction. Computational methods developed so far have utilized information from various sources at different levels, from primary sequences, to molecular structures, to evolutionary profiles. Results In this paper, we propose a computational method to predict DDI using support vector machines (SVMs), based on domains represented as interaction profile hidden Markov models (ipHMM) where interacting residues in domains are explicitly modeled according to the three dimensional structural information available at the Protein Data Bank (PDB). Features about the domains are extracted first as the Fisher scores derived from the ipHMM and then selected using singular value decomposition (SVD). Domain pairs are represented by concatenating their selected feature vectors, and classified by a support vector machine trained on these feature vectors. The method is tested by leave-one-out cross validation experiments with a set of interacting protein pairs adopted from the 3DID database. The prediction accuracy has shown significant improvement as compared to InterPreTS (Interaction Prediction through Tertiary Structure), an existing method for PPI prediction that also uses the sequences and complexes of known 3D structure. Conclusions We show that domain-domain interaction prediction can be significantly enhanced by exploiting information inherent in the domain profiles via feature selection based on Fisher scores, singular value decomposition and supervised learning based on support vector machines. Datasets and source code are freely available on the web at http://liao.cis.udel.edu/pub/svdsvm. Implemented in Matlab and supported on Linux and MS Windows. PMID:21034480

  10. Evolutionary-driven support vector machines for determining the degree of liver fibrosis in chronic hepatitis C.

    PubMed

    Stoean, Ruxandra; Stoean, Catalin; Lupsor, Monica; Stefanescu, Horia; Badea, Radu

    2011-01-01

    Hepatic fibrosis, the principal pointer to the development of a liver disease within chronic hepatitis C, can be measured through several stages. The correct evaluation of its degree, based on recent different non-invasive procedures, is of current major concern. The latest methodology for assessing it is the Fibroscan and the effect of its employment is impressive. However, the complex interaction between its stiffness indicator and the other biochemical and clinical examinations towards a respective degree of liver fibrosis is hard to be manually discovered. In this respect, the novel, well-performing evolutionary-powered support vector machines are proposed towards an automated learning of the relationship between medical attributes and fibrosis levels. The traditional support vector machines have been an often choice for addressing hepatic fibrosis, while the evolutionary option has been validated on many real-world tasks and proven flexibility and good performance. The evolutionary approach is simple and direct, resulting from the hybridization of the learning component within support vector machines and the optimization engine of evolutionary algorithms. It discovers the optimal coefficients of surfaces that separate instances of distinct classes. Apart from a detached manner of establishing the fibrosis degree for new cases, a resulting formula also offers insight upon the correspondence between the medical factors and the respective outcome. What is more, a feature selection genetic algorithm can be further embedded into the method structure, in order to dynamically concentrate search only on the most relevant attributes. The data set refers 722 patients with chronic hepatitis C infection and 24 indicators. The five possible degrees of fibrosis range from F0 (no fibrosis) to F4 (cirrhosis). Since the standard support vector machines are among the most frequently used methods in recent artificial intelligence studies for hepatic fibrosis staging, the evolutionary method is viewed in comparison to the traditional one. The multifaceted discrimination into all five degrees of fibrosis and the slightly less difficult common separation into solely three related stages are both investigated. The resulting performance proves the superiority over the standard support vector classification and the attained formula is helpful in providing an immediate calculation of the liver stage for new cases, while establishing the presence/absence and comprehending the weight of each medical factor with respect to a certain fibrosis level. The use of the evolutionary technique for fibrosis degree prediction triggers simplicity and offers a direct expression of the influence of dynamically selected indicators on the corresponding stage. Perhaps most importantly, it significantly surpasses the classical support vector machines, which are both widely used and technically sound. All these therefore confirm the promise of the new methodology towards a dependable support within the medical decision-making. Copyright © 2010 Elsevier B.V. All rights reserved.

  11. Novel solutions for an old disease: diagnosis of acute appendicitis with random forest, support vector machines, and artificial neural networks.

    PubMed

    Hsieh, Chung-Ho; Lu, Ruey-Hwa; Lee, Nai-Hsin; Chiu, Wen-Ta; Hsu, Min-Huei; Li, Yu-Chuan Jack

    2011-01-01

    Diagnosing acute appendicitis clinically is still difficult. We developed random forests, support vector machines, and artificial neural network models to diagnose acute appendicitis. Between January 2006 and December 2008, patients who had a consultation session with surgeons for suspected acute appendicitis were enrolled. Seventy-five percent of the data set was used to construct models including random forest, support vector machines, artificial neural networks, and logistic regression. Twenty-five percent of the data set was withheld to evaluate model performance. The area under the receiver operating characteristic curve (AUC) was used to evaluate performance, which was compared with that of the Alvarado score. Data from a total of 180 patients were collected, 135 used for training and 45 for testing. The mean age of patients was 39.4 years (range, 16-85). Final diagnosis revealed 115 patients with and 65 without appendicitis. The AUC of random forest, support vector machines, artificial neural networks, logistic regression, and Alvarado was 0.98, 0.96, 0.91, 0.87, and 0.77, respectively. The sensitivity, specificity, positive, and negative predictive values of random forest were 94%, 100%, 100%, and 87%, respectively. Random forest performed better than artificial neural networks, logistic regression, and Alvarado. We demonstrated that random forest can predict acute appendicitis with good accuracy and, deployed appropriately, can be an effective tool in clinical decision making. Copyright © 2011 Mosby, Inc. All rights reserved.

  12. A feasibility study of automatic lung nodule detection in chest digital tomosynthesis with machine learning based on support vector machine

    NASA Astrophysics Data System (ADS)

    Lee, Donghoon; Kim, Ye-seul; Choi, Sunghoon; Lee, Haenghwa; Jo, Byungdu; Choi, Seungyeon; Shin, Jungwook; Kim, Hee-Joung

    2017-03-01

    The chest digital tomosynthesis(CDT) is recently developed medical device that has several advantage for diagnosing lung disease. For example, CDT provides depth information with relatively low radiation dose compared to computed tomography (CT). However, a major problem with CDT is the image artifacts associated with data incompleteness resulting from limited angle data acquisition in CDT geometry. For this reason, the sensitivity of lung disease was not clear compared to CT. In this study, to improve sensitivity of lung disease detection in CDT, we developed computer aided diagnosis (CAD) systems based on machine learning. For design CAD systems, we used 100 cases of lung nodules cropped images and 100 cases of normal lesion cropped images acquired by lung man phantoms and proto type CDT. We used machine learning techniques based on support vector machine and Gabor filter. The Gabor filter was used for extracting characteristics of lung nodules and we compared performance of feature extraction of Gabor filter with various scale and orientation parameters. We used 3, 4, 5 scales and 4, 6, 8 orientations. After extracting features, support vector machine (SVM) was used for classifying feature of lesions. The linear, polynomial and Gaussian kernels of SVM were compared to decide the best SVM conditions for CDT reconstruction images. The results of CAD system with machine learning showed the capability of automatically lung lesion detection. Furthermore detection performance was the best when Gabor filter with 5 scale and 8 orientation and SVM with Gaussian kernel were used. In conclusion, our suggested CAD system showed improving sensitivity of lung lesion detection in CDT and decide Gabor filter and SVM conditions to achieve higher detection performance of our developed CAD system for CDT.

  13. Assessing and comparison of different machine learning methods in parent-offspring trios for genotype imputation.

    PubMed

    Mikhchi, Abbas; Honarvar, Mahmood; Kashan, Nasser Emam Jomeh; Aminafshar, Mehdi

    2016-06-21

    Genotype imputation is an important tool for prediction of unknown genotypes for both unrelated individuals and parent-offspring trios. Several imputation methods are available and can either employ universal machine learning methods, or deploy algorithms dedicated to infer missing genotypes. In this research the performance of eight machine learning methods: Support Vector Machine, K-Nearest Neighbors, Extreme Learning Machine, Radial Basis Function, Random Forest, AdaBoost, LogitBoost, and TotalBoost compared in terms of the imputation accuracy, computation time and the factors affecting imputation accuracy. The methods employed using real and simulated datasets to impute the un-typed SNPs in parent-offspring trios. The tested methods show that imputation of parent-offspring trios can be accurate. The Random Forest and Support Vector Machine were more accurate than the other machine learning methods. The TotalBoost performed slightly worse than the other methods.The running times were different between methods. The ELM was always most fast algorithm. In case of increasing the sample size, the RBF requires long imputation time.The tested methods in this research can be an alternative for imputation of un-typed SNPs in low missing rate of data. However, it is recommended that other machine learning methods to be used for imputation. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. Implementation of an ADI method on parallel computers

    NASA Technical Reports Server (NTRS)

    Fatoohi, Raad A.; Grosch, Chester E.

    1987-01-01

    The implementation of an ADI method for solving the diffusion equation on three parallel/vector computers is discussed. The computers were chosen so as to encompass a variety of architectures. They are: the MPP, an SIMD machine with 16K bit serial processors; FLEX/32, an MIMD machine with 20 processors; and CRAY/2, an MIMD machine with four vector processors. The Gaussian elimination algorithm is used to solve a set of tridiagonal systems on the FLEX/32 and CRAY/2 while the cyclic elimination algorithm is used to solve these systems on the MPP. The implementation of the method is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally, conclusions are presented.

  15. Implementation of an ADI method on parallel computers

    NASA Technical Reports Server (NTRS)

    Fatoohi, Raad A.; Grosch, Chester E.

    1987-01-01

    In this paper the implementation of an ADI method for solving the diffusion equation on three parallel/vector computers is discussed. The computers were chosen so as to encompass a variety of architectures. They are the MPP, an SIMD machine with 16-Kbit serial processors; Flex/32, an MIMD machine with 20 processors; and Cray/2, an MIMD machine with four vector processors. The Gaussian elimination algorithm is used to solve a set of tridiagonal systems on the Flex/32 and Cray/2 while the cyclic elimination algorithm is used to solve these systems on the MPP. The implementation of the method is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally conclusions are presented.

  16. T-ray relevant frequencies for osteosarcoma classification

    NASA Astrophysics Data System (ADS)

    Withayachumnankul, W.; Ferguson, B.; Rainsford, T.; Findlay, D.; Mickan, S. P.; Abbott, D.

    2006-01-01

    We investigate the classification of the T-ray response of normal human bone cells and human osteosarcoma cells, grown in culture. Given the magnitude and phase responses within a reliable spectral range as features for input vectors, a trained support vector machine can correctly classify the two cell types to some extent. Performance of the support vector machine is deteriorated by the curse of dimensionality, resulting from the comparatively large number of features in the input vectors. Feature subset selection methods are used to select only an optimal number of relevant features for inputs. As a result, an improvement in generalization performance is attainable, and the selected frequencies can be used for further describing different mechanisms of the cells, responding to T-rays. We demonstrate a consistent classification accuracy of 89.6%, while the only one fifth of the original features are retained in the data set.

  17. A VLSI chip set for real time vector quantization of image sequences

    NASA Technical Reports Server (NTRS)

    Baker, Richard L.

    1989-01-01

    The architecture and implementation of a VLSI chip set that vector quantizes (VQ) image sequences in real time is described. The chip set forms a programmable Single-Instruction, Multiple-Data (SIMD) machine which can implement various vector quantization encoding structures. Its VQ codebook may contain unlimited number of codevectors, N, having dimension up to K = 64. Under a weighted least squared error criterion, the engine locates at video rates the best code vector in full-searched or large tree searched VQ codebooks. The ability to manipulate tree structured codebooks, coupled with parallelism and pipelining, permits searches in as short as O (log N) cycles. A full codebook search results in O(N) performance, compared to O(KN) for a Single-Instruction, Single-Data (SISD) machine. With this VLSI chip set, an entire video code can be built on a single board that permits realtime experimentation with very large codebooks.

  18. Using support vector machines to identify literacy skills: Evidence from eye movements.

    PubMed

    Lou, Ya; Liu, Yanping; Kaakinen, Johanna K; Li, Xingshan

    2017-06-01

    Is inferring readers' literacy skills possible by analyzing their eye movements during text reading? This study used Support Vector Machines (SVM) to analyze eye movement data from 61 undergraduate students who read a multiple-paragraph, multiple-topic expository text. Forward fixation time, first-pass rereading time, second-pass fixation time, and regression path reading time on different regions of the text were provided as features. The SVM classification algorithm assisted in distinguishing high-literacy-skilled readers from low-literacy-skilled readers with 80.3 % accuracy. Results demonstrate the effectiveness of combining eye tracking and machine learning techniques to detect readers with low literacy skills, and suggest that such approaches can be potentially used in predicting other cognitive abilities.

  19. An M-step preconditioned conjugate gradient method for parallel computation

    NASA Technical Reports Server (NTRS)

    Adams, L.

    1983-01-01

    This paper describes a preconditioned conjugate gradient method that can be effectively implemented on both vector machines and parallel arrays to solve sparse symmetric and positive definite systems of linear equations. The implementation on the CYBER 203/205 and on the Finite Element Machine is discussed and results obtained using the method on these machines are given.

  20. Emotion detection from text

    NASA Astrophysics Data System (ADS)

    Ramalingam, V. V.; Pandian, A.; Jaiswal, Abhijeet; Bhatia, Nikhar

    2018-04-01

    This paper presents a novel method based on concept of Machine Learning for Emotion Detection using various algorithms of Support Vector Machine and major emotions described are linked to the Word-Net for enhanced accuracy. The approach proposed plays a promising role to augment the Artificial Intelligence in the near future and could be vital in optimization of Human-Machine Interface.

  1. Use of CYBER 203 and CYBER 205 computers for three-dimensional transonic flow calculations

    NASA Technical Reports Server (NTRS)

    Melson, N. D.; Keller, J. D.

    1983-01-01

    Experiences are discussed for modifying two three-dimensional transonic flow computer programs (FLO 22 and FLO 27) for use on the CDC CYBER 203 computer system. Both programs were originally written for use on serial machines. Several methods were attempted to optimize the execution of the two programs on the vector machine: leaving the program in a scalar form (i.e., serial computation) with compiler software used to optimize and vectorize the program, vectorizing parts of the existing algorithm in the program, and incorporating a vectorizable algorithm (ZEBRA I or ZEBRA II) in the program. Comparison runs of the programs were made on CDC CYBER 175. CYBER 203, and two pipe CDC CYBER 205 computer systems.

  2. Predicting healthcare associated infections using patients' experiences

    NASA Astrophysics Data System (ADS)

    Pratt, Michael A.; Chu, Henry

    2016-05-01

    Healthcare associated infections (HAI) are a major threat to patient safety and are costly to health systems. Our goal is to predict the HAI performance of a hospital using the patients' experience responses as input. We use four classifiers, viz. random forest, naive Bayes, artificial feedforward neural networks, and the support vector machine, to perform the prediction of six types of HAI. The six types include blood stream, urinary tract, surgical site, and intestinal infections. Experiments show that the random forest and support vector machine perform well across the six types of HAI.

  3. Progressive Vector Quantization on a massively parallel SIMD machine with application to multispectral image data

    NASA Technical Reports Server (NTRS)

    Manohar, Mareboyana; Tilton, James C.

    1994-01-01

    A progressive vector quantization (VQ) compression approach is discussed which decomposes image data into a number of levels using full search VQ. The final level is losslessly compressed, enabling lossless reconstruction. The computational difficulties are addressed by implementation on a massively parallel SIMD machine. We demonstrate progressive VQ on multispectral imagery obtained from the Advanced Very High Resolution Radiometer instrument and other Earth observation image data, and investigate the trade-offs in selecting the number of decomposition levels and codebook training method.

  4. Using an object-based grid system to evaluate a newly developed EP approach to formulate SVMs as applied to the classification of organophosphate nerve agents

    NASA Astrophysics Data System (ADS)

    Land, Walker H., Jr.; Lewis, Michael; Sadik, Omowunmi; Wong, Lut; Wanekaya, Adam; Gonzalez, Richard J.; Balan, Arun

    2004-04-01

    This paper extends the classification approaches described in reference [1] in the following way: (1.) developing and evaluating a new method for evolving organophosphate nerve agent Support Vector Machine (SVM) classifiers using Evolutionary Programming, (2.) conducting research experiments using a larger database of organophosphate nerve agents, and (3.) upgrading the architecture to an object-based grid system for evaluating the classification of EP derived SVMs. Due to the increased threats of chemical and biological weapons of mass destruction (WMD) by international terrorist organizations, a significant effort is underway to develop tools that can be used to detect and effectively combat biochemical warfare. This paper reports the integration of multi-array sensors with Support Vector Machines (SVMs) for the detection of organophosphates nerve agents using a grid computing system called Legion. Grid computing is the use of large collections of heterogeneous, distributed resources (including machines, databases, devices, and users) to support large-scale computations and wide-area data access. Finally, preliminary results using EP derived support vector machines designed to operate on distributed systems have provided accurate classification results. In addition, distributed training time architectures are 50 times faster when compared to standard iterative training time methods.

  5. Prediction of Human Intestinal Absorption of Compounds Using Artificial Intelligence Techniques.

    PubMed

    Kumar, Rajnish; Sharma, Anju; Siddiqui, Mohammed Haris; Tiwari, Rajesh Kumar

    2017-01-01

    Information about Pharmacokinetics of compounds is an essential component of drug design and development. Modeling the pharmacokinetic properties require identification of the factors effecting absorption, distribution, metabolism and excretion of compounds. There have been continuous attempts in the prediction of intestinal absorption of compounds using various Artificial intelligence methods in the effort to reduce the attrition rate of drug candidates entering to preclinical and clinical trials. Currently, there are large numbers of individual predictive models available for absorption using machine learning approaches. Six Artificial intelligence methods namely, Support vector machine, k- nearest neighbor, Probabilistic neural network, Artificial neural network, Partial least square and Linear discriminant analysis were used for prediction of absorption of compounds. Prediction accuracy of Support vector machine, k- nearest neighbor, Probabilistic neural network, Artificial neural network, Partial least square and Linear discriminant analysis for prediction of intestinal absorption of compounds was found to be 91.54%, 88.33%, 84.30%, 86.51%, 79.07% and 80.08% respectively. Comparative analysis of all the six prediction models suggested that Support vector machine with Radial basis function based kernel is comparatively better for binary classification of compounds using human intestinal absorption and may be useful at preliminary stages of drug design and development. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  6. Estimating chlorophyll with thermal and broadband multispectral high resolution imagery from an unmanned aerial system using relevance vector machines for precision agriculture

    NASA Astrophysics Data System (ADS)

    Elarab, Manal; Ticlavilca, Andres M.; Torres-Rua, Alfonso F.; Maslova, Inga; McKee, Mac

    2015-12-01

    Precision agriculture requires high-resolution information to enable greater precision in the management of inputs to production. Actionable information about crop and field status must be acquired at high spatial resolution and at a temporal frequency appropriate for timely responses. In this study, high spatial resolution imagery was obtained through the use of a small, unmanned aerial system called AggieAirTM. Simultaneously with the AggieAir flights, intensive ground sampling for plant chlorophyll was conducted at precisely determined locations. This study reports the application of a relevance vector machine coupled with cross validation and backward elimination to a dataset composed of reflectance from high-resolution multi-spectral imagery (VIS-NIR), thermal infrared imagery, and vegetative indices, in conjunction with in situ SPAD measurements from which chlorophyll concentrations were derived, to estimate chlorophyll concentration from remotely sensed data at 15-cm resolution. The results indicate that a relevance vector machine with a thin plate spline kernel type and kernel width of 5.4, having LAI, NDVI, thermal and red bands as the selected set of inputs, can be used to spatially estimate chlorophyll concentration with a root-mean-squared-error of 5.31 μg cm-2, efficiency of 0.76, and 9 relevance vectors.

  7. Characterization of spatiotemporal changes for the classification of dynamic contrast-enhanced magnetic-resonance breast lesions.

    PubMed

    Milenković, Jana; Hertl, Kristijana; Košir, Andrej; Zibert, Janez; Tasič, Jurij Franc

    2013-06-01

    The early detection of breast cancer is one of the most important predictors in determining the prognosis for women with malignant tumours. Dynamic contrast-enhanced magnetic-resonance imaging (DCE-MRI) is an important imaging modality for detecting and interpreting the different breast lesions from a time sequence of images and has proved to be a very sensitive modality for breast-cancer diagnosis. However, DCE-MRI exhibits only a moderate specificity, thus leading to a high rate of false positives, resulting in unnecessary biopsies that are stressful and physically painful for the patient and lead to an increase in the cost of treatment. There is a strong medical need for a DCE-MRI computer-aided diagnosis tool that would offer a reliable support to the physician's decision providing a high level of sensitivity and specificity. In our study we investigated the possibility of increasing differentiation between the malignant and the benign lesions with respect to the spatial variation of the temporal enhancements of three parametric maps, i.e., the initial enhancement (IE) map, the post-initial enhancement (PIE) map and the signal enhancement ratio (SER) map, by introducing additional methods along with the grey-level co-occurrence matrix, i.e., a second-order statistical method already applied for quantifying the spatiotemporal variations. We introduced the grey-level run-length matrix and the grey-level difference matrix, representing two additional, second-order statistical methods, and the circular Gabor as a frequency-domain-based method. Each of the additional methods is for the first time applied to the DCE-MRI data to differentiate between the malignant and the benign breast lesions. We applied the least-square minimum-distance classifier (LSMD), logistic regression and least-squares support vector machine (LS-SVM) classifiers on a total of 115 (78 malignant and 37 benign) breast DCE-MRI cases. The performances were evaluated using ten experiments of a ten-fold cross-validation. Our experimental analysis revealed the PIE map, together with the feature subset in which the discriminating ability of the co-occurrence features was increased by adding the newly introduced features, to be the most significant for differentiation between the malignant and the benign lesions. That diagnostic test - the aforementioned combination of parametric map and the feature subset achieved the sensitivity of 0.9193 which is statistically significantly higher compared to other diagnostic tests after ten-experiments of a ten-fold cross-validation and gave a statistically significantly higher specificity of 0.7819 for the fixed 95% sensitivity after the receiver operating characteristic (ROC) curve analysis. Combining the information from all the three parametric maps significantly increased the area under the ROC curve (AUC) of the aforementioned diagnostic test for the LSMD and logistic regression; however, not for the LS-SVM. The LSMD classifier yielded the highest area under the ROC curve when using the combined information, increasing the AUC from 0.9651 to 0.9755. Introducing new features to those of the grey-level co-occurrence matrix significantly increased the differentiation between the malignant and the benign breast lesions, thus resulting in a high sensitivity and improved specificity. Copyright © 2013 Elsevier B.V. All rights reserved.

  8. Multivariate analysis of fMRI time series: classification and regression of brain responses using machine learning.

    PubMed

    Formisano, Elia; De Martino, Federico; Valente, Giancarlo

    2008-09-01

    Machine learning and pattern recognition techniques are being increasingly employed in functional magnetic resonance imaging (fMRI) data analysis. By taking into account the full spatial pattern of brain activity measured simultaneously at many locations, these methods allow detecting subtle, non-strictly localized effects that may remain invisible to the conventional analysis with univariate statistical methods. In typical fMRI applications, pattern recognition algorithms "learn" a functional relationship between brain response patterns and a perceptual, cognitive or behavioral state of a subject expressed in terms of a label, which may assume discrete (classification) or continuous (regression) values. This learned functional relationship is then used to predict the unseen labels from a new data set ("brain reading"). In this article, we describe the mathematical foundations of machine learning applications in fMRI. We focus on two methods, support vector machines and relevance vector machines, which are respectively suited for the classification and regression of fMRI patterns. Furthermore, by means of several examples and applications, we illustrate and discuss the methodological challenges of using machine learning algorithms in the context of fMRI data analysis.

  9. Supervised machine learning algorithms to diagnose stress for vehicle drivers based on physiological sensor signals.

    PubMed

    Barua, Shaibal; Begum, Shahina; Ahmed, Mobyen Uddin

    2015-01-01

    Machine learning algorithms play an important role in computer science research. Recent advancement in sensor data collection in clinical sciences lead to a complex, heterogeneous data processing, and analysis for patient diagnosis and prognosis. Diagnosis and treatment of patients based on manual analysis of these sensor data are difficult and time consuming. Therefore, development of Knowledge-based systems to support clinicians in decision-making is important. However, it is necessary to perform experimental work to compare performances of different machine learning methods to help to select appropriate method for a specific characteristic of data sets. This paper compares classification performance of three popular machine learning methods i.e., case-based reasoning, neutral networks and support vector machine to diagnose stress of vehicle drivers using finger temperature and heart rate variability. The experimental results show that case-based reasoning outperforms other two methods in terms of classification accuracy. Case-based reasoning has achieved 80% and 86% accuracy to classify stress using finger temperature and heart rate variability. On contrary, both neural network and support vector machine have achieved less than 80% accuracy by using both physiological signals.

  10. repRNA: a web server for generating various feature vectors of RNA sequences.

    PubMed

    Liu, Bin; Liu, Fule; Fang, Longyun; Wang, Xiaolong; Chou, Kuo-Chen

    2016-02-01

    With the rapid growth of RNA sequences generated in the postgenomic age, it is highly desired to develop a flexible method that can generate various kinds of vectors to represent these sequences by focusing on their different features. This is because nearly all the existing machine-learning methods, such as SVM (support vector machine) and KNN (k-nearest neighbor), can only handle vectors but not sequences. To meet the increasing demands and speed up the genome analyses, we have developed a new web server, called "representations of RNA sequences" (repRNA). Compared with the existing methods, repRNA is much more comprehensive, flexible and powerful, as reflected by the following facts: (1) it can generate 11 different modes of feature vectors for users to choose according to their investigation purposes; (2) it allows users to select the features from 22 built-in physicochemical properties and even those defined by users' own; (3) the resultant feature vectors and the secondary structures of the corresponding RNA sequences can be visualized. The repRNA web server is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/repRNA/ .

  11. Multiscale asymmetric orthogonal wavelet kernel for linear programming support vector learning and nonlinear dynamic systems identification.

    PubMed

    Lu, Zhao; Sun, Jing; Butts, Kenneth

    2014-05-01

    Support vector regression for approximating nonlinear dynamic systems is more delicate than the approximation of indicator functions in support vector classification, particularly for systems that involve multitudes of time scales in their sampled data. The kernel used for support vector learning determines the class of functions from which a support vector machine can draw its solution, and the choice of kernel significantly influences the performance of a support vector machine. In this paper, to bridge the gap between wavelet multiresolution analysis and kernel learning, the closed-form orthogonal wavelet is exploited to construct new multiscale asymmetric orthogonal wavelet kernels for linear programming support vector learning. The closed-form multiscale orthogonal wavelet kernel provides a systematic framework to implement multiscale kernel learning via dyadic dilations and also enables us to represent complex nonlinear dynamics effectively. To demonstrate the superiority of the proposed multiscale wavelet kernel in identifying complex nonlinear dynamic systems, two case studies are presented that aim at building parallel models on benchmark datasets. The development of parallel models that address the long-term/mid-term prediction issue is more intricate and challenging than the identification of series-parallel models where only one-step ahead prediction is required. Simulation results illustrate the effectiveness of the proposed multiscale kernel learning.

  12. Arbitrary norm support vector machines.

    PubMed

    Huang, Kaizhu; Zheng, Danian; King, Irwin; Lyu, Michael R

    2009-02-01

    Support vector machines (SVM) are state-of-the-art classifiers. Typically L2-norm or L1-norm is adopted as a regularization term in SVMs, while other norm-based SVMs, for example, the L0-norm SVM or even the L(infinity)-norm SVM, are rarely seen in the literature. The major reason is that L0-norm describes a discontinuous and nonconvex term, leading to a combinatorially NP-hard optimization problem. In this letter, motivated by Bayesian learning, we propose a novel framework that can implement arbitrary norm-based SVMs in polynomial time. One significant feature of this framework is that only a sequence of sequential minimal optimization problems needs to be solved, thus making it practical in many real applications. The proposed framework is important in the sense that Bayesian priors can be efficiently plugged into most learning methods without knowing the explicit form. Hence, this builds a connection between Bayesian learning and the kernel machines. We derive the theoretical framework, demonstrate how our approach works on the L0-norm SVM as a typical example, and perform a series of experiments to validate its advantages. Experimental results on nine benchmark data sets are very encouraging. The implemented L0-norm is competitive with or even better than the standard L2-norm SVM in terms of accuracy but with a reduced number of support vectors, -9.46% of the number on average. When compared with another sparse model, the relevance vector machine, our proposed algorithm also demonstrates better sparse properties with a training speed over seven times faster.

  13. Computer-Aided Diagnosis for Breast Ultrasound Using Computerized BI-RADS Features and Machine Learning Methods.

    PubMed

    Shan, Juan; Alam, S Kaisar; Garra, Brian; Zhang, Yingtao; Ahmed, Tahira

    2016-04-01

    This work identifies effective computable features from the Breast Imaging Reporting and Data System (BI-RADS), to develop a computer-aided diagnosis (CAD) system for breast ultrasound. Computerized features corresponding to ultrasound BI-RADs categories were designed and tested using a database of 283 pathology-proven benign and malignant lesions. Features were selected based on classification performance using a "bottom-up" approach for different machine learning methods, including decision tree, artificial neural network, random forest and support vector machine. Using 10-fold cross-validation on the database of 283 cases, the highest area under the receiver operating characteristic (ROC) curve (AUC) was 0.84 from a support vector machine with 77.7% overall accuracy; the highest overall accuracy, 78.5%, was from a random forest with the AUC 0.83. Lesion margin and orientation were optimum features common to all of the different machine learning methods. These features can be used in CAD systems to help distinguish benign from worrisome lesions. Copyright © 2016 World Federation for Ultrasound in Medicine & Biology. All rights reserved.

  14. Impact of corpus domain for sentiment classification: An evaluation study using supervised machine learning techniques

    NASA Astrophysics Data System (ADS)

    Karsi, Redouane; Zaim, Mounia; El Alami, Jamila

    2017-07-01

    Thanks to the development of the internet, a large community now has the possibility to communicate and express its opinions and preferences through multiple media such as blogs, forums, social networks and e-commerce sites. Today, it becomes clearer that opinions published on the web are a very valuable source for decision-making, so a rapidly growing field of research called “sentiment analysis” is born to address the problem of automatically determining the polarity (Positive, negative, neutral,…) of textual opinions. People expressing themselves in a particular domain often use specific domain language expressions, thus, building a classifier, which performs well in different domains is a challenging problem. The purpose of this paper is to evaluate the impact of domain for sentiment classification when using machine learning techniques. In our study three popular machine learning techniques: Support Vector Machines (SVM), Naive Bayes and K nearest neighbors(KNN) were applied on datasets collected from different domains. Experimental results show that Support Vector Machines outperforms other classifiers in all domains, since it achieved at least 74.75% accuracy with a standard deviation of 4,08.

  15. Extended robust support vector machine based on financial risk minimization.

    PubMed

    Takeda, Akiko; Fujiwara, Shuhei; Kanamori, Takafumi

    2014-11-01

    Financial risk measures have been used recently in machine learning. For example, ν-support vector machine ν-SVM) minimizes the conditional value at risk (CVaR) of margin distribution. The measure is popular in finance because of the subadditivity property, but it is very sensitive to a few outliers in the tail of the distribution. We propose a new classification method, extended robust SVM (ER-SVM), which minimizes an intermediate risk measure between the CVaR and value at risk (VaR) by expecting that the resulting model becomes less sensitive than ν-SVM to outliers. We can regard ER-SVM as an extension of robust SVM, which uses a truncated hinge loss. Numerical experiments imply the ER-SVM's possibility of achieving a better prediction performance with proper parameter setting.

  16. Transportation Modes Classification Using Sensors on Smartphones.

    PubMed

    Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu

    2016-08-19

    This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user's transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes.

  17. Transportation Modes Classification Using Sensors on Smartphones

    PubMed Central

    Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu

    2016-01-01

    This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user’s transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes. PMID:27548182

  18. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dayman, Ken J; Ade, Brian J; Weber, Charles F

    High-dimensional, nonlinear function estimation using large datasets is a current area of interest in the machine learning community, and applications may be found throughout the analytical sciences, where ever-growing datasets are making more information available to the analyst. In this paper, we leverage the existing relevance vector machine, a sparse Bayesian version of the well-studied support vector machine, and expand the method to include integrated feature selection and automatic function shaping. These innovations produce an algorithm that is able to distinguish variables that are useful for making predictions of a response from variables that are unrelated or confusing. We testmore » the technology using synthetic data, conduct initial performance studies, and develop a model capable of making position-independent predictions of the coreaveraged burnup using a single specimen drawn randomly from a nuclear reactor core.« less

  19. Combining Relevance Vector Machines and exponential regression for bearing residual life estimation

    NASA Astrophysics Data System (ADS)

    Di Maio, Francesco; Tsui, Kwok Leung; Zio, Enrico

    2012-08-01

    In this paper we present a new procedure for estimating the bearing Residual Useful Life (RUL) by combining data-driven and model-based techniques. Respectively, we resort to (i) Relevance Vector Machines (RVMs) for selecting a low number of significant basis functions, called Relevant Vectors (RVs), and (ii) exponential regression to compute and continuously update residual life estimations. The combination of these techniques is developed with reference to partially degraded thrust ball bearings and tested on real world vibration-based degradation data. On the case study considered, the proposed procedure outperforms other model-based methods, with the added value of an adequate representation of the uncertainty associated to the estimates of the quantification of the credibility of the results by the Prognostic Horizon (PH) metric.

  20. Rotating electrical machines: Poynting flow

    NASA Astrophysics Data System (ADS)

    Donaghy-Spargo, C.

    2017-09-01

    This paper presents a complementary approach to the traditional Lorentz and Faraday approaches that are typically adopted in the classroom when teaching the fundamentals of electrical machines—motors and generators. The approach adopted is based upon the Poynting vector, which illustrates the ‘flow’ of electromagnetic energy. It is shown through simple vector analysis that the energy-flux density flow approach can provide insight into the operation of electrical machines and it is also shown that the results are in agreement with conventional Maxwell stress-based theory. The advantage of this approach is its complementary completion of the physical picture regarding the electromechanical energy conversion process—it is also a means of maintaining student interest in this subject and as an unconventional application of the Poynting vector during normal study of electromagnetism.

  1. LANDMARK-BASED SPEECH RECOGNITION: REPORT OF THE 2004 JOHNS HOPKINS SUMMER WORKSHOP.

    PubMed

    Hasegawa-Johnson, Mark; Baker, James; Borys, Sarah; Chen, Ken; Coogan, Emily; Greenberg, Steven; Juneja, Amit; Kirchhoff, Katrin; Livescu, Karen; Mohan, Srividya; Muller, Jennifer; Sonmez, Kemal; Wang, Tianyu

    2005-01-01

    Three research prototype speech recognition systems are described, all of which use recently developed methods from artificial intelligence (specifically support vector machines, dynamic Bayesian networks, and maximum entropy classification) in order to implement, in the form of an automatic speech recognizer, current theories of human speech perception and phonology (specifically landmark-based speech perception, nonlinear phonology, and articulatory phonology). All three systems begin with a high-dimensional multiframe acoustic-to-distinctive feature transformation, implemented using support vector machines trained to detect and classify acoustic phonetic landmarks. Distinctive feature probabilities estimated by the support vector machines are then integrated using one of three pronunciation models: a dynamic programming algorithm that assumes canonical pronunciation of each word, a dynamic Bayesian network implementation of articulatory phonology, or a discriminative pronunciation model trained using the methods of maximum entropy classification. Log probability scores computed by these models are then combined, using log-linear combination, with other word scores available in the lattice output of a first-pass recognizer, and the resulting combination score is used to compute a second-pass speech recognition output.

  2. Agricultural mapping using Support Vector Machine-Based Endmember Extraction (SVM-BEE)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Archibald, Richard K; Filippi, Anthony M; Bhaduri, Budhendra L

    Extracting endmembers from remotely sensed images of vegetated areas can present difficulties. In this research, we applied a recently developed endmember-extraction algorithm based on Support Vector Machines (SVMs) to the problem of semi-autonomous estimation of vegetation endmembers from a hyperspectral image. This algorithm, referred to as Support Vector Machine-Based Endmember Extraction (SVM-BEE), accurately and rapidly yields a computed representation of hyperspectral data that can accommodate multiple distributions. The number of distributions is identified without prior knowledge, based upon this representation. Prior work established that SVM-BEE is robustly noise-tolerant and can semi-automatically and effectively estimate endmembers; synthetic data and a geologicmore » scene were previously analyzed. Here we compared the efficacies of the SVM-BEE and N-FINDR algorithms in extracting endmembers from a predominantly agricultural scene. SVM-BEE was able to estimate vegetation and other endmembers for all classes in the image, which N-FINDR failed to do. Classifications based on SVM-BEE endmembers were markedly more accurate compared with those based on N-FINDR endmembers.« less

  3. Exploring the capabilities of support vector machines in detecting silent data corruptions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Subasi, Omer; Di, Sheng; Bautista-Gomez, Leonardo

    As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions (SDCs), or silent errors, are one of the major sources that corrupt the execution results of HPC applications without being detected. Here in this paper, we explore a set of novel SDC detectors – by leveraging epsilon-insensitive support vector machine regression – to detect SDCs that occur in HPC applications. The key contributions are threefold. (1) Our exploration takes temporal, spatial, and spatiotemporal features into account and analyzes different detectors based onmore » different features. (2) We provide an in-depth study on the detection ability and performance with different parameters, and we optimize the detection range carefully. (3) Experiments with eight real-world HPC applications show that support-vector-machine-based detectors can achieve detection sensitivity (i.e., recall) up to 99% yet suffer a less than 1% false positive rate for most cases. Our detectors incur low performance overhead, 5% on average, for all benchmarks studied in this work.« less

  4. Exploring the capabilities of support vector machines in detecting silent data corruptions

    DOE PAGES

    Subasi, Omer; Di, Sheng; Bautista-Gomez, Leonardo; ...

    2018-02-01

    As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions (SDCs), or silent errors, are one of the major sources that corrupt the execution results of HPC applications without being detected. Here in this paper, we explore a set of novel SDC detectors – by leveraging epsilon-insensitive support vector machine regression – to detect SDCs that occur in HPC applications. The key contributions are threefold. (1) Our exploration takes temporal, spatial, and spatiotemporal features into account and analyzes different detectors based onmore » different features. (2) We provide an in-depth study on the detection ability and performance with different parameters, and we optimize the detection range carefully. (3) Experiments with eight real-world HPC applications show that support-vector-machine-based detectors can achieve detection sensitivity (i.e., recall) up to 99% yet suffer a less than 1% false positive rate for most cases. Our detectors incur low performance overhead, 5% on average, for all benchmarks studied in this work.« less

  5. Clifford support vector machines for classification, regression, and recurrence.

    PubMed

    Bayro-Corrochano, Eduardo Jose; Arana-Daniel, Nancy

    2010-11-01

    This paper introduces the Clifford support vector machines (CSVM) as a generalization of the real and complex-valued support vector machines using the Clifford geometric algebra. In this framework, we handle the design of kernels involving the Clifford or geometric product. In this approach, one redefines the optimization variables as multivectors. This allows us to have a multivector as output. Therefore, we can represent multiple classes according to the dimension of the geometric algebra in which we work. We show that one can apply CSVM for classification and regression and also to build a recurrent CSVM. The CSVM is an attractive approach for the multiple input multiple output processing of high-dimensional geometric entities. We carried out comparisons between CSVM and the current approaches to solve multiclass classification and regression. We also study the performance of the recurrent CSVM with experiments involving time series. The authors believe that this paper can be of great use for researchers and practitioners interested in multiclass hypercomplex computing, particularly for applications in complex and quaternion signal and image processing, satellite control, neurocomputation, pattern recognition, computer vision, augmented virtual reality, robotics, and humanoids.

  6. Experience with a vectorized general circulation weather model on Star-100

    NASA Technical Reports Server (NTRS)

    Soll, D. B.; Habra, N. R.; Russell, G. L.

    1977-01-01

    A version of an atmospheric general circulation model was vectorized to run on a CDC STAR 100. The numerical model was coded and run in two different vector languages, CDC and LRLTRAN. A factor of 10 speed improvement over an IBM 360/95 was realized. Efficient use of the STAR machine required some redesigning of algorithms and logic. This precludes the application of vectorizing compilers on the original scalar code to achieve the same results. Vector languages permit a more natural and efficient formulation for such numerical codes.

  7. Machine Learning for Biological Trajectory Classification Applications

    NASA Technical Reports Server (NTRS)

    Sbalzarini, Ivo F.; Theriot, Julie; Koumoutsakos, Petros

    2002-01-01

    Machine-learning techniques, including clustering algorithms, support vector machines and hidden Markov models, are applied to the task of classifying trajectories of moving keratocyte cells. The different algorithms axe compared to each other as well as to expert and non-expert test persons, using concepts from signal-detection theory. The algorithms performed very well as compared to humans, suggesting a robust tool for trajectory classification in biological applications.

  8. A Shellcode Detection Method Based on Full Native API Sequence and Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Cheng, Yixuan; Fan, Wenqing; Huang, Wei; An, Jing

    2017-09-01

    Dynamic monitoring the behavior of a program is widely used to discriminate between benign program and malware. It is usually based on the dynamic characteristics of a program, such as API call sequence or API call frequency to judge. The key innovation of this paper is to consider the full Native API sequence and use the support vector machine to detect the shellcode. We also use the Markov chain to extract and digitize Native API sequence features. Our experimental results show that the method proposed in this paper has high accuracy and low detection rate.

  9. Support vector machine multiuser receiver for DS-CDMA signals in multipath channels.

    PubMed

    Chen, S; Samingan, A K; Hanzo, L

    2001-01-01

    The problem of constructing an adaptive multiuser detector (MUD) is considered for direct sequence code division multiple access (DS-CDMA) signals transmitted through multipath channels. The emerging learning technique, called support vector machines (SVM), is proposed as a method of obtaining a nonlinear MUD from a relatively small training data block. Computer simulation is used to study this SVM MUD, and the results show that it can closely match the performance of the optimal Bayesian one-shot detector. Comparisons with an adaptive radial basis function (RBF) MUD trained by an unsupervised clustering algorithm are discussed.

  10. Estimation of perceptible water vapor of atmosphere using artificial neural network, support vector machine and multiple linear regression algorithm and their comparative study

    NASA Astrophysics Data System (ADS)

    Shastri, Niket; Pathak, Kamlesh

    2018-05-01

    The water vapor content in atmosphere plays very important role in climate. In this paper the application of GPS signal in meteorology is discussed, which is useful technique that is used to estimate the perceptible water vapor of atmosphere. In this paper various algorithms like artificial neural network, support vector machine and multiple linear regression are use to predict perceptible water vapor. The comparative studies in terms of root mean square error and mean absolute errors are also carried out for all the algorithms.

  11. Implementation of support vector machine for classification of speech marked hijaiyah letters based on Mel frequency cepstrum coefficient feature extraction

    NASA Astrophysics Data System (ADS)

    Adhi Pradana, Wisnu; Adiwijaya; Novia Wisesty, Untari

    2018-03-01

    Support Vector Machine or commonly called SVM is one method that can be used to process the classification of a data. SVM classifies data from 2 different classes with hyperplane. In this study, the system was built using SVM to develop Arabic Speech Recognition. In the development of the system, there are 2 kinds of speakers that have been tested that is dependent speakers and independent speakers. The results from this system is an accuracy of 85.32% for speaker dependent and 61.16% for independent speakers.

  12. An ultra low power feature extraction and classification system for wearable seizure detection.

    PubMed

    Page, Adam; Pramod Tim Oates, Siddharth; Mohsenin, Tinoosh

    2015-01-01

    In this paper we explore the use of a variety of machine learning algorithms for designing a reliable and low-power, multi-channel EEG feature extractor and classifier for predicting seizures from electroencephalographic data (scalp EEG). Different machine learning classifiers including k-nearest neighbor, support vector machines, naïve Bayes, logistic regression, and neural networks are explored with the goal of maximizing detection accuracy while minimizing power, area, and latency. The input to each machine learning classifier is a 198 feature vector containing 9 features for each of the 22 EEG channels obtained over 1-second windows. All classifiers were able to obtain F1 scores over 80% and onset sensitivity of 100% when tested on 10 patients. Among five different classifiers that were explored, logistic regression (LR) proved to have minimum hardware complexity while providing average F-1 score of 91%. Both ASIC and FPGA implementations of logistic regression are presented and show the smallest area, power consumption, and the lowest latency when compared to the previous work.

  13. Recognition and Classification of Road Condition on the Basis of Friction Force by Using a Mobile Robot

    NASA Astrophysics Data System (ADS)

    Watanabe, Tatsuhito; Katsura, Seiichiro

    A person operating a mobile robot in a remote environment receives realistic visual feedback about the condition of the road on which the robot is moving. The categorization of the road condition is necessary to evaluate the conditions for safe and comfortable driving. For this purpose, the mobile robot should be capable of recognizing and classifying the condition of the road surfaces. This paper proposes a method for recognizing the type of road surfaces on the basis of the friction between the mobile robot and the road surfaces. This friction is estimated by a disturbance observer, and a support vector machine is used to classify the surfaces. The support vector machine identifies the type of the road surface using feature vector, which is determined using the arithmetic average and variance derived from the torque values. Further, these feature vectors are mapped onto a higher dimensional space by using a kernel function. The validity of the proposed method is confirmed by experimental results.

  14. Protein Kinase Classification with 2866 Hidden Markov Models and One Support Vector Machine

    NASA Technical Reports Server (NTRS)

    Weber, Ryan; New, Michael H.; Fonda, Mark (Technical Monitor)

    2002-01-01

    The main application considered in this paper is predicting true kinases from randomly permuted kinases that share the same length and amino acid distributions as the true kinases. Numerous methods already exist for this classification task, such as HMMs, motif-matchers, and sequence comparison algorithms. We build on some of these efforts by creating a vector from the output of thousands of structurally based HMMs, created offline with Pfam-A seed alignments using SAM-T99, which then must be combined into an overall classification for the protein. Then we use a Support Vector Machine for classifying this large ensemble Pfam-Vector, with a polynomial and chisquared kernel. In particular, the chi-squared kernel SVM performs better than the HMMs and better than the BLAST pairwise comparisons, when predicting true from false kinases in some respects, but no one algorithm is best for all purposes or in all instances so we consider the particular strengths and weaknesses of each.

  15. Feature selection using a one dimensional naïve Bayes’ classifier increases the accuracy of support vector machine classification of CDR3 repertoires

    PubMed Central

    Cinelli, Mattia; Sun, , Yuxin; Best, Katharine; Heather, James M.; Reich-Zeliger, Shlomit; Shifrut, Eric; Friedman, Nir; Shawe-Taylor, John; Chain, Benny

    2017-01-01

    Abstract Motivation: Somatic DNA recombination, the hallmark of vertebrate adaptive immunity, has the potential to generate a vast diversity of antigen receptor sequences. How this diversity captures antigen specificity remains incompletely understood. In this study we use high throughput sequencing to compare the global changes in T cell receptor β chain complementarity determining region 3 (CDR3β) sequences following immunization with ovalbumin administered with complete Freund’s adjuvant (CFA) or CFA alone. Results: The CDR3β sequences were deconstructed into short stretches of overlapping contiguous amino acids. The motifs were ranked according to a one-dimensional Bayesian classifier score comparing their frequency in the repertoires of the two immunization classes. The top ranking motifs were selected and used to create feature vectors which were used to train a support vector machine. The support vector machine achieved high classification scores in a leave-one-out validation test reaching >90% in some cases. Summary: The study describes a novel two-stage classification strategy combining a one-dimensional Bayesian classifier with a support vector machine. Using this approach we demonstrate that the frequency of a small number of linear motifs three amino acids in length can accurately identify a CD4 T cell response to ovalbumin against a background response to the complex mixture of antigens which characterize Complete Freund’s Adjuvant. Availability and implementation: The sequence data is available at www.ncbi.nlm.nih.gov/sra/?term¼SRP075893. The Decombinator package is available at github.com/innate2adaptive/Decombinator. The R package e1071 is available at the CRAN repository https://cran.r-project.org/web/packages/e1071/index.html. Contact: b.chain@ucl.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28073756

  16. Methods, systems and apparatus for synchronous current regulation of a five-phase machine

    DOEpatents

    Gallegos-Lopez, Gabriel; Perisic, Milun

    2012-10-09

    Methods, systems and apparatus are provided for controlling operation of and regulating current provided to a five-phase machine when one or more phases has experienced a fault or has failed. In one implementation, the disclosed embodiments can be used to synchronously regulate current in a vector controlled motor drive system that includes a five-phase AC machine, a five-phase inverter module coupled to the five-phase AC machine, and a synchronous current regulator.

  17. Application of Classification Models to Pharyngeal High-Resolution Manometry

    ERIC Educational Resources Information Center

    Mielens, Jason D.; Hoffman, Matthew R.; Ciucci, Michelle R.; McCulloch, Timothy M.; Jiang, Jack J.

    2012-01-01

    Purpose: The authors present 3 methods of performing pattern recognition on spatiotemporal plots produced by pharyngeal high-resolution manometry (HRM). Method: Classification models, including the artificial neural networks (ANNs) multilayer perceptron (MLP) and learning vector quantization (LVQ), as well as support vector machines (SVM), were…

  18. A tool for urban soundscape evaluation applying Support Vector Machines for developing a soundscape classification model.

    PubMed

    Torija, Antonio J; Ruiz, Diego P; Ramos-Ridao, Angel F

    2014-06-01

    To ensure appropriate soundscape management in urban environments, the urban-planning authorities need a range of tools that enable such a task to be performed. An essential step during the management of urban areas from a sound standpoint should be the evaluation of the soundscape in such an area. In this sense, it has been widely acknowledged that a subjective and acoustical categorization of a soundscape is the first step to evaluate it, providing a basis for designing or adapting it to match people's expectations as well. In this sense, this work proposes a model for automatic classification of urban soundscapes. This model is intended for the automatic classification of urban soundscapes based on underlying acoustical and perceptual criteria. Thus, this classification model is proposed to be used as a tool for a comprehensive urban soundscape evaluation. Because of the great complexity associated with the problem, two machine learning techniques, Support Vector Machines (SVM) and Support Vector Machines trained with Sequential Minimal Optimization (SMO), are implemented in developing model classification. The results indicate that the SMO model outperforms the SVM model in the specific task of soundscape classification. With the implementation of the SMO algorithm, the classification model achieves an outstanding performance (91.3% of instances correctly classified). © 2013 Elsevier B.V. All rights reserved.

  19. Measurement of aspheric mirror by nanoprofiler using normal vector tracing

    NASA Astrophysics Data System (ADS)

    Kitayama, Takao; Shiraji, Hiroki; Yamamura, Kazuya; Endo, Katsuyoshi

    2016-09-01

    Aspheric or free-form optics with high accuracy are necessary in many fields such as third-generation synchrotron radiation and extreme-ultraviolet lithography. Therefore the demand of measurement method for aspherical or free-form surface with nanometer accuracy increases. Purpose of our study is to develop a non-contact measurement technology for aspheric or free-form surfaces directly with high repeatability. To achieve this purpose we have developed threedimensional Nanoprofiler which detects normal vectors of sample surface. The measurement principle is based on the straightness of laser light and the accurate motion of rotational goniometers. This machine consists of four rotational stages, one translational stage and optical head which has the quadrant photodiode (QPD) and laser source. In this measurement method, we conform the incident light beam to reflect the beam by controlling five stages and determine the normal vectors and the coordinates of the surface from signal of goniometers, translational stage and QPD. We can obtain three-dimensional figure from the normal vectors and their coordinates by surface reconstruction algorithm. To evaluate performance of this machine we measure a concave aspheric mirror with diameter of 150 mm. As a result we achieve to measure large area of 150mm diameter. And we observe influence of systematic errors which the machine has. Then we simulated the influence and subtracted it from measurement result.

  20. Solar Flare Prediction Model with Three Machine-learning Algorithms using Ultraviolet Brightening and Vector Magnetograms

    NASA Astrophysics Data System (ADS)

    Nishizuka, N.; Sugiura, K.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M.

    2017-02-01

    We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010-2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite. We detected active regions (ARs) from the full-disk magnetogram, from which ˜60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.

  1. Solar Flare Prediction Model with Three Machine-learning Algorithms using Ultraviolet Brightening and Vector Magnetograms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nishizuka, N.; Kubo, Y.; Den, M.

    We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010–2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite . We detected active regions (ARs) from the full-disk magnetogram, from which ∼60 features were extracted with their time differentials, including magnetic neutralmore » lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.« less

  2. Combining information from 3 anatomic regions in the diagnosis of glaucoma with time-domain optical coherence tomography.

    PubMed

    Wang, Mingwu; Lu, Ake Tzu-Hui; Varma, Rohit; Schuman, Joel S; Greenfield, David S; Huang, David

    2014-03-01

    To improve the diagnosis of glaucoma by combining time-domain optical coherence tomography (TD-OCT) measurements of the optic disc, circumpapillary retinal nerve fiber layer (RNFL), and macular retinal thickness. Ninety-six age-matched normal and 96 perimetric glaucoma participants were included in this observational, cross-sectional study. Or-logic, support vector machine, relevance vector machine, and linear discrimination function were used to analyze the performances of combined TD-OCT diagnostic variables. The area under the receiver-operating curve (AROC) was used to evaluate the diagnostic accuracy and to compare the diagnostic performance of single and combined anatomic variables. The best RNFL thickness variables were the inferior (AROC=0.900), overall (AROC=0.892), and superior quadrants (AROC=0.850). The best optic disc variables were horizontal integrated rim width (AROC=0.909), vertical integrated rim area (AROC=0.908), and cup/disc vertical ratio (AROC=0.890). All macular retinal thickness variables had AROCs of 0.829 or less. Combining the top 3 RNFL and optic disc variables in optimizing glaucoma diagnosis, support vector machine had the highest AROC, 0.954, followed by or-logic (AROC=0.946), linear discrimination function (AROC=0.946), and relevance vector machine (AROC=0.943). All combination diagnostic variables had significantly larger AROCs than any single diagnostic variable. There are no significant differences among the combination diagnostic indices. With TD-OCT, RNFL and optic disc variables had better diagnostic accuracy than macular retinal variables. Combining top RNFL and optic disc variables significantly improved diagnostic performance. Clinically, or-logic classification was the most practical analytical tool with sufficient accuracy to diagnose early glaucoma.

  3. Applying a machine learning model using a locally preserving projection based feature regeneration algorithm to predict breast cancer risk

    NASA Astrophysics Data System (ADS)

    Heidari, Morteza; Zargari Khuzani, Abolfazl; Danala, Gopichandh; Mirniaharikandehei, Seyedehnafiseh; Qian, Wei; Zheng, Bin

    2018-03-01

    Both conventional and deep machine learning has been used to develop decision-support tools applied in medical imaging informatics. In order to take advantages of both conventional and deep learning approach, this study aims to investigate feasibility of applying a locally preserving projection (LPP) based feature regeneration algorithm to build a new machine learning classifier model to predict short-term breast cancer risk. First, a computer-aided image processing scheme was used to segment and quantify breast fibro-glandular tissue volume. Next, initially computed 44 image features related to the bilateral mammographic tissue density asymmetry were extracted. Then, an LLP-based feature combination method was applied to regenerate a new operational feature vector using a maximal variance approach. Last, a k-nearest neighborhood (KNN) algorithm based machine learning classifier using the LPP-generated new feature vectors was developed to predict breast cancer risk. A testing dataset involving negative mammograms acquired from 500 women was used. Among them, 250 were positive and 250 remained negative in the next subsequent mammography screening. Applying to this dataset, LLP-generated feature vector reduced the number of features from 44 to 4. Using a leave-onecase-out validation method, area under ROC curve produced by the KNN classifier significantly increased from 0.62 to 0.68 (p < 0.05) and odds ratio was 4.60 with a 95% confidence interval of [3.16, 6.70]. Study demonstrated that this new LPP-based feature regeneration approach enabled to produce an optimal feature vector and yield improved performance in assisting to predict risk of women having breast cancer detected in the next subsequent mammography screening.

  4. A comparison of graph- and kernel-based -omics data integration algorithms for classifying complex traits.

    PubMed

    Yan, Kang K; Zhao, Hongyu; Pang, Herbert

    2017-12-06

    High-throughput sequencing data are widely collected and analyzed in the study of complex diseases in quest of improving human health. Well-studied algorithms mostly deal with single data source, and cannot fully utilize the potential of these multi-omics data sources. In order to provide a holistic understanding of human health and diseases, it is necessary to integrate multiple data sources. Several algorithms have been proposed so far, however, a comprehensive comparison of data integration algorithms for classification of binary traits is currently lacking. In this paper, we focus on two common classes of integration algorithms, graph-based that depict relationships with subjects denoted by nodes and relationships denoted by edges, and kernel-based that can generate a classifier in feature space. Our paper provides a comprehensive comparison of their performance in terms of various measurements of classification accuracy and computation time. Seven different integration algorithms, including graph-based semi-supervised learning, graph sharpening integration, composite association network, Bayesian network, semi-definite programming-support vector machine (SDP-SVM), relevance vector machine (RVM) and Ada-boost relevance vector machine are compared and evaluated with hypertension and two cancer data sets in our study. In general, kernel-based algorithms create more complex models and require longer computation time, but they tend to perform better than graph-based algorithms. The performance of graph-based algorithms has the advantage of being faster computationally. The empirical results demonstrate that composite association network, relevance vector machine, and Ada-boost RVM are the better performers. We provide recommendations on how to choose an appropriate algorithm for integrating data from multiple sources.

  5. A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue.

    PubMed

    Chen, Zhenyu; Li, Jianping; Wei, Liwei

    2007-10-01

    Recently, gene expression profiling using microarray techniques has been shown as a promising tool to improve the diagnosis and treatment of cancer. Gene expression data contain high level of noise and the overwhelming number of genes relative to the number of available samples. It brings out a great challenge for machine learning and statistic techniques. Support vector machine (SVM) has been successfully used to classify gene expression data of cancer tissue. In the medical field, it is crucial to deliver the user a transparent decision process. How to explain the computed solutions and present the extracted knowledge becomes a main obstacle for SVM. A multiple kernel support vector machine (MK-SVM) scheme, consisting of feature selection, rule extraction and prediction modeling is proposed to improve the explanation capacity of SVM. In this scheme, we show that the feature selection problem can be translated into an ordinary multiple parameters learning problem. And a shrinkage approach: 1-norm based linear programming is proposed to obtain the sparse parameters and the corresponding selected features. We propose a novel rule extraction approach using the information provided by the separating hyperplane and support vectors to improve the generalization capacity and comprehensibility of rules and reduce the computational complexity. Two public gene expression datasets: leukemia dataset and colon tumor dataset are used to demonstrate the performance of this approach. Using the small number of selected genes, MK-SVM achieves encouraging classification accuracy: more than 90% for both two datasets. Moreover, very simple rules with linguist labels are extracted. The rule sets have high diagnostic power because of their good classification performance.

  6. Prediction task guided representation learning of medical codes in EHR.

    PubMed

    Cui, Liwen; Xie, Xiaolei; Shen, Zuojun

    2018-06-18

    There have been rapidly growing applications using machine learning models for predictive analytics in Electronic Health Records (EHR) to improve the quality of hospital services and the efficiency of healthcare resource utilization. A fundamental and crucial step in developing such models is to convert medical codes in EHR to feature vectors. These medical codes are used to represent diagnoses or procedures. Their vector representations have a tremendous impact on the performance of machine learning models. Recently, some researchers have utilized representation learning methods from Natural Language Processing (NLP) to learn vector representations of medical codes. However, most previous approaches are unsupervised, i.e. the generation of medical code vectors is independent from prediction tasks. Thus, the obtained feature vectors may be inappropriate for a specific prediction task. Moreover, unsupervised methods often require a lot of samples to obtain reliable results, but most practical problems have very limited patient samples. In this paper, we develop a new method called Prediction Task Guided Health Record Aggregation (PTGHRA), which aggregates health records guided by prediction tasks, to construct training corpus for various representation learning models. Compared with unsupervised approaches, representation learning models integrated with PTGHRA yield a significant improvement in predictive capability of generated medical code vectors, especially for limited training samples. Copyright © 2018. Published by Elsevier Inc.

  7. Machine learning to predict the occurrence of bisphosphonate-related osteonecrosis of the jaw associated with dental extraction: A preliminary report.

    PubMed

    Kim, Dong Wook; Kim, Hwiyoung; Nam, Woong; Kim, Hyung Jun; Cha, In-Ho

    2018-04-23

    The aim of this study was to build and validate five types of machine learning models that can predict the occurrence of BRONJ associated with dental extraction in patients taking bisphosphonates for the management of osteoporosis. A retrospective review of the medical records was conducted to obtain cases and controls for the study. Total 125 patients consisting of 41 cases and 84 controls were selected for the study. Five machine learning prediction algorithms including multivariable logistic regression model, decision tree, support vector machine, artificial neural network, and random forest were implemented. The outputs of these models were compared with each other and also with conventional methods, such as serum CTX level. Area under the receiver operating characteristic (ROC) curve (AUC) was used to compare the results. The performance of machine learning models was significantly superior to conventional statistical methods and single predictors. The random forest model yielded the best performance (AUC = 0.973), followed by artificial neural network (AUC = 0.915), support vector machine (AUC = 0.882), logistic regression (AUC = 0.844), decision tree (AUC = 0.821), drug holiday alone (AUC = 0.810), and CTX level alone (AUC = 0.630). Machine learning methods showed superior performance in predicting BRONJ associated with dental extraction compared to conventional statistical methods using drug holiday and serum CTX level. Machine learning can thus be applied in a wide range of clinical studies. Copyright © 2017. Published by Elsevier Inc.

  8. Evaluation of PLS, LS-SVM, and LWR for quantitative spectroscopic analysis of soils

    USDA-ARS?s Scientific Manuscript database

    Soil testing requires the analysis of large numbers of samples in laboratory that are often time consuming and expensive. Mid-infrared spectroscopy (mid-IR) and near-infrared spectroscopy (NIRS) are fast, non-destructive, and inexpensive analytical methods that have been used for soil analysis, in l...

  9. Automated Scoring of Chinese Engineering Students' English Essays

    ERIC Educational Resources Information Center

    Liu, Ming; Wang, Yuqi; Xu, Weiwei; Liu, Li

    2017-01-01

    The number of Chinese engineering students has increased greatly since 1999. Rating the quality of these students' English essays has thus become time-consuming and challenging. This paper presents a novel automatic essay scoring algorithm called PSOSVR, based on a machine learning algorithm, Support Vector Machine for Regression (SVR), and a…

  10. Monitoring Hitting Load in Tennis Using Inertial Sensors and Machine Learning.

    PubMed

    Whiteside, David; Cant, Olivia; Connolly, Molly; Reid, Machar

    2017-10-01

    Quantifying external workload is fundamental to training prescription in sport. In tennis, global positioning data are imprecise and fail to capture hitting loads. The current gold standard (manual notation) is time intensive and often not possible given players' heavy travel schedules. To develop an automated stroke-classification system to help quantify hitting load in tennis. Nineteen athletes wore an inertial measurement unit (IMU) on their wrist during 66 video-recorded training sessions. Video footage was manually notated such that known shot type (serve, rally forehand, slice forehand, forehand volley, rally backhand, slice backhand, backhand volley, smash, or false positive) was associated with the corresponding IMU data for 28,582 shots. Six types of machine-learning models were then constructed to classify true shot type from the IMU signals. Across 10-fold cross-validation, a cubic-kernel support vector machine classified binned shots (overhead, forehand, or backhand) with an accuracy of 97.4%. A second cubic-kernel support vector machine achieved 93.2% accuracy when classifying all 9 shot types. With a view to monitoring external load, the combination of miniature inertial sensors and machine learning offers a practical and automated method of quantifying shot counts and discriminating shot types in elite tennis players.

  11. Implementation of a smartphone wireless accelerometer platform for establishing deep brain stimulation treatment efficacy of essential tremor with machine learning.

    PubMed

    LeMoyne, Robert; Tomycz, Nestor; Mastroianni, Timothy; McCandless, Cyrus; Cozza, Michael; Peduto, David

    2015-01-01

    Essential tremor (ET) is a highly prevalent movement disorder. Patients with ET exhibit a complex progressive and disabling tremor, and medical management often fails. Deep brain stimulation (DBS) has been successfully applied to this disorder, however there has been no quantifiable way to measure tremor severity or treatment efficacy in this patient population. The quantified amelioration of kinetic tremor via DBS is herein demonstrated through the application of a smartphone (iPhone) as a wireless accelerometer platform. The recorded acceleration signal can be obtained at a setting of the subject's convenience and conveyed by wireless transmission through the Internet for post-processing anywhere in the world. Further post-processing of the acceleration signal can be classified through a machine learning application, such as the support vector machine. Preliminary application of deep brain stimulation with a smartphone for acquisition of a feature set and machine learning for classification has been successfully applied. The support vector machine achieved 100% classification between deep brain stimulation in `on' and `off' mode based on the recording of an accelerometer signal through a smartphone as a wireless accelerometer platform.

  12. Machine parts recognition using a trinary associative memory

    NASA Technical Reports Server (NTRS)

    Awwal, Abdul Ahad S.; Karim, Mohammad A.; Liu, Hua-Kuang

    1989-01-01

    The convergence mechanism of vectors in Hopfield's neural network in relation to recognition of partially known patterns is studied in terms of both inner products and Hamming distance. It has been shown that Hamming distance should not always be used in determining the convergence of vectors. Instead, inner product weighting coefficients play a more dominant role in certain data representations for determining the convergence mechanism. A trinary neuron representation for associative memory is found to be more effective for associative recall. Applications of the trinary associative memory to reconstruct machine part images that are partially missing are demonstrated by means of computer simulation as examples of the usefulness of this approach.

  13. Machine Parts Recognition Using A Trinary Associative Memory

    NASA Astrophysics Data System (ADS)

    Awwal, Abdul A. S.; Karim, Mohammad A.; Liu, Hua-Kuang

    1989-05-01

    The convergence mechanism of vectors in Hopfield's neural network in relation to recognition of partially known patterns is studied in terms of both inner products and Hamming distance. It has been shown that Hamming distance should not always be used in determining the convergence of vectors. Instead, inner product weighting coefficients play a more dominant role in certain data representations for determining the convergence mechanism. A trinary neuron representation for associative memory is found to be more effective for associative recall. Applications of the trinary associative memory to reconstruct machine part images that are partially missing are demonstrated by means of computer simulation as examples of the usefulness of this approach.

  14. Research on Classification of Chinese Text Data Based on SVM

    NASA Astrophysics Data System (ADS)

    Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao

    2017-09-01

    Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.

  15. Objective research of auscultation signals in Traditional Chinese Medicine based on wavelet packet energy and support vector machine.

    PubMed

    Yan, Jianjun; Shen, Xiaojing; Wang, Yiqin; Li, Fufeng; Xia, Chunming; Guo, Rui; Chen, Chunfeng; Shen, Qingwei

    2010-01-01

    This study aims at utilising Wavelet Packet Transform (WPT) and Support Vector Machine (SVM) algorithm to make objective analysis and quantitative research for the auscultation in Traditional Chinese Medicine (TCM) diagnosis. First, Wavelet Packet Decomposition (WPD) at level 6 was employed to split more elaborate frequency bands of the auscultation signals. Then statistic analysis was made based on the extracted Wavelet Packet Energy (WPE) features from WPD coefficients. Furthermore, the pattern recognition was used to distinguish mixed subjects' statistical feature values of sample groups through SVM. Finally, the experimental results showed that the classification accuracies were at a high level.

  16. Electrocardiographic signals and swarm-based support vector machine for hypoglycemia detection.

    PubMed

    Nuryani, Nuryani; Ling, Steve S H; Nguyen, H T

    2012-04-01

    Cardiac arrhythmia relating to hypoglycemia is suggested as a cause of death in diabetic patients. This article introduces electrocardiographic (ECG) parameters for artificially induced hypoglycemia detection. In addition, a hybrid technique of swarm-based support vector machine (SVM) is introduced for hypoglycemia detection using the ECG parameters as inputs. In this technique, a particle swarm optimization (PSO) is proposed to optimize the SVM to detect hypoglycemia. In an experiment using medical data of patients with Type 1 diabetes, the introduced ECG parameters show significant contributions to the performance of the hypoglycemia detection and the proposed detection technique performs well in terms of sensitivity and specificity.

  17. Identification of handwriting by using the genetic algorithm (GA) and support vector machine (SVM)

    NASA Astrophysics Data System (ADS)

    Zhang, Qigui; Deng, Kai

    2016-12-01

    As portable digital camera and a camera phone comes more and more popular, and equally pressing is meeting the requirements of people to shoot at any time, to identify and storage handwritten character. In this paper, genetic algorithm(GA) and support vector machine(SVM)are used for identification of handwriting. Compare with parameters-optimized method, this technique overcomes two defects: first, it's easy to trap in the local optimum; second, finding the best parameters in the larger range will affects the efficiency of classification and prediction. As the experimental results suggest, GA-SVM has a higher recognition rate.

  18. Optimization of Support Vector Machine (SVM) for Object Classification

    NASA Technical Reports Server (NTRS)

    Scholten, Matthew; Dhingra, Neil; Lu, Thomas T.; Chao, Tien-Hsin

    2012-01-01

    The Support Vector Machine (SVM) is a powerful algorithm, useful in classifying data into species. The SVMs implemented in this research were used as classifiers for the final stage in a Multistage Automatic Target Recognition (ATR) system. A single kernel SVM known as SVMlight, and a modified version known as a SVM with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SVM as a method for classification. From trial to trial, SVM produces consistent results.

  19. Control-group feature normalization for multivariate pattern analysis of structural MRI data using the support vector machine.

    PubMed

    Linn, Kristin A; Gaonkar, Bilwaj; Satterthwaite, Theodore D; Doshi, Jimit; Davatzikos, Christos; Shinohara, Russell T

    2016-05-15

    Normalization of feature vector values is a common practice in machine learning. Generally, each feature value is standardized to the unit hypercube or by normalizing to zero mean and unit variance. Classification decisions based on support vector machines (SVMs) or by other methods are sensitive to the specific normalization used on the features. In the context of multivariate pattern analysis using neuroimaging data, standardization effectively up- and down-weights features based on their individual variability. Since the standard approach uses the entire data set to guide the normalization, it utilizes the total variability of these features. This total variation is inevitably dependent on the amount of marginal separation between groups. Thus, such a normalization may attenuate the separability of the data in high dimensional space. In this work we propose an alternate approach that uses an estimate of the control-group standard deviation to normalize features before training. We study our proposed approach in the context of group classification using structural MRI data. We show that control-based normalization leads to better reproducibility of estimated multivariate disease patterns and improves the classifier performance in many cases. Copyright © 2016 Elsevier Inc. All rights reserved.

  20. Breast cancer risk assessment and diagnosis model using fuzzy support vector machine based expert system

    NASA Astrophysics Data System (ADS)

    Dheeba, J.; Jaya, T.; Singh, N. Albert

    2017-09-01

    Classification of cancerous masses is a challenging task in many computerised detection systems. Cancerous masses are difficult to detect because these masses are obscured and subtle in mammograms. This paper investigates an intelligent classifier - fuzzy support vector machine (FSVM) applied to classify the tissues containing masses on mammograms for breast cancer diagnosis. The algorithm utilises texture features extracted using Laws texture energy measures and a FSVM to classify the suspicious masses. The new FSVM treats every feature as both normal and abnormal samples, but with different membership. By this way, the new FSVM have more generalisation ability to classify the masses in mammograms. The classifier analysed 219 clinical mammograms collected from breast cancer screening laboratory. The tests made on the real clinical mammograms shows that the proposed detection system has better discriminating power than the conventional support vector machine. With the best combination of FSVM and Laws texture features, the area under the Receiver operating characteristic curve reached .95, which corresponds to a sensitivity of 93.27% with a specificity of 87.17%. The results suggest that detecting masses using FSVM contribute to computer-aided detection of breast cancer and as a decision support system for radiologists.

  1. [Machine Learning-based Prediction of Seizure-inducing Action as an Adverse Drug Effect].

    PubMed

    Gao, Mengxuan; Sato, Motoshige; Ikegaya, Yuji

    2018-01-01

     During the preclinical research period of drug development, animal testing is widely used to help screen out a drug's dangerous side effects. However, it remains difficult to predict side effects within the central nervous system. Here, we introduce a machine learning-based in vitro system designed to detect seizure-inducing side effects before clinical trial. We recorded local field potentials from the CA1 alveus in acute mouse neocortico-hippocampal slices that were bath-perfused with each of 14 different drugs, and at 5 different concentrations of each drug. For each of these experimental conditions, we collected seizure-like neuronal activity and merged their waveforms as one graphic image, which was further converted into a feature vector using Caffe, an open framework for deep learning. In the space of the first two principal components, the support vector machine completely separated the vectors (i.e., doses of individual drugs) that induced seizure-like events, and identified diphenhydramine, enoxacin, strychnine and theophylline as "seizure-inducing" drugs, which have indeed been reported to induce seizures in clinical situations. Thus, this artificial intelligence-based classification may provide a new platform to pre-clinically detect seizure-inducing side effects of drugs.

  2. Machine learning-based prediction of adverse drug effects: An example of seizure-inducing compounds.

    PubMed

    Gao, Mengxuan; Igata, Hideyoshi; Takeuchi, Aoi; Sato, Kaoru; Ikegaya, Yuji

    2017-02-01

    Various biological factors have been implicated in convulsive seizures, involving side effects of drugs. For the preclinical safety assessment of drug development, it is difficult to predict seizure-inducing side effects. Here, we introduced a machine learning-based in vitro system designed to detect seizure-inducing side effects. We recorded local field potentials from the CA1 alveus in acute mouse neocortico-hippocampal slices, while 14 drugs were bath-perfused at 5 different concentrations each. For each experimental condition, we collected seizure-like neuronal activity and merged their waveforms as one graphic image, which was further converted into a feature vector using Caffe, an open framework for deep learning. In the space of the first two principal components, the support vector machine completely separated the vectors (i.e., doses of individual drugs) that induced seizure-like events and identified diphenhydramine, enoxacin, strychnine and theophylline as "seizure-inducing" drugs, which indeed were reported to induce seizures in clinical situations. Thus, this artificial intelligence-based classification may provide a new platform to detect the seizure-inducing side effects of preclinical drugs. Copyright © 2017 The Authors. Production and hosting by Elsevier B.V. All rights reserved.

  3. Prediction of Drug-Plasma Protein Binding Using Artificial Intelligence Based Algorithms.

    PubMed

    Kumar, Rajnish; Sharma, Anju; Siddiqui, Mohammed Haris; Tiwari, Rajesh Kumar

    2018-01-01

    Plasma protein binding (PPB) has vital importance in the characterization of drug distribution in the systemic circulation. Unfavorable PPB can pose a negative effect on clinical development of promising drug candidates. The drug distribution properties should be considered at the initial phases of the drug design and development. Therefore, PPB prediction models are receiving an increased attention. In the current study, we present a systematic approach using Support vector machine, Artificial neural network, k- nearest neighbor, Probabilistic neural network, Partial least square and Linear discriminant analysis to relate various in vitro and in silico molecular descriptors to a diverse dataset of 736 drugs/drug-like compounds. The overall accuracy of Support vector machine with Radial basis function kernel came out to be comparatively better than the rest of the applied algorithms. The training set accuracy, validation set accuracy, precision, sensitivity, specificity and F1 score for the Suprort vector machine was found to be 89.73%, 89.97%, 92.56%, 87.26%, 91.97% and 0.898, respectively. This model can potentially be useful in screening of relevant drug candidates at the preliminary stages of drug design and development. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  4. SVM Classifier - a comprehensive java interface for support vector machine classification of microarray data.

    PubMed

    Pirooznia, Mehdi; Deng, Youping

    2006-12-12

    Graphical user interface (GUI) software promotes novelty by allowing users to extend the functionality. SVM Classifier is a cross-platform graphical application that handles very large datasets well. The purpose of this study is to create a GUI application that allows SVM users to perform SVM training, classification and prediction. The GUI provides user-friendly access to state-of-the-art SVM methods embodied in the LIBSVM implementation of Support Vector Machine. We implemented the java interface using standard swing libraries. We used a sample data from a breast cancer study for testing classification accuracy. We achieved 100% accuracy in classification among the BRCA1-BRCA2 samples with RBF kernel of SVM. We have developed a java GUI application that allows SVM users to perform SVM training, classification and prediction. We have demonstrated that support vector machines can accurately classify genes into functional categories based upon expression data from DNA microarray hybridization experiments. Among the different kernel functions that we examined, the SVM that uses a radial basis kernel function provides the best performance. The SVM Classifier is available at http://mfgn.usm.edu/ebl/svm/.

  5. Rainfall-induced Landslide Susceptibility assessment at the Longnan county

    NASA Astrophysics Data System (ADS)

    Hong, Haoyuan; Zhang, Ying

    2017-04-01

    Landslides are a serious disaster in Longnan county, China. Therefore landslide susceptibility assessment is useful tool for government or decision making. The main objective of this study is to investigate and compare the frequency ratio, support vector machines, and logistic regression. The Longnan county (Jiangxi province, China) was selected as the case study. First, the landslide inventory map with 354 landslide locations was constructed. Then landslide locations were then randomly divided into a ratio of 70/30 for the training and validating the models. Second, fourteen landslide conditioning factors were prepared such as slope, aspect, altitude, topographic wetness index (TWI), stream power index (SPI), sediment transport index (STI), plan curvature, lithology, distance to faults, distance to rivers, distance to roads, land use, normalized difference vegetation index (NDVI), and rainfall. Using the frequency ratio, support vector machines, and logistic regression, a total of three landslide susceptibility models were constructed. Finally, the overall performance of the resulting models was assessed and compared using the Receiver operating characteristic (ROC) curve technique. The result showed that the support vector machines model is the best model in the study area. The success rate is 88.39 %; and prediction rate is 84.06 %.

  6. Human action recognition with group lasso regularized-support vector machine

    NASA Astrophysics Data System (ADS)

    Luo, Huiwu; Lu, Huanzhang; Wu, Yabei; Zhao, Fei

    2016-05-01

    The bag-of-visual-words (BOVW) and Fisher kernel are two popular models in human action recognition, and support vector machine (SVM) is the most commonly used classifier for the two models. We show two kinds of group structures in the feature representation constructed by BOVW and Fisher kernel, respectively, since the structural information of feature representation can be seen as a prior for the classifier and can improve the performance of the classifier, which has been verified in several areas. However, the standard SVM employs L2-norm regularization in its learning procedure, which penalizes each variable individually and cannot express the structural information of feature representation. We replace the L2-norm regularization with group lasso regularization in standard SVM, and a group lasso regularized-support vector machine (GLRSVM) is proposed. Then, we embed the group structural information of feature representation into GLRSVM. Finally, we introduce an algorithm to solve the optimization problem of GLRSVM by alternating directions method of multipliers. The experiments evaluated on KTH, YouTube, and Hollywood2 datasets show that our method achieves promising results and improves the state-of-the-art methods on KTH and YouTube datasets.

  7. Fuzzy Nonlinear Proximal Support Vector Machine for Land Extraction Based on Remote Sensing Image

    PubMed Central

    Zhong, Xiaomei; Li, Jianping; Dou, Huacheng; Deng, Shijun; Wang, Guofei; Jiang, Yu; Wang, Yongjie; Zhou, Zebing; Wang, Li; Yan, Fei

    2013-01-01

    Currently, remote sensing technologies were widely employed in the dynamic monitoring of the land. This paper presented an algorithm named fuzzy nonlinear proximal support vector machine (FNPSVM) by basing on ETM+ remote sensing image. This algorithm is applied to extract various types of lands of the city Da’an in northern China. Two multi-category strategies, namely “one-against-one” and “one-against-rest” for this algorithm were described in detail and then compared. A fuzzy membership function was presented to reduce the effects of noises or outliers on the data samples. The approaches of feature extraction, feature selection, and several key parameter settings were also given. Numerous experiments were carried out to evaluate its performances including various accuracies (overall accuracies and kappa coefficient), stability, training speed, and classification speed. The FNPSVM classifier was compared to the other three classifiers including the maximum likelihood classifier (MLC), back propagation neural network (BPN), and the proximal support vector machine (PSVM) under different training conditions. The impacts of the selection of training samples, testing samples and features on the four classifiers were also evaluated in these experiments. PMID:23936016

  8. Relevance Vector Machine Learning for Neonate Pain Intensity Assessment Using Digital Imaging

    PubMed Central

    Gholami, Behnood; Tannenbaum, Allen R.

    2011-01-01

    Pain assessment in patients who are unable to verbally communicate is a challenging problem. The fundamental limitations in pain assessment in neonates stem from subjective assessment criteria, rather than quantifiable and measurable data. This often results in poor quality and inconsistent treatment of patient pain management. Recent advancements in pattern recognition techniques using relevance vector machine (RVM) learning techniques can assist medical staff in assessing pain by constantly monitoring the patient and providing the clinician with quantifiable data for pain management. The RVM classification technique is a Bayesian extension of the support vector machine (SVM) algorithm, which achieves comparable performance to SVM while providing posterior probabilities for class memberships and a sparser model. If classes represent “pure” facial expressions (i.e., extreme expressions that an observer can identify with a high degree of confidence), then the posterior probability of the membership of some intermediate facial expression to a class can provide an estimate of the intensity of such an expression. In this paper, we use the RVM classification technique to distinguish pain from nonpain in neonates as well as assess their pain intensity levels. We also correlate our results with the pain intensity assessed by expert and nonexpert human examiners. PMID:20172803

  9. A Real-Time Interference Monitoring Technique for GNSS Based on a Twin Support Vector Machine Method.

    PubMed

    Li, Wutao; Huang, Zhigang; Lang, Rongling; Qin, Honglei; Zhou, Kai; Cao, Yongbin

    2016-03-04

    Interferences can severely degrade the performance of Global Navigation Satellite System (GNSS) receivers. As the first step of GNSS any anti-interference measures, interference monitoring for GNSS is extremely essential and necessary. Since interference monitoring can be considered as a classification problem, a real-time interference monitoring technique based on Twin Support Vector Machine (TWSVM) is proposed in this paper. A TWSVM model is established, and TWSVM is solved by the Least Squares Twin Support Vector Machine (LSTWSVM) algorithm. The interference monitoring indicators are analyzed to extract features from the interfered GNSS signals. The experimental results show that the chosen observations can be used as the interference monitoring indicators. The interference monitoring performance of the proposed method is verified by using GPS L1 C/A code signal and being compared with that of standard SVM. The experimental results indicate that the TWSVM-based interference monitoring is much faster than the conventional SVM. Furthermore, the training time of TWSVM is on millisecond (ms) level and the monitoring time is on microsecond (μs) level, which make the proposed approach usable in practical interference monitoring applications.

  10. Statistical downscaling of GCM simulations to streamflow using relevance vector machine

    NASA Astrophysics Data System (ADS)

    Ghosh, Subimal; Mujumdar, P. P.

    2008-01-01

    General circulation models (GCMs), the climate models often used in assessing the impact of climate change, operate on a coarse scale and thus the simulation results obtained from GCMs are not particularly useful in a comparatively smaller river basin scale hydrology. The article presents a methodology of statistical downscaling based on sparse Bayesian learning and Relevance Vector Machine (RVM) to model streamflow at river basin scale for monsoon period (June, July, August, September) using GCM simulated climatic variables. NCEP/NCAR reanalysis data have been used for training the model to establish a statistical relationship between streamflow and climatic variables. The relationship thus obtained is used to project the future streamflow from GCM simulations. The statistical methodology involves principal component analysis, fuzzy clustering and RVM. Different kernel functions are used for comparison purpose. The model is applied to Mahanadi river basin in India. The results obtained using RVM are compared with those of state-of-the-art Support Vector Machine (SVM) to present the advantages of RVMs over SVMs. A decreasing trend is observed for monsoon streamflow of Mahanadi due to high surface warming in future, with the CCSR/NIES GCM and B2 scenario.

  11. A Real-Time Interference Monitoring Technique for GNSS Based on a Twin Support Vector Machine Method

    PubMed Central

    Li, Wutao; Huang, Zhigang; Lang, Rongling; Qin, Honglei; Zhou, Kai; Cao, Yongbin

    2016-01-01

    Interferences can severely degrade the performance of Global Navigation Satellite System (GNSS) receivers. As the first step of GNSS any anti-interference measures, interference monitoring for GNSS is extremely essential and necessary. Since interference monitoring can be considered as a classification problem, a real-time interference monitoring technique based on Twin Support Vector Machine (TWSVM) is proposed in this paper. A TWSVM model is established, and TWSVM is solved by the Least Squares Twin Support Vector Machine (LSTWSVM) algorithm. The interference monitoring indicators are analyzed to extract features from the interfered GNSS signals. The experimental results show that the chosen observations can be used as the interference monitoring indicators. The interference monitoring performance of the proposed method is verified by using GPS L1 C/A code signal and being compared with that of standard SVM. The experimental results indicate that the TWSVM-based interference monitoring is much faster than the conventional SVM. Furthermore, the training time of TWSVM is on millisecond (ms) level and the monitoring time is on microsecond (μs) level, which make the proposed approach usable in practical interference monitoring applications. PMID:26959020

  12. Method and apparatus for improving the quality and efficiency of ultrashort-pulse laser machining

    DOEpatents

    Stuart, Brent C.; Nguyen, Hoang T.; Perry, Michael D.

    2001-01-01

    A method and apparatus for improving the quality and efficiency of machining of materials with laser pulse durations shorter than 100 picoseconds by orienting and maintaining the polarization of the laser light such that the electric field vector is perpendicular relative to the edges of the material being processed. Its use is any machining operation requiring remote delivery and/or high precision with minimal collateral dames.

  13. Automatic recognition of vector and parallel operations in a higher level language

    NASA Technical Reports Server (NTRS)

    Schneck, P. B.

    1971-01-01

    A compiler for recognizing statements of a FORTRAN program which are suited for fast execution on a parallel or pipeline machine such as Illiac-4, Star or ASC is described. The technique employs interval analysis to provide flow information to the vector/parallel recognizer. Where profitable the compiler changes scalar variables to subscripted variables. The output of the compiler is an extension to FORTRAN which shows parallel and vector operations explicitly.

  14. Studies of machinable ceramics for dental applications. 1. Color analysis.

    PubMed

    Taira, M; Wakasa, K; Yamaki, M; Tanaka, N; Shintani, H

    1989-12-01

    Machinable ceramics that can be cut and even lathed have recently been developed in industry. As a first step in evaluating the feasibility of such ceramics in dentistry, eight machinable ceramics were examined for color using the Vita shade guide and a chroma-meter reflectance instrument. We discovered that the studied machinable ceramics varied significantly from the Vita shade guide by the color difference vector, delta E. These machinable ceramics appeared very white and strongly opaque due to their high brightness (L*) values. For intra-oral applications, we expect that L* values of machinable ceramics will be reduced by modification of their microstructures, including their matrix and dispersed phases, while their excellent machinability due to the cleavage of dispersed crystals should be retained.

  15. Multidirectional Scanning Model, MUSCLE, to Vectorize Raster Images with Straight Lines

    PubMed Central

    Karas, Ismail Rakip; Bayram, Bulent; Batuk, Fatmagul; Akay, Abdullah Emin; Baz, Ibrahim

    2008-01-01

    This paper presents a new model, MUSCLE (Multidirectional Scanning for Line Extraction), for automatic vectorization of raster images with straight lines. The algorithm of the model implements the line thinning and the simple neighborhood methods to perform vectorization. The model allows users to define specified criteria which are crucial for acquiring the vectorization process. In this model, various raster images can be vectorized such as township plans, maps, architectural drawings, and machine plans. The algorithm of the model was developed by implementing an appropriate computer programming and tested on a basic application. Results, verified by using two well known vectorization programs (WinTopo and Scan2CAD), indicated that the model can successfully vectorize the specified raster data quickly and accurately. PMID:27879843

  16. Multidirectional Scanning Model, MUSCLE, to Vectorize Raster Images with Straight Lines.

    PubMed

    Karas, Ismail Rakip; Bayram, Bulent; Batuk, Fatmagul; Akay, Abdullah Emin; Baz, Ibrahim

    2008-04-15

    This paper presents a new model, MUSCLE (Multidirectional Scanning for Line Extraction), for automatic vectorization of raster images with straight lines. The algorithm of the model implements the line thinning and the simple neighborhood methods to perform vectorization. The model allows users to define specified criteria which are crucial for acquiring the vectorization process. In this model, various raster images can be vectorized such as township plans, maps, architectural drawings, and machine plans. The algorithm of the model was developed by implementing an appropriate computer programming and tested on a basic application. Results, verified by using two well known vectorization programs (WinTopo and Scan2CAD), indicated that the model can successfully vectorize the specified raster data quickly and accurately.

  17. Deep learning of support vector machines with class probability output networks.

    PubMed

    Kim, Sangwook; Yu, Zhibin; Kil, Rhee Man; Lee, Minho

    2015-04-01

    Deep learning methods endeavor to learn features automatically at multiple levels and allow systems to learn complex functions mapping from the input space to the output space for the given data. The ability to learn powerful features automatically is increasingly important as the volume of data and range of applications of machine learning methods continues to grow. This paper proposes a new deep architecture that uses support vector machines (SVMs) with class probability output networks (CPONs) to provide better generalization power for pattern classification problems. As a result, deep features are extracted without additional feature engineering steps, using multiple layers of the SVM classifiers with CPONs. The proposed structure closely approaches the ideal Bayes classifier as the number of layers increases. Using a simulation of classification problems, the effectiveness of the proposed method is demonstrated. Copyright © 2014 Elsevier Ltd. All rights reserved.

  18. New fuzzy support vector machine for the class imbalance problem in medical datasets classification.

    PubMed

    Gu, Xiaoqing; Ni, Tongguang; Wang, Hongyuan

    2014-01-01

    In medical datasets classification, support vector machine (SVM) is considered to be one of the most successful methods. However, most of the real-world medical datasets usually contain some outliers/noise and data often have class imbalance problems. In this paper, a fuzzy support machine (FSVM) for the class imbalance problem (called FSVM-CIP) is presented, which can be seen as a modified class of FSVM by extending manifold regularization and assigning two misclassification costs for two classes. The proposed FSVM-CIP can be used to handle the class imbalance problem in the presence of outliers/noise, and enhance the locality maximum margin. Five real-world medical datasets, breast, heart, hepatitis, BUPA liver, and pima diabetes, from the UCI medical database are employed to illustrate the method presented in this paper. Experimental results on these datasets show the outperformed or comparable effectiveness of FSVM-CIP.

  19. Support vector machines-based fault diagnosis for turbo-pump rotor

    NASA Astrophysics Data System (ADS)

    Yuan, Sheng-Fa; Chu, Fu-Lei

    2006-05-01

    Most artificial intelligence methods used in fault diagnosis are based on empirical risk minimisation principle and have poor generalisation when fault samples are few. Support vector machines (SVM) is a new general machine-learning tool based on structural risk minimisation principle that exhibits good generalisation even when fault samples are few. Fault diagnosis based on SVM is discussed. Since basic SVM is originally designed for two-class classification, while most of fault diagnosis problems are multi-class cases, a new multi-class classification of SVM named 'one to others' algorithm is presented to solve the multi-class recognition problems. It is a binary tree classifier composed of several two-class classifiers organised by fault priority, which is simple, and has little repeated training amount, and the rate of training and recognition is expedited. The effectiveness of the method is verified by the application to the fault diagnosis for turbo pump rotor.

  20. Predicting Flavonoid UGT Regioselectivity

    PubMed Central

    Jackson, Rhydon; Knisley, Debra; McIntosh, Cecilia; Pfeiffer, Phillip

    2011-01-01

    Machine learning was applied to a challenging and biologically significant protein classification problem: the prediction of avonoid UGT acceptor regioselectivity from primary sequence. Novel indices characterizing graphical models of residues were proposed and found to be widely distributed among existing amino acid indices and to cluster residues appropriately. UGT subsequences biochemically linked to regioselectivity were modeled as sets of index sequences. Several learning techniques incorporating these UGT models were compared with classifications based on standard sequence alignment scores. These techniques included an application of time series distance functions to protein classification. Time series distances defined on the index sequences were used in nearest neighbor and support vector machine classifiers. Additionally, Bayesian neural network classifiers were applied to the index sequences. The experiments identified improvements over the nearest neighbor and support vector machine classifications relying on standard alignment similarity scores, as well as strong correlations between specific subsequences and regioselectivities. PMID:21747849

  1. Scorebox extraction from mobile sports videos using Support Vector Machines

    NASA Astrophysics Data System (ADS)

    Kim, Wonjun; Park, Jimin; Kim, Changick

    2008-08-01

    Scorebox plays an important role in understanding contents of sports videos. However, the tiny scorebox may give the small-display-viewers uncomfortable experience in grasping the game situation. In this paper, we propose a novel framework to extract the scorebox from sports video frames. We first extract candidates by using accumulated intensity and edge information after short learning period. Since there are various types of scoreboxes inserted in sports videos, multiple attributes need to be used for efficient extraction. Based on those attributes, the optimal information gain is computed and top three ranked attributes in terms of information gain are selected as a three-dimensional feature vector for Support Vector Machines (SVM) to distinguish the scorebox from other candidates, such as logos and advertisement boards. The proposed method is tested on various videos of sports games and experimental results show the efficiency and robustness of our proposed method.

  2. Learning atoms for materials discovery.

    PubMed

    Zhou, Quan; Tang, Peizhe; Liu, Shenxiu; Pan, Jinbo; Yan, Qimin; Zhang, Shou-Cheng

    2018-06-26

    Exciting advances have been made in artificial intelligence (AI) during recent decades. Among them, applications of machine learning (ML) and deep learning techniques brought human-competitive performances in various tasks of fields, including image recognition, speech recognition, and natural language understanding. Even in Go, the ancient game of profound complexity, the AI player has already beat human world champions convincingly with and without learning from the human. In this work, we show that our unsupervised machines (Atom2Vec) can learn the basic properties of atoms by themselves from the extensive database of known compounds and materials. These learned properties are represented in terms of high-dimensional vectors, and clustering of atoms in vector space classifies them into meaningful groups consistent with human knowledge. We use the atom vectors as basic input units for neural networks and other ML models designed and trained to predict materials properties, which demonstrate significant accuracy. Copyright © 2018 the Author(s). Published by PNAS.

  3. Product demand forecasts using wavelet kernel support vector machine and particle swarm optimization in manufacture system

    NASA Astrophysics Data System (ADS)

    Wu, Qi

    2010-03-01

    Demand forecasts play a crucial role in supply chain management. The future demand for a certain product is the basis for the respective replenishment systems. Aiming at demand series with small samples, seasonal character, nonlinearity, randomicity and fuzziness, the existing support vector kernel does not approach the random curve of the sales time series in the space (quadratic continuous integral space). In this paper, we present a hybrid intelligent system combining the wavelet kernel support vector machine and particle swarm optimization for demand forecasting. The results of application in car sale series forecasting show that the forecasting approach based on the hybrid PSOWv-SVM model is effective and feasible, the comparison between the method proposed in this paper and other ones is also given, which proves that this method is, for the discussed example, better than hybrid PSOv-SVM and other traditional methods.

  4. Discontinuity Detection in the Shield Metal Arc Welding Process

    PubMed Central

    Cocota, José Alberto Naves; Garcia, Gabriel Carvalho; da Costa, Adilson Rodrigues; de Lima, Milton Sérgio Fernandes; Rocha, Filipe Augusto Santos; Freitas, Gustavo Medeiros

    2017-01-01

    This work proposes a new methodology for the detection of discontinuities in the weld bead applied in Shielded Metal Arc Welding (SMAW) processes. The detection system is based on two sensors—a microphone and piezoelectric—that acquire acoustic emissions generated during the welding. The feature vectors extracted from the sensor dataset are used to construct classifier models. The approaches based on Artificial Neural Network (ANN) and Support Vector Machine (SVM) classifiers are able to identify with a high accuracy the three proposed weld bead classes: desirable weld bead, shrinkage cavity and burn through discontinuities. Experimental results illustrate the system’s high accuracy, greater than 90% for each class. A novel Hierarchical Support Vector Machine (HSVM) structure is proposed to make feasible the use of this system in industrial environments. This approach presented 96.6% overall accuracy. Given the simplicity of the equipment involved, this system can be applied in the metal transformation industries. PMID:28489045

  5. Discontinuity Detection in the Shield Metal Arc Welding Process.

    PubMed

    Cocota, José Alberto Naves; Garcia, Gabriel Carvalho; da Costa, Adilson Rodrigues; de Lima, Milton Sérgio Fernandes; Rocha, Filipe Augusto Santos; Freitas, Gustavo Medeiros

    2017-05-10

    This work proposes a new methodology for the detection of discontinuities in the weld bead applied in Shielded Metal Arc Welding (SMAW) processes. The detection system is based on two sensors-a microphone and piezoelectric-that acquire acoustic emissions generated during the welding. The feature vectors extracted from the sensor dataset are used to construct classifier models. The approaches based on Artificial Neural Network (ANN) and Support Vector Machine (SVM) classifiers are able to identify with a high accuracy the three proposed weld bead classes: desirable weld bead, shrinkage cavity and burn through discontinuities. Experimental results illustrate the system's high accuracy, greater than 90% for each class. A novel Hierarchical Support Vector Machine (HSVM) structure is proposed to make feasible the use of this system in industrial environments. This approach presented 96.6% overall accuracy. Given the simplicity of the equipment involved, this system can be applied in the metal transformation industries.

  6. Molecular Symmetry in Ab Initio Calculations

    NASA Astrophysics Data System (ADS)

    Madhavan, P. V.; Written, J. L.

    1987-05-01

    A scheme is presented for the construction of the Fock matrix in LCAO-SCF calculations and for the transformation of basis integrals to LCAO-MO integrals that can utilize several symmetry unique lists of integrals corresponding to different symmetry groups. The algorithm is fully compatible with vector processing machines and is especially suited for parallel processing machines.

  7. Evaluation of the discrete vortex wake cross flow model using vector computers. Part 2: User's manual for DIVORCE

    NASA Technical Reports Server (NTRS)

    Deffenbaugh, F. D.; Vitz, J. F.

    1979-01-01

    The users manual for the Discrete Vortex Cross flow Evaluator (DIVORCE) computer program is presented. DIVORCE was developed in FORTRAN 4 for the DCD 6600 and CDC 7600 machines. Optimal calls to a NASA vector subroutine package are provided for use with the CDC 7600.

  8. Output-only modal parameter estimator of linear time-varying structural systems based on vector TAR model and least squares support vector machine

    NASA Astrophysics Data System (ADS)

    Zhou, Si-Da; Ma, Yuan-Chen; Liu, Li; Kang, Jie; Ma, Zhi-Sai; Yu, Lei

    2018-01-01

    Identification of time-varying modal parameters contributes to the structural health monitoring, fault detection, vibration control, etc. of the operational time-varying structural systems. However, it is a challenging task because there is not more information for the identification of the time-varying systems than that of the time-invariant systems. This paper presents a vector time-dependent autoregressive model and least squares support vector machine based modal parameter estimator for linear time-varying structural systems in case of output-only measurements. To reduce the computational cost, a Wendland's compactly supported radial basis function is used to achieve the sparsity of the Gram matrix. A Gamma-test-based non-parametric approach of selecting the regularization factor is adapted for the proposed estimator to replace the time-consuming n-fold cross validation. A series of numerical examples have illustrated the advantages of the proposed modal parameter estimator on the suppression of the overestimate and the short data. A laboratory experiment has further validated the proposed estimator.

  9. Support vector machine incremental learning triggered by wrongly predicted samples

    NASA Astrophysics Data System (ADS)

    Tang, Ting-long; Guan, Qiu; Wu, Yi-rong

    2018-05-01

    According to the classic Karush-Kuhn-Tucker (KKT) theorem, at every step of incremental support vector machine (SVM) learning, the newly adding sample which violates the KKT conditions will be a new support vector (SV) and migrate the old samples between SV set and non-support vector (NSV) set, and at the same time the learning model should be updated based on the SVs. However, it is not exactly clear at this moment that which of the old samples would change between SVs and NSVs. Additionally, the learning model will be unnecessarily updated, which will not greatly increase its accuracy but decrease the training speed. Therefore, how to choose the new SVs from old sets during the incremental stages and when to process incremental steps will greatly influence the accuracy and efficiency of incremental SVM learning. In this work, a new algorithm is proposed to select candidate SVs and use the wrongly predicted sample to trigger the incremental processing simultaneously. Experimental results show that the proposed algorithm can achieve good performance with high efficiency, high speed and good accuracy.

  10. Prediction of hourly PM2.5 using a space-time support vector regression model

    NASA Astrophysics Data System (ADS)

    Yang, Wentao; Deng, Min; Xu, Feng; Wang, Hang

    2018-05-01

    Real-time air quality prediction has been an active field of research in atmospheric environmental science. The existing methods of machine learning are widely used to predict pollutant concentrations because of their enhanced ability to handle complex non-linear relationships. However, because pollutant concentration data, as typical geospatial data, also exhibit spatial heterogeneity and spatial dependence, they may violate the assumptions of independent and identically distributed random variables in most of the machine learning methods. As a result, a space-time support vector regression model is proposed to predict hourly PM2.5 concentrations. First, to address spatial heterogeneity, spatial clustering is executed to divide the study area into several homogeneous or quasi-homogeneous subareas. To handle spatial dependence, a Gauss vector weight function is then developed to determine spatial autocorrelation variables as part of the input features. Finally, a local support vector regression model with spatial autocorrelation variables is established for each subarea. Experimental data on PM2.5 concentrations in Beijing are used to verify whether the results of the proposed model are superior to those of other methods.

  11. Prediction of Spirometric Forced Expiratory Volume (FEV1) Data Using Support Vector Regression

    NASA Astrophysics Data System (ADS)

    Kavitha, A.; Sujatha, C. M.; Ramakrishnan, S.

    2010-01-01

    In this work, prediction of forced expiratory volume in 1 second (FEV1) in pulmonary function test is carried out using the spirometer and support vector regression analysis. Pulmonary function data are measured with flow volume spirometer from volunteers (N=175) using a standard data acquisition protocol. The acquired data are then used to predict FEV1. Support vector machines with polynomial kernel function with four different orders were employed to predict the values of FEV1. The performance is evaluated by computing the average prediction accuracy for normal and abnormal cases. Results show that support vector machines are capable of predicting FEV1 in both normal and abnormal cases and the average prediction accuracy for normal subjects was higher than that of abnormal subjects. Accuracy in prediction was found to be high for a regularization constant of C=10. Since FEV1 is the most significant parameter in the analysis of spirometric data, it appears that this method of assessment is useful in diagnosing the pulmonary abnormalities with incomplete data and data with poor recording.

  12. Study on vibration characteristics and fault diagnosis method of oil-immersed flat wave reactor in Arctic area converter station

    NASA Astrophysics Data System (ADS)

    Lai, Wenqing; Wang, Yuandong; Li, Wenpeng; Sun, Guang; Qu, Guomin; Cui, Shigang; Li, Mengke; Wang, Yongqiang

    2017-10-01

    Based on long term vibration monitoring of the No.2 oil-immersed fat wave reactor in the ±500kV converter station in East Mongolia, the vibration signals in normal state and in core loose fault state were saved. Through the time-frequency analysis of the signals, the vibration characteristics of the core loose fault were obtained, and a fault diagnosis method based on the dual tree complex wavelet (DT-CWT) and support vector machine (SVM) was proposed. The vibration signals were analyzed by DT-CWT, and the energy entropy of the vibration signals were taken as the feature vector; the support vector machine was used to train and test the feature vector, and the accurate identification of the core loose fault of the flat wave reactor was realized. Through the identification of many groups of normal and core loose fault state vibration signals, the diagnostic accuracy of the result reached 97.36%. The effectiveness and accuracy of the method in the fault diagnosis of the flat wave reactor core is verified.

  13. Modeling Dengue vector population using remotely sensed data and machine learning.

    PubMed

    Scavuzzo, Juan M; Trucco, Francisco; Espinosa, Manuel; Tauro, Carolina B; Abril, Marcelo; Scavuzzo, Carlos M; Frery, Alejandro C

    2018-05-16

    Mosquitoes are vectors of many human diseases. In particular, Aedes ægypti (Linnaeus) is the main vector for Chikungunya, Dengue, and Zika viruses in Latin America and it represents a global threat. Public health policies that aim at combating this vector require dependable and timely information, which is usually expensive to obtain with field campaigns. For this reason, several efforts have been done to use remote sensing due to its reduced cost. The present work includes the temporal modeling of the oviposition activity (measured weekly on 50 ovitraps in a north Argentinean city) of Aedes ægypti (Linnaeus), based on time series of data extracted from operational earth observation satellite images. We use are NDVI, NDWI, LST night, LST day and TRMM-GPM rain from 2012 to 2016 as predictive variables. In contrast to previous works which use linear models, we employ Machine Learning techniques using completely accessible open source toolkits. These models have the advantages of being non-parametric and capable of describing nonlinear relationships between variables. Specifically, in addition to two linear approaches, we assess a support vector machine, an artificial neural networks, a K-nearest neighbors and a decision tree regressor. Considerations are made on parameter tuning and the validation and training approach. The results are compared to linear models used in previous works with similar data sets for generating temporal predictive models. These new tools perform better than linear approaches, in particular nearest neighbor regression (KNNR) performs the best. These results provide better alternatives to be implemented operatively on the Argentine geospatial risk system that is running since 2012. Copyright © 2018 Elsevier B.V. All rights reserved.

  14. [Identification of green tea brand based on hyperspectra imaging technology].

    PubMed

    Zhang, Hai-Liang; Liu, Xiao-Li; Zhu, Feng-Le; He, Yong

    2014-05-01

    Hyperspectral imaging technology was developed to identify different brand famous green tea based on PCA information and image information fusion. First 512 spectral images of six brands of famous green tea in the 380 approximately 1 023 nm wavelength range were collected and principal component analysis (PCA) was performed with the goal of selecting two characteristic bands (545 and 611 nm) that could potentially be used for classification system. Then, 12 gray level co-occurrence matrix (GLCM) features (i. e., mean, covariance, homogeneity, energy, contrast, correlation, entropy, inverse gap, contrast, difference from the second-order and autocorrelation) based on the statistical moment were extracted from each characteristic band image. Finally, integration of the 12 texture features and three PCA spectral characteristics for each green tea sample were extracted as the input of LS-SVM. Experimental results showed that discriminating rate was 100% in the prediction set. The receiver operating characteristic curve (ROC) assessment methods were used to evaluate the LS-SVM classification algorithm. Overall results sufficiently demonstrate that hyperspectral imaging technology can be used to perform classification of green tea.

  15. Development of a sugar-binding residue prediction system from protein sequences using support vector machine.

    PubMed

    Banno, Masaki; Komiyama, Yusuke; Cao, Wei; Oku, Yuya; Ueki, Kokoro; Sumikoshi, Kazuya; Nakamura, Shugo; Terada, Tohru; Shimizu, Kentaro

    2017-02-01

    Several methods have been proposed for protein-sugar binding site prediction using machine learning algorithms. However, they are not effective to learn various properties of binding site residues caused by various interactions between proteins and sugars. In this study, we classified sugars into acidic and nonacidic sugars and showed that their binding sites have different amino acid occurrence frequencies. By using this result, we developed sugar-binding residue predictors dedicated to the two classes of sugars: an acid sugar binding predictor and a nonacidic sugar binding predictor. We also developed a combination predictor which combines the results of the two predictors. We showed that when a sugar is known to be an acidic sugar, the acidic sugar binding predictor achieves the best performance, and showed that when a sugar is known to be a nonacidic sugar or is not known to be either of the two classes, the combination predictor achieves the best performance. Our method uses only amino acid sequences for prediction. Support vector machine was used as a machine learning algorithm and the position-specific scoring matrix created by the position-specific iterative basic local alignment search tool was used as the feature vector. We evaluated the performance of the predictors using five-fold cross-validation. We have launched our system, as an open source freeware tool on the GitHub repository (https://doi.org/10.5281/zenodo.61513). Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  16. Channelized relevance vector machine as a numerical observer for cardiac perfusion defect detection task

    NASA Astrophysics Data System (ADS)

    Kalayeh, Mahdi M.; Marin, Thibault; Pretorius, P. Hendrik; Wernick, Miles N.; Yang, Yongyi; Brankov, Jovan G.

    2011-03-01

    In this paper, we present a numerical observer for image quality assessment, aiming to predict human observer accuracy in a cardiac perfusion defect detection task for single-photon emission computed tomography (SPECT). In medical imaging, image quality should be assessed by evaluating the human observer accuracy for a specific diagnostic task. This approach is known as task-based assessment. Such evaluations are important for optimizing and testing imaging devices and algorithms. Unfortunately, human observer studies with expert readers are costly and time-demanding. To address this problem, numerical observers have been developed as a surrogate for human readers to predict human diagnostic performance. The channelized Hotelling observer (CHO) with internal noise model has been found to predict human performance well in some situations, but does not always generalize well to unseen data. We have argued in the past that finding a model to predict human observers could be viewed as a machine learning problem. Following this approach, in this paper we propose a channelized relevance vector machine (CRVM) to predict human diagnostic scores in a detection task. We have previously used channelized support vector machines (CSVM) to predict human scores and have shown that this approach offers better and more robust predictions than the classical CHO method. The comparison of the proposed CRVM with our previously introduced CSVM method suggests that CRVM can achieve similar generalization accuracy, while dramatically reducing model complexity and computation time.

  17. Improvements on ν-Twin Support Vector Machine.

    PubMed

    Khemchandani, Reshma; Saigal, Pooja; Chandra, Suresh

    2016-07-01

    In this paper, we propose two novel binary classifiers termed as "Improvements on ν-Twin Support Vector Machine: Iν-TWSVM and Iν-TWSVM (Fast)" that are motivated by ν-Twin Support Vector Machine (ν-TWSVM). Similar to ν-TWSVM, Iν-TWSVM determines two nonparallel hyperplanes such that they are closer to their respective classes and are at least ρ distance away from the other class. The significant advantage of Iν-TWSVM over ν-TWSVM is that Iν-TWSVM solves one smaller-sized Quadratic Programming Problem (QPP) and one Unconstrained Minimization Problem (UMP); as compared to solving two related QPPs in ν-TWSVM. Further, Iν-TWSVM (Fast) avoids solving a smaller sized QPP and transforms it as a unimodal function, which can be solved using line search methods and similar to Iν-TWSVM, the other problem is solved as a UMP. Due to their novel formulation, the proposed classifiers are faster than ν-TWSVM and have comparable generalization ability. Iν-TWSVM also implements structural risk minimization (SRM) principle by introducing a regularization term, along with minimizing the empirical risk. The other properties of Iν-TWSVM, related to support vectors (SVs), are similar to that of ν-TWSVM. To test the efficacy of the proposed method, experiments have been conducted on a wide range of UCI and a skewed variation of NDC datasets. We have also given the application of Iν-TWSVM as a binary classifier for pixel classification of color images. Copyright © 2016 Elsevier Ltd. All rights reserved.

  18. CAT-PUMA: CME Arrival Time Prediction Using Machine learning Algorithms

    NASA Astrophysics Data System (ADS)

    Liu, Jiajia; Ye, Yudong; Shen, Chenglong; Wang, Yuming; Erdélyi, Robert

    2018-04-01

    CAT-PUMA (CME Arrival Time Prediction Using Machine learning Algorithms) quickly and accurately predicts the arrival of Coronal Mass Ejections (CMEs) of CME arrival time. The software was trained via detailed analysis of CME features and solar wind parameters using 182 previously observed geo-effective partial-/full-halo CMEs and uses algorithms of the Support Vector Machine (SVM) to make its predictions, which can be made within minutes of providing the necessary input parameters of a CME.

  19. Speckle-learning-based object recognition through scattering media.

    PubMed

    Ando, Takamasa; Horisaki, Ryoichi; Tanida, Jun

    2015-12-28

    We experimentally demonstrated object recognition through scattering media based on direct machine learning of a number of speckle intensity images. In the experiments, speckle intensity images of amplitude or phase objects on a spatial light modulator between scattering plates were captured by a camera. We used the support vector machine for binary classification of the captured speckle intensity images of face and non-face data. The experimental results showed that speckles are sufficient for machine learning.

  20. A Language-Independent Approach to Automatic Text Difficulty Assessment for Second-Language Learners

    DTIC Science & Technology

    2013-08-01

    best-suited for regression. Our baseline uses z-normalized shallow length features and TF -LOG weighted vectors on bag-of-words for Arabic, Dari...length features and TF -LOG weighted vectors on bag-of-words for Arabic, Dari, English and Pashto. We compare Support Vector Machines and the Margin...football, whereas they are much less common in documents about opera). We used TF -LOG weighted word frequencies on bag-of-words for each document

  1. Relevance Vector Machine and Support Vector Machine Classifier Analysis of Scanning Laser Polarimetry Retinal Nerve Fiber Layer Measurements

    PubMed Central

    Bowd, Christopher; Medeiros, Felipe A.; Zhang, Zuohua; Zangwill, Linda M.; Hao, Jiucang; Lee, Te-Won; Sejnowski, Terrence J.; Weinreb, Robert N.; Goldbaum, Michael H.

    2010-01-01

    Purpose To classify healthy and glaucomatous eyes using relevance vector machine (RVM) and support vector machine (SVM) learning classifiers trained on retinal nerve fiber layer (RNFL) thickness measurements obtained by scanning laser polarimetry (SLP). Methods Seventy-two eyes of 72 healthy control subjects (average age = 64.3 ± 8.8 years, visual field mean deviation =−0.71 ± 1.2 dB) and 92 eyes of 92 patients with glaucoma (average age = 66.9 ± 8.9 years, visual field mean deviation =−5.32 ± 4.0 dB) were imaged with SLP with variable corneal compensation (GDx VCC; Laser Diagnostic Technologies, San Diego, CA). RVM and SVM learning classifiers were trained and tested on SLP-determined RNFL thickness measurements from 14 standard parameters and 64 sectors (approximately 5.6° each) obtained in the circumpapillary area under the instrument-defined measurement ellipse (total 78 parameters). Tenfold cross-validation was used to train and test RVM and SVM classifiers on unique subsets of the full 164-eye data set and areas under the receiver operating characteristic (AUROC) curve for the classification of eyes in the test set were generated. AUROC curve results from RVM and SVM were compared to those for 14 SLP software-generated global and regional RNFL thickness parameters. Also reported was the AUROC curve for the GDx VCC software-generated nerve fiber indicator (NFI). Results The AUROC curves for RVM and SVM were 0.90 and 0.91, respectively, and increased to 0.93 and 0.94 when the training sets were optimized with sequential forward and backward selection (resulting in reduced dimensional data sets). AUROC curves for optimized RVM and SVM were significantly larger than those for all individual SLP parameters. The AUROC curve for the NFI was 0.87. Conclusions Results from RVM and SVM trained on SLP RNFL thickness measurements are similar and provide accurate classification of glaucomatous and healthy eyes. RVM may be preferable to SVM, because it provides a Bayesian-derived probability of glaucoma as an output. These results suggest that these machine learning classifiers show good potential for glaucoma diagnosis. PMID:15790898

  2. Improving protein–protein interactions prediction accuracy using protein evolutionary information and relevance vector machine model

    PubMed Central

    An, Ji‐Yong; Meng, Fan‐Rong; Chen, Xing; Yan, Gui‐Ying; Hu, Ji‐Pu

    2016-01-01

    Abstract Predicting protein–protein interactions (PPIs) is a challenging task and essential to construct the protein interaction networks, which is important for facilitating our understanding of the mechanisms of biological systems. Although a number of high‐throughput technologies have been proposed to predict PPIs, there are unavoidable shortcomings, including high cost, time intensity, and inherently high false positive rates. For these reasons, many computational methods have been proposed for predicting PPIs. However, the problem is still far from being solved. In this article, we propose a novel computational method called RVM‐BiGP that combines the relevance vector machine (RVM) model and Bi‐gram Probabilities (BiGP) for PPIs detection from protein sequences. The major improvement includes (1) Protein sequences are represented using the Bi‐gram probabilities (BiGP) feature representation on a Position Specific Scoring Matrix (PSSM), in which the protein evolutionary information is contained; (2) For reducing the influence of noise, the Principal Component Analysis (PCA) method is used to reduce the dimension of BiGP vector; (3) The powerful and robust Relevance Vector Machine (RVM) algorithm is used for classification. Five‐fold cross‐validation experiments executed on yeast and Helicobacter pylori datasets, which achieved very high accuracies of 94.57 and 90.57%, respectively. Experimental results are significantly better than previous methods. To further evaluate the proposed method, we compare it with the state‐of‐the‐art support vector machine (SVM) classifier on the yeast dataset. The experimental results demonstrate that our RVM‐BiGP method is significantly better than the SVM‐based method. In addition, we achieved 97.15% accuracy on imbalance yeast dataset, which is higher than that of balance yeast dataset. The promising experimental results show the efficiency and robust of the proposed method, which can be an automatic decision support tool for future proteomics research. For facilitating extensive studies for future proteomics research, we developed a freely available web server called RVM‐BiGP‐PPIs in Hypertext Preprocessor (PHP) for predicting PPIs. The web server including source code and the datasets are available at http://219.219.62.123:8888/BiGP/. PMID:27452983

  3. Speech sound classification and detection of articulation disorders with support vector machines and wavelets.

    PubMed

    Georgoulas, George; Georgopoulos, Voula C; Stylios, Chrysostomos D

    2006-01-01

    This paper proposes a novel integrated methodology to extract features and classify speech sounds with intent to detect the possible existence of a speech articulation disorder in a speaker. Articulation, in effect, is the specific and characteristic way that an individual produces the speech sounds. A methodology to process the speech signal, extract features and finally classify the signal and detect articulation problems in a speaker is presented. The use of support vector machines (SVMs), for the classification of speech sounds and detection of articulation disorders is introduced. The proposed method is implemented on a data set where different sets of features and different schemes of SVMs are tested leading to satisfactory performance.

  4. HYBRID NEURAL NETWORK AND SUPPORT VECTOR MACHINE METHOD FOR OPTIMIZATION

    NASA Technical Reports Server (NTRS)

    Rai, Man Mohan (Inventor)

    2005-01-01

    System and method for optimization of a design associated with a response function, using a hybrid neural net and support vector machine (NN/SVM) analysis to minimize or maximize an objective function, optionally subject to one or more constraints. As a first example, the NN/SVM analysis is applied iteratively to design of an aerodynamic component, such as an airfoil shape, where the objective function measures deviation from a target pressure distribution on the perimeter of the aerodynamic component. As a second example, the NN/SVM analysis is applied to data classification of a sequence of data points in a multidimensional space. The NN/SVM analysis is also applied to data regression.

  5. Hybrid Neural Network and Support Vector Machine Method for Optimization

    NASA Technical Reports Server (NTRS)

    Rai, Man Mohan (Inventor)

    2007-01-01

    System and method for optimization of a design associated with a response function, using a hybrid neural net and support vector machine (NN/SVM) analysis to minimize or maximize an objective function, optionally subject to one or more constraints. As a first example, the NN/SVM analysis is applied iteratively to design of an aerodynamic component, such as an airfoil shape, where the objective function measures deviation from a target pressure distribution on the perimeter of the aerodynamic component. As a second example, the NN/SVM analysis is applied to data classification of a sequence of data points in a multidimensional space. The NN/SVM analysis is also applied to data regression.

  6. Prediction of mutagenic toxicity by combination of Recursive Partitioning and Support Vector Machines.

    PubMed

    Liao, Quan; Yao, Jianhua; Yuan, Shengang

    2007-05-01

    The study of prediction of toxicity is very important and necessary because measurement of toxicity is typically time-consuming and expensive. In this paper, Recursive Partitioning (RP) method was used to select descriptors. RP and Support Vector Machines (SVM) were used to construct structure-toxicity relationship models, RP model and SVM model, respectively. The performances of the two models are different. The prediction accuracies of the RP model are 80.2% for mutagenic compounds in MDL's toxicity database, 83.4% for compounds in CMC and 84.9% for agrochemicals in in-house database respectively. Those of SVM model are 81.4%, 87.0% and 87.3% respectively.

  7. StruLocPred: structure-based protein subcellular localisation prediction using multi-class support vector machine.

    PubMed

    Zhou, Wengang; Dickerson, Julie A

    2012-01-01

    Knowledge of protein subcellular locations can help decipher a protein's biological function. This work proposes new features: sequence-based: Hybrid Amino Acid Pair (HAAP) and two structure-based: Secondary Structural Element Composition (SSEC) and solvent accessibility state frequency. A multi-class Support Vector Machine is developed to predict the locations. Testing on two established data sets yields better prediction accuracies than the best available systems. Comparisons with existing methods show comparable results to ESLPred2. When StruLocPred is applied to the entire Arabidopsis proteome, over 77% of proteins with known locations match the prediction results. An implementation of this system is at http://wgzhou.ece. iastate.edu/StruLocPred/.

  8. Identification of cigarette smoke inhalations from wearable sensor data using a Support Vector Machine classifier.

    PubMed

    Lopez-Meyer, Paulo; Tiffany, Stephen; Sazonov, Edward

    2012-01-01

    This study presents a subject-independent model for detection of smoke inhalations from wearable sensors capturing characteristic hand-to-mouth gestures and changes in breathing patterns during cigarette smoking. Wearable sensors were used to detect the proximity of the hand to the mouth and to acquire the respiratory patterns. The waveforms of sensor signals were used as features to build a Support Vector Machine classification model. Across a data set of 20 enrolled participants, precision of correct identification of smoke inhalations was found to be >87%, and a resulting recall >80%. These results suggest that it is possible to analyze smoking behavior by means of a wearable and non-invasive sensor system.

  9. An Auto-flag Method of Radio Visibility Data Based on Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Dai, Hui-mei; Mei, Ying; Wang, Wei; Deng, Hui; Wang, Feng

    2017-01-01

    The Mingantu Ultrawide Spectral Radioheliograph (MUSER) has entered a test observation stage. After the construction of the data acquisition and storage system, it is urgent to automatically flag and eliminate the abnormal visibility data so as to improve the imaging quality. In this paper, according to the observational records, we create a credible visibility set, and further obtain the corresponding flag model of visibility data by using the support vector machine (SVM) technique. The results show that the SVM is a robust approach to flag the MUSER visibility data, and can attain an accuracy of about 86%. Meanwhile, this method will not be affected by solar activities, such as flare eruptions.

  10. Detection of Dendritic Spines Using Wavelet Packet Entropy and Fuzzy Support Vector Machine.

    PubMed

    Wang, Shuihua; Li, Yang; Shao, Ying; Cattani, Carlo; Zhang, Yudong; Du, Sidan

    2017-01-01

    The morphology of dendritic spines is highly correlated with the neuron function. Therefore, it is of positive influence for the research of the dendritic spines. However, it is tried to manually label the spine types for statistical analysis. In this work, we proposed an approach based on the combination of wavelet contour analysis for the backbone detection, wavelet packet entropy, and fuzzy support vector machine for the spine classification. The experiments show that this approach is promising. The average detection accuracy of "MushRoom" achieves 97.3%, "Stubby" achieves 94.6%, and "Thin" achieves 97.2%. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  11. Prediction on sunspot activity based on fuzzy information granulation and support vector machine

    NASA Astrophysics Data System (ADS)

    Peng, Lingling; Yan, Haisheng; Yang, Zhigang

    2018-04-01

    In order to analyze the range of sunspots, a combined prediction method of forecasting the fluctuation range of sunspots based on fuzzy information granulation (FIG) and support vector machine (SVM) was put forward. Firstly, employing the FIG to granulate sample data and extract va)alid information of each window, namely the minimum value, the general average value and the maximum value of each window. Secondly, forecasting model is built respectively with SVM and then cross method is used to optimize these parameters. Finally, the fluctuation range of sunspots is forecasted with the optimized SVM model. Case study demonstrates that the model have high accuracy and can effectively predict the fluctuation of sunspots.

  12. Source localization in an ocean waveguide using supervised machine learning.

    PubMed

    Niu, Haiqiang; Reeves, Emma; Gerstoft, Peter

    2017-09-01

    Source localization in ocean acoustics is posed as a machine learning problem in which data-driven methods learn source ranges directly from observed acoustic data. The pressure received by a vertical linear array is preprocessed by constructing a normalized sample covariance matrix and used as the input for three machine learning methods: feed-forward neural networks (FNN), support vector machines (SVM), and random forests (RF). The range estimation problem is solved both as a classification problem and as a regression problem by these three machine learning algorithms. The results of range estimation for the Noise09 experiment are compared for FNN, SVM, RF, and conventional matched-field processing and demonstrate the potential of machine learning for underwater source localization.

  13. Experimental Realization of a Quantum Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Li, Zhaokai; Liu, Xiaomei; Xu, Nanyang; Du, Jiangfeng

    2015-04-01

    The fundamental principle of artificial intelligence is the ability of machines to learn from previous experience and do future work accordingly. In the age of big data, classical learning machines often require huge computational resources in many practical cases. Quantum machine learning algorithms, on the other hand, could be exponentially faster than their classical counterparts by utilizing quantum parallelism. Here, we demonstrate a quantum machine learning algorithm to implement handwriting recognition on a four-qubit NMR test bench. The quantum machine learns standard character fonts and then recognizes handwritten characters from a set with two candidates. Because of the wide spread importance of artificial intelligence and its tremendous consumption of computational resources, quantum speedup would be extremely attractive against the challenges of big data.

  14. Resident Space Object Characterization and Behavior Understanding via Machine Learning and Ontology-based Bayesian Networks

    NASA Astrophysics Data System (ADS)

    Furfaro, R.; Linares, R.; Gaylor, D.; Jah, M.; Walls, R.

    2016-09-01

    In this paper, we present an end-to-end approach that employs machine learning techniques and Ontology-based Bayesian Networks (BN) to characterize the behavior of resident space objects. State-of-the-Art machine learning architectures (e.g. Extreme Learning Machines, Convolutional Deep Networks) are trained on physical models to learn the Resident Space Object (RSO) features in the vectorized energy and momentum states and parameters. The mapping from measurements to vectorized energy and momentum states and parameters enables behavior characterization via clustering in the features space and subsequent RSO classification. Additionally, Space Object Behavioral Ontologies (SOBO) are employed to define and capture the domain knowledge-base (KB) and BNs are constructed from the SOBO in a semi-automatic fashion to execute probabilistic reasoning over conclusions drawn from trained classifiers and/or directly from processed data. Such an approach enables integrating machine learning classifiers and probabilistic reasoning to support higher-level decision making for space domain awareness applications. The innovation here is to use these methods (which have enjoyed great success in other domains) in synergy so that it enables a "from data to discovery" paradigm by facilitating the linkage and fusion of large and disparate sources of information via a Big Data Science and Analytics framework.

  15. Merged or monolithic? Using machine-learning to reconstruct the dynamical history of simulated star clusters

    NASA Astrophysics Data System (ADS)

    Pasquato, Mario; Chung, Chul

    2016-05-01

    Context. Machine-learning (ML) solves problems by learning patterns from data with limited or no human guidance. In astronomy, ML is mainly applied to large observational datasets, e.g. for morphological galaxy classification. Aims: We apply ML to gravitational N-body simulations of star clusters that are either formed by merging two progenitors or evolved in isolation, planning to later identify globular clusters (GCs) that may have a history of merging from observational data. Methods: We create mock-observations from simulated GCs, from which we measure a set of parameters (also called features in the machine-learning field). After carrying out dimensionality reduction on the feature space, the resulting datapoints are fed in to various classification algorithms. Using repeated random subsampling validation, we check whether the groups identified by the algorithms correspond to the underlying physical distinction between mergers and monolithically evolved simulations. Results: The three algorithms we considered (C5.0 trees, k-nearest neighbour, and support-vector machines) all achieve a test misclassification rate of about 10% without parameter tuning, with support-vector machines slightly outperforming the others. The first principal component of feature space correlates with cluster concentration. If we exclude it from the regression, the performance of the algorithms is only slightly reduced.

  16. Classification of sodium MRI data of cartilage using machine learning.

    PubMed

    Madelin, Guillaume; Poidevin, Frederick; Makrymallis, Antonios; Regatte, Ravinder R

    2015-11-01

    To assess the possible utility of machine learning for classifying subjects with and subjects without osteoarthritis using sodium magnetic resonance imaging data. Theory: Support vector machine, k-nearest neighbors, naïve Bayes, discriminant analysis, linear regression, logistic regression, neural networks, decision tree, and tree bagging were tested. Sodium magnetic resonance imaging with and without fluid suppression by inversion recovery was acquired on the knee cartilage of 19 controls and 28 osteoarthritis patients. Sodium concentrations were measured in regions of interests in the knee for both acquisitions. Mean (MEAN) and standard deviation (STD) of these concentrations were measured in each regions of interest, and the minimum, maximum, and mean of these two measurements were calculated over all regions of interests for each subject. The resulting 12 variables per subject were used as predictors for classification. Either Min [STD] alone, or in combination with Mean [MEAN] or Min [MEAN], all from fluid suppressed data, were the best predictors with an accuracy >74%, mainly with linear logistic regression and linear support vector machine. Other good classifiers include discriminant analysis, linear regression, and naïve Bayes. Machine learning is a promising technique for classifying osteoarthritis patients and controls from sodium magnetic resonance imaging data. © 2014 Wiley Periodicals, Inc.

  17. A comparative analysis of support vector machines and extreme learning machines.

    PubMed

    Liu, Xueyi; Gao, Chuanhou; Li, Ping

    2012-09-01

    The theory of extreme learning machines (ELMs) has recently become increasingly popular. As a new learning algorithm for single-hidden-layer feed-forward neural networks, an ELM offers the advantages of low computational cost, good generalization ability, and ease of implementation. Hence the comparison and model selection between ELMs and other kinds of state-of-the-art machine learning approaches has become significant and has attracted many research efforts. This paper performs a comparative analysis of the basic ELMs and support vector machines (SVMs) from two viewpoints that are different from previous works: one is the Vapnik-Chervonenkis (VC) dimension, and the other is their performance under different training sample sizes. It is shown that the VC dimension of an ELM is equal to the number of hidden nodes of the ELM with probability one. Additionally, their generalization ability and computational complexity are exhibited with changing training sample size. ELMs have weaker generalization ability than SVMs for small sample but can generalize as well as SVMs for large sample. Remarkably, great superiority in computational speed especially for large-scale sample problems is found in ELMs. The results obtained can provide insight into the essential relationship between them, and can also serve as complementary knowledge for their past experimental and theoretical comparisons. Copyright © 2012 Elsevier Ltd. All rights reserved.

  18. Spiking Neural P Systems With Rules on Synapses Working in Maximum Spiking Strategy.

    PubMed

    Tao Song; Linqiang Pan

    2015-06-01

    Spiking neural P systems (called SN P systems for short) are a class of parallel and distributed neural-like computation models inspired by the way the neurons process information and communicate with each other by means of impulses or spikes. In this work, we introduce a new variant of SN P systems, called SN P systems with rules on synapses working in maximum spiking strategy, and investigate the computation power of the systems as both number and vector generators. Specifically, we prove that i) if no limit is imposed on the number of spikes in any neuron during any computation, such systems can generate the sets of Turing computable natural numbers and the sets of vectors of positive integers computed by k-output register machine; ii) if an upper bound is imposed on the number of spikes in each neuron during any computation, such systems can characterize semi-linear sets of natural numbers as number generating devices; as vector generating devices, such systems can only characterize the family of sets of vectors computed by sequential monotonic counter machine, which is strictly included in family of semi-linear sets of vectors. This gives a positive answer to the problem formulated in Song et al., Theor. Comput. Sci., vol. 529, pp. 82-95, 2014.

  19. Synthetic Space Vector Modulation

    DTIC Science & Technology

    2013-06-01

    especially batteries without fancy controls. Inherently, DC machine commutation is environmentally sensitive and maintenance intensive at well as...reliable DC power supplies especially batteries without fancy controls. Inherently, DC machine commutation is environmentally sensitive and maintenance...Drives and Energy Systems, New Delhi, India , 20-23 December, 2010. [12] PIC18F2331/2431/4331/4431 datasheet DS39616B, Microchip Technology Inc

  20. Automated Creation of Labeled Pointcloud Datasets in Support of Machine-Learning Based Perception

    DTIC Science & Technology

    2017-12-01

    computationally intensive 3D vector math and took more than ten seconds to segment a single LIDAR frame from the HDL-32e with the Dell XPS15 9650’s Intel...Core i7 CPU. Depth Clustering avoids the computationally intensive 3D vector math of Euclidean Clustering-based DON segmentation and, instead

  1. Subpixel urban land cover estimation: comparing cubist, random forests, and support vector regression

    Treesearch

    Jeffrey T. Walton

    2008-01-01

    Three machine learning subpixel estimation methods (Cubist, Random Forests, and support vector regression) were applied to estimate urban cover. Urban forest canopy cover and impervious surface cover were estimated from Landsat-7 ETM+ imagery using a higher resolution cover map resampled to 30 m as training and reference data. Three different band combinations (...

  2. Divide and Recombine for Large Complex Data

    DTIC Science & Technology

    2017-12-01

    Empirical Methods in Natural Language Processing , October 2014 Keywords Enter keywords for the publication. URL Enter the URL...low-latency data processing systems. Declarative Languages for Interactive Visualization: The Reactive Vega Stack Another thread of XDATA research...for array processing operations embedded in the R programming language . Vector virtual machines work well for long vectors. One of the most

  3. Parallel-vector out-of-core equation solver for computational mechanics

    NASA Technical Reports Server (NTRS)

    Qin, J.; Agarwal, T. K.; Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.

    1993-01-01

    A parallel/vector out-of-core equation solver is developed for shared-memory computers, such as the Cray Y-MP machine. The input/ output (I/O) time is reduced by using the a synchronous BUFFER IN and BUFFER OUT, which can be executed simultaneously with the CPU instructions. The parallel and vector capability provided by the supercomputers is also exploited to enhance the performance. Numerical applications in large-scale structural analysis are given to demonstrate the efficiency of the present out-of-core solver.

  4. Prediction of Skin Sensitization with a Particle Swarm Optimized Support Vector Machine

    PubMed Central

    Yuan, Hua; Huang, Jianping; Cao, Chenzhong

    2009-01-01

    Skin sensitization is the most commonly reported occupational illness, causing much suffering to a wide range of people. Identification and labeling of environmental allergens is urgently required to protect people from skin sensitization. The guinea pig maximization test (GPMT) and murine local lymph node assay (LLNA) are the two most important in vivo models for identification of skin sensitizers. In order to reduce the number of animal tests, quantitative structure-activity relationships (QSARs) are strongly encouraged in the assessment of skin sensitization of chemicals. This paper has investigated the skin sensitization potential of 162 compounds with LLNA results and 92 compounds with GPMT results using a support vector machine. A particle swarm optimization algorithm was implemented for feature selection from a large number of molecular descriptors calculated by Dragon. For the LLNA data set, the classification accuracies are 95.37% and 88.89% for the training and the test sets, respectively. For the GPMT data set, the classification accuracies are 91.80% and 90.32% for the training and the test sets, respectively. The classification performances were greatly improved compared to those reported in the literature, indicating that the support vector machine optimized by particle swarm in this paper is competent for the identification of skin sensitizers. PMID:19742136

  5. Pharmaceutical Raw Material Identification Using Miniature Near-Infrared (MicroNIR) Spectroscopy and Supervised Pattern Recognition Using Support Vector Machine

    PubMed Central

    Hsiung, Chang; Pederson, Christopher G.; Zou, Peng; Smith, Valton; von Gunten, Marc; O’Brien, Nada A.

    2016-01-01

    Near-infrared spectroscopy as a rapid and non-destructive analytical technique offers great advantages for pharmaceutical raw material identification (RMID) to fulfill the quality and safety requirements in pharmaceutical industry. In this study, we demonstrated the use of portable miniature near-infrared (MicroNIR) spectrometers for NIR-based pharmaceutical RMID and solved two challenges in this area, model transferability and large-scale classification, with the aid of support vector machine (SVM) modeling. We used a set of 19 pharmaceutical compounds including various active pharmaceutical ingredients (APIs) and excipients and six MicroNIR spectrometers to test model transferability. For the test of large-scale classification, we used another set of 253 pharmaceutical compounds comprised of both chemically and physically different APIs and excipients. We compared SVM with conventional chemometric modeling techniques, including soft independent modeling of class analogy, partial least squares discriminant analysis, linear discriminant analysis, and quadratic discriminant analysis. Support vector machine modeling using a linear kernel, especially when combined with a hierarchical scheme, exhibited excellent performance in both model transferability and large-scale classification. Hence, ultra-compact, portable and robust MicroNIR spectrometers coupled with SVM modeling can make on-site and in situ pharmaceutical RMID for large-volume applications highly achievable. PMID:27029624

  6. Face recognition using total margin-based adaptive fuzzy support vector machines.

    PubMed

    Liu, Yi-Hung; Chen, Yen-Ting

    2007-01-01

    This paper presents a new classifier called total margin-based adaptive fuzzy support vector machines (TAF-SVM) that deals with several problems that may occur in support vector machines (SVMs) when applied to the face recognition. The proposed TAF-SVM not only solves the overfitting problem resulted from the outlier with the approach of fuzzification of the penalty, but also corrects the skew of the optimal separating hyperplane due to the very imbalanced data sets by using different cost algorithm. In addition, by introducing the total margin algorithm to replace the conventional soft margin algorithm, a lower generalization error bound can be obtained. Those three functions are embodied into the traditional SVM so that the TAF-SVM is proposed and reformulated in both linear and nonlinear cases. By using two databases, the Chung Yuan Christian University (CYCU) multiview and the facial recognition technology (FERET) face databases, and using the kernel Fisher's discriminant analysis (KFDA) algorithm to extract discriminating face features, experimental results show that the proposed TAF-SVM is superior to SVM in terms of the face-recognition accuracy. The results also indicate that the proposed TAF-SVM can achieve smaller error variances than SVM over a number of tests such that better recognition stability can be obtained.

  7. Detection of surface cracking in steel pipes based on vibration data using a multi-class support vector machine classifier

    NASA Astrophysics Data System (ADS)

    Mustapha, S.; Braytee, A.; Ye, L.

    2017-04-01

    In this study, we focused at the development and verification of a robust framework for surface crack detection in steel pipes using measured vibration responses; with the presence of multiple progressive damage occurring in different locations within the structure. Feature selection, dimensionality reduction, and multi-class support vector machine were established for this purpose. Nine damage cases, at different locations, orientations and length, were introduced into the pipe structure. The pipe was impacted 300 times using an impact hammer, after each damage case, the vibration data were collected using 3 PZT wafers which were installed on the outer surface of the pipe. At first, damage sensitive features were extracted using the frequency response function approach followed by recursive feature elimination for dimensionality reduction. Then, a multi-class support vector machine learning algorithm was employed to train the data and generate a statistical model. Once the model is established, decision values and distances from the hyper-plane were generated for the new collected data using the trained model. This process was repeated on the data collected from each sensor. Overall, using a single sensor for training and testing led to a very high accuracy reaching 98% in the assessment of the 9 damage cases used in this study.

  8. Detection of Hard Exudates in Colour Fundus Images Using Fuzzy Support Vector Machine-Based Expert System.

    PubMed

    Jaya, T; Dheeba, J; Singh, N Albert

    2015-12-01

    Diabetic retinopathy is a major cause of vision loss in diabetic patients. Currently, there is a need for making decisions using intelligent computer algorithms when screening a large volume of data. This paper presents an expert decision-making system designed using a fuzzy support vector machine (FSVM) classifier to detect hard exudates in fundus images. The optic discs in the colour fundus images are segmented to avoid false alarms using morphological operations and based on circular Hough transform. To discriminate between the exudates and the non-exudates pixels, colour and texture features are extracted from the images. These features are given as input to the FSVM classifier. The classifier analysed 200 retinal images collected from diabetic retinopathy screening programmes. The tests made on the retinal images show that the proposed detection system has better discriminating power than the conventional support vector machine. With the best combination of FSVM and features sets, the area under the receiver operating characteristic curve reached 0.9606, which corresponds to a sensitivity of 94.1% with a specificity of 90.0%. The results suggest that detecting hard exudates using FSVM contribute to computer-assisted detection of diabetic retinopathy and as a decision support system for ophthalmologists.

  9. Predicting hydrofacies and hydraulic conductivity from direct-push data using a data-driven relevance vector machine approach: Motivations, algorithms, and application

    NASA Astrophysics Data System (ADS)

    Paradis, Daniel; Lefebvre, René; Gloaguen, Erwan; Rivera, Alfonso

    2015-01-01

    The spatial heterogeneity of hydraulic conductivity (K) exerts a major control on groundwater flow and solute transport. The heterogeneous spatial distribution of K can be imaged using indirect geophysical data as long as reliable relations exist to link geophysical data to K. This paper presents a nonparametric learning machine approach to predict aquifer K from cone penetrometer tests (CPT) coupled with a soil moisture and resistivity probe (SMR) using relevance vector machines (RVMs). The learning machine approach is demonstrated with an application to a heterogeneous unconsolidated littoral aquifer in a 12 km2 subwatershed, where relations between K and multiparameters CPT/SMR soundings appear complex. Our approach involved fuzzy clustering to define hydrofacies (HF) on the basis of CPT/SMR and K data prior to the training of RVMs for HFs recognition and K prediction on the basis of CPT/SMR data alone. The learning machine was built from a colocated training data set representative of the study area that includes K data from slug tests and CPT/SMR data up-scaled at a common vertical resolution of 15 cm with K data. After training, the predictive capabilities of the learning machine were assessed through cross validation with data withheld from the training data set and with K data from flowmeter tests not used during the training process. Results show that HF and K predictions from the learning machine are consistent with hydraulic tests. The combined use of CPT/SMR data and RVM-based learning machine proved to be powerful and efficient for the characterization of high-resolution K heterogeneity for unconsolidated aquifers.

  10. Machine-learning in grading of gliomas based on multi-parametric magnetic resonance imaging at 3T.

    PubMed

    Citak-Er, Fusun; Firat, Zeynep; Kovanlikaya, Ilhami; Ture, Ugur; Ozturk-Isik, Esin

    2018-06-15

    The objective of this study was to assess the contribution of multi-parametric (mp) magnetic resonance imaging (MRI) quantitative features in the machine learning-based grading of gliomas with a multi-region-of-interests approach. Forty-three patients who were newly diagnosed as having a glioma were included in this study. The patients were scanned prior to any therapy using a standard brain tumor magnetic resonance (MR) imaging protocol that included T1 and T2-weighted, diffusion-weighted, diffusion tensor, MR perfusion and MR spectroscopic imaging. Three different regions-of-interest were drawn for each subject to encompass tumor, immediate tumor periphery, and distant peritumoral edema/normal. The normalized mp-MRI features were used to build machine-learning models for differentiating low-grade gliomas (WHO grades I and II) from high grades (WHO grades III and IV). In order to assess the contribution of regional mp-MRI quantitative features to the classification models, a support vector machine-based recursive feature elimination method was applied prior to classification. A machine-learning model based on support vector machine algorithm with linear kernel achieved an accuracy of 93.0%, a specificity of 86.7%, and a sensitivity of 96.4% for the grading of gliomas using ten-fold cross validation based on the proposed subset of the mp-MRI features. In this study, machine-learning based on multiregional and multi-parametric MRI data has proven to be an important tool in grading glial tumors accurately even in this limited patient population. Future studies are needed to investigate the use of machine learning algorithms for brain tumor classification in a larger patient cohort. Copyright © 2018. Published by Elsevier Ltd.

  11. A machine learning approach to galaxy-LSS classification - I. Imprints on halo merger trees

    NASA Astrophysics Data System (ADS)

    Hui, Jianan; Aragon, Miguel; Cui, Xinping; Flegal, James M.

    2018-04-01

    The cosmic web plays a major role in the formation and evolution of galaxies and defines, to a large extent, their properties. However, the relation between galaxies and environment is still not well understood. Here, we present a machine learning approach to study imprints of environmental effects on the mass assembly of haloes. We present a galaxy-LSS machine learning classifier based on galaxy properties sensitive to the environment. We then use the classifier to assess the relevance of each property. Correlations between galaxy properties and their cosmic environment can be used to predict galaxy membership to void/wall or filament/cluster with an accuracy of 93 per cent. Our study unveils environmental information encoded in properties of haloes not normally considered directly dependent on the cosmic environment such as merger history and complexity. Understanding the physical mechanism by which the cosmic web is imprinted in a halo can lead to significant improvements in galaxy formation models. This is accomplished by extracting features from galaxy properties and merger trees, computing feature scores for each feature and then applying support vector machine (SVM) to different feature sets. To this end, we have discovered that the shape and depth of the merger tree, formation time, and density of the galaxy are strongly associated with the cosmic environment. We describe a significant improvement in the original classification algorithm by performing LU decomposition of the distance matrix computed by the feature vectors and then using the output of the decomposition as input vectors for SVM.

  12. Real-data comparison of data mining methods in prediction of diabetes in iran.

    PubMed

    Tapak, Lily; Mahjub, Hossein; Hamidi, Omid; Poorolajal, Jalal

    2013-09-01

    Diabetes is one of the most common non-communicable diseases in developing countries. Early screening and diagnosis play an important role in effective prevention strategies. This study compared two traditional classification methods (logistic regression and Fisher linear discriminant analysis) and four machine-learning classifiers (neural networks, support vector machines, fuzzy c-mean, and random forests) to classify persons with and without diabetes. The data set used in this study included 6,500 subjects from the Iranian national non-communicable diseases risk factors surveillance obtained through a cross-sectional survey. The obtained sample was based on cluster sampling of the Iran population which was conducted in 2005-2009 to assess the prevalence of major non-communicable disease risk factors. Ten risk factors that are commonly associated with diabetes were selected to compare the performance of six classifiers in terms of sensitivity, specificity, total accuracy, and area under the receiver operating characteristic (ROC) curve criteria. Support vector machines showed the highest total accuracy (0.986) as well as area under the ROC (0.979). Also, this method showed high specificity (1.000) and sensitivity (0.820). All other methods produced total accuracy of more than 85%, but for all methods, the sensitivity values were very low (less than 0.350). The results of this study indicate that, in terms of sensitivity, specificity, and overall classification accuracy, the support vector machine model ranks first among all the classifiers tested in the prediction of diabetes. Therefore, this approach is a promising classifier for predicting diabetes, and it should be further investigated for the prediction of other diseases.

  13. Condition Assessment of Foundation Piles and Utility Poles Based on Guided Wave Propagation Using a Network of Tactile Transducers and Support Vector Machines

    PubMed Central

    Yu, Yang; Niederleithinger, Ernst; Li, Jianchun; Wiggenhauser, Herbert

    2017-01-01

    This paper presents a novel non-destructive testing and health monitoring system using a network of tactile transducers and accelerometers for the condition assessment and damage classification of foundation piles and utility poles. While in traditional pile integrity testing an impact hammer with broadband frequency excitation is typically used, the proposed testing system utilizes an innovative excitation system based on a network of tactile transducers to induce controlled narrow-band frequency stress waves. Thereby, the simultaneous excitation of multiple stress wave types and modes is avoided (or at least reduced), and targeted wave forms can be generated. The new testing system enables the testing and monitoring of foundation piles and utility poles where the top is inaccessible, making the new testing system suitable, for example, for the condition assessment of pile structures with obstructed heads and of poles with live wires. For system validation, the new system was experimentally tested on nine timber and concrete poles that were inflicted with several types of damage. The tactile transducers were excited with continuous sine wave signals of 1 kHz frequency. Support vector machines were employed together with advanced signal processing algorithms to distinguish recorded stress wave signals from pole structures with different types of damage. The results show that using fast Fourier transform signals, combined with principal component analysis as the input feature vector for support vector machine (SVM) classifiers with different kernel functions, can achieve damage classification with accuracies of 92.5% ± 7.5%. PMID:29258274

  14. Ecological footprint model using the support vector machine technique.

    PubMed

    Ma, Haibo; Chang, Wenjuan; Cui, Guangbai

    2012-01-01

    The per capita ecological footprint (EF) is one of the most widely recognized measures of environmental sustainability. It aims to quantify the Earth's biological resources required to support human activity. In this paper, we summarize relevant previous literature, and present five factors that influence per capita EF. These factors are: National gross domestic product (GDP), urbanization (independent of economic development), distribution of income (measured by the Gini coefficient), export dependence (measured by the percentage of exports to total GDP), and service intensity (measured by the percentage of service to total GDP). A new ecological footprint model based on a support vector machine (SVM), which is a machine-learning method based on the structural risk minimization principle from statistical learning theory was conducted to calculate the per capita EF of 24 nations using data from 123 nations. The calculation accuracy was measured by average absolute error and average relative error. They were 0.004883 and 0.351078% respectively. Our results demonstrate that the EF model based on SVM has good calculation performance.

  15. Machine Learning Methods for Attack Detection in the Smart Grid.

    PubMed

    Ozay, Mete; Esnaola, Inaki; Yarman Vural, Fatos Tunay; Kulkarni, Sanjeev R; Poor, H Vincent

    2016-08-01

    Attack detection problems in the smart grid are posed as statistical learning problems for different attack scenarios in which the measurements are observed in batch or online settings. In this approach, machine learning algorithms are used to classify measurements as being either secure or attacked. An attack detection framework is provided to exploit any available prior knowledge about the system and surmount constraints arising from the sparse structure of the problem in the proposed approach. Well-known batch and online learning algorithms (supervised and semisupervised) are employed with decision- and feature-level fusion to model the attack detection problem. The relationships between statistical and geometric properties of attack vectors employed in the attack scenarios and learning algorithms are analyzed to detect unobservable attacks using statistical learning methods. The proposed algorithms are examined on various IEEE test systems. Experimental analyses show that machine learning algorithms can detect attacks with performances higher than attack detection algorithms that employ state vector estimation methods in the proposed attack detection framework.

  16. Gender classification of running subjects using full-body kinematics

    NASA Astrophysics Data System (ADS)

    Williams, Christina M.; Flora, Jeffrey B.; Iftekharuddin, Khan M.

    2016-05-01

    This paper proposes novel automated gender classification of subjects while engaged in running activity. The machine learning techniques include preprocessing steps using principal component analysis followed by classification with linear discriminant analysis, and nonlinear support vector machines, and decision-stump with AdaBoost. The dataset consists of 49 subjects (25 males, 24 females, 2 trials each) all equipped with approximately 80 retroreflective markers. The trials are reflective of the subject's entire body moving unrestrained through a capture volume at a self-selected running speed, thus producing highly realistic data. The classification accuracy using leave-one-out cross validation for the 49 subjects is improved from 66.33% using linear discriminant analysis to 86.74% using the nonlinear support vector machine. Results are further improved to 87.76% by means of implementing a nonlinear decision stump with AdaBoost classifier. The experimental findings suggest that the linear classification approaches are inadequate in classifying gender for a large dataset with subjects running in a moderately uninhibited environment.

  17. Rare events modeling with support vector machine: Application to forecasting large-amplitude geomagnetic substorms and extreme events in financial markets.

    NASA Astrophysics Data System (ADS)

    Gavrishchaka, V. V.; Ganguli, S. B.

    2001-12-01

    Reliable forecasting of rare events in a complex dynamical system is a challenging problem that is important for many practical applications. Due to the nature of rare events, data set available for construction of the statistical and/or machine learning model is often very limited and incomplete. Therefore many widely used approaches including such robust algorithms as neural networks can easily become inadequate for rare events prediction. Moreover in many practical cases models with high-dimensional inputs are required. This limits applications of the existing rare event modeling techniques (e.g., extreme value theory) that focus on univariate cases. These approaches are not easily extended to multivariate cases. Support vector machine (SVM) is a machine learning system that can provide an optimal generalization using very limited and incomplete training data sets and can efficiently handle high-dimensional data. These features may allow to use SVM to model rare events in some applications. We have applied SVM-based system to the problem of large-amplitude substorm prediction and extreme event forecasting in stock and currency exchange markets. Encouraging preliminary results will be presented and other possible applications of the system will be discussed.

  18. Biomarkers of Eating Disorders Using Support Vector Machine Analysis of Structural Neuroimaging Data: Preliminary Results

    PubMed Central

    Cerasa, Antonio; Castiglioni, Isabella; Salvatore, Christian; Funaro, Angela; Martino, Iolanda; Alfano, Stefania; Donzuso, Giulia; Perrotta, Paolo; Gioia, Maria Cecilia; Gilardi, Maria Carla; Quattrone, Aldo

    2015-01-01

    Presently, there are no valid biomarkers to identify individuals with eating disorders (ED). The aim of this work was to assess the feasibility of a machine learning method for extracting reliable neuroimaging features allowing individual categorization of patients with ED. Support Vector Machine (SVM) technique, combined with a pattern recognition method, was employed utilizing structural magnetic resonance images. Seventeen females with ED (six with diagnosis of anorexia nervosa and 11 with bulimia nervosa) were compared against 17 body mass index-matched healthy controls (HC). Machine learning allowed individual diagnosis of ED versus HC with an Accuracy ≥ 0.80. Voxel-based pattern recognition analysis demonstrated that voxels influencing the classification Accuracy involved the occipital cortex, the posterior cerebellar lobule, precuneus, sensorimotor/premotor cortices, and the medial prefrontal cortex, all critical regions known to be strongly involved in the pathophysiological mechanisms of ED. Although these findings should be considered preliminary given the small size investigated, SVM analysis highlights the role of well-known brain regions as possible biomarkers to distinguish ED from HC at an individual level, thus encouraging the translational implementation of this new multivariate approach in the clinical practice. PMID:26648660

  19. CNN universal machine as classificaton platform: an art-like clustering algorithm.

    PubMed

    Bálya, David

    2003-12-01

    Fast and robust classification of feature vectors is a crucial task in a number of real-time systems. A cellular neural/nonlinear network universal machine (CNN-UM) can be very efficient as a feature detector. The next step is to post-process the results for object recognition. This paper shows how a robust classification scheme based on adaptive resonance theory (ART) can be mapped to the CNN-UM. Moreover, this mapping is general enough to include different types of feed-forward neural networks. The designed analogic CNN algorithm is capable of classifying the extracted feature vectors keeping the advantages of the ART networks, such as robust, plastic and fault-tolerant behaviors. An analogic algorithm is presented for unsupervised classification with tunable sensitivity and automatic new class creation. The algorithm is extended for supervised classification. The presented binary feature vector classification is implemented on the existing standard CNN-UM chips for fast classification. The experimental evaluation shows promising performance after 100% accuracy on the training set.

  20. Efficient boundary hunting via vector quantization

    NASA Astrophysics Data System (ADS)

    Diamantini, Claudia; Panti, Maurizio

    2001-03-01

    A great amount of information about a classification problem is contained in those instances falling near the decision boundary. This intuition dates back to the earliest studies in pattern recognition, and in the more recent adaptive approaches to the so called boundary hunting, such as the work of Aha et alii on Instance Based Learning and the work of Vapnik et alii on Support Vector Machines. The last work is of particular interest, since theoretical and experimental results ensure the accuracy of boundary reconstruction. However, its optimization approach has heavy computational and memory requirements, which limits its application on huge amounts of data. In the paper we describe an alternative approach to boundary hunting based on adaptive labeled quantization architectures. The adaptation is performed by a stochastic gradient algorithm for the minimization of the error probability. Error probability minimization guarantees the accurate approximation of the optimal decision boundary, while the use of a stochastic gradient algorithm defines an efficient method to reach such approximation. In the paper comparisons to Support Vector Machines are considered.

  1. Universal Parameter Measurement and Sensorless Vector Control of Induction and Permanent Magnet Synchronous Motors

    NASA Astrophysics Data System (ADS)

    Yamamoto, Shu; Ara, Takahiro

    Recently, induction motors (IMs) and permanent-magnet synchronous motors (PMSMs) have been used in various industrial drive systems. The features of the hardware device used for controlling the adjustable-speed drive in these motors are almost identical. Despite this, different techniques are generally used for parameter measurement and speed-sensorless control of these motors. If the same technique can be used for parameter measurement and sensorless control, a highly versatile adjustable-speed-drive system can be realized. In this paper, the authors describe a new universal sensorless control technique for both IMs and PMSMs (including salient pole and nonsalient pole machines). A mathematical model applicable for IMs and PMSMs is discussed. Using this model, the authors derive the proposed universal sensorless vector control algorithm on the basis of estimation of the stator flux linkage vector. All the electrical motor parameters are determined by a unified test procedure. The proposed method is implemented on three test machines. The actual driving test results demonstrate the validity of the proposed method.

  2. Support vector machine firefly algorithm based optimization of lens system.

    PubMed

    Shamshirband, Shahaboddin; Petković, Dalibor; Pavlović, Nenad T; Ch, Sudheer; Altameem, Torki A; Gani, Abdullah

    2015-01-01

    Lens system design is an important factor in image quality. The main aspect of the lens system design methodology is the optimization procedure. Since optimization is a complex, nonlinear task, soft computing optimization algorithms can be used. There are many tools that can be employed to measure optical performance, but the spot diagram is the most useful. The spot diagram gives an indication of the image of a point object. In this paper, the spot size radius is considered an optimization criterion. Intelligent soft computing scheme support vector machines (SVMs) coupled with the firefly algorithm (FFA) are implemented. The performance of the proposed estimators is confirmed with the simulation results. The result of the proposed SVM-FFA model has been compared with support vector regression (SVR), artificial neural networks, and generic programming methods. The results show that the SVM-FFA model performs more accurately than the other methodologies. Therefore, SVM-FFA can be used as an efficient soft computing technique in the optimization of lens system designs.

  3. Color image segmentation with support vector machines: applications to road signs detection.

    PubMed

    Cyganek, Bogusław

    2008-08-01

    In this paper we propose efficient color segmentation method which is based on the Support Vector Machine classifier operating in a one-class mode. The method has been developed especially for the road signs recognition system, although it can be used in other applications. The main advantage of the proposed method comes from the fact that the segmentation of characteristic colors is performed not in the original but in the higher dimensional feature space. By this a better data encapsulation with a linear hypersphere can be usually achieved. Moreover, the classifier does not try to capture the whole distribution of the input data which is often difficult to achieve. Instead, the characteristic data samples, called support vectors, are selected which allow construction of the tightest hypersphere that encloses majority of the input data. Then classification of a test data simply consists in a measurement of its distance to a centre of the found hypersphere. The experimental results show high accuracy and speed of the proposed method.

  4. The Performance of the NAS HSPs in 1st Half of 1994

    NASA Technical Reports Server (NTRS)

    Bergeron, Robert J.; Walter, Howard (Technical Monitor)

    1995-01-01

    During the first six months of 1994, the NAS (National Airspace System) 16-CPU Y-MP C90 Von Neumann (VN) delivered an average throughput of 4.045 GFLOPS while the ACSF (Aeronautics Consolidated Supercomputer Facility) 8-CPU Y-MP C90 Eagle averaged 1.658 GFLOPS. The VN rate represents a machine efficiency of 26.3% whereas the Eagle rate corresponds to a machine efficiency of 21.6%. VN displayed a greater efficiency than Eagle primarily because the stronger workload demand for its CPU cycles allowed it to devote more time to user programs and less time to idle. An additional factor increasing VN efficiency was the ability of the UNICOS 8.0 Operating System to deliver a larger fraction of CPU time to user programs. Although measurements indicate increasing vector length for both workloads, insufficient vector lengths continue to hinder HSP (High Speed Processor) performance. To improve HSP performance, NAS should continue to encourage the HSP users to modify their codes to increase program vector length.

  5. Sparse Solutions for Single Class SVMs: A Bi-Criterion Approach

    NASA Technical Reports Server (NTRS)

    Das, Santanu; Oza, Nikunj C.

    2011-01-01

    In this paper we propose an innovative learning algorithm - a variation of One-class nu Support Vector Machines (SVMs) learning algorithm to produce sparser solutions with much reduced computational complexities. The proposed technique returns an approximate solution, nearly as good as the solution set obtained by the classical approach, by minimizing the original risk function along with a regularization term. We introduce a bi-criterion optimization that helps guide the search towards the optimal set in much reduced time. The outcome of the proposed learning technique was compared with the benchmark one-class Support Vector machines algorithm which more often leads to solutions with redundant support vectors. Through out the analysis, the problem size for both optimization routines was kept consistent. We have tested the proposed algorithm on a variety of data sources under different conditions to demonstrate the effectiveness. In all cases the proposed algorithm closely preserves the accuracy of standard one-class nu SVMs while reducing both training time and test time by several factors.

  6. Automatic sleep staging using multi-dimensional feature extraction and multi-kernel fuzzy support vector machine.

    PubMed

    Zhang, Yanjun; Zhang, Xiangmin; Liu, Wenhui; Luo, Yuxi; Yu, Enjia; Zou, Keju; Liu, Xiaoliang

    2014-01-01

    This paper employed the clinical Polysomnographic (PSG) data, mainly including all-night Electroencephalogram (EEG), Electrooculogram (EOG) and Electromyogram (EMG) signals of subjects, and adopted the American Academy of Sleep Medicine (AASM) clinical staging manual as standards to realize automatic sleep staging. Authors extracted eighteen different features of EEG, EOG and EMG in time domains and frequency domains to construct the vectors according to the existing literatures as well as clinical experience. By adopting sleep samples self-learning, the linear combination of weights and parameters of multiple kernels of the fuzzy support vector machine (FSVM) were learned and the multi-kernel FSVM (MK-FSVM) was constructed. The overall agreement between the experts' scores and the results presented was 82.53%. Compared with previous results, the accuracy of N1 was improved to some extent while the accuracies of other stages were approximate, which well reflected the sleep structure. The staging algorithm proposed in this paper is transparent, and worth further investigation.

  7. Extraction and classification of 3D objects from volumetric CT data

    NASA Astrophysics Data System (ADS)

    Song, Samuel M.; Kwon, Junghyun; Ely, Austin; Enyeart, John; Johnson, Chad; Lee, Jongkyu; Kim, Namho; Boyd, Douglas P.

    2016-05-01

    We propose an Automatic Threat Detection (ATD) algorithm for Explosive Detection System (EDS) using our multistage Segmentation Carving (SC) followed by Support Vector Machine (SVM) classifier. The multi-stage Segmentation and Carving (SC) step extracts all suspect 3-D objects. The feature vector is then constructed for all extracted objects and the feature vector is classified by the Support Vector Machine (SVM) previously learned using a set of ground truth threat and benign objects. The learned SVM classifier has shown to be effective in classification of different types of threat materials. The proposed ATD algorithm robustly deals with CT data that are prone to artifacts due to scatter, beam hardening as well as other systematic idiosyncrasies of the CT data. Furthermore, the proposed ATD algorithm is amenable for including newly emerging threat materials as well as for accommodating data from newly developing sensor technologies. Efficacy of the proposed ATD algorithm with the SVM classifier is demonstrated by the Receiver Operating Characteristics (ROC) curve that relates Probability of Detection (PD) as a function of Probability of False Alarm (PFA). The tests performed using CT data of passenger bags shows excellent performance characteristics.

  8. Sparse kernel methods for high-dimensional survival data.

    PubMed

    Evers, Ludger; Messow, Claudia-Martina

    2008-07-15

    Sparse kernel methods like support vector machines (SVM) have been applied with great success to classification and (standard) regression settings. Existing support vector classification and regression techniques however are not suitable for partly censored survival data, which are typically analysed using Cox's proportional hazards model. As the partial likelihood of the proportional hazards model only depends on the covariates through inner products, it can be 'kernelized'. The kernelized proportional hazards model however yields a solution that is dense, i.e. the solution depends on all observations. One of the key features of an SVM is that it yields a sparse solution, depending only on a small fraction of the training data. We propose two methods. One is based on a geometric idea, where-akin to support vector classification-the margin between the failed observation and the observations currently at risk is maximised. The other approach is based on obtaining a sparse model by adding observations one after another akin to the Import Vector Machine (IVM). Data examples studied suggest that both methods can outperform competing approaches. Software is available under the GNU Public License as an R package and can be obtained from the first author's website http://www.maths.bris.ac.uk/~maxle/software.html.

  9. Vector processing efficiency of plasma MHD codes by use of the FACOM 230-75 APU

    NASA Astrophysics Data System (ADS)

    Matsuura, T.; Tanaka, Y.; Naraoka, K.; Takizuka, T.; Tsunematsu, T.; Tokuda, S.; Azumi, M.; Kurita, G.; Takeda, T.

    1982-06-01

    In the framework of pipelined vector architecture, the efficiency of vector processing is assessed with respect to plasma MHD codes in nuclear fusion research. By using a vector processor, the FACOM 230-75 APU, the limit of the enhancement factor due to parallelism of current vector machines is examined for three numerical codes based on a fluid model. Reasonable speed-up factors of approximately 6,6 and 4 times faster than the highly optimized scalar version are obtained for ERATO (linear stability code), AEOLUS-R1 (nonlinear stability code) and APOLLO (1-1/2D transport code), respectively. Problems of the pipelined vector processors are discussed from the viewpoint of restructuring, optimization and choice of algorithms. In conclusion, the important concept of "concurrency within pipelined parallelism" is emphasized.

  10. UArizona at the CLEF eRisk 2017 Pilot Task: Linear and Recurrent Models for Early Depression Detection

    PubMed Central

    Sadeque, Farig; Xu, Dongfang; Bethard, Steven

    2017-01-01

    The 2017 CLEF eRisk pilot task focuses on automatically detecting depression as early as possible from a users’ posts to Reddit. In this paper we present the techniques employed for the University of Arizona team’s participation in this early risk detection shared task. We leveraged external information beyond the small training set, including a preexisting depression lexicon and concepts from the Unified Medical Language System as features. For prediction, we used both sequential (recurrent neural network) and non-sequential (support vector machine) models. Our models perform decently on the test data, and the recurrent neural models perform better than the non-sequential support vector machines while using the same feature sets. PMID:29075167

  11. Analysis of miRNA expression profile based on SVM algorithm

    NASA Astrophysics Data System (ADS)

    Ting-ting, Dai; Chang-ji, Shan; Yan-shou, Dong; Yi-duo, Bian

    2018-05-01

    Based on mirna expression spectrum data set, a new data mining algorithm - tSVM - KNN (t statistic with support vector machine - k nearest neighbor) is proposed. the idea of the algorithm is: firstly, the feature selection of the data set is carried out by the unified measurement method; Secondly, SVM - KNN algorithm, which combines support vector machine (SVM) and k - nearest neighbor (k - nearest neighbor) is used as classifier. Simulation results show that SVM - KNN algorithm has better classification ability than SVM and KNN alone. Tsvm - KNN algorithm only needs 5 mirnas to obtain 96.08 % classification accuracy in terms of the number of mirna " tags" and recognition accuracy. compared with similar algorithms, tsvm - KNN algorithm has obvious advantages.

  12. Application of Support Vector Machine to Forex Monitoring

    NASA Astrophysics Data System (ADS)

    Kamruzzaman, Joarder; Sarker, Ruhul A.

    Previous studies have demonstrated superior performance of artificial neural network (ANN) based forex forecasting models over traditional regression models. This paper applies support vector machines to build a forecasting model from the historical data using six simple technical indicators and presents a comparison with an ANN based model trained by scaled conjugate gradient (SCG) learning algorithm. The models are evaluated and compared on the basis of five commonly used performance metrics that measure closeness of prediction as well as correctness in directional change. Forecasting results of six different currencies against Australian dollar reveal superior performance of SVM model using simple linear kernel over ANN-SCG model in terms of all the evaluation metrics. The effect of SVM parameter selection on prediction performance is also investigated and analyzed.

  13. Automatic EEG artifact removal: a weighted support vector machine approach with error correction.

    PubMed

    Shao, Shi-Yun; Shen, Kai-Quan; Ong, Chong Jin; Wilder-Smith, Einar P V; Li, Xiao-Ping

    2009-02-01

    An automatic electroencephalogram (EEG) artifact removal method is presented in this paper. Compared to past methods, it has two unique features: 1) a weighted version of support vector machine formulation that handles the inherent unbalanced nature of component classification and 2) the ability to accommodate structural information typically found in component classification. The advantages of the proposed method are demonstrated on real-life EEG recordings with comparisons made to several benchmark methods. Results show that the proposed method is preferable to the other methods in the context of artifact removal by achieving a better tradeoff between removing artifacts and preserving inherent brain activities. Qualitative evaluation of the reconstructed EEG epochs also demonstrates that after artifact removal inherent brain activities are largely preserved.

  14. Bayesian anomaly detection in monitoring data applying relevance vector machine

    NASA Astrophysics Data System (ADS)

    Saito, Tomoo

    2011-04-01

    A method for automatically classifying the monitoring data into two categories, normal and anomaly, is developed in order to remove anomalous data included in the enormous amount of monitoring data, applying the relevance vector machine (RVM) to a probabilistic discriminative model with basis functions and their weight parameters whose posterior PDF (probabilistic density function) conditional on the learning data set is given by Bayes' theorem. The proposed framework is applied to actual monitoring data sets containing some anomalous data collected at two buildings in Tokyo, Japan, which shows that the trained models discriminate anomalous data from normal data very clearly, giving high probabilities of being normal to normal data and low probabilities of being normal to anomalous data.

  15. Rapid authentication of adulteration of olive oil by near-infrared spectroscopy using support vector machines

    NASA Astrophysics Data System (ADS)

    Wu, Jingzhu; Dong, Jingjing; Dong, Wenfei; Chen, Yan; Liu, Cuiling

    2016-10-01

    A classification method of support vector machines with linear kernel was employed to authenticate genuine olive oil based on near-infrared spectroscopy. There were three types of adulteration of olive oil experimented in the study. The adulterated oil was respectively soybean oil, rapeseed oil and the mixture of soybean and rapeseed oil. The average recognition rate of second experiment was more than 90% and that of the third experiment was reach to 100%. The results showed the method had good performance in classifying genuine olive oil and the adulteration with small variation range of adulterated concentration and it was a promising and rapid technique for the detection of oil adulteration and fraud in the food industry.

  16. Classifying low-grade and high-grade bladder cancer using label-free serum surface-enhanced Raman spectroscopy and support vector machine

    NASA Astrophysics Data System (ADS)

    Zhang, Yanjiao; Lai, Xiaoping; Zeng, Qiuyao; Li, Linfang; Lin, Lin; Li, Shaoxin; Liu, Zhiming; Su, Chengkang; Qi, Minni; Guo, Zhouyi

    2018-03-01

    This study aims to classify low-grade and high-grade bladder cancer (BC) patients using serum surface-enhanced Raman scattering (SERS) spectra and support vector machine (SVM) algorithms. Serum SERS spectra are acquired from 88 serum samples with silver nanoparticles as the SERS-active substrate. Diagnostic accuracies of 96.4% and 95.4% are obtained when differentiating the serum SERS spectra of all BC patients versus normal subjects and low-grade versus high-grade BC patients, respectively, with optimal SVM classifier models. This study demonstrates that the serum SERS technique combined with SVM has great potential to noninvasively detect and classify high-grade and low-grade BC patients.

  17. Support vector machine and mel frequency Cepstral coefficient based algorithm for hand gestures and bidirectional speech to text device

    NASA Astrophysics Data System (ADS)

    Balbin, Jessie R.; Padilla, Dionis A.; Fausto, Janette C.; Vergara, Ernesto M.; Garcia, Ramon G.; Delos Angeles, Bethsedea Joy S.; Dizon, Neil John A.; Mardo, Mark Kevin N.

    2017-02-01

    This research is about translating series of hand gesture to form a word and produce its equivalent sound on how it is read and said in Filipino accent using Support Vector Machine and Mel Frequency Cepstral Coefficient analysis. The concept is to detect Filipino speech input and translate the spoken words to their text form in Filipino. This study is trying to help the Filipino deaf community to impart their thoughts through the use of hand gestures and be able to communicate to people who do not know how to read hand gestures. This also helps literate deaf to simply read the spoken words relayed to them using the Filipino speech to text system.

  18. Facial Expression Recognition using Multiclass Ensemble Least-Square Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Lawi, Armin; Sya'Rani Machrizzandi, M.

    2018-03-01

    Facial expression is one of behavior characteristics of human-being. The use of biometrics technology system with facial expression characteristics makes it possible to recognize a person’s mood or emotion. The basic components of facial expression analysis system are face detection, face image extraction, facial classification and facial expressions recognition. This paper uses Principal Component Analysis (PCA) algorithm to extract facial features with expression parameters, i.e., happy, sad, neutral, angry, fear, and disgusted. Then Multiclass Ensemble Least-Squares Support Vector Machine (MELS-SVM) is used for the classification process of facial expression. The result of MELS-SVM model obtained from our 185 different expression images of 10 persons showed high accuracy level of 99.998% using RBF kernel.

  19. Financial Distress Prediction using Linear Discriminant Analysis and Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Santoso, Noviyanti; Wibowo, Wahyu

    2018-03-01

    A financial difficulty is the early stages before the bankruptcy. Bankruptcies caused by the financial distress can be seen from the financial statements of the company. The ability to predict financial distress became an important research topic because it can provide early warning for the company. In addition, predicting financial distress is also beneficial for investors and creditors. This research will be made the prediction model of financial distress at industrial companies in Indonesia by comparing the performance of Linear Discriminant Analysis (LDA) and Support Vector Machine (SVM) combined with variable selection technique. The result of this research is prediction model based on hybrid Stepwise-SVM obtains better balance among fitting ability, generalization ability and model stability than the other models.

  20. Evaluation of a panel of 28 biomarkers for the non-invasive diagnosis of endometriosis.

    PubMed

    Vodolazkaia, A; El-Aalamat, Y; Popovic, D; Mihalyi, A; Bossuyt, X; Kyama, C M; Fassbender, A; Bokor, A; Schols, D; Huskens, D; Meuleman, C; Peeraer, K; Tomassetti, C; Gevaert, O; Waelkens, E; Kasran, A; De Moor, B; D'Hooghe, T M

    2012-09-01

    At present, the only way to conclusively diagnose endometriosis is laparoscopic inspection, preferably with histological confirmation. This contributes to the delay in the diagnosis of endometriosis which is 6-11 years. So far non-invasive diagnostic approaches such as ultrasound (US), MRI or blood tests do not have sufficient diagnostic power. Our aim was to develop and validate a non-invasive diagnostic test with a high sensitivity (80% or more) for symptomatic endometriosis patients, without US evidence of endometriosis, since this is the group most in need of a non-invasive test. A total of 28 inflammatory and non-inflammatory plasma biomarkers were measured in 353 EDTA plasma samples collected at surgery from 121 controls without endometriosis at laparoscopy and from 232 women with endometriosis (minimal-mild n = 148; moderate-severe n = 84), including 175 women without preoperative US evidence of endometriosis. Surgery was done during menstrual (n = 83), follicular (n = 135) and luteal (n = 135) phases of the menstrual cycle. For analysis, the data were randomly divided into an independent training (n = 235) and a test (n = 118) data set. Statistical analysis was done using univariate and multivariate (logistic regression and least squares support vector machines (LS-SVM) approaches in training- and test data set separately to validate our findings. In the training set, two models of four biomarkers (Model 1: annexin V, VEGF, CA-125 and glycodelin; Model 2: annexin V, VEGF, CA-125 and sICAM-1) analysed in plasma, obtained during the menstrual phase, could predict US-negative endometriosis with a high sensitivity (81-90%) and an acceptable specificity (68-81%). The same two models predicted US-negative endometriosis in the independent validation test set with a high sensitivity (82%) and an acceptable specificity (63-75%). In plasma samples obtained during menstruation, multivariate analysis of four biomarkers (annexin V, VEGF, CA-125 and sICAM-1/or glycodelin) enabled the diagnosis of endometriosis undetectable by US with a sensitivity of 81-90% and a specificity of 63-81% in independent training- and test data set. The next step is to apply these models for preoperative prediction of endometriosis in an independent set of patients with infertility and/or pain without US evidence of endometriosis, scheduled for laparoscopy.

  1. [Detection of Hawthorn Fruit Defects Using Hyperspectral Imaging].

    PubMed

    Liu, De-hua; Zhang, Shu-juan; Wang, Bin; Yu, Ke-qiang; Zhao, Yan-ru; He, Yong

    2015-11-01

    Hyperspectral imaging technology covered the range of 380-1000 nm was employed to detect defects (bruise and insect damage) of hawthorn fruit. A total of 134 samples were collected, which included damage fruit of 46, pest fruit of 30, injure and pest fruit of 10 and intact fruit of 48. Because calyx · s⁻¹ tem-end and bruise/insect damage regions offered a similar appearance characteristic in RGB images, which could produce easily confusion between them. Hence, five types of defects including bruise, insect damage, sound, calyx, and stem-end were collected from 230 hawthorn fruits. After acquiring hyperspectral images of hawthorn fruits, the spectral data were extracted from region of interest (ROI). Then, several pretreatment methods of standard normalized variate (SNV), savitzky golay (SG), median filter (MF) and multiplicative scatter correction (MSC) were used and partial least squares method(PLS) model was carried out to obtain the better performance. Accordingly to their results, SNV pretreatment methods assessed by PLS was viewed as best pretreatment method. Lastly, SNV was chosen as the pretreatment method. Spectral features of five different regions were combined with Regression coefficients(RCs) of partial least squares-discriminant analysis (PLS-DA) model was used to identify the important wavelengths and ten wavebands at 483, 563, 645, 671, 686, 722, 777, 819, 837 and 942 nm were selected from all of the wavebands. Using Kennard-Stone algorithm, all kinds of samples were randomly divided into training set (173) and test set (57) according to the proportion of 3:1. And then, least squares-support vector machine (LS-SVM) discriminate model was established by using the selected wavebands. The results showed that the discriminate accuracy of the method was 91.23%. In the other hand, images at ten important wavebands were executed to Principal component analysis (PCA). Using "Sobel" operator and region growing algrorithm "Regiongrow", the edge and defect feature of 86 Hawthorn could be recognized. Lastly, the detect precision of bruised, insect damage and two-defect samples is 95.65%, 86.67% and 100%, respectively. This investigation demonstrated that hyperspectral imaging technology could detect the defects of bruise, insect damage, calyx, and stem-end in hawthorn fruit in qualitative analysis and feature detection which provided a theoretical reference for the defects nondestructive detection of hawthorn fruit.

  2. Ship localization in Santa Barbara Channel using machine learning classifiers.

    PubMed

    Niu, Haiqiang; Ozanich, Emma; Gerstoft, Peter

    2017-11-01

    Machine learning classifiers are shown to outperform conventional matched field processing for a deep water (600 m depth) ocean acoustic-based ship range estimation problem in the Santa Barbara Channel Experiment when limited environmental information is known. Recordings of three different ships of opportunity on a vertical array were used as training and test data for the feed-forward neural network and support vector machine classifiers, demonstrating the feasibility of machine learning methods to locate unseen sources. The classifiers perform well up to 10 km range whereas the conventional matched field processing fails at about 4 km range without accurate environmental information.

  3. The reflection of evolving bearing faults in the stator current's extended park vector approach for induction machines

    NASA Astrophysics Data System (ADS)

    Corne, Bram; Vervisch, Bram; Derammelaere, Stijn; Knockaert, Jos; Desmet, Jan

    2018-07-01

    Stator current analysis has the potential of becoming the most cost-effective condition monitoring technology regarding electric rotating machinery. Since both electrical and mechanical faults are detected by inexpensive and robust current-sensors, measuring current is advantageous on other techniques such as vibration, acoustic or temperature analysis. However, this technology is struggling to breach into the market of condition monitoring as the electrical interpretation of mechanical machine-problems is highly complicated. Recently, the authors built a test-rig which facilitates the emulation of several representative mechanical faults on an 11 kW induction machine with high accuracy and reproducibility. Operating this test-rig, the stator current of the induction machine under test can be analyzed while mechanical faults are emulated. Furthermore, while emulating, the fault-severity can be manipulated adaptively under controllable environmental conditions. This creates the opportunity of examining the relation between the magnitude of the well-known current fault components and the corresponding fault-severity. This paper presents the emulation of evolving bearing faults and their reflection in the Extended Park Vector Approach for the 11 kW induction machine under test. The results confirm the strong relation between the bearing faults and the stator current fault components in both identification and fault-severity. Conclusively, stator current analysis increases reliability in the application as a complete, robust, on-line condition monitoring technology.

  4. A machine learning model with human cognitive biases capable of learning from small and biased datasets.

    PubMed

    Taniguchi, Hidetaka; Sato, Hiroshi; Shirakawa, Tomohiro

    2018-05-09

    Human learners can generalize a new concept from a small number of samples. In contrast, conventional machine learning methods require large amounts of data to address the same types of problems. Humans have cognitive biases that promote fast learning. Here, we developed a method to reduce the gap between human beings and machines in this type of inference by utilizing cognitive biases. We implemented a human cognitive model into machine learning algorithms and compared their performance with the currently most popular methods, naïve Bayes, support vector machine, neural networks, logistic regression and random forests. We focused on the task of spam classification, which has been studied for a long time in the field of machine learning and often requires a large amount of data to obtain high accuracy. Our models achieved superior performance with small and biased samples in comparison with other representative machine learning methods.

  5. Scaling Support Vector Machines On Modern HPC Platforms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    You, Yang; Fu, Haohuan; Song, Shuaiwen

    2015-02-01

    We designed and implemented MIC-SVM, a highly efficient parallel SVM for x86 based multicore and many-core architectures, such as the Intel Ivy Bridge CPUs and Intel Xeon Phi co-processor (MIC). We propose various novel analysis methods and optimization techniques to fully utilize the multilevel parallelism provided by these architectures and serve as general optimization methods for other machine learning tools.

  6. On the use of feature selection to improve the detection of sea oil spills in SAR images

    NASA Astrophysics Data System (ADS)

    Mera, David; Bolon-Canedo, Veronica; Cotos, J. M.; Alonso-Betanzos, Amparo

    2017-03-01

    Fast and effective oil spill detection systems are crucial to ensure a proper response to environmental emergencies caused by hydrocarbon pollution on the ocean's surface. Typically, these systems uncover not only oil spills, but also a high number of look-alikes. The feature extraction is a critical and computationally intensive phase where each detected dark spot is independently examined. Traditionally, detection systems use an arbitrary set of features to discriminate between oil spills and look-alikes phenomena. However, Feature Selection (FS) methods based on Machine Learning (ML) have proved to be very useful in real domains for enhancing the generalization capabilities of the classifiers, while discarding the existing irrelevant features. In this work, we present a generic and systematic approach, based on FS methods, for choosing a concise and relevant set of features to improve the oil spill detection systems. We have compared five FS methods: Correlation-based feature selection (CFS), Consistency-based filter, Information Gain, ReliefF and Recursive Feature Elimination for Support Vector Machine (SVM-RFE). They were applied on a 141-input vector composed of features from a collection of outstanding studies. Selected features were validated via a Support Vector Machine (SVM) classifier and the results were compared with previous works. Test experiments revealed that the classifier trained with the 6-input feature vector proposed by SVM-RFE achieved the best accuracy and Cohen's kappa coefficient (87.1% and 74.06% respectively). This is a smaller feature combination with similar or even better classification accuracy than previous works. The presented finding allows to speed up the feature extraction phase without reducing the classifier accuracy. Experiments also confirmed the significance of the geometrical features since 75.0% of the different features selected by the applied FS methods as well as 66.67% of the proposed 6-input feature vector belong to this category.

  7. Progressive Classification Using Support Vector Machines

    NASA Technical Reports Server (NTRS)

    Wagstaff, Kiri; Kocurek, Michael

    2009-01-01

    An algorithm for progressive classification of data, analogous to progressive rendering of images, makes it possible to compromise between speed and accuracy. This algorithm uses support vector machines (SVMs) to classify data. An SVM is a machine learning algorithm that builds a mathematical model of the desired classification concept by identifying the critical data points, called support vectors. Coarse approximations to the concept require only a few support vectors, while precise, highly accurate models require far more support vectors. Once the model has been constructed, the SVM can be applied to new observations. The cost of classifying a new observation is proportional to the number of support vectors in the model. When computational resources are limited, an SVM of the appropriate complexity can be produced. However, if the constraints are not known when the model is constructed, or if they can change over time, a method for adaptively responding to the current resource constraints is required. This capability is particularly relevant for spacecraft (or any other real-time systems) that perform onboard data analysis. The new algorithm enables the fast, interactive application of an SVM classifier to a new set of data. The classification process achieved by this algorithm is characterized as progressive because a coarse approximation to the true classification is generated rapidly and thereafter iteratively refined. The algorithm uses two SVMs: (1) a fast, approximate one and (2) slow, highly accurate one. New data are initially classified by the fast SVM, producing a baseline approximate classification. For each classified data point, the algorithm calculates a confidence index that indicates the likelihood that it was classified correctly in the first pass. Next, the data points are sorted by their confidence indices and progressively reclassified by the slower, more accurate SVM, starting with the items most likely to be incorrectly classified. The user can halt this reclassification process at any point, thereby obtaining the best possible result for a given amount of computation time. Alternatively, the results can be displayed as they are generated, providing the user with real-time feedback about the current accuracy of classification.

  8. Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

    PubMed Central

    2011-01-01

    Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most sensitivity was around or even lower than a median value of 0.5. Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing. PMID:21849043

  9. Improving protein-protein interactions prediction accuracy using protein evolutionary information and relevance vector machine model.

    PubMed

    An, Ji-Yong; Meng, Fan-Rong; You, Zhu-Hong; Chen, Xing; Yan, Gui-Ying; Hu, Ji-Pu

    2016-10-01

    Predicting protein-protein interactions (PPIs) is a challenging task and essential to construct the protein interaction networks, which is important for facilitating our understanding of the mechanisms of biological systems. Although a number of high-throughput technologies have been proposed to predict PPIs, there are unavoidable shortcomings, including high cost, time intensity, and inherently high false positive rates. For these reasons, many computational methods have been proposed for predicting PPIs. However, the problem is still far from being solved. In this article, we propose a novel computational method called RVM-BiGP that combines the relevance vector machine (RVM) model and Bi-gram Probabilities (BiGP) for PPIs detection from protein sequences. The major improvement includes (1) Protein sequences are represented using the Bi-gram probabilities (BiGP) feature representation on a Position Specific Scoring Matrix (PSSM), in which the protein evolutionary information is contained; (2) For reducing the influence of noise, the Principal Component Analysis (PCA) method is used to reduce the dimension of BiGP vector; (3) The powerful and robust Relevance Vector Machine (RVM) algorithm is used for classification. Five-fold cross-validation experiments executed on yeast and Helicobacter pylori datasets, which achieved very high accuracies of 94.57 and 90.57%, respectively. Experimental results are significantly better than previous methods. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM) classifier on the yeast dataset. The experimental results demonstrate that our RVM-BiGP method is significantly better than the SVM-based method. In addition, we achieved 97.15% accuracy on imbalance yeast dataset, which is higher than that of balance yeast dataset. The promising experimental results show the efficiency and robust of the proposed method, which can be an automatic decision support tool for future proteomics research. For facilitating extensive studies for future proteomics research, we developed a freely available web server called RVM-BiGP-PPIs in Hypertext Preprocessor (PHP) for predicting PPIs. The web server including source code and the datasets are available at http://219.219.62.123:8888/BiGP/. © 2016 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.

  10. Recognition of acute lymphoblastic leukemia cells in microscopic images using k-means clustering and support vector machine classifier.

    PubMed

    Amin, Morteza Moradi; Kermani, Saeed; Talebi, Ardeshir; Oghli, Mostafa Ghelich

    2015-01-01

    Acute lymphoblastic leukemia is the most common form of pediatric cancer which is categorized into three L1, L2, and L3 and could be detected through screening of blood and bone marrow smears by pathologists. Due to being time-consuming and tediousness of the procedure, a computer-based system is acquired for convenient detection of Acute lymphoblastic leukemia. Microscopic images are acquired from blood and bone marrow smears of patients with Acute lymphoblastic leukemia and normal cases. After applying image preprocessing, cells nuclei are segmented by k-means algorithm. Then geometric and statistical features are extracted from nuclei and finally these cells are classified to cancerous and noncancerous cells by means of support vector machine classifier with 10-fold cross validation. These cells are also classified into their sub-types by multi-Support vector machine classifier. Classifier is evaluated by these parameters: Sensitivity, specificity, and accuracy which values for cancerous and noncancerous cells 98%, 95%, and 97%, respectively. These parameters are also used for evaluation of cell sub-types which values in mean 84.3%, 97.3%, and 95.6%, respectively. The results show that proposed algorithm could achieve an acceptable performance for the diagnosis of Acute lymphoblastic leukemia and its sub-types and can be used as an assistant diagnostic tool for pathologists.

  11. Classification of different kinds of pesticide residues on lettuce based on fluorescence spectra and WT-BCC-SVM algorithm

    NASA Astrophysics Data System (ADS)

    Zhou, Xin; Jun, Sun; Zhang, Bing; Jun, Wu

    2017-07-01

    In order to improve the reliability of the spectrum feature extracted by wavelet transform, a method combining wavelet transform (WT) with bacterial colony chemotaxis algorithm and support vector machine (BCC-SVM) algorithm (WT-BCC-SVM) was proposed in this paper. Besides, we aimed to identify different kinds of pesticide residues on lettuce leaves in a novel and rapid non-destructive way by using fluorescence spectra technology. The fluorescence spectral data of 150 lettuce leaf samples of five different kinds of pesticide residues on the surface of lettuce were obtained using Cary Eclipse fluorescence spectrometer. Standard normalized variable detrending (SNV detrending), Savitzky-Golay coupled with Standard normalized variable detrending (SG-SNV detrending) were used to preprocess the raw spectra, respectively. Bacterial colony chemotaxis combined with support vector machine (BCC-SVM) and support vector machine (SVM) classification models were established based on full spectra (FS) and wavelet transform characteristics (WTC), respectively. Moreover, WTC were selected by WT. The results showed that the accuracy of training set, calibration set and the prediction set of the best optimal classification model (SG-SNV detrending-WT-BCC-SVM) were 100%, 98% and 93.33%, respectively. In addition, the results indicated that it was feasible to use WT-BCC-SVM to establish diagnostic model of different kinds of pesticide residues on lettuce leaves.

  12. Optical Coherence Tomography Machine Learning Classifiers for Glaucoma Detection: A Preliminary Study

    PubMed Central

    Burgansky-Eliash, Zvia; Wollstein, Gadi; Chu, Tianjiao; Ramsey, Joseph D.; Glymour, Clark; Noecker, Robert J.; Ishikawa, Hiroshi; Schuman, Joel S.

    2007-01-01

    Purpose Machine-learning classifiers are trained computerized systems with the ability to detect the relationship between multiple input parameters and a diagnosis. The present study investigated whether the use of machine-learning classifiers improves optical coherence tomography (OCT) glaucoma detection. Methods Forty-seven patients with glaucoma (47 eyes) and 42 healthy subjects (42 eyes) were included in this cross-sectional study. Of the glaucoma patients, 27 had early disease (visual field mean deviation [MD] ≥ −6 dB) and 20 had advanced glaucoma (MD < −6 dB). Machine-learning classifiers were trained to discriminate between glaucomatous and healthy eyes using parameters derived from OCT output. The classifiers were trained with all 38 parameters as well as with only 8 parameters that correlated best with the visual field MD. Five classifiers were tested: linear discriminant analysis, support vector machine, recursive partitioning and regression tree, generalized linear model, and generalized additive model. For the last two classifiers, a backward feature selection was used to find the minimal number of parameters that resulted in the best and most simple prediction. The cross-validated receiver operating characteristic (ROC) curve and accuracies were calculated. Results The largest area under the ROC curve (AROC) for glaucoma detection was achieved with the support vector machine using eight parameters (0.981). The sensitivity at 80% and 95% specificity was 97.9% and 92.5%, respectively. This classifier also performed best when judged by cross-validated accuracy (0.966). The best classification between early glaucoma and advanced glaucoma was obtained with the generalized additive model using only three parameters (AROC = 0.854). Conclusions Automated machine classifiers of OCT data might be useful for enhancing the utility of this technology for detecting glaucomatous abnormality. PMID:16249492

  13. Classifying injury narratives of large administrative databases for surveillance-A practical approach combining machine learning ensembles and human review.

    PubMed

    Marucci-Wellman, Helen R; Corns, Helen L; Lehto, Mark R

    2017-01-01

    Injury narratives are now available real time and include useful information for injury surveillance and prevention. However, manual classification of the cause or events leading to injury found in large batches of narratives, such as workers compensation claims databases, can be prohibitive. In this study we compare the utility of four machine learning algorithms (Naïve Bayes, Single word and Bi-gram models, Support Vector Machine and Logistic Regression) for classifying narratives into Bureau of Labor Statistics Occupational Injury and Illness event leading to injury classifications for a large workers compensation database. These algorithms are known to do well classifying narrative text and are fairly easy to implement with off-the-shelf software packages such as Python. We propose human-machine learning ensemble approaches which maximize the power and accuracy of the algorithms for machine-assigned codes and allow for strategic filtering of rare, emerging or ambiguous narratives for manual review. We compare human-machine approaches based on filtering on the prediction strength of the classifier vs. agreement between algorithms. Regularized Logistic Regression (LR) was the best performing algorithm alone. Using this algorithm and filtering out the bottom 30% of predictions for manual review resulted in high accuracy (overall sensitivity/positive predictive value of 0.89) of the final machine-human coded dataset. The best pairings of algorithms included Naïve Bayes with Support Vector Machine whereby the triple ensemble NB SW =NB BI-GRAM =SVM had very high performance (0.93 overall sensitivity/positive predictive value and high accuracy (i.e. high sensitivity and positive predictive values)) across both large and small categories leaving 41% of the narratives for manual review. Integrating LR into this ensemble mix improved performance only slightly. For large administrative datasets we propose incorporation of methods based on human-machine pairings such as we have done here, utilizing readily-available off-the-shelf machine learning techniques and resulting in only a fraction of narratives that require manual review. Human-machine ensemble methods are likely to improve performance over total manual coding. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  14. Mechanical Fault Diagnosis of High Voltage Circuit Breakers Based on Variational Mode Decomposition and Multi-Layer Classifier.

    PubMed

    Huang, Nantian; Chen, Huaijin; Cai, Guowei; Fang, Lihua; Wang, Yuqiang

    2016-11-10

    Mechanical fault diagnosis of high-voltage circuit breakers (HVCBs) based on vibration signal analysis is one of the most significant issues in improving the reliability and reducing the outage cost for power systems. The limitation of training samples and types of machine faults in HVCBs causes the existing mechanical fault diagnostic methods to recognize new types of machine faults easily without training samples as either a normal condition or a wrong fault type. A new mechanical fault diagnosis method for HVCBs based on variational mode decomposition (VMD) and multi-layer classifier (MLC) is proposed to improve the accuracy of fault diagnosis. First, HVCB vibration signals during operation are measured using an acceleration sensor. Second, a VMD algorithm is used to decompose the vibration signals into several intrinsic mode functions (IMFs). The IMF matrix is divided into submatrices to compute the local singular values (LSV). The maximum singular values of each submatrix are selected as the feature vectors for fault diagnosis. Finally, a MLC composed of two one-class support vector machines (OCSVMs) and a support vector machine (SVM) is constructed to identify the fault type. Two layers of independent OCSVM are adopted to distinguish normal or fault conditions with known or unknown fault types, respectively. On this basis, SVM recognizes the specific fault type. Real diagnostic experiments are conducted with a real SF₆ HVCB with normal and fault states. Three different faults (i.e., jam fault of the iron core, looseness of the base screw, and poor lubrication of the connecting lever) are simulated in a field experiment on a real HVCB to test the feasibility of the proposed method. Results show that the classification accuracy of the new method is superior to other traditional methods.

  15. Mechanical Fault Diagnosis of High Voltage Circuit Breakers Based on Variational Mode Decomposition and Multi-Layer Classifier

    PubMed Central

    Huang, Nantian; Chen, Huaijin; Cai, Guowei; Fang, Lihua; Wang, Yuqiang

    2016-01-01

    Mechanical fault diagnosis of high-voltage circuit breakers (HVCBs) based on vibration signal analysis is one of the most significant issues in improving the reliability and reducing the outage cost for power systems. The limitation of training samples and types of machine faults in HVCBs causes the existing mechanical fault diagnostic methods to recognize new types of machine faults easily without training samples as either a normal condition or a wrong fault type. A new mechanical fault diagnosis method for HVCBs based on variational mode decomposition (VMD) and multi-layer classifier (MLC) is proposed to improve the accuracy of fault diagnosis. First, HVCB vibration signals during operation are measured using an acceleration sensor. Second, a VMD algorithm is used to decompose the vibration signals into several intrinsic mode functions (IMFs). The IMF matrix is divided into submatrices to compute the local singular values (LSV). The maximum singular values of each submatrix are selected as the feature vectors for fault diagnosis. Finally, a MLC composed of two one-class support vector machines (OCSVMs) and a support vector machine (SVM) is constructed to identify the fault type. Two layers of independent OCSVM are adopted to distinguish normal or fault conditions with known or unknown fault types, respectively. On this basis, SVM recognizes the specific fault type. Real diagnostic experiments are conducted with a real SF6 HVCB with normal and fault states. Three different faults (i.e., jam fault of the iron core, looseness of the base screw, and poor lubrication of the connecting lever) are simulated in a field experiment on a real HVCB to test the feasibility of the proposed method. Results show that the classification accuracy of the new method is superior to other traditional methods. PMID:27834902

  16. Detection of Alzheimer's Disease by Three-Dimensional Displacement Field Estimation in Structural Magnetic Resonance Imaging.

    PubMed

    Wang, Shuihua; Zhang, Yudong; Liu, Ge; Phillips, Preetha; Yuan, Ti-Fei

    2016-01-01

    Within the past decade, computer scientists have developed many methods using computer vision and machine learning techniques to detect Alzheimer's disease (AD) in its early stages. However, some of these methods are unable to achieve excellent detection accuracy, and several other methods are unable to locate AD-related regions. Hence, our goal was to develop a novel AD brain detection method. In this study, our method was based on the three-dimensional (3D) displacement-field (DF) estimation between subjects in the healthy elder control group and AD group. The 3D-DF was treated with AD-related features. The three feature selection measures were used in the Bhattacharyya distance, Student's t-test, and Welch's t-test (WTT). Two non-parallel support vector machines, i.e., generalized eigenvalue proximal support vector machine and twin support vector machine (TSVM), were then used for classification. A 50 × 10-fold cross validation was implemented for statistical analysis. The results showed that "3D-DF+WTT+TSVM" achieved the best performance, with an accuracy of 93.05 ± 2.18, a sensitivity of 92.57 ± 3.80, a specificity of 93.18 ± 3.35, and a precision of 79.51 ± 2.86. This method also exceled in 13 state-of-the-art approaches. Additionally, we were able to detect 17 regions related to AD by using the pure computer-vision technique. These regions include sub-gyral, inferior parietal lobule, precuneus, angular gyrus, lingual gyrus, supramarginal gyrus, postcentral gyrus, third ventricle, superior parietal lobule, thalamus, middle temporal gyrus, precentral gyrus, superior temporal gyrus, superior occipital gyrus, cingulate gyrus, culmen, and insula. These regions were reported in recent publications. The 3D-DF is effective in AD subject and related region detection.

  17. Highly accurate prediction of protein self-interactions by incorporating the average block and PSSM information into the general PseAAC.

    PubMed

    Zhai, Jing-Xuan; Cao, Tian-Jie; An, Ji-Yong; Bian, Yong-Tao

    2017-11-07

    It is a challenging task for fundamental research whether proteins can interact with their partners. Protein self-interaction (SIP) is a special case of PPIs, which plays a key role in the regulation of cellular functions. Due to the limitations of experimental self-interaction identification, it is very important to develop an effective biological tool for predicting SIPs based on protein sequences. In the study, we developed a novel computational method called RVM-AB that combines the Relevance Vector Machine (RVM) model and Average Blocks (AB) for detecting SIPs from protein sequences. Firstly, Average Blocks (AB) feature extraction method is employed to represent protein sequences on a Position Specific Scoring Matrix (PSSM). Secondly, Principal Component Analysis (PCA) method is used to reduce the dimension of AB vector for reducing the influence of noise. Then, by employing the Relevance Vector Machine (RVM) algorithm, the performance of RVM-AB is assessed and compared with the state-of-the-art support vector machine (SVM) classifier and other exiting methods on yeast and human datasets respectively. Using the fivefold test experiment, RVM-AB model achieved very high accuracies of 93.01% and 97.72% on yeast and human datasets respectively, which are significantly better than the method based on SVM classifier and other previous methods. The experimental results proved that the RVM-AB prediction model is efficient and robust. It can be an automatic decision support tool for detecting SIPs. For facilitating extensive studies for future proteomics research, the RVMAB server is freely available for academic use at http://219.219.62.123:8888/SIP_AB. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. Pulmonary Nodule Recognition Based on Multiple Kernel Learning Support Vector Machine-PSO

    PubMed Central

    Zhu, Zhichuan; Zhao, Qingdong; Liu, Liwei; Zhang, Lijuan

    2018-01-01

    Pulmonary nodule recognition is the core module of lung CAD. The Support Vector Machine (SVM) algorithm has been widely used in pulmonary nodule recognition, and the algorithm of Multiple Kernel Learning Support Vector Machine (MKL-SVM) has achieved good results therein. Based on grid search, however, the MKL-SVM algorithm needs long optimization time in course of parameter optimization; also its identification accuracy depends on the fineness of grid. In the paper, swarm intelligence is introduced and the Particle Swarm Optimization (PSO) is combined with MKL-SVM algorithm to be MKL-SVM-PSO algorithm so as to realize global optimization of parameters rapidly. In order to obtain the global optimal solution, different inertia weights such as constant inertia weight, linear inertia weight, and nonlinear inertia weight are applied to pulmonary nodules recognition. The experimental results show that the model training time of the proposed MKL-SVM-PSO algorithm is only 1/7 of the training time of the MKL-SVM grid search algorithm, achieving better recognition effect. Moreover, Euclidean norm of normalized error vector is proposed to measure the proximity between the average fitness curve and the optimal fitness curve after convergence. Through statistical analysis of the average of 20 times operation results with different inertial weights, it can be seen that the dynamic inertial weight is superior to the constant inertia weight in the MKL-SVM-PSO algorithm. In the dynamic inertial weight algorithm, the parameter optimization time of nonlinear inertia weight is shorter; the average fitness value after convergence is much closer to the optimal fitness value, which is better than the linear inertial weight. Besides, a better nonlinear inertial weight is verified. PMID:29853983

  19. Pulmonary Nodule Recognition Based on Multiple Kernel Learning Support Vector Machine-PSO.

    PubMed

    Li, Yang; Zhu, Zhichuan; Hou, Alin; Zhao, Qingdong; Liu, Liwei; Zhang, Lijuan

    2018-01-01

    Pulmonary nodule recognition is the core module of lung CAD. The Support Vector Machine (SVM) algorithm has been widely used in pulmonary nodule recognition, and the algorithm of Multiple Kernel Learning Support Vector Machine (MKL-SVM) has achieved good results therein. Based on grid search, however, the MKL-SVM algorithm needs long optimization time in course of parameter optimization; also its identification accuracy depends on the fineness of grid. In the paper, swarm intelligence is introduced and the Particle Swarm Optimization (PSO) is combined with MKL-SVM algorithm to be MKL-SVM-PSO algorithm so as to realize global optimization of parameters rapidly. In order to obtain the global optimal solution, different inertia weights such as constant inertia weight, linear inertia weight, and nonlinear inertia weight are applied to pulmonary nodules recognition. The experimental results show that the model training time of the proposed MKL-SVM-PSO algorithm is only 1/7 of the training time of the MKL-SVM grid search algorithm, achieving better recognition effect. Moreover, Euclidean norm of normalized error vector is proposed to measure the proximity between the average fitness curve and the optimal fitness curve after convergence. Through statistical analysis of the average of 20 times operation results with different inertial weights, it can be seen that the dynamic inertial weight is superior to the constant inertia weight in the MKL-SVM-PSO algorithm. In the dynamic inertial weight algorithm, the parameter optimization time of nonlinear inertia weight is shorter; the average fitness value after convergence is much closer to the optimal fitness value, which is better than the linear inertial weight. Besides, a better nonlinear inertial weight is verified.

  20. Improved Online Support Vector Machines Spam Filtering Using String Kernels

    NASA Astrophysics Data System (ADS)

    Amayri, Ola; Bouguila, Nizar

    A major bottleneck in electronic communications is the enormous dissemination of spam emails. Developing of suitable filters that can adequately capture those emails and achieve high performance rate become a main concern. Support vector machines (SVMs) have made a large contribution to the development of spam email filtering. Based on SVMs, the crucial problems in email classification are feature mapping of input emails and the choice of the kernels. In this paper, we present thorough investigation of several distance-based kernels and propose the use of string kernels and prove its efficiency in blocking spam emails. We detail a feature mapping variants in text classification (TC) that yield improved performance for the standard SVMs in filtering task. Furthermore, to cope for realtime scenarios we propose an online active framework for spam filtering.

  1. Gradient Evolution-based Support Vector Machine Algorithm for Classification

    NASA Astrophysics Data System (ADS)

    Zulvia, Ferani E.; Kuo, R. J.

    2018-03-01

    This paper proposes a classification algorithm based on a support vector machine (SVM) and gradient evolution (GE) algorithms. SVM algorithm has been widely used in classification. However, its result is significantly influenced by the parameters. Therefore, this paper aims to propose an improvement of SVM algorithm which can find the best SVMs’ parameters automatically. The proposed algorithm employs a GE algorithm to automatically determine the SVMs’ parameters. The GE algorithm takes a role as a global optimizer in finding the best parameter which will be used by SVM algorithm. The proposed GE-SVM algorithm is verified using some benchmark datasets and compared with other metaheuristic-based SVM algorithms. The experimental results show that the proposed GE-SVM algorithm obtains better results than other algorithms tested in this paper.

  2. MIC-SVM: Designing A Highly Efficient Support Vector Machine For Advanced Modern Multi-Core and Many-Core Architectures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    You, Yang; Song, Shuaiwen; Fu, Haohuan

    2014-08-16

    Support Vector Machine (SVM) has been widely used in data-mining and Big Data applications as modern commercial databases start to attach an increasing importance to the analytic capabilities. In recent years, SVM was adapted to the field of High Performance Computing for power/performance prediction, auto-tuning, and runtime scheduling. However, even at the risk of losing prediction accuracy due to insufficient runtime information, researchers can only afford to apply offline model training to avoid significant runtime training overhead. To address the challenges above, we designed and implemented MICSVM, a highly efficient parallel SVM for x86 based multi-core and many core architectures,more » such as the Intel Ivy Bridge CPUs and Intel Xeon Phi coprocessor (MIC).« less

  3. Detection of License Plate using Sliding Window, Histogram of Oriented Gradient, and Support Vector Machines Method

    NASA Astrophysics Data System (ADS)

    Astawa, INGA; Gusti Ngurah Bagus Caturbawa, I.; Made Sajayasa, I.; Dwi Suta Atmaja, I. Made Ari

    2018-01-01

    The license plate recognition usually used as part of system such as parking system. License plate detection considered as the most important step in the license plate recognition system. We propose methods that can be used to detect the vehicle plate on mobile phone. In this paper, we used Sliding Window, Histogram of Oriented Gradient (HOG), and Support Vector Machines (SVM) method to license plate detection so it will increase the detection level even though the image is not in a good quality. The image proceed by Sliding Window method in order to find plate position. Feature extraction in every window movement had been done by HOG and SVM method. Good result had shown in this research, which is 96% of accuracy.

  4. Ischemic stroke lesion segmentation in multi-spectral MR images with support vector machine classifiers

    NASA Astrophysics Data System (ADS)

    Maier, Oskar; Wilms, Matthias; von der Gablentz, Janina; Krämer, Ulrike; Handels, Heinz

    2014-03-01

    Automatic segmentation of ischemic stroke lesions in magnetic resonance (MR) images is important in clinical practice and for neuroscientific trials. The key problem is to detect largely inhomogeneous regions of varying sizes, shapes and locations. We present a stroke lesion segmentation method based on local features extracted from multi-spectral MR data that are selected to model a human observer's discrimination criteria. A support vector machine classifier is trained on expert-segmented examples and then used to classify formerly unseen images. Leave-one-out cross validation on eight datasets with lesions of varying appearances is performed, showing our method to compare favourably with other published approaches in terms of accuracy and robustness. Furthermore, we compare a number of feature selectors and closely examine each feature's and MR sequence's contribution.

  5. Online Artifact Removal for Brain-Computer Interfaces Using Support Vector Machines and Blind Source Separation

    PubMed Central

    Halder, Sebastian; Bensch, Michael; Mellinger, Jürgen; Bogdan, Martin; Kübler, Andrea; Birbaumer, Niels; Rosenstiel, Wolfgang

    2007-01-01

    We propose a combination of blind source separation (BSS) and independent component analysis (ICA) (signal decomposition into artifacts and nonartifacts) with support vector machines (SVMs) (automatic classification) that are designed for online usage. In order to select a suitable BSS/ICA method, three ICA algorithms (JADE, Infomax, and FastICA) and one BSS algorithm (AMUSE) are evaluated to determine their ability to isolate electromyographic (EMG) and electrooculographic (EOG) artifacts into individual components. An implementation of the selected BSS/ICA method with SVMs trained to classify EMG and EOG artifacts, which enables the usage of the method as a filter in measurements with online feedback, is described. This filter is evaluated on three BCI datasets as a proof-of-concept of the method. PMID:18288259

  6. Online artifact removal for brain-computer interfaces using support vector machines and blind source separation.

    PubMed

    Halder, Sebastian; Bensch, Michael; Mellinger, Jürgen; Bogdan, Martin; Kübler, Andrea; Birbaumer, Niels; Rosenstiel, Wolfgang

    2007-01-01

    We propose a combination of blind source separation (BSS) and independent component analysis (ICA) (signal decomposition into artifacts and nonartifacts) with support vector machines (SVMs) (automatic classification) that are designed for online usage. In order to select a suitable BSS/ICA method, three ICA algorithms (JADE, Infomax, and FastICA) and one BSS algorithm (AMUSE) are evaluated to determine their ability to isolate electromyographic (EMG) and electrooculographic (EOG) artifacts into individual components. An implementation of the selected BSS/ICA method with SVMs trained to classify EMG and EOG artifacts, which enables the usage of the method as a filter in measurements with online feedback, is described. This filter is evaluated on three BCI datasets as a proof-of-concept of the method.

  7. A Scatter-Based Prototype Framework and Multi-Class Extension of Support Vector Machines

    PubMed Central

    Jenssen, Robert; Kloft, Marius; Zien, Alexander; Sonnenburg, Sören; Müller, Klaus-Robert

    2012-01-01

    We provide a novel interpretation of the dual of support vector machines (SVMs) in terms of scatter with respect to class prototypes and their mean. As a key contribution, we extend this framework to multiple classes, providing a new joint Scatter SVM algorithm, at the level of its binary counterpart in the number of optimization variables. This enables us to implement computationally efficient solvers based on sequential minimal and chunking optimization. As a further contribution, the primal problem formulation is developed in terms of regularized risk minimization and the hinge loss, revealing the score function to be used in the actual classification of test patterns. We investigate Scatter SVM properties related to generalization ability, computational efficiency, sparsity and sensitivity maps, and report promising results. PMID:23118845

  8. Integrating image quality in 2nu-SVM biometric match score fusion.

    PubMed

    Vatsa, Mayank; Singh, Richa; Noore, Afzel

    2007-10-01

    This paper proposes an intelligent 2nu-support vector machine based match score fusion algorithm to improve the performance of face and iris recognition by integrating the quality of images. The proposed algorithm applies redundant discrete wavelet transform to evaluate the underlying linear and non-linear features present in the image. A composite quality score is computed to determine the extent of smoothness, sharpness, noise, and other pertinent features present in each subband of the image. The match score and the corresponding quality score of an image are fused using 2nu-support vector machine to improve the verification performance. The proposed algorithm is experimentally validated using the FERET face database and the CASIA iris database. The verification performance and statistical evaluation show that the proposed algorithm outperforms existing fusion algorithms.

  9. Software tool for data mining and its applications

    NASA Astrophysics Data System (ADS)

    Yang, Jie; Ye, Chenzhou; Chen, Nianyi

    2002-03-01

    A software tool for data mining is introduced, which integrates pattern recognition (PCA, Fisher, clustering, hyperenvelop, regression), artificial intelligence (knowledge representation, decision trees), statistical learning (rough set, support vector machine), computational intelligence (neural network, genetic algorithm, fuzzy systems). It consists of nine function models: pattern recognition, decision trees, association rule, fuzzy rule, neural network, genetic algorithm, Hyper Envelop, support vector machine, visualization. The principle and knowledge representation of some function models of data mining are described. The software tool of data mining is realized by Visual C++ under Windows 2000. Nonmonotony in data mining is dealt with by concept hierarchy and layered mining. The software tool of data mining has satisfactorily applied in the prediction of regularities of the formation of ternary intermetallic compounds in alloy systems, and diagnosis of brain glioma.

  10. Segmentation of mosaicism in cervicographic images using support vector machines

    NASA Astrophysics Data System (ADS)

    Xue, Zhiyun; Long, L. Rodney; Antani, Sameer; Jeronimo, Jose; Thoma, George R.

    2009-02-01

    The National Library of Medicine (NLM), in collaboration with the National Cancer Institute (NCI), is creating a large digital repository of cervicographic images for the study of uterine cervix cancer prevention. One of the research goals is to automatically detect diagnostic bio-markers in these images. Reliable bio-marker segmentation in large biomedical image collections is a challenging task due to the large variation in image appearance. Methods described in this paper focus on segmenting mosaicism, which is an important vascular feature used to visually assess the degree of cervical intraepithelial neoplasia. The proposed approach uses support vector machines (SVM) trained on a ground truth dataset annotated by medical experts (which circumvents the need for vascular structure extraction). We have evaluated the performance of the proposed algorithm and experimentally demonstrated its feasibility.

  11. Improved Saturated Hydraulic Conductivity Pedotransfer Functions Using Machine Learning Methods

    NASA Astrophysics Data System (ADS)

    Araya, S. N.; Ghezzehei, T. A.

    2017-12-01

    Saturated hydraulic conductivity (Ks) is one of the fundamental hydraulic properties of soils. Its measurement, however, is cumbersome and instead pedotransfer functions (PTFs) are often used to estimate it. Despite a lot of progress over the years, generic PTFs that estimate hydraulic conductivity generally don't have a good performance. We develop significantly improved PTFs by applying state of the art machine learning techniques coupled with high-performance computing on a large database of over 20,000 soils—USKSAT and the Florida Soil Characterization databases. We compared the performance of four machine learning algorithms (k-nearest neighbors, gradient boosted model, support vector machine, and relevance vector machine) and evaluated the relative importance of several soil properties in explaining Ks. An attempt is also made to better account for soil structural properties; we evaluated the importance of variables derived from transformations of soil water retention characteristics and other soil properties. The gradient boosted models gave the best performance with root mean square errors less than 0.7 and mean errors in the order of 0.01 on a log scale of Ks [cm/h]. The effective particle size, D10, was found to be the single most important predictor. Other important predictors included percent clay, bulk density, organic carbon percent, coefficient of uniformity and values derived from water retention characteristics. Model performances were consistently better for Ks values greater than 10 cm/h. This study maximizes the extraction of information from a large database to develop generic machine learning based PTFs to estimate Ks. The study also evaluates the importance of various soil properties and their transformations in explaining Ks.

  12. Methods, systems and apparatus for adjusting duty cycle of pulse width modulated (PWM) waveforms

    DOEpatents

    Gallegos-Lopez, Gabriel; Kinoshita, Michael H; Ransom, Ray M; Perisic, Milun

    2013-05-21

    Embodiments of the present invention relate to methods, systems and apparatus for controlling operation of a multi-phase machine in a vector controlled motor drive system when the multi-phase machine operates in an overmodulation region. The disclosed embodiments provide a mechanism for adjusting a duty cycle of PWM waveforms so that the correct phase voltage command signals are applied at the angle transitions. This can reduce variations/errors in the phase voltage command signals applied to the multi-phase machine so that phase current may be properly regulated thus reducing current/torque oscillation, which can in turn improve machine efficiency and performance, as well as utilization of the DC voltage source.

  13. Classification of large-sized hyperspectral imagery using fast machine learning algorithms

    NASA Astrophysics Data System (ADS)

    Xia, Junshi; Yokoya, Naoto; Iwasaki, Akira

    2017-07-01

    We present a framework of fast machine learning algorithms in the context of large-sized hyperspectral images classification from the theoretical to a practical viewpoint. In particular, we assess the performance of random forest (RF), rotation forest (RoF), and extreme learning machine (ELM) and the ensembles of RF and ELM. These classifiers are applied to two large-sized hyperspectral images and compared to the support vector machines. To give the quantitative analysis, we pay attention to comparing these methods when working with high input dimensions and a limited/sufficient training set. Moreover, other important issues such as the computational cost and robustness against the noise are also discussed.

  14. Contributions a l'etude et a l'application industrielle de la machine asynchrone

    NASA Astrophysics Data System (ADS)

    Ouhrouche, Mohand-Ameziane

    The work presented in this thesis, done in the Electrical Drives Laboratory of Electrical and Computer Engineering Department, deals with the industrial applications of a three-phase induction machine (electrical drives and electricity generation). This thesis, characterized by its multidisciplinary content, has two major parts. The first one deals with the on-line and off-line parametric identification of the induction machine model necessary to achieve accurate vector control strategy. The second part, which is a resume of a research work sponsored by Hydro-Quebec, deals with the application of an induction machine in Asynchronous Non Utility Generators units (ANUG). As it is shown in the following, major scientific contributions are made in both two parts. In the first part of our research work, we propose a new speed sensorless vector control strategy for an induction machine, which is adaptive to the rotor resistance variations. The proposed control strategy is based on the Extended Kalman Filter approach and a decoupling controller which takes into account the rotor resistance variations. The consideration of coupled electrical and mechanical modes leads to a fifth order nonlinear model of the induction machine. The load torque is taken as a function of the rotor angular speed. The Extended Kalman Filter, based on the process's nonlinear (bilinear) model, estimate simultaneously the rotor resistance, angular speed and the flux vector from the startup to the steady state equilibrium point. The machine-converter-control system is implemented in MATLAB/SIMULINK environment and the obtained results confirm the robustness of the proposed scheme. As in the electrical drives erea, the induction machine is now widely used by small to medium power Non Utility Generator units (NUG) to produce electricity. In Quebec, these NUGs units are integrated into the Hydro-Quebec 25 kV distribution system via transformer which exhibit nonlinear characteristics. We have shown by using the ElectroMagnetic Program (EMTP) that, in some islanding scenarios, i.e. that the NUG unit is disconnected from the power grid, in addition to frequency variations, appearence of high an abnormal overvoltages, ferroresonance should occur. As a consequence, normal protective devices could fail to securely operate, which could cause serious damages to the equipment and the maintenance staff. This result, established for the first time , can be useful to improve the reliability of the NUGs units and is considered important by the power engineering community. This has led to a publication in the John Wiley & Sons Encyclopedia of Electrical and Electronics Engineering which will be available in February 1999 ( http://www.engr.wisc.edu/ ece/ece).

  15. Implementation of a parallel unstructured Euler solver on the CM-5

    NASA Technical Reports Server (NTRS)

    Morano, Eric; Mavriplis, D. J.

    1995-01-01

    An efficient unstructured 3D Euler solver is parallelized on a Thinking Machine Corporation Connection Machine 5, distributed memory computer with vectoring capability. In this paper, the single instruction multiple data (SIMD) strategy is employed through the use of the CM Fortran language and the CMSSL scientific library. The performance of the CMSSL mesh partitioner is evaluated and the overall efficiency of the parallel flow solver is discussed.

  16. Agent Based Computing Machine

    DTIC Science & Technology

    2005-12-09

    decision making logic that respond to the environment (concentration of operands - the state vector), and bias or "mood" as established by its history of...mentioned in the chart, there is no need for file management in a ABC Machine. Information is distributed, no history is maintained. The instruction set... Postgresql ) for collection of cluster samples/snapshots over intervals of time. An prototypical example of an XML file to configure and launch the ABC

  17. Supervised Time Series Event Detector for Building Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2016-04-13

    A machine learning based approach is developed to detect events that have rarely been seen in the historical data. The data can include building energy consumption, sensor data, environmental data and any data that may affect the building's energy consumption. The algorithm is a modified nonlinear Bayesian support vector machine, which examines daily energy consumption profile, detect the days with abnormal events, and diagnose the cause of the events.

  18. High Performance Distributed Computing in a Supercomputer Environment: Computational Services and Applications Issues

    NASA Technical Reports Server (NTRS)

    Kramer, Williams T. C.; Simon, Horst D.

    1994-01-01

    This tutorial proposes to be a practical guide for the uninitiated to the main topics and themes of high-performance computing (HPC), with particular emphasis to distributed computing. The intent is first to provide some guidance and directions in the rapidly increasing field of scientific computing using both massively parallel and traditional supercomputers. Because of their considerable potential computational power, loosely or tightly coupled clusters of workstations are increasingly considered as a third alternative to both the more conventional supercomputers based on a small number of powerful vector processors, as well as high massively parallel processors. Even though many research issues concerning the effective use of workstation clusters and their integration into a large scale production facility are still unresolved, such clusters are already used for production computing. In this tutorial we will utilize the unique experience made at the NAS facility at NASA Ames Research Center. Over the last five years at NAS massively parallel supercomputers such as the Connection Machines CM-2 and CM-5 from Thinking Machines Corporation and the iPSC/860 (Touchstone Gamma Machine) and Paragon Machines from Intel were used in a production supercomputer center alongside with traditional vector supercomputers such as the Cray Y-MP and C90.

  19. Use seismic colored inversion and power law committee machine based on imperial competitive algorithm for improving porosity prediction in a heterogeneous reservoir

    NASA Astrophysics Data System (ADS)

    Ansari, Hamid Reza

    2014-09-01

    In this paper we propose a new method for predicting rock porosity based on a combination of several artificial intelligence systems. The method focuses on one of the Iranian carbonate fields in the Persian Gulf. Because there is strong heterogeneity in carbonate formations, estimation of rock properties experiences more challenge than sandstone. For this purpose, seismic colored inversion (SCI) and a new approach of committee machine are used in order to improve porosity estimation. The study comprises three major steps. First, a series of sample-based attributes is calculated from 3D seismic volume. Acoustic impedance is an important attribute that is obtained by the SCI method in this study. Second, porosity log is predicted from seismic attributes using common intelligent computation systems including: probabilistic neural network (PNN), radial basis function network (RBFN), multi-layer feed forward network (MLFN), ε-support vector regression (ε-SVR) and adaptive neuro-fuzzy inference system (ANFIS). Finally, a power law committee machine (PLCM) is constructed based on imperial competitive algorithm (ICA) to combine the results of all previous predictions in a single solution. This technique is called PLCM-ICA in this paper. The results show that PLCM-ICA model improved the results of neural networks, support vector machine and neuro-fuzzy system.

  20. On the sparseness of 1-norm support vector machines.

    PubMed

    Zhang, Li; Zhou, Weida

    2010-04-01

    There is some empirical evidence available showing that 1-norm Support Vector Machines (1-norm SVMs) have good sparseness; however, both how good sparseness 1-norm SVMs can reach and whether they have a sparser representation than that of standard SVMs are not clear. In this paper we take into account the sparseness of 1-norm SVMs. Two upper bounds on the number of nonzero coefficients in the decision function of 1-norm SVMs are presented. First, the number of nonzero coefficients in 1-norm SVMs is at most equal to the number of only the exact support vectors lying on the +1 and -1 discriminating surfaces, while that in standard SVMs is equal to the number of support vectors, which implies that 1-norm SVMs have better sparseness than that of standard SVMs. Second, the number of nonzero coefficients is at most equal to the rank of the sample matrix. A brief review of the geometry of linear programming and the primal steepest edge pricing simplex method are given, which allows us to provide the proof of the two upper bounds and evaluate their tightness by experiments. Experimental results on toy data sets and the UCI data sets illustrate our analysis. Copyright 2009 Elsevier Ltd. All rights reserved.

Top