methods support vector: Topics by Science.gov

Sample records for methods support vector

Research on bearing fault diagnosis of large machinery based on mathematical morphology

NASA Astrophysics Data System (ADS)

Wang, Yu

2018-04-01

To study the automatic diagnosis of large machinery fault based on support vector machine, combining the four common faults of the large machinery, the support vector machine is used to classify and identify the fault. The extracted feature vectors are entered. The feature vector is trained and identified by multi - classification method. The optimal parameters of the support vector machine are searched by trial and error method and cross validation method. Then, the support vector machine is compared with BP neural network. The results show that the support vector machines are short in time and high in classification accuracy. It is more suitable for the research of fault diagnosis in large machinery. Therefore, it can be concluded that the training speed of support vector machines (SVM) is fast and the performance is good.
A Two-Layer Least Squares Support Vector Machine Approach to Credit Risk Assessment

NASA Astrophysics Data System (ADS)

Liu, Jingli; Li, Jianping; Xu, Weixuan; Shi, Yong

Least squares support vector machine (LS-SVM) is a revised version of support vector machine (SVM) and has been proved to be a useful tool for pattern recognition. LS-SVM had excellent generalization performance and low computational cost. In this paper, we propose a new method called two-layer least squares support vector machine which combines kernel principle component analysis (KPCA) and linear programming form of least square support vector machine. With this method sparseness and robustness is obtained while solving large dimensional and large scale database. A U.S. commercial credit card database is used to test the efficiency of our method and the result proved to be a satisfactory one.
A new method for the prediction of chatter stability lobes based on dynamic cutting force simulation model and support vector machine

NASA Astrophysics Data System (ADS)

Peng, Chong; Wang, Lun; Liao, T. Warren

2015-10-01

Currently, chatter has become the critical factor in hindering machining quality and productivity in machining processes. To avoid cutting chatter, a new method based on dynamic cutting force simulation model and support vector machine (SVM) is presented for the prediction of chatter stability lobes. The cutting force is selected as the monitoring signal, and the wavelet energy entropy theory is used to extract the feature vectors. A support vector machine is constructed using the MATLAB LIBSVM toolbox for pattern classification based on the feature vectors derived from the experimental cutting data. Then combining with the dynamic cutting force simulation model, the stability lobes diagram (SLD) can be estimated. Finally, the predicted results are compared with existing methods such as zero-order analytical (ZOA) and semi-discretization (SD) method as well as actual cutting experimental results to confirm the validity of this new method.
Currency crisis indication by using ensembles of support vector machine classifiers

NASA Astrophysics Data System (ADS)

Ramli, Nor Azuana; Ismail, Mohd Tahir; Wooi, Hooy Chee

2014-07-01

There are many methods that had been experimented in the analysis of currency crisis. However, not all methods could provide accurate indications. This paper introduces an ensemble of classifiers by using Support Vector Machine that's never been applied in analyses involving currency crisis before with the aim of increasing the indication accuracy. The proposed ensemble classifiers' performances are measured using percentage of accuracy, root mean squared error (RMSE), area under the Receiver Operating Characteristics (ROC) curve and Type II error. The performances of an ensemble of Support Vector Machine classifiers are compared with the single Support Vector Machine classifier and both of classifiers are tested on the data set from 27 countries with 12 macroeconomic indicators for each country. From our analyses, the results show that the ensemble of Support Vector Machine classifiers outperforms single Support Vector Machine classifier on the problem involving indicating a currency crisis in terms of a range of standard measures for comparing the performance of classifiers.
Multiclass Reduced-Set Support Vector Machines

NASA Technical Reports Server (NTRS)

Tang, Benyang; Mazzoni, Dominic

2006-01-01

There are well-established methods for reducing the number of support vectors in a trained binary support vector machine, often with minimal impact on accuracy. We show how reduced-set methods can be applied to multiclass SVMs made up of several binary SVMs, with significantly better results than reducing each binary SVM independently. Our approach is based on Burges' approach that constructs each reduced-set vector as the pre-image of a vector in kernel space, but we extend this by recomputing the SVM weights and bias optimally using the original SVM objective function. This leads to greater accuracy for a binary reduced-set SVM, and also allows vectors to be 'shared' between multiple binary SVMs for greater multiclass accuracy with fewer reduced-set vectors. We also propose computing pre-images using differential evolution, which we have found to be more robust than gradient descent alone. We show experimental results on a variety of problems and find that this new approach is consistently better than previous multiclass reduced-set methods, sometimes with a dramatic difference.
A Power Transformers Fault Diagnosis Model Based on Three DGA Ratios and PSO Optimization SVM

NASA Astrophysics Data System (ADS)

Ma, Hongzhe; Zhang, Wei; Wu, Rongrong; Yang, Chunyan

2018-03-01

In order to make up for the shortcomings of existing transformer fault diagnosis methods in dissolved gas-in-oil analysis (DGA) feature selection and parameter optimization, a transformer fault diagnosis model based on the three DGA ratios and particle swarm optimization (PSO) optimize support vector machine (SVM) is proposed. Using transforming support vector machine to the nonlinear and multi-classification SVM, establishing the particle swarm optimization to optimize the SVM multi classification model, and conducting transformer fault diagnosis combined with the cross validation principle. The fault diagnosis results show that the average accuracy of test method is better than the standard support vector machine and genetic algorithm support vector machine, and the proposed method can effectively improve the accuracy of transformer fault diagnosis is proved.
Lysine acetylation sites prediction using an ensemble of support vector machine classifiers.

PubMed

Xu, Yan; Wang, Xiao-Bo; Ding, Jun; Wu, Ling-Yun; Deng, Nai-Yang

2010-05-07

Lysine acetylation is an essentially reversible and high regulated post-translational modification which regulates diverse protein properties. Experimental identification of acetylation sites is laborious and expensive. Hence, there is significant interest in the development of computational methods for reliable prediction of acetylation sites from amino acid sequences. In this paper we use an ensemble of support vector machine classifiers to perform this work. The experimentally determined acetylation lysine sites are extracted from Swiss-Prot database and scientific literatures. Experiment results show that an ensemble of support vector machine classifiers outperforms single support vector machine classifier and other computational methods such as PAIL and LysAcet on the problem of predicting acetylation lysine sites. The resulting method has been implemented in EnsemblePail, a web server for lysine acetylation sites prediction available at http://www.aporc.org/EnsemblePail/. Copyright (c) 2010 Elsevier Ltd. All rights reserved.
Dual linear structured support vector machine tracking method via scale correlation filter

NASA Astrophysics Data System (ADS)

Li, Weisheng; Chen, Yanquan; Xiao, Bin; Feng, Chen

2018-01-01

Adaptive tracking-by-detection methods based on structured support vector machine (SVM) performed well on recent visual tracking benchmarks. However, these methods did not adopt an effective strategy of object scale estimation, which limits the overall tracking performance. We present a tracking method based on a dual linear structured support vector machine (DLSSVM) with a discriminative scale correlation filter. The collaborative tracker comprised of a DLSSVM model and a scale correlation filter obtains good results in tracking target position and scale estimation. The fast Fourier transform is applied for detection. Extensive experiments show that our tracking approach outperforms many popular top-ranking trackers. On a benchmark including 100 challenging video sequences, the average precision of the proposed method is 82.8%.
Comparing machine learning and logistic regression methods for predicting hypertension using a combination of gene expression and next-generation sequencing data.

PubMed

Held, Elizabeth; Cape, Joshua; Tintle, Nathan

2016-01-01

Machine learning methods continue to show promise in the analysis of data from genetic association studies because of the high number of variables relative to the number of observations. However, few best practices exist for the application of these methods. We extend a recently proposed supervised machine learning approach for predicting disease risk by genotypes to be able to incorporate gene expression data and rare variants. We then apply 2 different versions of the approach (radial and linear support vector machines) to simulated data from Genetic Analysis Workshop 19 and compare performance to logistic regression. Method performance was not radically different across the 3 methods, although the linear support vector machine tended to show small gains in predictive ability relative to a radial support vector machine and logistic regression. Importantly, as the number of genes in the models was increased, even when those genes contained causal rare variants, model predictive ability showed a statistically significant decrease in performance for both the radial support vector machine and logistic regression. The linear support vector machine showed more robust performance to the inclusion of additional genes. Further work is needed to evaluate machine learning approaches on larger samples and to evaluate the relative improvement in model prediction from the incorporation of gene expression data.
Support vector machine applied to predict the zoonotic potential of E. coli O157 cattle isolates

USDA-ARS?s Scientific Manuscript database

Methods based on sequence data analysis facilitate the tracking of disease outbreaks, allow relationships between strains to be reconstructed and virulence factors to be identified. However, these methods are used postfactum after an outbreak has happened. Here, we show that support vector machine a...
Research on intrusion detection based on Kohonen network and support vector machine

NASA Astrophysics Data System (ADS)

Shuai, Chunyan; Yang, Hengcheng; Gong, Zeweiyi

2018-05-01

In view of the problem of low detection accuracy and the long detection time of support vector machine, which directly applied to the network intrusion detection system. Optimization of SVM parameters can greatly improve the detection accuracy, but it can not be applied to high-speed network because of the long detection time. a method based on Kohonen neural network feature selection is proposed to reduce the optimization time of support vector machine parameters. Firstly, this paper is to calculate the weights of the KDD99 network intrusion data by Kohonen network and select feature by weight. Then, after the feature selection is completed, genetic algorithm (GA) and grid search method are used for parameter optimization to find the appropriate parameters and classify them by support vector machines. By comparing experiments, it is concluded that feature selection can reduce the time of parameter optimization, which has little influence on the accuracy of classification. The experiments suggest that the support vector machine can be used in the network intrusion detection system and reduce the missing rate.
Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.

PubMed

Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi

2013-01-01

The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.
Testing of the Support Vector Machine for Binary-Class Classification

NASA Technical Reports Server (NTRS)

Scholten, Matthew

2011-01-01

The Support Vector Machine is a powerful algorithm, useful in classifying data in to species. The Support Vector Machines implemented in this research were used as classifiers for the final stage in a Multistage Autonomous Target Recognition system. A single kernel SVM known as SVMlight, and a modified version known as a Support Vector Machine with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SMV as a method for classification. From trial to trial, SVM produces consistent results
Predicting domain-domain interaction based on domain profiles with feature selection and support vector machines

PubMed Central

2010-01-01

Background Protein-protein interaction (PPI) plays essential roles in cellular functions. The cost, time and other limitations associated with the current experimental methods have motivated the development of computational methods for predicting PPIs. As protein interactions generally occur via domains instead of the whole molecules, predicting domain-domain interaction (DDI) is an important step toward PPI prediction. Computational methods developed so far have utilized information from various sources at different levels, from primary sequences, to molecular structures, to evolutionary profiles. Results In this paper, we propose a computational method to predict DDI using support vector machines (SVMs), based on domains represented as interaction profile hidden Markov models (ipHMM) where interacting residues in domains are explicitly modeled according to the three dimensional structural information available at the Protein Data Bank (PDB). Features about the domains are extracted first as the Fisher scores derived from the ipHMM and then selected using singular value decomposition (SVD). Domain pairs are represented by concatenating their selected feature vectors, and classified by a support vector machine trained on these feature vectors. The method is tested by leave-one-out cross validation experiments with a set of interacting protein pairs adopted from the 3DID database. The prediction accuracy has shown significant improvement as compared to InterPreTS (Interaction Prediction through Tertiary Structure), an existing method for PPI prediction that also uses the sequences and complexes of known 3D structure. Conclusions We show that domain-domain interaction prediction can be significantly enhanced by exploiting information inherent in the domain profiles via feature selection based on Fisher scores, singular value decomposition and supervised learning based on support vector machines. Datasets and source code are freely available on the web at http://liao.cis.udel.edu/pub/svdsvm. Implemented in Matlab and supported on Linux and MS Windows. PMID:21034480
Scattering transform and LSPTSVM based fault diagnosis of rotating machinery

NASA Astrophysics Data System (ADS)

Ma, Shangjun; Cheng, Bo; Shang, Zhaowei; Liu, Geng

2018-05-01

This paper proposes an algorithm for fault diagnosis of rotating machinery to overcome the shortcomings of classical techniques which are noise sensitive in feature extraction and time consuming for training. Based on the scattering transform and the least squares recursive projection twin support vector machine (LSPTSVM), the method has the advantages of high efficiency and insensitivity for noise signal. Using the energy of the scattering coefficients in each sub-band, the features of the vibration signals are obtained. Then, an LSPTSVM classifier is used for fault diagnosis. The new method is compared with other common methods including the proximal support vector machine, the standard support vector machine and multi-scale theory by using fault data for two systems, a motor bearing and a gear box. The results show that the new method proposed in this study is more effective for fault diagnosis of rotating machinery.
Fast support vector data descriptions for novelty detection.

PubMed

Liu, Yi-Hung; Liu, Yan-Chen; Chen, Yen-Jen

2010-08-01

Support vector data description (SVDD) has become a very attractive kernel method due to its good results in many novelty detection problems. However, the decision function of SVDD is expressed in terms of the kernel expansion, which results in a run-time complexity linear in the number of support vectors. For applications where fast real-time response is needed, how to speed up the decision function is crucial. This paper aims at dealing with the issue of reducing the testing time complexity of SVDD. A method called fast SVDD (F-SVDD) is proposed. Unlike the traditional methods which all try to compress a kernel expansion into one with fewer terms, the proposed F-SVDD directly finds the preimage of a feature vector, and then uses a simple relationship between this feature vector and the SVDD sphere center to re-express the center with a single vector. The decision function of F-SVDD contains only one kernel term, and thus the decision boundary of F-SVDD is only spherical in the original space. Hence, the run-time complexity of the F-SVDD decision function is no longer linear in the support vectors, but is a constant, no matter how large the training set size is. In this paper, we also propose a novel direct preimage-finding method, which is noniterative and involves no free parameters. The unique preimage can be obtained in real time by the proposed direct method without taking trial-and-error. For demonstration, several real-world data sets and a large-scale data set, the extended MIT face data set, are used in experiments. In addition, a practical industry example regarding liquid crystal display micro-defect inspection is also used to compare the applicability of SVDD and our proposed F-SVDD when faced with mass data input. The results are very encouraging.
Power line identification of millimeter wave radar based on PCA-GS-SVM

NASA Astrophysics Data System (ADS)

Fang, Fang; Zhang, Guifeng; Cheng, Yansheng

2017-12-01

Aiming at the problem that the existing detection method can not effectively solve the security of UAV's ultra low altitude flight caused by power line, a power line recognition method based on grid search (GS) and the principal component analysis and support vector machine (PCA-SVM) is proposed. Firstly, the candidate line of Hough transform is reduced by PCA, and the main feature of candidate line is extracted. Then, upport vector machine (SVM is) optimized by grid search method (GS). Finally, using support vector machine classifier optimized parameters to classify the candidate line. MATLAB simulation results show that this method can effectively identify the power line and noise, and has high recognition accuracy and algorithm efficiency.
The optional selection of micro-motion feature based on Support Vector Machine

NASA Astrophysics Data System (ADS)

Li, Bo; Ren, Hongmei; Xiao, Zhi-he; Sheng, Jing

2017-11-01

Micro-motion form of target is multiple, different micro-motion forms are apt to be modulated, which makes it difficult for feature extraction and recognition. Aiming at feature extraction of cone-shaped objects with different micro-motion forms, this paper proposes the best selection method of micro-motion feature based on support vector machine. After the time-frequency distribution of radar echoes, comparing the time-frequency spectrum of objects with different micro-motion forms, features are extracted based on the differences between the instantaneous frequency variations of different micro-motions. According to the methods based on SVM (Support Vector Machine) features are extracted, then the best features are acquired. Finally, the result shows the method proposed in this paper is feasible under the test condition of certain signal-to-noise ratio(SNR).
Automatic event detection in low SNR microseismic signals based on multi-scale permutation entropy and a support vector machine

NASA Astrophysics Data System (ADS)

Jia, Rui-Sheng; Sun, Hong-Mei; Peng, Yan-Jun; Liang, Yong-Quan; Lu, Xin-Ming

2017-07-01

Microseismic monitoring is an effective means for providing early warning of rock or coal dynamical disasters, and its first step is microseismic event detection, although low SNR microseismic signals often cannot effectively be detected by routine methods. To solve this problem, this paper presents permutation entropy and a support vector machine to detect low SNR microseismic events. First, an extraction method of signal features based on multi-scale permutation entropy is proposed by studying the influence of the scale factor on the signal permutation entropy. Second, the detection model of low SNR microseismic events based on the least squares support vector machine is built by performing a multi-scale permutation entropy calculation for the collected vibration signals, constructing a feature vector set of signals. Finally, a comparative analysis of the microseismic events and noise signals in the experiment proves that the different characteristics of the two can be fully expressed by using multi-scale permutation entropy. The detection model of microseismic events combined with the support vector machine, which has the features of high classification accuracy and fast real-time algorithms, can meet the requirements of online, real-time extractions of microseismic events.
Object recognition of ladar with support vector machine

NASA Astrophysics Data System (ADS)

Sun, Jian-Feng; Li, Qi; Wang, Qi

2005-01-01

Intensity, range and Doppler images can be obtained by using laser radar. Laser radar can detect much more object information than other detecting sensor, such as passive infrared imaging and synthetic aperture radar (SAR), so it is well suited as the sensor of object recognition. Traditional method of laser radar object recognition is extracting target features, which can be influenced by noise. In this paper, a laser radar recognition method-Support Vector Machine is introduced. Support Vector Machine (SVM) is a new hotspot of recognition research after neural network. It has well performance on digital written and face recognition. Two series experiments about SVM designed for preprocessing and non-preprocessing samples are performed by real laser radar images, and the experiments results are compared.

Community detection in complex networks using proximate support vector clustering

NASA Astrophysics Data System (ADS)

Wang, Feifan; Zhang, Baihai; Chai, Senchun; Xia, Yuanqing

2018-03-01

Community structure, one of the most attention attracting properties in complex networks, has been a cornerstone in advances of various scientific branches. A number of tools have been involved in recent studies concentrating on the community detection algorithms. In this paper, we propose a support vector clustering method based on a proximity graph, owing to which the introduced algorithm surpasses the traditional support vector approach both in accuracy and complexity. Results of extensive experiments undertaken on computer generated networks and real world data sets illustrate competent performances in comparison with the other counterparts.
Bayesian Kernel Methods for Non-Gaussian Distributions: Binary and Multi-class Classification Problems

DTIC Science & Technology

2013-05-28

those of the support vector machine and relevance vector machine, and the model runs more quickly than the other algorithms . When one class occurs...incremental support vector machine algorithm for online learning when fewer than 50 data points are available. (a) Papers published in peer-reviewed journals...learning environments, where data processing occurs one observation at a time and the classification algorithm improves over time with new
Fabric wrinkle characterization and classification using modified wavelet coefficients and optimized support-vector-machine classifier

USDA-ARS?s Scientific Manuscript database

This paper presents a novel wrinkle evaluation method that uses modified wavelet coefficients and an optimized support-vector-machine (SVM) classification scheme to characterize and classify wrinkle appearance of fabric. Fabric images were decomposed with the wavelet transform (WT), and five parame...
Product Quality Modelling Based on Incremental Support Vector Machine

NASA Astrophysics Data System (ADS)

Wang, J.; Zhang, W.; Qin, B.; Shi, W.

2012-05-01

Incremental Support vector machine (ISVM) is a new learning method developed in recent years based on the foundations of statistical learning theory. It is suitable for the problem of sequentially arriving field data and has been widely used for product quality prediction and production process optimization. However, the traditional ISVM learning does not consider the quality of the incremental data which may contain noise and redundant data; it will affect the learning speed and accuracy to a great extent. In order to improve SVM training speed and accuracy, a modified incremental support vector machine (MISVM) is proposed in this paper. Firstly, the margin vectors are extracted according to the Karush-Kuhn-Tucker (KKT) condition; then the distance from the margin vectors to the final decision hyperplane is calculated to evaluate the importance of margin vectors, where the margin vectors are removed while their distance exceed the specified value; finally, the original SVs and remaining margin vectors are used to update the SVM. The proposed MISVM can not only eliminate the unimportant samples such as noise samples, but also can preserve the important samples. The MISVM has been experimented on two public data and one field data of zinc coating weight in strip hot-dip galvanizing, and the results shows that the proposed method can improve the prediction accuracy and the training speed effectively. Furthermore, it can provide the necessary decision supports and analysis tools for auto control of product quality, and also can extend to other process industries, such as chemical process and manufacturing process.
The application of artificial neural networks and support vector regression for simultaneous spectrophotometric determination of commercial eye drop contents

NASA Astrophysics Data System (ADS)

Valizadeh, Maryam; Sohrabi, Mahmoud Reza

2018-03-01

In the present study, artificial neural networks (ANNs) and support vector regression (SVR) as intelligent methods coupled with UV spectroscopy for simultaneous quantitative determination of Dorzolamide (DOR) and Timolol (TIM) in eye drop. Several synthetic mixtures were analyzed for validating the proposed methods. At first, neural network time series, which one type of network from the artificial neural network was employed and its efficiency was evaluated. Afterwards, the radial basis network was applied as another neural network. Results showed that the performance of this method is suitable for predicting. Finally, support vector regression was proposed to construct the Zilomole prediction model. Also, root mean square error (RMSE) and mean recovery (%) were calculated for SVR method. Moreover, the proposed methods were compared to the high-performance liquid chromatography (HPLC) as a reference method. One way analysis of variance (ANOVA) test at the 95% confidence level applied to the comparison results of suggested and reference methods that there were no significant differences between them. Also, the effect of interferences was investigated in spike solutions.
Prediction of hourly PM2.5 using a space-time support vector regression model

NASA Astrophysics Data System (ADS)

Yang, Wentao; Deng, Min; Xu, Feng; Wang, Hang

2018-05-01

Real-time air quality prediction has been an active field of research in atmospheric environmental science. The existing methods of machine learning are widely used to predict pollutant concentrations because of their enhanced ability to handle complex non-linear relationships. However, because pollutant concentration data, as typical geospatial data, also exhibit spatial heterogeneity and spatial dependence, they may violate the assumptions of independent and identically distributed random variables in most of the machine learning methods. As a result, a space-time support vector regression model is proposed to predict hourly PM2.5 concentrations. First, to address spatial heterogeneity, spatial clustering is executed to divide the study area into several homogeneous or quasi-homogeneous subareas. To handle spatial dependence, a Gauss vector weight function is then developed to determine spatial autocorrelation variables as part of the input features. Finally, a local support vector regression model with spatial autocorrelation variables is established for each subarea. Experimental data on PM2.5 concentrations in Beijing are used to verify whether the results of the proposed model are superior to those of other methods.
Sparse kernel methods for high-dimensional survival data.

PubMed

Evers, Ludger; Messow, Claudia-Martina

2008-07-15

Sparse kernel methods like support vector machines (SVM) have been applied with great success to classification and (standard) regression settings. Existing support vector classification and regression techniques however are not suitable for partly censored survival data, which are typically analysed using Cox's proportional hazards model. As the partial likelihood of the proportional hazards model only depends on the covariates through inner products, it can be 'kernelized'. The kernelized proportional hazards model however yields a solution that is dense, i.e. the solution depends on all observations. One of the key features of an SVM is that it yields a sparse solution, depending only on a small fraction of the training data. We propose two methods. One is based on a geometric idea, where-akin to support vector classification-the margin between the failed observation and the observations currently at risk is maximised. The other approach is based on obtaining a sparse model by adding observations one after another akin to the Import Vector Machine (IVM). Data examples studied suggest that both methods can outperform competing approaches. Software is available under the GNU Public License as an R package and can be obtained from the first author's website http://www.maths.bris.ac.uk/~maxle/software.html.
Product demand forecasts using wavelet kernel support vector machine and particle swarm optimization in manufacture system

NASA Astrophysics Data System (ADS)

Wu, Qi

2010-03-01

Demand forecasts play a crucial role in supply chain management. The future demand for a certain product is the basis for the respective replenishment systems. Aiming at demand series with small samples, seasonal character, nonlinearity, randomicity and fuzziness, the existing support vector kernel does not approach the random curve of the sales time series in the space (quadratic continuous integral space). In this paper, we present a hybrid intelligent system combining the wavelet kernel support vector machine and particle swarm optimization for demand forecasting. The results of application in car sale series forecasting show that the forecasting approach based on the hybrid PSOWv-SVM model is effective and feasible, the comparison between the method proposed in this paper and other ones is also given, which proves that this method is, for the discussed example, better than hybrid PSOv-SVM and other traditional methods.
Vectorization and parallelization of the finite strip method for dynamic Mindlin plate problems

NASA Technical Reports Server (NTRS)

Chen, Hsin-Chu; He, Ai-Fang

1993-01-01

The finite strip method is a semi-analytical finite element process which allows for a discrete analysis of certain types of physical problems by discretizing the domain of the problem into finite strips. This method decomposes a single large problem into m smaller independent subproblems when m harmonic functions are employed, thus yielding natural parallelism at a very high level. In this paper we address vectorization and parallelization strategies for the dynamic analysis of simply-supported Mindlin plate bending problems and show how to prevent potential conflicts in memory access during the assemblage process. The vector and parallel implementations of this method and the performance results of a test problem under scalar, vector, and vector-concurrent execution modes on the Alliant FX/80 are also presented.
Analysis of programming properties and the row-column generation method for 1-norm support vector machines.

PubMed

Zhang, Li; Zhou, WeiDa

2013-12-01

This paper deals with fast methods for training a 1-norm support vector machine (SVM). First, we define a specific class of linear programming with many sparse constraints, i.e., row-column sparse constraint linear programming (RCSC-LP). In nature, the 1-norm SVM is a sort of RCSC-LP. In order to construct subproblems for RCSC-LP and solve them, a family of row-column generation (RCG) methods is introduced. RCG methods belong to a category of decomposition techniques, and perform row and column generations in a parallel fashion. Specially, for the 1-norm SVM, the maximum size of subproblems of RCG is identical with the number of Support Vectors (SVs). We also introduce a semi-deleting rule for RCG methods and prove the convergence of RCG methods when using the semi-deleting rule. Experimental results on toy data and real-world datasets illustrate that it is efficient to use RCG to train the 1-norm SVM, especially in the case of small SVs. Copyright © 2013 Elsevier Ltd. All rights reserved.
Distributed collaborative probabilistic design for turbine blade-tip radial running clearance using support vector machine of regression

NASA Astrophysics Data System (ADS)

Fei, Cheng-Wei; Bai, Guang-Chen

2014-12-01

To improve the computational precision and efficiency of probabilistic design for mechanical dynamic assembly like the blade-tip radial running clearance (BTRRC) of gas turbine, a distribution collaborative probabilistic design method-based support vector machine of regression (SR)(called as DCSRM) is proposed by integrating distribution collaborative response surface method and support vector machine regression model. The mathematical model of DCSRM is established and the probabilistic design idea of DCSRM is introduced. The dynamic assembly probabilistic design of aeroengine high-pressure turbine (HPT) BTRRC is accomplished to verify the proposed DCSRM. The analysis results reveal that the optimal static blade-tip clearance of HPT is gained for designing BTRRC, and improving the performance and reliability of aeroengine. The comparison of methods shows that the DCSRM has high computational accuracy and high computational efficiency in BTRRC probabilistic analysis. The present research offers an effective way for the reliability design of mechanical dynamic assembly and enriches mechanical reliability theory and method.
Color image segmentation with support vector machines: applications to road signs detection.

PubMed

Cyganek, Bogusław

2008-08-01

In this paper we propose efficient color segmentation method which is based on the Support Vector Machine classifier operating in a one-class mode. The method has been developed especially for the road signs recognition system, although it can be used in other applications. The main advantage of the proposed method comes from the fact that the segmentation of characteristic colors is performed not in the original but in the higher dimensional feature space. By this a better data encapsulation with a linear hypersphere can be usually achieved. Moreover, the classifier does not try to capture the whole distribution of the input data which is often difficult to achieve. Instead, the characteristic data samples, called support vectors, are selected which allow construction of the tightest hypersphere that encloses majority of the input data. Then classification of a test data simply consists in a measurement of its distance to a centre of the found hypersphere. The experimental results show high accuracy and speed of the proposed method.
Application of Classification Models to Pharyngeal High-Resolution Manometry

ERIC Educational Resources Information Center

Mielens, Jason D.; Hoffman, Matthew R.; Ciucci, Michelle R.; McCulloch, Timothy M.; Jiang, Jack J.

2012-01-01

Purpose: The authors present 3 methods of performing pattern recognition on spatiotemporal plots produced by pharyngeal high-resolution manometry (HRM). Method: Classification models, including the artificial neural networks (ANNs) multilayer perceptron (MLP) and learning vector quantization (LVQ), as well as support vector machines (SVM), were…
Using the Relevance Vector Machine Model Combined with Local Phase Quantization to Predict Protein-Protein Interactions from Protein Sequences.

PubMed

An, Ji-Yong; Meng, Fan-Rong; You, Zhu-Hong; Fang, Yu-Hong; Zhao, Yu-Jun; Zhang, Ming

2016-01-01

We propose a novel computational method known as RVM-LPQ that combines the Relevance Vector Machine (RVM) model and Local Phase Quantization (LPQ) to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the LPQ feature representation on a Position Specific Scoring Matrix (PSSM), reducing the influence of noise using a Principal Component Analysis (PCA), and using a Relevance Vector Machine (RVM) based classifier. We perform 5-fold cross-validation experiments on Yeast and Human datasets, and we achieve very high accuracies of 92.65% and 97.62%, respectively, which is significantly better than previous works. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM) classifier on the Yeast dataset. The experimental results demonstrate that our RVM-LPQ method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool for future proteomics research.
Seminal quality prediction using data mining methods.

PubMed

Sahoo, Anoop J; Kumar, Yugal

2014-01-01

Now-a-days, some new classes of diseases have come into existences which are known as lifestyle diseases. The main reasons behind these diseases are changes in the lifestyle of people such as alcohol drinking, smoking, food habits etc. After going through the various lifestyle diseases, it has been found that the fertility rates (sperm quantity) in men has considerably been decreasing in last two decades. Lifestyle factors as well as environmental factors are mainly responsible for the change in the semen quality. The objective of this paper is to identify the lifestyle and environmental features that affects the seminal quality and also fertility rate in man using data mining methods. The five artificial intelligence techniques such as Multilayer perceptron (MLP), Decision Tree (DT), Navie Bayes (Kernel), Support vector machine+Particle swarm optimization (SVM+PSO) and Support vector machine (SVM) have been applied on fertility dataset to evaluate the seminal quality and also to predict the person is either normal or having altered fertility rate. While the eight feature selection techniques such as support vector machine (SVM), neural network (NN), evolutionary logistic regression (LR), support vector machine plus particle swarm optimization (SVM+PSO), principle component analysis (PCA), chi-square test, correlation and T-test methods have been used to identify more relevant features which affect the seminal quality. These techniques are applied on fertility dataset which contains 100 instances with nine attribute with two classes. The experimental result shows that SVM+PSO provides higher accuracy and area under curve (AUC) rate (94% & 0.932) among multi-layer perceptron (MLP) (92% & 0.728), Support Vector Machines (91% & 0.758), Navie Bayes (Kernel) (89% & 0.850) and Decision Tree (89% & 0.735) for some of the seminal parameters. This paper also focuses on the feature selection process i.e. how to select the features which are more important for prediction of fertility rate. In this paper, eight feature selection methods are applied on fertility dataset to find out a set of good features. The investigational results shows that childish diseases (0.079) and high fever features (0.057) has less impact on fertility rate while age (0.8685), season (0.843), surgical intervention (0.7683), alcohol consumption (0.5992), smoking habit (0.575), number of hours spent on setting (0.4366) and accident (0.5973) features have more impact. It is also observed that feature selection methods increase the accuracy of above mentioned techniques (multilayer perceptron 92%, support vector machine 91%, SVM+PSO 94%, Navie Bayes (Kernel) 89% and decision tree 89%) as compared to without feature selection methods (multilayer perceptron 86%, support vector machine 86%, SVM+PSO 85%, Navie Bayes (Kernel) 83% and decision tree 84%) which shows the applicability of feature selection methods in prediction. This paper lightens the application of artificial techniques in medical domain. From this paper, it can be concluded that data mining methods can be used to predict a person with or without disease based on environmental and lifestyle parameters/features rather than undergoing various medical test. In this paper, five data mining techniques are used to predict the fertility rate and among which SVM+PSO provide more accurate results than support vector machine and decision tree.
Subpixel urban land cover estimation: comparing cubist, random forests, and support vector regression

Treesearch

Jeffrey T. Walton

2008-01-01

Three machine learning subpixel estimation methods (Cubist, Random Forests, and support vector regression) were applied to estimate urban cover. Urban forest canopy cover and impervious surface cover were estimated from Landsat-7 ETM+ imagery using a higher resolution cover map resampled to 30 m as training and reference data. Three different band combinations (...
[Support vector machine?assisted diagnosis of human malignant gastric tissues based on dielectric properties].

PubMed

Zhang, Sa; Li, Zhou; Xin, Xue-Gang

2017-12-20

To achieve differential diagnosis of normal and malignant gastric tissues based on discrepancies in their dielectric properties using support vector machine. The dielectric properties of normal and malignant gastric tissues at the frequency ranging from 42.58 to 500 MHz were measured by coaxial probe method, and the Cole?Cole model was used to fit the measured data. Receiver?operating characteristic (ROC) curve analysis was used to evaluate the discrimination capability with respect to permittivity, conductivity, and Cole?Cole fitting parameters. Support vector machine was used for discriminating normal and malignant gastric tissues, and the discrimination accuracy was calculated using k?fold cross? The area under the ROC curve was above 0.8 for permittivity at the 5 frequencies at the lower end of the measured frequency range. The combination of the support vector machine with the permittivity at all these 5 frequencies combined achieved the highest discrimination accuracy of 84.38% with a MATLAB runtime of 3.40 s. The support vector machine?assisted diagnosis is feasible for human malignant gastric tissues based on the dielectric properties.
Predicting complications of percutaneous coronary intervention using a novel support vector method

PubMed Central

Lee, Gyemin; Gurm, Hitinder S; Syed, Zeeshan

2013-01-01

Objective To explore the feasibility of a novel approach using an augmented one-class learning algorithm to model in-laboratory complications of percutaneous coronary intervention (PCI). Materials and methods Data from the Blue Cross Blue Shield of Michigan Cardiovascular Consortium (BMC2) multicenter registry for the years 2007 and 2008 (n=41 016) were used to train models to predict 13 different in-laboratory PCI complications using a novel one-plus-class support vector machine (OP-SVM) algorithm. The performance of these models in terms of discrimination and calibration was compared to the performance of models trained using the following classification algorithms on BMC2 data from 2009 (n=20 289): logistic regression (LR), one-class support vector machine classification (OC-SVM), and two-class support vector machine classification (TC-SVM). For the OP-SVM and TC-SVM approaches, variants of the algorithms with cost-sensitive weighting were also considered. Results The OP-SVM algorithm and its cost-sensitive variant achieved the highest area under the receiver operating characteristic curve for the majority of the PCI complications studied (eight cases). Similar improvements were observed for the Hosmer–Lemeshow χ2 value (seven cases) and the mean cross-entropy error (eight cases). Conclusions The OP-SVM algorithm based on an augmented one-class learning problem improved discrimination and calibration across different PCI complications relative to LR and traditional support vector machine classification. Such an approach may have value in a broader range of clinical domains. PMID:23599229
Research on Classification of Chinese Text Data Based on SVM

NASA Astrophysics Data System (ADS)

Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao

2017-09-01

Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.
Recursive feature selection with significant variables of support vectors.

PubMed

Tsai, Chen-An; Huang, Chien-Hsun; Chang, Ching-Wei; Chen, Chun-Houh

2012-01-01

The development of DNA microarray makes researchers screen thousands of genes simultaneously and it also helps determine high- and low-expression level genes in normal and disease tissues. Selecting relevant genes for cancer classification is an important issue. Most of the gene selection methods use univariate ranking criteria and arbitrarily choose a threshold to choose genes. However, the parameter setting may not be compatible to the selected classification algorithms. In this paper, we propose a new gene selection method (SVM-t) based on the use of t-statistics embedded in support vector machine. We compared the performance to two similar SVM-based methods: SVM recursive feature elimination (SVMRFE) and recursive support vector machine (RSVM). The three methods were compared based on extensive simulation experiments and analyses of two published microarray datasets. In the simulation experiments, we found that the proposed method is more robust in selecting informative genes than SVMRFE and RSVM and capable to attain good classification performance when the variations of informative and noninformative genes are different. In the analysis of two microarray datasets, the proposed method yields better performance in identifying fewer genes with good prediction accuracy, compared to SVMRFE and RSVM.

Fuzzy support vector machines for adaptive Morse code recognition.

PubMed

Yang, Cheng-Hong; Jin, Li-Cheng; Chuang, Li-Yeh

2006-11-01

Morse code is now being harnessed for use in rehabilitation applications of augmentative-alternative communication and assistive technology, facilitating mobility, environmental control and adapted worksite access. In this paper, Morse code is selected as a communication adaptive device for persons who suffer from muscle atrophy, cerebral palsy or other severe handicaps. A stable typing rate is strictly required for Morse code to be effective as a communication tool. Therefore, an adaptive automatic recognition method with a high recognition rate is needed. The proposed system uses both fuzzy support vector machines and the variable-degree variable-step-size least-mean-square algorithm to achieve these objectives. We apply fuzzy memberships to each point, and provide different contributions to the decision learning function for support vector machines. Statistical analyses demonstrated that the proposed method elicited a higher recognition rate than other algorithms in the literature.
Extrapolation methods for vector sequences

NASA Technical Reports Server (NTRS)

Smith, David A.; Ford, William F.; Sidi, Avram

1987-01-01

This paper derives, describes, and compares five extrapolation methods for accelerating convergence of vector sequences or transforming divergent vector sequences to convergent ones. These methods are the scalar epsilon algorithm (SEA), vector epsilon algorithm (VEA), topological epsilon algorithm (TEA), minimal polynomial extrapolation (MPE), and reduced rank extrapolation (RRE). MPE and RRE are first derived and proven to give the exact solution for the right 'essential degree' k. Then, Brezinski's (1975) generalization of the Shanks-Schmidt transform is presented; the generalized form leads from systems of equations to TEA. The necessary connections are then made with SEA and VEA. The algorithms are extended to the nonlinear case by cycling, the error analysis for MPE and VEA is sketched, and the theoretical support for quadratic convergence is discussed. Strategies for practical implementation of the methods are considered.
Support vector machine based classification of fast Fourier transform spectroscopy of proteins

NASA Astrophysics Data System (ADS)

Lazarevic, Aleksandar; Pokrajac, Dragoljub; Marcano, Aristides; Melikechi, Noureddine

2009-02-01

Fast Fourier transform spectroscopy has proved to be a powerful method for study of the secondary structure of proteins since peak positions and their relative amplitude are affected by the number of hydrogen bridges that sustain this secondary structure. However, to our best knowledge, the method has not been used yet for identification of proteins within a complex matrix like a blood sample. The principal reason is the apparent similarity of protein infrared spectra with actual differences usually masked by the solvent contribution and other interactions. In this paper, we propose a novel machine learning based method that uses protein spectra for classification and identification of such proteins within a given sample. The proposed method uses principal component analysis (PCA) to identify most important linear combinations of original spectral components and then employs support vector machine (SVM) classification model applied on such identified combinations to categorize proteins into one of given groups. Our experiments have been performed on the set of four different proteins, namely: Bovine Serum Albumin, Leptin, Insulin-like Growth Factor 2 and Osteopontin. Our proposed method of applying principal component analysis along with support vector machines exhibits excellent classification accuracy when identifying proteins using their infrared spectra.
1-norm support vector novelty detection and its sparseness.

PubMed

Zhang, Li; Zhou, WeiDa

2013-12-01

This paper proposes a 1-norm support vector novelty detection (SVND) method and discusses its sparseness. 1-norm SVND is formulated as a linear programming problem and uses two techniques for inducing sparseness, or the 1-norm regularization and the hinge loss function. We also find two upper bounds on the sparseness of 1-norm SVND, or exact support vector (ESV) and kernel Gram matrix rank bounds. The ESV bound indicates that 1-norm SVND has a sparser representation model than SVND. The kernel Gram matrix rank bound can loosely estimate the sparseness of 1-norm SVND. Experimental results show that 1-norm SVND is feasible and effective. Copyright © 2013 Elsevier Ltd. All rights reserved.
Automatic EEG artifact removal: a weighted support vector machine approach with error correction.

PubMed

Shao, Shi-Yun; Shen, Kai-Quan; Ong, Chong Jin; Wilder-Smith, Einar P V; Li, Xiao-Ping

2009-02-01

An automatic electroencephalogram (EEG) artifact removal method is presented in this paper. Compared to past methods, it has two unique features: 1) a weighted version of support vector machine formulation that handles the inherent unbalanced nature of component classification and 2) the ability to accommodate structural information typically found in component classification. The advantages of the proposed method are demonstrated on real-life EEG recordings with comparisons made to several benchmark methods. Results show that the proposed method is preferable to the other methods in the context of artifact removal by achieving a better tradeoff between removing artifacts and preserving inherent brain activities. Qualitative evaluation of the reconstructed EEG epochs also demonstrates that after artifact removal inherent brain activities are largely preserved.
Study on vibration characteristics and fault diagnosis method of oil-immersed flat wave reactor in Arctic area converter station

NASA Astrophysics Data System (ADS)

Lai, Wenqing; Wang, Yuandong; Li, Wenpeng; Sun, Guang; Qu, Guomin; Cui, Shigang; Li, Mengke; Wang, Yongqiang

2017-10-01

Based on long term vibration monitoring of the No.2 oil-immersed fat wave reactor in the ±500kV converter station in East Mongolia, the vibration signals in normal state and in core loose fault state were saved. Through the time-frequency analysis of the signals, the vibration characteristics of the core loose fault were obtained, and a fault diagnosis method based on the dual tree complex wavelet (DT-CWT) and support vector machine (SVM) was proposed. The vibration signals were analyzed by DT-CWT, and the energy entropy of the vibration signals were taken as the feature vector; the support vector machine was used to train and test the feature vector, and the accurate identification of the core loose fault of the flat wave reactor was realized. Through the identification of many groups of normal and core loose fault state vibration signals, the diagnostic accuracy of the result reached 97.36%. The effectiveness and accuracy of the method in the fault diagnosis of the flat wave reactor core is verified.
A comparative study of surface EMG classification by fuzzy relevance vector machine and fuzzy support vector machine.

PubMed

Xie, Hong-Bo; Huang, Hu; Wu, Jianhua; Liu, Lei

2015-02-01

We present a multiclass fuzzy relevance vector machine (FRVM) learning mechanism and evaluate its performance to classify multiple hand motions using surface electromyographic (sEMG) signals. The relevance vector machine (RVM) is a sparse Bayesian kernel method which avoids some limitations of the support vector machine (SVM). However, RVM still suffers the difficulty of possible unclassifiable regions in multiclass problems. We propose two fuzzy membership function-based FRVM algorithms to solve such problems, based on experiments conducted on seven healthy subjects and two amputees with six hand motions. Two feature sets, namely, AR model coefficients and room mean square value (AR-RMS), and wavelet transform (WT) features, are extracted from the recorded sEMG signals. Fuzzy support vector machine (FSVM) analysis was also conducted for wide comparison in terms of accuracy, sparsity, training and testing time, as well as the effect of training sample sizes. FRVM yielded comparable classification accuracy with dramatically fewer support vectors in comparison with FSVM. Furthermore, the processing delay of FRVM was much less than that of FSVM, whilst training time of FSVM much faster than FRVM. The results indicate that FRVM classifier trained using sufficient samples can achieve comparable generalization capability as FSVM with significant sparsity in multi-channel sEMG classification, which is more suitable for sEMG-based real-time control applications.
Feature selection using probabilistic prediction of support vector regression.

PubMed

Yang, Jian-Bo; Ong, Chong-Jin

2011-06-01

This paper presents a new wrapper-based feature selection method for support vector regression (SVR) using its probabilistic predictions. The method computes the importance of a feature by aggregating the difference, over the feature space, of the conditional density functions of the SVR prediction with and without the feature. As the exact computation of this importance measure is expensive, two approximations are proposed. The effectiveness of the measure using these approximations, in comparison to several other existing feature selection methods for SVR, is evaluated on both artificial and real-world problems. The result of the experiments show that the proposed method generally performs better than, or at least as well as, the existing methods, with notable advantage when the dataset is sparse.
Recombinase-Mediated Cassette Exchange Using Adenoviral Vectors.

PubMed

Kolb, Andreas F; Knowles, Christopher; Pultinevicius, Patrikas; Harbottle, Jennifer A; Petrie, Linda; Robinson, Claire; Sorrell, David A

2017-01-01

Site-specific recombinases are important tools for the modification of mammalian genomes. In conjunction with viral vectors, they can be utilized to mediate site-specific gene insertions in animals and in cell lines which are difficult to transfect. Here we describe a method for the generation and analysis of an adenovirus vector supporting a recombinase-mediated cassette exchange reaction and discuss the advantages and limitations of this approach.
The maximum vector-angular margin classifier and its fast training on large datasets using a core vector machine.

PubMed

Hu, Wenjun; Chung, Fu-Lai; Wang, Shitong

2012-03-01

Although pattern classification has been extensively studied in the past decades, how to effectively solve the corresponding training on large datasets is a problem that still requires particular attention. Many kernelized classification methods, such as SVM and SVDD, can be formulated as the corresponding quadratic programming (QP) problems, but computing the associated kernel matrices requires O(n2)(or even up to O(n3)) computational complexity, where n is the size of the training patterns, which heavily limits the applicability of these methods for large datasets. In this paper, a new classification method called the maximum vector-angular margin classifier (MAMC) is first proposed based on the vector-angular margin to find an optimal vector c in the pattern feature space, and all the testing patterns can be classified in terms of the maximum vector-angular margin ρ, between the vector c and all the training data points. Accordingly, it is proved that the kernelized MAMC can be equivalently formulated as the kernelized Minimum Enclosing Ball (MEB), which leads to a distinctive merit of MAMC, i.e., it has the flexibility of controlling the sum of support vectors like v-SVC and may be extended to a maximum vector-angular margin core vector machine (MAMCVM) by connecting the core vector machine (CVM) method with MAMC such that the corresponding fast training on large datasets can be effectively achieved. Experimental results on artificial and real datasets are provided to validate the power of the proposed methods. Copyright © 2011 Elsevier Ltd. All rights reserved.
Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods.

PubMed

Polat, Huseyin; Danaei Mehr, Homay; Cetin, Aydin

2017-04-01

As Chronic Kidney Disease progresses slowly, early detection and effective treatment are the only cure to reduce the mortality rate. Machine learning techniques are gaining significance in medical diagnosis because of their classification ability with high accuracy rates. The accuracy of classification algorithms depend on the use of correct feature selection algorithms to reduce the dimension of datasets. In this study, Support Vector Machine classification algorithm was used to diagnose Chronic Kidney Disease. To diagnose the Chronic Kidney Disease, two essential types of feature selection methods namely, wrapper and filter approaches were chosen to reduce the dimension of Chronic Kidney Disease dataset. In wrapper approach, classifier subset evaluator with greedy stepwise search engine and wrapper subset evaluator with the Best First search engine were used. In filter approach, correlation feature selection subset evaluator with greedy stepwise search engine and filtered subset evaluator with the Best First search engine were used. The results showed that the Support Vector Machine classifier by using filtered subset evaluator with the Best First search engine feature selection method has higher accuracy rate (98.5%) in the diagnosis of Chronic Kidney Disease compared to other selected methods.
Recognition and Classification of Road Condition on the Basis of Friction Force by Using a Mobile Robot

NASA Astrophysics Data System (ADS)

Watanabe, Tatsuhito; Katsura, Seiichiro

A person operating a mobile robot in a remote environment receives realistic visual feedback about the condition of the road on which the robot is moving. The categorization of the road condition is necessary to evaluate the conditions for safe and comfortable driving. For this purpose, the mobile robot should be capable of recognizing and classifying the condition of the road surfaces. This paper proposes a method for recognizing the type of road surfaces on the basis of the friction between the mobile robot and the road surfaces. This friction is estimated by a disturbance observer, and a support vector machine is used to classify the surfaces. The support vector machine identifies the type of the road surface using feature vector, which is determined using the arithmetic average and variance derived from the torque values. Further, these feature vectors are mapped onto a higher dimensional space by using a kernel function. The validity of the proposed method is confirmed by experimental results.
The assisted prediction modelling frame with hybridisation and ensemble for business risk forecasting and an implementation

NASA Astrophysics Data System (ADS)

Li, Hui; Hong, Lu-Yao; Zhou, Qing; Yu, Hai-Jie

2015-08-01

The business failure of numerous companies results in financial crises. The high social costs associated with such crises have made people to search for effective tools for business risk prediction, among which, support vector machine is very effective. Several modelling means, including single-technique modelling, hybrid modelling, and ensemble modelling, have been suggested in forecasting business risk with support vector machine. However, existing literature seldom focuses on the general modelling frame for business risk prediction, and seldom investigates performance differences among different modelling means. We reviewed researches on forecasting business risk with support vector machine, proposed the general assisted prediction modelling frame with hybridisation and ensemble (APMF-WHAE), and finally, investigated the use of principal components analysis, support vector machine, random sampling, and group decision, under the general frame in forecasting business risk. Under the APMF-WHAE frame with support vector machine as the base predictive model, four specific predictive models were produced, namely, pure support vector machine, a hybrid support vector machine involved with principal components analysis, a support vector machine ensemble involved with random sampling and group decision, and an ensemble of hybrid support vector machine using group decision to integrate various hybrid support vector machines on variables produced from principle components analysis and samples from random sampling. The experimental results indicate that hybrid support vector machine and ensemble of hybrid support vector machines were able to produce dominating performance than pure support vector machine and support vector machine ensemble.
A Shellcode Detection Method Based on Full Native API Sequence and Support Vector Machine

NASA Astrophysics Data System (ADS)

Cheng, Yixuan; Fan, Wenqing; Huang, Wei; An, Jing

2017-09-01

Dynamic monitoring the behavior of a program is widely used to discriminate between benign program and malware. It is usually based on the dynamic characteristics of a program, such as API call sequence or API call frequency to judge. The key innovation of this paper is to consider the full Native API sequence and use the support vector machine to detect the shellcode. We also use the Markov chain to extract and digitize Native API sequence features. Our experimental results show that the method proposed in this paper has high accuracy and low detection rate.
Localization of U(1) gauge vector field on flat branes with five-dimension (asymptotic) AdS5 spacetime

NASA Astrophysics Data System (ADS)

Zhao, Zhen-Hua; Xie, Qun-Ying

2018-05-01

In order to localize U(1) gauge vector field on Randall-Sundrum-like braneworld model with infinite extra dimension, we propose a new kind of non-minimal coupling between the U(1) gauge field and the gravity. We propose three kinds of coupling methods and they all support the localization of zero mode. In addition, one of them can support the localization of massive modes. Moreover, the massive tachyonic modes can be excluded. And our method can be used not only in the thin braneword models but also in the thick ones.
A multi-label learning based kernel automatic recommendation method for support vector machine.

PubMed

Zhang, Xueying; Song, Qinbao

2015-01-01

Choosing an appropriate kernel is very important and critical when classifying a new problem with Support Vector Machine. So far, more attention has been paid on constructing new kernels and choosing suitable parameter values for a specific kernel function, but less on kernel selection. Furthermore, most of current kernel selection methods focus on seeking a best kernel with the highest classification accuracy via cross-validation, they are time consuming and ignore the differences among the number of support vectors and the CPU time of SVM with different kernels. Considering the tradeoff between classification success ratio and CPU time, there may be multiple kernel functions performing equally well on the same classification problem. Aiming to automatically select those appropriate kernel functions for a given data set, we propose a multi-label learning based kernel recommendation method built on the data characteristics. For each data set, the meta-knowledge data base is first created by extracting the feature vector of data characteristics and identifying the corresponding applicable kernel set. Then the kernel recommendation model is constructed on the generated meta-knowledge data base with the multi-label classification method. Finally, the appropriate kernel functions are recommended to a new data set by the recommendation model according to the characteristics of the new data set. Extensive experiments over 132 UCI benchmark data sets, with five different types of data set characteristics, eleven typical kernels (Linear, Polynomial, Radial Basis Function, Sigmoidal function, Laplace, Multiquadric, Rational Quadratic, Spherical, Spline, Wave and Circular), and five multi-label classification methods demonstrate that, compared with the existing kernel selection methods and the most widely used RBF kernel function, SVM with the kernel function recommended by our proposed method achieved the highest classification performance.
A Multi-Label Learning Based Kernel Automatic Recommendation Method for Support Vector Machine

PubMed Central

Zhang, Xueying; Song, Qinbao

2015-01-01

Choosing an appropriate kernel is very important and critical when classifying a new problem with Support Vector Machine. So far, more attention has been paid on constructing new kernels and choosing suitable parameter values for a specific kernel function, but less on kernel selection. Furthermore, most of current kernel selection methods focus on seeking a best kernel with the highest classification accuracy via cross-validation, they are time consuming and ignore the differences among the number of support vectors and the CPU time of SVM with different kernels. Considering the tradeoff between classification success ratio and CPU time, there may be multiple kernel functions performing equally well on the same classification problem. Aiming to automatically select those appropriate kernel functions for a given data set, we propose a multi-label learning based kernel recommendation method built on the data characteristics. For each data set, the meta-knowledge data base is first created by extracting the feature vector of data characteristics and identifying the corresponding applicable kernel set. Then the kernel recommendation model is constructed on the generated meta-knowledge data base with the multi-label classification method. Finally, the appropriate kernel functions are recommended to a new data set by the recommendation model according to the characteristics of the new data set. Extensive experiments over 132 UCI benchmark data sets, with five different types of data set characteristics, eleven typical kernels (Linear, Polynomial, Radial Basis Function, Sigmoidal function, Laplace, Multiquadric, Rational Quadratic, Spherical, Spline, Wave and Circular), and five multi-label classification methods demonstrate that, compared with the existing kernel selection methods and the most widely used RBF kernel function, SVM with the kernel function recommended by our proposed method achieved the highest classification performance. PMID:25893896
New analysis methods to push the boundaries of diagnostic techniques in the environmental sciences

NASA Astrophysics Data System (ADS)

Lungaroni, M.; Murari, A.; Peluso, E.; Gelfusa, M.; Malizia, A.; Vega, J.; Talebzadeh, S.; Gaudio, P.

2016-04-01

In the last years, new and more sophisticated measurements have been at the basis of the major progress in various disciplines related to the environment, such as remote sensing and thermonuclear fusion. To maximize the effectiveness of the measurements, new data analysis techniques are required. First data processing tasks, such as filtering and fitting, are of primary importance, since they can have a strong influence on the rest of the analysis. Even if Support Vector Regression is a method devised and refined at the end of the 90s, a systematic comparison with more traditional non parametric regression methods has never been reported. In this paper, a series of systematic tests is described, which indicates how SVR is a very competitive method of non-parametric regression that can usefully complement and often outperform more consolidated approaches. The performance of Support Vector Regression as a method of filtering is investigated first, comparing it with the most popular alternative techniques. Then Support Vector Regression is applied to the problem of non-parametric regression to analyse Lidar surveys for the environments measurement of particulate matter due to wildfires. The proposed approach has given very positive results and provides new perspectives to the interpretation of the data.
Evolutionary-driven support vector machines for determining the degree of liver fibrosis in chronic hepatitis C.

PubMed

Stoean, Ruxandra; Stoean, Catalin; Lupsor, Monica; Stefanescu, Horia; Badea, Radu

2011-01-01

Hepatic fibrosis, the principal pointer to the development of a liver disease within chronic hepatitis C, can be measured through several stages. The correct evaluation of its degree, based on recent different non-invasive procedures, is of current major concern. The latest methodology for assessing it is the Fibroscan and the effect of its employment is impressive. However, the complex interaction between its stiffness indicator and the other biochemical and clinical examinations towards a respective degree of liver fibrosis is hard to be manually discovered. In this respect, the novel, well-performing evolutionary-powered support vector machines are proposed towards an automated learning of the relationship between medical attributes and fibrosis levels. The traditional support vector machines have been an often choice for addressing hepatic fibrosis, while the evolutionary option has been validated on many real-world tasks and proven flexibility and good performance. The evolutionary approach is simple and direct, resulting from the hybridization of the learning component within support vector machines and the optimization engine of evolutionary algorithms. It discovers the optimal coefficients of surfaces that separate instances of distinct classes. Apart from a detached manner of establishing the fibrosis degree for new cases, a resulting formula also offers insight upon the correspondence between the medical factors and the respective outcome. What is more, a feature selection genetic algorithm can be further embedded into the method structure, in order to dynamically concentrate search only on the most relevant attributes. The data set refers 722 patients with chronic hepatitis C infection and 24 indicators. The five possible degrees of fibrosis range from F0 (no fibrosis) to F4 (cirrhosis). Since the standard support vector machines are among the most frequently used methods in recent artificial intelligence studies for hepatic fibrosis staging, the evolutionary method is viewed in comparison to the traditional one. The multifaceted discrimination into all five degrees of fibrosis and the slightly less difficult common separation into solely three related stages are both investigated. The resulting performance proves the superiority over the standard support vector classification and the attained formula is helpful in providing an immediate calculation of the liver stage for new cases, while establishing the presence/absence and comprehending the weight of each medical factor with respect to a certain fibrosis level. The use of the evolutionary technique for fibrosis degree prediction triggers simplicity and offers a direct expression of the influence of dynamically selected indicators on the corresponding stage. Perhaps most importantly, it significantly surpasses the classical support vector machines, which are both widely used and technically sound. All these therefore confirm the promise of the new methodology towards a dependable support within the medical decision-making. Copyright © 2010 Elsevier B.V. All rights reserved.
Interpreting linear support vector machine models with heat map molecule coloring

PubMed Central

2011-01-01

Background Model-based virtual screening plays an important role in the early drug discovery stage. The outcomes of high-throughput screenings are a valuable source for machine learning algorithms to infer such models. Besides a strong performance, the interpretability of a machine learning model is a desired property to guide the optimization of a compound in later drug discovery stages. Linear support vector machines showed to have a convincing performance on large-scale data sets. The goal of this study is to present a heat map molecule coloring technique to interpret linear support vector machine models. Based on the weights of a linear model, the visualization approach colors each atom and bond of a compound according to its importance for activity. Results We evaluated our approach on a toxicity data set, a chromosome aberration data set, and the maximum unbiased validation data sets. The experiments show that our method sensibly visualizes structure-property and structure-activity relationships of a linear support vector machine model. The coloring of ligands in the binding pocket of several crystal structures of a maximum unbiased validation data set target indicates that our approach assists to determine the correct ligand orientation in the binding pocket. Additionally, the heat map coloring enables the identification of substructures important for the binding of an inhibitor. Conclusions In combination with heat map coloring, linear support vector machine models can help to guide the modification of a compound in later stages of drug discovery. Particularly substructures identified as important by our method might be a starting point for optimization of a lead compound. The heat map coloring should be considered as complementary to structure based modeling approaches. As such, it helps to get a better understanding of the binding mode of an inhibitor. PMID:21439031

Detection of License Plate using Sliding Window, Histogram of Oriented Gradient, and Support Vector Machines Method

NASA Astrophysics Data System (ADS)

Astawa, INGA; Gusti Ngurah Bagus Caturbawa, I.; Made Sajayasa, I.; Dwi Suta Atmaja, I. Made Ari

2018-01-01

The license plate recognition usually used as part of system such as parking system. License plate detection considered as the most important step in the license plate recognition system. We propose methods that can be used to detect the vehicle plate on mobile phone. In this paper, we used Sliding Window, Histogram of Oriented Gradient (HOG), and Support Vector Machines (SVM) method to license plate detection so it will increase the detection level even though the image is not in a good quality. The image proceed by Sliding Window method in order to find plate position. Feature extraction in every window movement had been done by HOG and SVM method. Good result had shown in this research, which is 96% of accuracy.
EMMA: An Extensible Mammalian Modular Assembly Toolkit for the Rapid Design and Production of Diverse Expression Vectors.

PubMed

Martella, Andrea; Matjusaitis, Mantas; Auxillos, Jamie; Pollard, Steven M; Cai, Yizhi

2017-07-21

Mammalian plasmid expression vectors are critical reagents underpinning many facets of research across biology, biomedical research, and the biotechnology industry. Traditional cloning methods often require laborious manual design and assembly of plasmids using tailored sequential cloning steps. This process can be protracted, complicated, expensive, and error-prone. New tools and strategies that facilitate the efficient design and production of bespoke vectors would help relieve a current bottleneck for researchers. To address this, we have developed an extensible mammalian modular assembly kit (EMMA). This enables rapid and efficient modular assembly of mammalian expression vectors in a one-tube, one-step golden-gate cloning reaction, using a standardized library of compatible genetic parts. The high modularity, flexibility, and extensibility of EMMA provide a simple method for the production of functionally diverse mammalian expression vectors. We demonstrate the value of this toolkit by constructing and validating a range of representative vectors, such as transient and stable expression vectors (transposon based vectors), targeting vectors, inducible systems, polycistronic expression cassettes, fusion proteins, and fluorescent reporters. The method also supports simple assembly combinatorial libraries and hierarchical assembly for production of larger multigenetic cargos. In summary, EMMA is compatible with automated production, and novel genetic parts can be easily incorporated, providing new opportunities for mammalian synthetic biology.
Intelligent Design of Metal Oxide Gas Sensor Arrays Using Reciprocal Kernel Support Vector Regression

NASA Astrophysics Data System (ADS)

Dougherty, Andrew W.

Metal oxides are a staple of the sensor industry. The combination of their sensitivity to a number of gases, and the electrical nature of their sensing mechanism, make the particularly attractive in solid state devices. The high temperature stability of the ceramic material also make them ideal for detecting combustion byproducts where exhaust temperatures can be high. However, problems do exist with metal oxide sensors. They are not very selective as they all tend to be sensitive to a number of reduction and oxidation reactions on the oxide's surface. This makes sensors with large numbers of sensors interesting to study as a method for introducing orthogonality to the system. Also, the sensors tend to suffer from long term drift for a number of reasons. In this thesis I will develop a system for intelligently modeling metal oxide sensors and determining their suitability for use in large arrays designed to analyze exhaust gas streams. It will introduce prior knowledge of the metal oxide sensors' response mechanisms in order to produce a response function for each sensor from sparse training data. The system will use the same technique to model and remove any long term drift from the sensor response. It will also provide an efficient means for determining the orthogonality of the sensor to determine whether they are useful in gas sensing arrays. The system is based on least squares support vector regression using the reciprocal kernel. The reciprocal kernel is introduced along with a method of optimizing the free parameters of the reciprocal kernel support vector machine. The reciprocal kernel is shown to be simpler and to perform better than an earlier kernel, the modified reciprocal kernel. Least squares support vector regression is chosen as it uses all of the training points and an emphasis was placed throughout this research for extracting the maximum information from very sparse data. The reciprocal kernel is shown to be effective in modeling the sensor responses in the time, gas and temperature domains, and the dual representation of the support vector regression solution is shown to provide insight into the sensor's sensitivity and potential orthogonality. Finally, the dual weights of the support vector regression solution to the sensor's response are suggested as a fitness function for a genetic algorithm, or some other method for efficiently searching large parameter spaces.
Support Vector Data Description Model to Map Specific Land Cover with Optimal Parameters Determined from a Window-Based Validation Set.

PubMed

Zhang, Jinshui; Yuan, Zhoumiqi; Shuai, Guanyuan; Pan, Yaozhong; Zhu, Xiufang

2017-04-26

This paper developed an approach, the window-based validation set for support vector data description (WVS-SVDD), to determine optimal parameters for support vector data description (SVDD) model to map specific land cover by integrating training and window-based validation sets. Compared to the conventional approach where the validation set included target and outlier pixels selected visually and randomly, the validation set derived from WVS-SVDD constructed a tightened hypersphere because of the compact constraint by the outlier pixels which were located neighboring to the target class in the spectral feature space. The overall accuracies for wheat and bare land achieved were as high as 89.25% and 83.65%, respectively. However, target class was underestimated because the validation set covers only a small fraction of the heterogeneous spectra of the target class. The different window sizes were then tested to acquire more wheat pixels for validation set. The results showed that classification accuracy increased with the increasing window size and the overall accuracies were higher than 88% at all window size scales. Moreover, WVS-SVDD showed much less sensitivity to the untrained classes than the multi-class support vector machine (SVM) method. Therefore, the developed method showed its merits using the optimal parameters, tradeoff coefficient ( C ) and kernel width ( s ), in mapping homogeneous specific land cover.
Prediction of Spirometric Forced Expiratory Volume (FEV1) Data Using Support Vector Regression

NASA Astrophysics Data System (ADS)

Kavitha, A.; Sujatha, C. M.; Ramakrishnan, S.

2010-01-01

In this work, prediction of forced expiratory volume in 1 second (FEV1) in pulmonary function test is carried out using the spirometer and support vector regression analysis. Pulmonary function data are measured with flow volume spirometer from volunteers (N=175) using a standard data acquisition protocol. The acquired data are then used to predict FEV1. Support vector machines with polynomial kernel function with four different orders were employed to predict the values of FEV1. The performance is evaluated by computing the average prediction accuracy for normal and abnormal cases. Results show that support vector machines are capable of predicting FEV1 in both normal and abnormal cases and the average prediction accuracy for normal subjects was higher than that of abnormal subjects. Accuracy in prediction was found to be high for a regularization constant of C=10. Since FEV1 is the most significant parameter in the analysis of spirometric data, it appears that this method of assessment is useful in diagnosing the pulmonary abnormalities with incomplete data and data with poor recording.
A Method for Extracting Important Segments from Documents Using Support Vector Machines

NASA Astrophysics Data System (ADS)

Suzuki, Daisuke; Utsumi, Akira

In this paper we propose an extraction-based method for automatic summarization. The proposed method consists of two processes: important segment extraction and sentence compaction. The process of important segment extraction classifies each segment in a document as important or not by Support Vector Machines (SVMs). The process of sentence compaction then determines grammatically appropriate portions of a sentence for a summary according to its dependency structure and the classification result by SVMs. To test the performance of our method, we conducted an evaluation experiment using the Text Summarization Challenge (TSC-1) corpus of human-prepared summaries. The result was that our method achieved better performance than a segment-extraction-only method and the Lead method, especially for sentences only a part of which was included in human summaries. Further analysis of the experimental results suggests that a hybrid method that integrates sentence extraction with segment extraction may generate better summaries.
Vision based nutrient deficiency classification in maize plants using multi class support vector machines

NASA Astrophysics Data System (ADS)

Leena, N.; Saju, K. K.

2018-04-01

Nutritional deficiencies in plants are a major concern for farmers as it affects productivity and thus profit. The work aims to classify nutritional deficiencies in maize plant in a non-destructive mannerusing image processing and machine learning techniques. The colored images of the leaves are analyzed and classified with multi-class support vector machine (SVM) method. Several images of maize leaves with known deficiencies like nitrogen, phosphorous and potassium (NPK) are used to train the SVM classifier prior to the classification of test images. The results show that the method was able to classify and identify nutritional deficiencies.
DOA Finding with Support Vector Regression Based Forward-Backward Linear Prediction.

PubMed

Pan, Jingjing; Wang, Yide; Le Bastard, Cédric; Wang, Tianzhen

2017-05-27

Direction-of-arrival (DOA) estimation has drawn considerable attention in array signal processing, particularly with coherent signals and a limited number of snapshots. Forward-backward linear prediction (FBLP) is able to directly deal with coherent signals. Support vector regression (SVR) is robust with small samples. This paper proposes the combination of the advantages of FBLP and SVR in the estimation of DOAs of coherent incoming signals with low snapshots. The performance of the proposed method is validated with numerical simulations in coherent scenarios, in terms of different angle separations, numbers of snapshots, and signal-to-noise ratios (SNRs). Simulation results show the effectiveness of the proposed method.
Human action recognition with group lasso regularized-support vector machine

NASA Astrophysics Data System (ADS)

Luo, Huiwu; Lu, Huanzhang; Wu, Yabei; Zhao, Fei

2016-05-01

The bag-of-visual-words (BOVW) and Fisher kernel are two popular models in human action recognition, and support vector machine (SVM) is the most commonly used classifier for the two models. We show two kinds of group structures in the feature representation constructed by BOVW and Fisher kernel, respectively, since the structural information of feature representation can be seen as a prior for the classifier and can improve the performance of the classifier, which has been verified in several areas. However, the standard SVM employs L2-norm regularization in its learning procedure, which penalizes each variable individually and cannot express the structural information of feature representation. We replace the L2-norm regularization with group lasso regularization in standard SVM, and a group lasso regularized-support vector machine (GLRSVM) is proposed. Then, we embed the group structural information of feature representation into GLRSVM. Finally, we introduce an algorithm to solve the optimization problem of GLRSVM by alternating directions method of multipliers. The experiments evaluated on KTH, YouTube, and Hollywood2 datasets show that our method achieves promising results and improves the state-of-the-art methods on KTH and YouTube datasets.
T-wave end detection using neural networks and Support Vector Machines.

PubMed

Suárez-León, Alexander Alexeis; Varon, Carolina; Willems, Rik; Van Huffel, Sabine; Vázquez-Seisdedos, Carlos Román

2018-05-01

In this paper we propose a new approach for detecting the end of the T-wave in the electrocardiogram (ECG) using Neural Networks and Support Vector Machines. Both, Multilayer Perceptron (MLP) neural networks and Fixed-Size Least-Squares Support Vector Machines (FS-LSSVM) were used as regression algorithms to determine the end of the T-wave. Different strategies for selecting the training set such as random selection, k-means, robust clustering and maximum quadratic (Rényi) entropy were evaluated. Individual parameters were tuned for each method during training and the results are given for the evaluation set. A comparison between MLP and FS-LSSVM approaches was performed. Finally, a fair comparison of the FS-LSSVM method with other state-of-the-art algorithms for detecting the end of the T-wave was included. The experimental results show that FS-LSSVM approaches are more suitable as regression algorithms than MLP neural networks. Despite the small training sets used, the FS-LSSVM methods outperformed the state-of-the-art techniques. FS-LSSVM can be successfully used as a T-wave end detection algorithm in ECG even with small training set sizes. Copyright © 2018 Elsevier Ltd. All rights reserved.
A Temperature Compensation Method for Piezo-Resistive Pressure Sensor Utilizing Chaotic Ions Motion Algorithm Optimized Hybrid Kernel LSSVM.

PubMed

Li, Ji; Hu, Guoqing; Zhou, Yonghong; Zou, Chong; Peng, Wei; Alam Sm, Jahangir

2016-10-14

A piezo-resistive pressure sensor is made of silicon, the nature of which is considerably influenced by ambient temperature. The effect of temperature should be eliminated during the working period in expectation of linear output. To deal with this issue, an approach consists of a hybrid kernel Least Squares Support Vector Machine (LSSVM) optimized by a chaotic ions motion algorithm presented. To achieve the learning and generalization for excellent performance, a hybrid kernel function, constructed by a local kernel as Radial Basis Function (RBF) kernel, and a global kernel as polynomial kernel is incorporated into the Least Squares Support Vector Machine. The chaotic ions motion algorithm is introduced to find the best hyper-parameters of the Least Squares Support Vector Machine. The temperature data from a calibration experiment is conducted to validate the proposed method. With attention on algorithm robustness and engineering applications, the compensation result shows the proposed scheme outperforms other compared methods on several performance measures as maximum absolute relative error, minimum absolute relative error mean and variance of the averaged value on fifty runs. Furthermore, the proposed temperature compensation approach lays a foundation for more extensive research.
Online Artifact Removal for Brain-Computer Interfaces Using Support Vector Machines and Blind Source Separation

PubMed Central

Halder, Sebastian; Bensch, Michael; Mellinger, Jürgen; Bogdan, Martin; Kübler, Andrea; Birbaumer, Niels; Rosenstiel, Wolfgang

2007-01-01

We propose a combination of blind source separation (BSS) and independent component analysis (ICA) (signal decomposition into artifacts and nonartifacts) with support vector machines (SVMs) (automatic classification) that are designed for online usage. In order to select a suitable BSS/ICA method, three ICA algorithms (JADE, Infomax, and FastICA) and one BSS algorithm (AMUSE) are evaluated to determine their ability to isolate electromyographic (EMG) and electrooculographic (EOG) artifacts into individual components. An implementation of the selected BSS/ICA method with SVMs trained to classify EMG and EOG artifacts, which enables the usage of the method as a filter in measurements with online feedback, is described. This filter is evaluated on three BCI datasets as a proof-of-concept of the method. PMID:18288259
Online artifact removal for brain-computer interfaces using support vector machines and blind source separation.

PubMed

Halder, Sebastian; Bensch, Michael; Mellinger, Jürgen; Bogdan, Martin; Kübler, Andrea; Birbaumer, Niels; Rosenstiel, Wolfgang

2007-01-01

We propose a combination of blind source separation (BSS) and independent component analysis (ICA) (signal decomposition into artifacts and nonartifacts) with support vector machines (SVMs) (automatic classification) that are designed for online usage. In order to select a suitable BSS/ICA method, three ICA algorithms (JADE, Infomax, and FastICA) and one BSS algorithm (AMUSE) are evaluated to determine their ability to isolate electromyographic (EMG) and electrooculographic (EOG) artifacts into individual components. An implementation of the selected BSS/ICA method with SVMs trained to classify EMG and EOG artifacts, which enables the usage of the method as a filter in measurements with online feedback, is described. This filter is evaluated on three BCI datasets as a proof-of-concept of the method.
Scorebox extraction from mobile sports videos using Support Vector Machines

NASA Astrophysics Data System (ADS)

Kim, Wonjun; Park, Jimin; Kim, Changick

2008-08-01

Scorebox plays an important role in understanding contents of sports videos. However, the tiny scorebox may give the small-display-viewers uncomfortable experience in grasping the game situation. In this paper, we propose a novel framework to extract the scorebox from sports video frames. We first extract candidates by using accumulated intensity and edge information after short learning period. Since there are various types of scoreboxes inserted in sports videos, multiple attributes need to be used for efficient extraction. Based on those attributes, the optimal information gain is computed and top three ranked attributes in terms of information gain are selected as a three-dimensional feature vector for Support Vector Machines (SVM) to distinguish the scorebox from other candidates, such as logos and advertisement boards. The proposed method is tested on various videos of sports games and experimental results show the efficiency and robustness of our proposed method.
New model for prediction binary mixture of antihistamine decongestant using artificial neural networks and least squares support vector machine by spectrophotometry method

NASA Astrophysics Data System (ADS)

Mofavvaz, Shirin; Sohrabi, Mahmoud Reza; Nezamzadeh-Ejhieh, Alireza

2017-07-01

In the present study, artificial neural networks (ANNs) and least squares support vector machines (LS-SVM) as intelligent methods based on absorption spectra in the range of 230-300 nm have been used for determination of antihistamine decongestant contents. In the first step, one type of network (feed-forward back-propagation) from the artificial neural network with two different training algorithms, Levenberg-Marquardt (LM) and gradient descent with momentum and adaptive learning rate back-propagation (GDX) algorithm, were employed and their performance was evaluated. The performance of the LM algorithm was better than the GDX algorithm. In the second one, the radial basis network was utilized and results compared with the previous network. In the last one, the other intelligent method named least squares support vector machine was proposed to construct the antihistamine decongestant prediction model and the results were compared with two of the aforementioned networks. The values of the statistical parameters mean square error (MSE), Regression coefficient (R2), correlation coefficient (r) and also mean recovery (%), relative standard deviation (RSD) used for selecting the best model between these methods. Moreover, the proposed methods were compared to the high- performance liquid chromatography (HPLC) as a reference method. One way analysis of variance (ANOVA) test at the 95% confidence level applied to the comparison results of suggested and reference methods that there were no significant differences between them.
Design of Clinical Support Systems Using Integrated Genetic Algorithm and Support Vector Machine

NASA Astrophysics Data System (ADS)

Chen, Yung-Fu; Huang, Yung-Fa; Jiang, Xiaoyi; Hsu, Yuan-Nian; Lin, Hsuan-Hung

Clinical decision support system (CDSS) provides knowledge and specific information for clinicians to enhance diagnostic efficiency and improving healthcare quality. An appropriate CDSS can highly elevate patient safety, improve healthcare quality, and increase cost-effectiveness. Support vector machine (SVM) is believed to be superior to traditional statistical and neural network classifiers. However, it is critical to determine suitable combination of SVM parameters regarding classification performance. Genetic algorithm (GA) can find optimal solution within an acceptable time, and is faster than greedy algorithm with exhaustive searching strategy. By taking the advantage of GA in quickly selecting the salient features and adjusting SVM parameters, a method using integrated GA and SVM (IGS), which is different from the traditional method with GA used for feature selection and SVM for classification, was used to design CDSSs for prediction of successful ventilation weaning, diagnosis of patients with severe obstructive sleep apnea, and discrimination of different cell types form Pap smear. The results show that IGS is better than methods using SVM alone or linear discriminator.
An implementation of support vector machine on sentiment classification of movie reviews

NASA Astrophysics Data System (ADS)

Yulietha, I. M.; Faraby, S. A.; Adiwijaya; Widyaningtyas, W. C.

2018-03-01

With technological advances, all information about movie is available on the internet. If the information is processed properly, it will get the quality of the information. This research proposes to the classify sentiments on movie review documents. This research uses Support Vector Machine (SVM) method because it can classify high dimensional data in accordance with the data used in this research in the form of text. Support Vector Machine is a popular machine learning technique for text classification because it can classify by learning from a collection of documents that have been classified previously and can provide good result. Based on number of datasets, the 90-10 composition has the best result that is 85.6%. Based on SVM kernel, kernel linear with constant 1 has the best result that is 84.9%
Combined empirical mode decomposition and texture features for skin lesion classification using quadratic support vector machine.

PubMed

Wahba, Maram A; Ashour, Amira S; Napoleon, Sameh A; Abd Elnaby, Mustafa M; Guo, Yanhui

2017-12-01

Basal cell carcinoma is one of the most common malignant skin lesions. Automated lesion identification and classification using image processing techniques is highly required to reduce the diagnosis errors. In this study, a novel technique is applied to classify skin lesion images into two classes, namely the malignant Basal cell carcinoma and the benign nevus. A hybrid combination of bi-dimensional empirical mode decomposition and gray-level difference method features is proposed after hair removal. The combined features are further classified using quadratic support vector machine (Q-SVM). The proposed system has achieved outstanding performance of 100% accuracy, sensitivity and specificity compared to other support vector machine procedures as well as with different extracted features. Basal Cell Carcinoma is effectively classified using Q-SVM with the proposed combined features.
Walsh-Hadamard transform kernel-based feature vector for shot boundary detection.

PubMed

Lakshmi, Priya G G; Domnic, S

2014-12-01

Video shot boundary detection (SBD) is the first step of video analysis, summarization, indexing, and retrieval. In SBD process, videos are segmented into basic units called shots. In this paper, a new SBD method is proposed using color, edge, texture, and motion strength as vector of features (feature vector). Features are extracted by projecting the frames on selected basis vectors of Walsh-Hadamard transform (WHT) kernel and WHT matrix. After extracting the features, based on the significance of the features, weights are calculated. The weighted features are combined to form a single continuity signal, used as input for Procedure Based shot transition Identification process (PBI). Using the procedure, shot transitions are classified into abrupt and gradual transitions. Experimental results are examined using large-scale test sets provided by the TRECVID 2007, which has evaluated hard cut and gradual transition detection. To evaluate the robustness of the proposed method, the system evaluation is performed. The proposed method yields F1-Score of 97.4% for cut, 78% for gradual, and 96.1% for overall transitions. We have also evaluated the proposed feature vector with support vector machine classifier. The results show that WHT-based features can perform well than the other existing methods. In addition to this, few more video sequences are taken from the Openvideo project and the performance of the proposed method is compared with the recent existing SBD method.
LANDMARK-BASED SPEECH RECOGNITION: REPORT OF THE 2004 JOHNS HOPKINS SUMMER WORKSHOP.

PubMed

Hasegawa-Johnson, Mark; Baker, James; Borys, Sarah; Chen, Ken; Coogan, Emily; Greenberg, Steven; Juneja, Amit; Kirchhoff, Katrin; Livescu, Karen; Mohan, Srividya; Muller, Jennifer; Sonmez, Kemal; Wang, Tianyu

2005-01-01

Three research prototype speech recognition systems are described, all of which use recently developed methods from artificial intelligence (specifically support vector machines, dynamic Bayesian networks, and maximum entropy classification) in order to implement, in the form of an automatic speech recognizer, current theories of human speech perception and phonology (specifically landmark-based speech perception, nonlinear phonology, and articulatory phonology). All three systems begin with a high-dimensional multiframe acoustic-to-distinctive feature transformation, implemented using support vector machines trained to detect and classify acoustic phonetic landmarks. Distinctive feature probabilities estimated by the support vector machines are then integrated using one of three pronunciation models: a dynamic programming algorithm that assumes canonical pronunciation of each word, a dynamic Bayesian network implementation of articulatory phonology, or a discriminative pronunciation model trained using the methods of maximum entropy classification. Log probability scores computed by these models are then combined, using log-linear combination, with other word scores available in the lattice output of a first-pass recognizer, and the resulting combination score is used to compute a second-pass speech recognition output.

Optimizing support vector machine learning for semi-arid vegetation mapping by using clustering analysis

NASA Astrophysics Data System (ADS)

Su, Lihong

In remote sensing communities, support vector machine (SVM) learning has recently received increasing attention. SVM learning usually requires large memory and enormous amounts of computation time on large training sets. According to SVM algorithms, the SVM classification decision function is fully determined by support vectors, which compose a subset of the training sets. In this regard, a solution to optimize SVM learning is to efficiently reduce training sets. In this paper, a data reduction method based on agglomerative hierarchical clustering is proposed to obtain smaller training sets for SVM learning. Using a multiple angle remote sensing dataset of a semi-arid region, the effectiveness of the proposed method is evaluated by classification experiments with a series of reduced training sets. The experiments show that there is no loss of SVM accuracy when the original training set is reduced to 34% using the proposed approach. Maximum likelihood classification (MLC) also is applied on the reduced training sets. The results show that MLC can also maintain the classification accuracy. This implies that the most informative data instances can be retained by this approach.
LCD denoise and the vector mutual information method in the application of the gear fault diagnosis under different working conditions

NASA Astrophysics Data System (ADS)

Xiangfeng, Zhang; Hong, Jiang

2018-03-01

In this paper, the full vector LCD method is proposed to solve the misjudgment problem caused by the change of the working condition. First, the signal from different working condition is decomposed by LCD, to obtain the Intrinsic Scale Component (ISC)whose instantaneous frequency with physical significance. Then, calculate of the cross correlation coefficient between ISC and the original signal, signal denoising based on the principle of mutual information minimum. At last, calculate the sum of absolute Vector mutual information of the sample under different working condition and the denoised ISC as the characteristics to classify by use of Support vector machine (SVM). The wind turbines vibration platform gear box experiment proves that this method can identify fault characteristics under different working conditions. The advantage of this method is that it reduce dependence of man’s subjective experience, identify fault directly from the original data of vibration signal. It will has high engineering value.
Improvements on ν-Twin Support Vector Machine.

PubMed

Khemchandani, Reshma; Saigal, Pooja; Chandra, Suresh

2016-07-01

In this paper, we propose two novel binary classifiers termed as "Improvements on ν-Twin Support Vector Machine: Iν-TWSVM and Iν-TWSVM (Fast)" that are motivated by ν-Twin Support Vector Machine (ν-TWSVM). Similar to ν-TWSVM, Iν-TWSVM determines two nonparallel hyperplanes such that they are closer to their respective classes and are at least ρ distance away from the other class. The significant advantage of Iν-TWSVM over ν-TWSVM is that Iν-TWSVM solves one smaller-sized Quadratic Programming Problem (QPP) and one Unconstrained Minimization Problem (UMP); as compared to solving two related QPPs in ν-TWSVM. Further, Iν-TWSVM (Fast) avoids solving a smaller sized QPP and transforms it as a unimodal function, which can be solved using line search methods and similar to Iν-TWSVM, the other problem is solved as a UMP. Due to their novel formulation, the proposed classifiers are faster than ν-TWSVM and have comparable generalization ability. Iν-TWSVM also implements structural risk minimization (SRM) principle by introducing a regularization term, along with minimizing the empirical risk. The other properties of Iν-TWSVM, related to support vectors (SVs), are similar to that of ν-TWSVM. To test the efficacy of the proposed method, experiments have been conducted on a wide range of UCI and a skewed variation of NDC datasets. We have also given the application of Iν-TWSVM as a binary classifier for pixel classification of color images. Copyright © 2016 Elsevier Ltd. All rights reserved.
Support vector machine regression (SVR/LS-SVM)--an alternative to neural networks (ANN) for analytical chemistry? Comparison of nonlinear methods on near infrared (NIR) spectroscopy data.

PubMed

Balabin, Roman M; Lomakina, Ekaterina I

2011-04-21

In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.
A real-time neutron-gamma discriminator based on the support vector machine method for the time-of-flight neutron spectrometer

NASA Astrophysics Data System (ADS)

Wei, ZHANG; Tongyu, WU; Bowen, ZHENG; Shiping, LI; Yipo, ZHANG; Zejie, YIN

2018-04-01

A new neutron-gamma discriminator based on the support vector machine (SVM) method is proposed to improve the performance of the time-of-flight neutron spectrometer. The neutron detector is an EJ-299-33 plastic scintillator with pulse-shape discrimination (PSD) property. The SVM algorithm is implemented in field programmable gate array (FPGA) to carry out the real-time sifting of neutrons in neutron-gamma mixed radiation fields. This study compares the ability of the pulse gradient analysis method and the SVM method. The results show that this SVM discriminator can provide a better discrimination accuracy of 99.1%. The accuracy and performance of the SVM discriminator based on FPGA have been evaluated in the experiments. It can get a figure of merit of 1.30.
HYBRID NEURAL NETWORK AND SUPPORT VECTOR MACHINE METHOD FOR OPTIMIZATION

NASA Technical Reports Server (NTRS)

Rai, Man Mohan (Inventor)

2005-01-01

System and method for optimization of a design associated with a response function, using a hybrid neural net and support vector machine (NN/SVM) analysis to minimize or maximize an objective function, optionally subject to one or more constraints. As a first example, the NN/SVM analysis is applied iteratively to design of an aerodynamic component, such as an airfoil shape, where the objective function measures deviation from a target pressure distribution on the perimeter of the aerodynamic component. As a second example, the NN/SVM analysis is applied to data classification of a sequence of data points in a multidimensional space. The NN/SVM analysis is also applied to data regression.
Hybrid Neural Network and Support Vector Machine Method for Optimization

NASA Technical Reports Server (NTRS)

Rai, Man Mohan (Inventor)

2007-01-01

System and method for optimization of a design associated with a response function, using a hybrid neural net and support vector machine (NN/SVM) analysis to minimize or maximize an objective function, optionally subject to one or more constraints. As a first example, the NN/SVM analysis is applied iteratively to design of an aerodynamic component, such as an airfoil shape, where the objective function measures deviation from a target pressure distribution on the perimeter of the aerodynamic component. As a second example, the NN/SVM analysis is applied to data classification of a sequence of data points in a multidimensional space. The NN/SVM analysis is also applied to data regression.
An Auto-flag Method of Radio Visibility Data Based on Support Vector Machine

NASA Astrophysics Data System (ADS)

Dai, Hui-mei; Mei, Ying; Wang, Wei; Deng, Hui; Wang, Feng

2017-01-01

The Mingantu Ultrawide Spectral Radioheliograph (MUSER) has entered a test observation stage. After the construction of the data acquisition and storage system, it is urgent to automatically flag and eliminate the abnormal visibility data so as to improve the imaging quality. In this paper, according to the observational records, we create a credible visibility set, and further obtain the corresponding flag model of visibility data by using the support vector machine (SVM) technique. The results show that the SVM is a robust approach to flag the MUSER visibility data, and can attain an accuracy of about 86%. Meanwhile, this method will not be affected by solar activities, such as flare eruptions.
Prediction on sunspot activity based on fuzzy information granulation and support vector machine

NASA Astrophysics Data System (ADS)

Peng, Lingling; Yan, Haisheng; Yang, Zhigang

2018-04-01

In order to analyze the range of sunspots, a combined prediction method of forecasting the fluctuation range of sunspots based on fuzzy information granulation (FIG) and support vector machine (SVM) was put forward. Firstly, employing the FIG to granulate sample data and extract va)alid information of each window, namely the minimum value, the general average value and the maximum value of each window. Secondly, forecasting model is built respectively with SVM and then cross method is used to optimize these parameters. Finally, the fluctuation range of sunspots is forecasted with the optimized SVM model. Case study demonstrates that the model have high accuracy and can effectively predict the fluctuation of sunspots.
T-ray relevant frequencies for osteosarcoma classification

NASA Astrophysics Data System (ADS)

Withayachumnankul, W.; Ferguson, B.; Rainsford, T.; Findlay, D.; Mickan, S. P.; Abbott, D.

2006-01-01

We investigate the classification of the T-ray response of normal human bone cells and human osteosarcoma cells, grown in culture. Given the magnitude and phase responses within a reliable spectral range as features for input vectors, a trained support vector machine can correctly classify the two cell types to some extent. Performance of the support vector machine is deteriorated by the curse of dimensionality, resulting from the comparatively large number of features in the input vectors. Feature subset selection methods are used to select only an optimal number of relevant features for inputs. As a result, an improvement in generalization performance is attainable, and the selected frequencies can be used for further describing different mechanisms of the cells, responding to T-rays. We demonstrate a consistent classification accuracy of 89.6%, while the only one fifth of the original features are retained in the data set.
Chromosome preference of disease genes and vectorization for the prediction of non-coding disease genes.

PubMed

Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan

2017-10-03

Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes.
Chromosome preference of disease genes and vectorization for the prediction of non-coding disease genes

PubMed Central

Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan

2017-01-01

Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes. PMID:29108274
repRNA: a web server for generating various feature vectors of RNA sequences.

PubMed

Liu, Bin; Liu, Fule; Fang, Longyun; Wang, Xiaolong; Chou, Kuo-Chen

2016-02-01

With the rapid growth of RNA sequences generated in the postgenomic age, it is highly desired to develop a flexible method that can generate various kinds of vectors to represent these sequences by focusing on their different features. This is because nearly all the existing machine-learning methods, such as SVM (support vector machine) and KNN (k-nearest neighbor), can only handle vectors but not sequences. To meet the increasing demands and speed up the genome analyses, we have developed a new web server, called "representations of RNA sequences" (repRNA). Compared with the existing methods, repRNA is much more comprehensive, flexible and powerful, as reflected by the following facts: (1) it can generate 11 different modes of feature vectors for users to choose according to their investigation purposes; (2) it allows users to select the features from 22 built-in physicochemical properties and even those defined by users' own; (3) the resultant feature vectors and the secondary structures of the corresponding RNA sequences can be visualized. The repRNA web server is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/repRNA/ .
Predicting complications of percutaneous coronary intervention using a novel support vector method.

PubMed

Lee, Gyemin; Gurm, Hitinder S; Syed, Zeeshan

2013-01-01

To explore the feasibility of a novel approach using an augmented one-class learning algorithm to model in-laboratory complications of percutaneous coronary intervention (PCI). Data from the Blue Cross Blue Shield of Michigan Cardiovascular Consortium (BMC2) multicenter registry for the years 2007 and 2008 (n=41 016) were used to train models to predict 13 different in-laboratory PCI complications using a novel one-plus-class support vector machine (OP-SVM) algorithm. The performance of these models in terms of discrimination and calibration was compared to the performance of models trained using the following classification algorithms on BMC2 data from 2009 (n=20 289): logistic regression (LR), one-class support vector machine classification (OC-SVM), and two-class support vector machine classification (TC-SVM). For the OP-SVM and TC-SVM approaches, variants of the algorithms with cost-sensitive weighting were also considered. The OP-SVM algorithm and its cost-sensitive variant achieved the highest area under the receiver operating characteristic curve for the majority of the PCI complications studied (eight cases). Similar improvements were observed for the Hosmer-Lemeshow χ(2) value (seven cases) and the mean cross-entropy error (eight cases). The OP-SVM algorithm based on an augmented one-class learning problem improved discrimination and calibration across different PCI complications relative to LR and traditional support vector machine classification. Such an approach may have value in a broader range of clinical domains.
Experimental Investigation for Fault Diagnosis Based on a Hybrid Approach Using Wavelet Packet and Support Vector Classification

PubMed Central

Li, Pengfei; Jiang, Yongying; Xiang, Jiawei

2014-01-01

To deal with the difficulty to obtain a large number of fault samples under the practical condition for mechanical fault diagnosis, a hybrid method that combined wavelet packet decomposition and support vector classification (SVC) is proposed. The wavelet packet is employed to decompose the vibration signal to obtain the energy ratio in each frequency band. Taking energy ratios as feature vectors, the pattern recognition results are obtained by the SVC. The rolling bearing and gear fault diagnostic results of the typical experimental platform show that the present approach is robust to noise and has higher classification accuracy and, thus, provides a better way to diagnose mechanical faults under the condition of small fault samples. PMID:24688361
An analysis of random projection for changeable and privacy-preserving biometric verification.

PubMed

Wang, Yongjin; Plataniotis, Konstantinos N

2010-10-01

Changeability and privacy protection are important factors for widespread deployment of biometrics-based verification systems. This paper presents a systematic analysis of a random-projection (RP)-based method for addressing these problems. The employed method transforms biometric data using a random matrix with each entry an independent and identically distributed Gaussian random variable. The similarity- and privacy-preserving properties, as well as the changeability of the biometric information in the transformed domain, are analyzed in detail. Specifically, RP on both high-dimensional image vectors and dimensionality-reduced feature vectors is discussed and compared. A vector translation method is proposed to improve the changeability of the generated templates. The feasibility of the introduced solution is well supported by detailed theoretical analyses. Extensive experimentation on a face-based biometric verification problem shows the effectiveness of the proposed method.
Classification of Stellar Spectra with Fuzzy Minimum Within-Class Support Vector Machine

NASA Astrophysics Data System (ADS)

Zhong-bao, Liu; Wen-ai, Song; Jing, Zhang; Wen-juan, Zhao

2017-06-01

Classification is one of the important tasks in astronomy, especially in spectra analysis. Support Vector Machine (SVM) is a typical classification method, which is widely used in spectra classification. Although it performs well in practice, its classification accuracies can not be greatly improved because of two limitations. One is it does not take the distribution of the classes into consideration. The other is it is sensitive to noise. In order to solve the above problems, inspired by the maximization of the Fisher's Discriminant Analysis (FDA) and the SVM separability constraints, fuzzy minimum within-class support vector machine (FMWSVM) is proposed in this paper. In FMWSVM, the distribution of the classes is reflected by the within-class scatter in FDA and the fuzzy membership function is introduced to decrease the influence of the noise. The comparative experiments with SVM on the SDSS datasets verify the effectiveness of the proposed classifier FMWSVM.
Geographical traceability of Marsdenia tenacissima by Fourier transform infrared spectroscopy and chemometrics

NASA Astrophysics Data System (ADS)

Li, Chao; Yang, Sheng-Chao; Guo, Qiao-Sheng; Zheng, Kai-Yan; Wang, Ping-Li; Meng, Zhen-Gui

2016-01-01

A combination of Fourier transform infrared spectroscopy with chemometrics tools provided an approach for studying Marsdenia tenacissima according to its geographical origin. A total of 128 M. tenacissima samples from four provinces in China were analyzed with FTIR spectroscopy. Six pattern recognition methods were used to construct the discrimination models: support vector machine-genetic algorithms, support vector machine-particle swarm optimization, K-nearest neighbors, radial basis function neural network, random forest and support vector machine-grid search. Experimental results showed that K-nearest neighbors was superior to other mathematical algorithms after data were preprocessed with wavelet de-noising, with a discrimination rate of 100% in both the training and prediction sets. This study demonstrated that FTIR spectroscopy coupled with K-nearest neighbors could be successfully applied to determine the geographical origins of M. tenacissima samples, thereby providing reliable authentication in a rapid, cheap and noninvasive way.
Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

PubMed Central

Fernandez-Lozano, C.; Canto, C.; Gestal, M.; Andrade-Garda, J. M.; Rabuñal, J. R.; Dorado, J.; Pazos, A.

2013-01-01

Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM). Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA), the most representative variables for a specific classification problem can be selected. PMID:24453933
Design of 2D time-varying vector fields.

PubMed

Chen, Guoning; Kwatra, Vivek; Wei, Li-Yi; Hansen, Charles D; Zhang, Eugene

2012-10-01

Design of time-varying vector fields, i.e., vector fields that can change over time, has a wide variety of important applications in computer graphics. Existing vector field design techniques do not address time-varying vector fields. In this paper, we present a framework for the design of time-varying vector fields, both for planar domains as well as manifold surfaces. Our system supports the creation and modification of various time-varying vector fields with desired spatial and temporal characteristics through several design metaphors, including streamlines, pathlines, singularity paths, and bifurcations. These design metaphors are integrated into an element-based design to generate the time-varying vector fields via a sequence of basis field summations or spatial constrained optimizations at the sampled times. The key-frame design and field deformation are also introduced to support other user design scenarios. Accordingly, a spatial-temporal constrained optimization and the time-varying transformation are employed to generate the desired fields for these two design scenarios, respectively. We apply the time-varying vector fields generated using our design system to a number of important computer graphics applications that require controllable dynamic effects, such as evolving surface appearance, dynamic scene design, steerable crowd movement, and painterly animation. Many of these are difficult or impossible to achieve via prior simulation-based methods. In these applications, the time-varying vector fields have been applied as either orientation fields or advection fields to control the instantaneous appearance or evolving trajectories of the dynamic effects.

Text mining approach to predict hospital admissions using early medical records from the emergency department.

PubMed

Lucini, Filipe R; S Fogliatto, Flavio; C da Silveira, Giovani J; L Neyeloff, Jeruza; Anzanello, Michel J; de S Kuchenbecker, Ricardo; D Schaan, Beatriz

2017-04-01

Emergency department (ED) overcrowding is a serious issue for hospitals. Early information on short-term inward bed demand from patients receiving care at the ED may reduce the overcrowding problem, and optimize the use of hospital resources. In this study, we use text mining methods to process data from early ED patient records using the SOAP framework, and predict future hospitalizations and discharges. We try different approaches for pre-processing of text records and to predict hospitalization. Sets-of-words are obtained via binary representation, term frequency, and term frequency-inverse document frequency. Unigrams, bigrams and trigrams are tested for feature formation. Feature selection is based on χ 2 and F-score metrics. In the prediction module, eight text mining methods are tested: Decision Tree, Random Forest, Extremely Randomized Tree, AdaBoost, Logistic Regression, Multinomial Naïve Bayes, Support Vector Machine (Kernel linear) and Nu-Support Vector Machine (Kernel linear). Prediction performance is evaluated by F1-scores. Precision and Recall values are also informed for all text mining methods tested. Nu-Support Vector Machine was the text mining method with the best overall performance. Its average F1-score in predicting hospitalization was 77.70%, with a standard deviation (SD) of 0.66%. The method could be used to manage daily routines in EDs such as capacity planning and resource allocation. Text mining could provide valuable information and facilitate decision-making by inward bed management teams. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.
Using an object-based grid system to evaluate a newly developed EP approach to formulate SVMs as applied to the classification of organophosphate nerve agents

NASA Astrophysics Data System (ADS)

Land, Walker H., Jr.; Lewis, Michael; Sadik, Omowunmi; Wong, Lut; Wanekaya, Adam; Gonzalez, Richard J.; Balan, Arun

2004-04-01

This paper extends the classification approaches described in reference [1] in the following way: (1.) developing and evaluating a new method for evolving organophosphate nerve agent Support Vector Machine (SVM) classifiers using Evolutionary Programming, (2.) conducting research experiments using a larger database of organophosphate nerve agents, and (3.) upgrading the architecture to an object-based grid system for evaluating the classification of EP derived SVMs. Due to the increased threats of chemical and biological weapons of mass destruction (WMD) by international terrorist organizations, a significant effort is underway to develop tools that can be used to detect and effectively combat biochemical warfare. This paper reports the integration of multi-array sensors with Support Vector Machines (SVMs) for the detection of organophosphates nerve agents using a grid computing system called Legion. Grid computing is the use of large collections of heterogeneous, distributed resources (including machines, databases, devices, and users) to support large-scale computations and wide-area data access. Finally, preliminary results using EP derived support vector machines designed to operate on distributed systems have provided accurate classification results. In addition, distributed training time architectures are 50 times faster when compared to standard iterative training time methods.
Discrete wavelength selection for the optical readout of a metamaterial biosensing system for glucose concentration estimation via a support vector regression model.

PubMed

Teutsch, T; Mesch, M; Giessen, H; Tarin, C

2015-01-01

In this contribution, a method to select discrete wavelengths that allow an accurate estimation of the glucose concentration in a biosensing system based on metamaterials is presented. The sensing concept is adapted to the particular application of ophthalmic glucose sensing by covering the metamaterial with a glucose-sensitive hydrogel and the sensor readout is performed optically. Due to the fact that in a mobile context a spectrometer is not suitable, few discrete wavelengths must be selected to estimate the glucose concentration. The developed selection methods are based on nonlinear support vector regression (SVR) models. Two selection methods are compared and it is shown that wavelengths selected by a sequential forward feature selection algorithm achieves an estimation improvement. The presented method can be easily applied to different metamaterial layouts and hydrogel configurations.
Support Vector Machines to improve physiologic hot flash measures: application to the ambulatory setting.

PubMed

Thurston, Rebecca C; Hernandez, Javier; Del Rio, Jose M; De La Torre, Fernando

2011-07-01

Most midlife women have hot flashes. The conventional criterion (≥2 μmho rise/30 s) for classifying hot flashes physiologically has shown poor performance. We improved this performance in the laboratory with Support Vector Machines (SVMs), a pattern classification method. We aimed to compare conventional to SVM methods to classify hot flashes in the ambulatory setting. Thirty-one women with hot flashes underwent 24 h of ambulatory sternal skin conductance monitoring. Hot flashes were quantified with conventional (≥2 μmho/30 s) and SVM methods. Conventional methods had low sensitivity (sensitivity=.57, specificity=.98, positive predictive value (PPV)=.91, negative predictive value (NPV)=.90, F1=.60), with performance lower with higher body mass index (BMI). SVMs improved this performance (sensitivity=.87, specificity=.97, PPV=.90, NPV=.96, F1=.88) and reduced BMI variation. SVMs can improve ambulatory physiologic hot flash measures. Copyright © 2010 Society for Psychophysiological Research.
Differentiation of Glioblastoma and Lymphoma Using Feature Extraction and Support Vector Machine.

PubMed

Yang, Zhangjing; Feng, Piaopiao; Wen, Tian; Wan, Minghua; Hong, Xunning

2017-01-01

Differentiation of glioblastoma multiformes (GBMs) and lymphomas using multi-sequence magnetic resonance imaging (MRI) is an important task that is valuable for treatment planning. However, this task is a challenge because GBMs and lymphomas may have a similar appearance in MRI images. This similarity may lead to misclassification and could affect the treatment results. In this paper, we propose a semi-automatic method based on multi-sequence MRI to differentiate these two types of brain tumors. Our method consists of three steps: 1) the key slice is selected from 3D MRIs and region of interests (ROIs) are drawn around the tumor region; 2) different features are extracted based on prior clinical knowledge and validated using a t-test; and 3) features that are helpful for classification are used to build an original feature vector and a support vector machine is applied to perform classification. In total, 58 GBM cases and 37 lymphoma cases are used to validate our method. A leave-one-out crossvalidation strategy is adopted in our experiments. The global accuracy of our method was determined as 96.84%, which indicates that our method is effective for the differentiation of GBM and lymphoma and can be applied in clinical diagnosis. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Analysis of spectrally resolved autofluorescence images by support vector machines

NASA Astrophysics Data System (ADS)

Mateasik, A.; Chorvat, D.; Chorvatova, A.

2013-02-01

Spectral analysis of the autofluorescence images of isolated cardiac cells was performed to evaluate and to classify the metabolic state of the cells in respect to the responses to metabolic modulators. The classification was done using machine learning approach based on support vector machine with the set of the automatically calculated features from recorded spectral profile of spectral autofluorescence images. This classification method was compared with the classical approach where the individual spectral components contributing to cell autofluorescence were estimated by spectral analysis, namely by blind source separation using non-negative matrix factorization. Comparison of both methods showed that machine learning can effectively classify the spectrally resolved autofluorescence images without the need of detailed knowledge about the sources of autofluorescence and their spectral properties.
Prediction of biomechanical parameters of the proximal femur using statistical appearance models and support vector regression.

PubMed

Fritscher, Karl; Schuler, Benedikt; Link, Thomas; Eckstein, Felix; Suhm, Norbert; Hänni, Markus; Hengg, Clemens; Schubert, Rainer

2008-01-01

Fractures of the proximal femur are one of the principal causes of mortality among elderly persons. Traditional methods for the determination of femoral fracture risk use methods for measuring bone mineral density. However, BMD alone is not sufficient to predict bone failure load for an individual patient and additional parameters have to be determined for this purpose. In this work an approach that uses statistical models of appearance to identify relevant regions and parameters for the prediction of biomechanical properties of the proximal femur will be presented. By using Support Vector Regression the proposed model based approach is capable of predicting two different biomechanical parameters accurately and fully automatically in two different testing scenarios.
Rapid authentication of adulteration of olive oil by near-infrared spectroscopy using support vector machines

NASA Astrophysics Data System (ADS)

Wu, Jingzhu; Dong, Jingjing; Dong, Wenfei; Chen, Yan; Liu, Cuiling

2016-10-01

A classification method of support vector machines with linear kernel was employed to authenticate genuine olive oil based on near-infrared spectroscopy. There were three types of adulteration of olive oil experimented in the study. The adulterated oil was respectively soybean oil, rapeseed oil and the mixture of soybean and rapeseed oil. The average recognition rate of second experiment was more than 90% and that of the third experiment was reach to 100%. The results showed the method had good performance in classifying genuine olive oil and the adulteration with small variation range of adulterated concentration and it was a promising and rapid technique for the detection of oil adulteration and fraud in the food industry.
Classification of ECG signal with Support Vector Machine Method for Arrhythmia Detection

NASA Astrophysics Data System (ADS)

Turnip, Arjon; Ilham Rizqywan, M.; Kusumandari, Dwi E.; Turnip, Mardi; Sihombing, Poltak

2018-03-01

An electrocardiogram is a potential bioelectric record that occurs as a result of cardiac activity. QRS Detection with zero crossing calculation is one method that can precisely determine peak R of QRS wave as part of arrhythmia detection. In this paper, two experimental scheme (2 minutes duration with different activities: relaxed and, typing) were conducted. From the two experiments it were obtained: accuracy, sensitivity, and positive predictivity about 100% each for the first experiment and about 79%, 93%, 83% for the second experiment, respectively. Furthermore, the feature set of MIT-BIH arrhythmia using the support vector machine (SVM) method on the WEKA software is evaluated. By combining the available attributes on the WEKA algorithm, the result is constant since all classes of SVM goes to the normal class with average 88.49% accuracy.
A support vector regression-firefly algorithm-based model for limiting velocity prediction in sewer pipes.

PubMed

Ebtehaj, Isa; Bonakdari, Hossein

2016-01-01

Sediment transport without deposition is an essential consideration in the optimum design of sewer pipes. In this study, a novel method based on a combination of support vector regression (SVR) and the firefly algorithm (FFA) is proposed to predict the minimum velocity required to avoid sediment settling in pipe channels, which is expressed as the densimetric Froude number (Fr). The efficiency of support vector machine (SVM) models depends on the suitable selection of SVM parameters. In this particular study, FFA is used by determining these SVM parameters. The actual effective parameters on Fr calculation are generally identified by employing dimensional analysis. The different dimensionless variables along with the models are introduced. The best performance is attributed to the model that employs the sediment volumetric concentration (C(V)), ratio of relative median diameter of particles to hydraulic radius (d/R), dimensionless particle number (D(gr)) and overall sediment friction factor (λ(s)) parameters to estimate Fr. The performance of the SVR-FFA model is compared with genetic programming, artificial neural network and existing regression-based equations. The results indicate the superior performance of SVR-FFA (mean absolute percentage error = 2.123%; root mean square error =0.116) compared with other methods.
A Real-Time Interference Monitoring Technique for GNSS Based on a Twin Support Vector Machine Method.

PubMed

Li, Wutao; Huang, Zhigang; Lang, Rongling; Qin, Honglei; Zhou, Kai; Cao, Yongbin

2016-03-04

Interferences can severely degrade the performance of Global Navigation Satellite System (GNSS) receivers. As the first step of GNSS any anti-interference measures, interference monitoring for GNSS is extremely essential and necessary. Since interference monitoring can be considered as a classification problem, a real-time interference monitoring technique based on Twin Support Vector Machine (TWSVM) is proposed in this paper. A TWSVM model is established, and TWSVM is solved by the Least Squares Twin Support Vector Machine (LSTWSVM) algorithm. The interference monitoring indicators are analyzed to extract features from the interfered GNSS signals. The experimental results show that the chosen observations can be used as the interference monitoring indicators. The interference monitoring performance of the proposed method is verified by using GPS L1 C/A code signal and being compared with that of standard SVM. The experimental results indicate that the TWSVM-based interference monitoring is much faster than the conventional SVM. Furthermore, the training time of TWSVM is on millisecond (ms) level and the monitoring time is on microsecond (μs) level, which make the proposed approach usable in practical interference monitoring applications.
A Real-Time Interference Monitoring Technique for GNSS Based on a Twin Support Vector Machine Method

PubMed Central

Li, Wutao; Huang, Zhigang; Lang, Rongling; Qin, Honglei; Zhou, Kai; Cao, Yongbin

2016-01-01

Interferences can severely degrade the performance of Global Navigation Satellite System (GNSS) receivers. As the first step of GNSS any anti-interference measures, interference monitoring for GNSS is extremely essential and necessary. Since interference monitoring can be considered as a classification problem, a real-time interference monitoring technique based on Twin Support Vector Machine (TWSVM) is proposed in this paper. A TWSVM model is established, and TWSVM is solved by the Least Squares Twin Support Vector Machine (LSTWSVM) algorithm. The interference monitoring indicators are analyzed to extract features from the interfered GNSS signals. The experimental results show that the chosen observations can be used as the interference monitoring indicators. The interference monitoring performance of the proposed method is verified by using GPS L1 C/A code signal and being compared with that of standard SVM. The experimental results indicate that the TWSVM-based interference monitoring is much faster than the conventional SVM. Furthermore, the training time of TWSVM is on millisecond (ms) level and the monitoring time is on microsecond (μs) level, which make the proposed approach usable in practical interference monitoring applications. PMID:26959020
Short-Circuit Fault Detection and Classification Using Empirical Wavelet Transform and Local Energy for Electric Transmission Line.

PubMed

Huang, Nantian; Qi, Jiajin; Li, Fuqing; Yang, Dongfeng; Cai, Guowei; Huang, Guilin; Zheng, Jian; Li, Zhenxin

2017-09-16

In order to improve the classification accuracy of recognizing short-circuit faults in electric transmission lines, a novel detection and diagnosis method based on empirical wavelet transform (EWT) and local energy (LE) is proposed. First, EWT is used to deal with the original short-circuit fault signals from photoelectric voltage transformers, before the amplitude modulated-frequency modulated (AM-FM) mode with a compactly supported Fourier spectrum is extracted. Subsequently, the fault occurrence time is detected according to the modulus maxima of intrinsic mode function (IMF₂) from three-phase voltage signals processed by EWT. After this process, the feature vectors are constructed by calculating the LE of the fundamental frequency based on the three-phase voltage signals of one period after the fault occurred. Finally, the classifier based on support vector machine (SVM) which was constructed with the LE feature vectors is used to classify 10 types of short-circuit fault signals. Compared with complementary ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and improved CEEMDAN methods, the new method using EWT has a better ability to present the frequency in time. The difference in the characteristics of the energy distribution in the time domain between different types of short-circuit faults can be presented by the feature vectors of LE. Together, simulation and real signals experiment demonstrate the validity and effectiveness of the new approach.
Short-Circuit Fault Detection and Classification Using Empirical Wavelet Transform and Local Energy for Electric Transmission Line

PubMed Central

Huang, Nantian; Qi, Jiajin; Li, Fuqing; Yang, Dongfeng; Cai, Guowei; Huang, Guilin; Zheng, Jian; Li, Zhenxin

2017-01-01

In order to improve the classification accuracy of recognizing short-circuit faults in electric transmission lines, a novel detection and diagnosis method based on empirical wavelet transform (EWT) and local energy (LE) is proposed. First, EWT is used to deal with the original short-circuit fault signals from photoelectric voltage transformers, before the amplitude modulated-frequency modulated (AM-FM) mode with a compactly supported Fourier spectrum is extracted. Subsequently, the fault occurrence time is detected according to the modulus maxima of intrinsic mode function (IMF2) from three-phase voltage signals processed by EWT. After this process, the feature vectors are constructed by calculating the LE of the fundamental frequency based on the three-phase voltage signals of one period after the fault occurred. Finally, the classifier based on support vector machine (SVM) which was constructed with the LE feature vectors is used to classify 10 types of short-circuit fault signals. Compared with complementary ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and improved CEEMDAN methods, the new method using EWT has a better ability to present the frequency in time. The difference in the characteristics of the energy distribution in the time domain between different types of short-circuit faults can be presented by the feature vectors of LE. Together, simulation and real signals experiment demonstrate the validity and effectiveness of the new approach. PMID:28926953
New fuzzy support vector machine for the class imbalance problem in medical datasets classification.

PubMed

Gu, Xiaoqing; Ni, Tongguang; Wang, Hongyuan

2014-01-01

In medical datasets classification, support vector machine (SVM) is considered to be one of the most successful methods. However, most of the real-world medical datasets usually contain some outliers/noise and data often have class imbalance problems. In this paper, a fuzzy support machine (FSVM) for the class imbalance problem (called FSVM-CIP) is presented, which can be seen as a modified class of FSVM by extending manifold regularization and assigning two misclassification costs for two classes. The proposed FSVM-CIP can be used to handle the class imbalance problem in the presence of outliers/noise, and enhance the locality maximum margin. Five real-world medical datasets, breast, heart, hepatitis, BUPA liver, and pima diabetes, from the UCI medical database are employed to illustrate the method presented in this paper. Experimental results on these datasets show the outperformed or comparable effectiveness of FSVM-CIP.
Vector control activities: Fiscal Year, 1986

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1987-04-01

The program is divided into two major components - operations and support studies. The support studies are designed to improve the operational effectiveness and efficiency of the control program and to identify other vector control problems requiring TVA attention and study. Nonchemical methods of control are emphasized and are supplemented with chemical measures as needed. TVA also cooperates with various concerned municipalities in identifying blood-sucking arthropod pest problems and demonstrating control techniques useful in establishing abatement programs, and provides technical assistance to other TVA programs and organizations. The program also helps Land Between The Lakes (LBL) plan and conduct vectormore » control operations and tick control research. Specific program control activities and support studies are discussed.« less
On the sparseness of 1-norm support vector machines.

PubMed

Zhang, Li; Zhou, Weida

2010-04-01

There is some empirical evidence available showing that 1-norm Support Vector Machines (1-norm SVMs) have good sparseness; however, both how good sparseness 1-norm SVMs can reach and whether they have a sparser representation than that of standard SVMs are not clear. In this paper we take into account the sparseness of 1-norm SVMs. Two upper bounds on the number of nonzero coefficients in the decision function of 1-norm SVMs are presented. First, the number of nonzero coefficients in 1-norm SVMs is at most equal to the number of only the exact support vectors lying on the +1 and -1 discriminating surfaces, while that in standard SVMs is equal to the number of support vectors, which implies that 1-norm SVMs have better sparseness than that of standard SVMs. Second, the number of nonzero coefficients is at most equal to the rank of the sample matrix. A brief review of the geometry of linear programming and the primal steepest edge pricing simplex method are given, which allows us to provide the proof of the two upper bounds and evaluate their tightness by experiments. Experimental results on toy data sets and the UCI data sets illustrate our analysis. Copyright 2009 Elsevier Ltd. All rights reserved.
Weighted K-means support vector machine for cancer prediction.

PubMed

Kim, SungHwan

2016-01-01

To date, the support vector machine (SVM) has been widely applied to diverse bio-medical fields to address disease subtype identification and pathogenicity of genetic variants. In this paper, I propose the weighted K-means support vector machine (wKM-SVM) and weighted support vector machine (wSVM), for which I allow the SVM to impose weights to the loss term. Besides, I demonstrate the numerical relations between the objective function of the SVM and weights. Motivated by general ensemble techniques, which are known to improve accuracy, I directly adopt the boosting algorithm to the newly proposed weighted KM-SVM (and wSVM). For predictive performance, a range of simulation studies demonstrate that the weighted KM-SVM (and wSVM) with boosting outperforms the standard KM-SVM (and SVM) including but not limited to many popular classification rules. I applied the proposed methods to simulated data and two large-scale real applications in the TCGA pan-cancer methylation data of breast and kidney cancer. In conclusion, the weighted KM-SVM (and wSVM) increases accuracy of the classification model, and will facilitate disease diagnosis and clinical treatment decisions to benefit patients. A software package (wSVM) is publicly available at the R-project webpage (https://www.r-project.org).
Conditional Entropy-Constrained Residual VQ with Application to Image Coding

NASA Technical Reports Server (NTRS)

Kossentini, Faouzi; Chung, Wilson C.; Smith, Mark J. T.

1996-01-01

This paper introduces an extension of entropy-constrained residual vector quantization (VQ) where intervector dependencies are exploited. The method, which we call conditional entropy-constrained residual VQ, employs a high-order entropy conditioning strategy that captures local information in the neighboring vectors. When applied to coding images, the proposed method is shown to achieve better rate-distortion performance than that of entropy-constrained residual vector quantization with less computational complexity and lower memory requirements. Moreover, it can be designed to support progressive transmission in a natural way. It is also shown to outperform some of the best predictive and finite-state VQ techniques reported in the literature. This is due partly to the joint optimization between the residual vector quantizer and a high-order conditional entropy coder as well as the efficiency of the multistage residual VQ structure and the dynamic nature of the prediction.
A Simple Deep Learning Method for Neuronal Spike Sorting

NASA Astrophysics Data System (ADS)

Yang, Kai; Wu, Haifeng; Zeng, Yu

2017-10-01

Spike sorting is one of key technique to understand brain activity. With the development of modern electrophysiology technology, some recent multi-electrode technologies have been able to record the activity of thousands of neuronal spikes simultaneously. The spike sorting in this case will increase the computational complexity of conventional sorting algorithms. In this paper, we will focus spike sorting on how to reduce the complexity, and introduce a deep learning algorithm, principal component analysis network (PCANet) to spike sorting. The introduced method starts from a conventional model and establish a Toeplitz matrix. Through the column vectors in the matrix, we trains a PCANet, where some eigenvalue vectors of spikes could be extracted. Finally, support vector machine (SVM) is used to sort spikes. In experiments, we choose two groups of simulated data from public databases availably and compare this introduced method with conventional methods. The results indicate that the introduced method indeed has lower complexity with the same sorting errors as the conventional methods.

Automated image segmentation using support vector machines

NASA Astrophysics Data System (ADS)

Powell, Stephanie; Magnotta, Vincent A.; Andreasen, Nancy C.

2007-03-01

Neurodegenerative and neurodevelopmental diseases demonstrate problems associated with brain maturation and aging. Automated methods to delineate brain structures of interest are required to analyze large amounts of imaging data like that being collected in several on going multi-center studies. We have previously reported on using artificial neural networks (ANN) to define subcortical brain structures including the thalamus (0.88), caudate (0.85) and the putamen (0.81). In this work, apriori probability information was generated using Thirion's demons registration algorithm. The input vector consisted of apriori probability, spherical coordinates, and an iris of surrounding signal intensity values. We have applied the support vector machine (SVM) machine learning algorithm to automatically segment subcortical and cerebellar regions using the same input vector information. SVM architecture was derived from the ANN framework. Training was completed using a radial-basis function kernel with gamma equal to 5.5. Training was performed using 15,000 vectors collected from 15 training images in approximately 10 minutes. The resulting support vectors were applied to delineate 10 images not part of the training set. Relative overlap calculated for the subcortical structures was 0.87 for the thalamus, 0.84 for the caudate, 0.84 for the putamen, and 0.72 for the hippocampus. Relative overlap for the cerebellar lobes ranged from 0.76 to 0.86. The reliability of the SVM based algorithm was similar to the inter-rater reliability between manual raters and can be achieved without rater intervention.
Support vector machine multiuser receiver for DS-CDMA signals in multipath channels.

PubMed

Chen, S; Samingan, A K; Hanzo, L

2001-01-01

The problem of constructing an adaptive multiuser detector (MUD) is considered for direct sequence code division multiple access (DS-CDMA) signals transmitted through multipath channels. The emerging learning technique, called support vector machines (SVM), is proposed as a method of obtaining a nonlinear MUD from a relatively small training data block. Computer simulation is used to study this SVM MUD, and the results show that it can closely match the performance of the optimal Bayesian one-shot detector. Comparisons with an adaptive radial basis function (RBF) MUD trained by an unsupervised clustering algorithm are discussed.
Implementation of support vector machine for classification of speech marked hijaiyah letters based on Mel frequency cepstrum coefficient feature extraction

NASA Astrophysics Data System (ADS)

Adhi Pradana, Wisnu; Adiwijaya; Novia Wisesty, Untari

2018-03-01

Support Vector Machine or commonly called SVM is one method that can be used to process the classification of a data. SVM classifies data from 2 different classes with hyperplane. In this study, the system was built using SVM to develop Arabic Speech Recognition. In the development of the system, there are 2 kinds of speakers that have been tested that is dependent speakers and independent speakers. The results from this system is an accuracy of 85.32% for speaker dependent and 61.16% for independent speakers.
Protein Kinase Classification with 2866 Hidden Markov Models and One Support Vector Machine

NASA Technical Reports Server (NTRS)

Weber, Ryan; New, Michael H.; Fonda, Mark (Technical Monitor)

2002-01-01

The main application considered in this paper is predicting true kinases from randomly permuted kinases that share the same length and amino acid distributions as the true kinases. Numerous methods already exist for this classification task, such as HMMs, motif-matchers, and sequence comparison algorithms. We build on some of these efforts by creating a vector from the output of thousands of structurally based HMMs, created offline with Pfam-A seed alignments using SAM-T99, which then must be combined into an overall classification for the protein. Then we use a Support Vector Machine for classifying this large ensemble Pfam-Vector, with a polynomial and chisquared kernel. In particular, the chi-squared kernel SVM performs better than the HMMs and better than the BLAST pairwise comparisons, when predicting true from false kinases in some respects, but no one algorithm is best for all purposes or in all instances so we consider the particular strengths and weaknesses of each.
Highly accurate prediction of protein self-interactions by incorporating the average block and PSSM information into the general PseAAC.

PubMed

Zhai, Jing-Xuan; Cao, Tian-Jie; An, Ji-Yong; Bian, Yong-Tao

2017-11-07

It is a challenging task for fundamental research whether proteins can interact with their partners. Protein self-interaction (SIP) is a special case of PPIs, which plays a key role in the regulation of cellular functions. Due to the limitations of experimental self-interaction identification, it is very important to develop an effective biological tool for predicting SIPs based on protein sequences. In the study, we developed a novel computational method called RVM-AB that combines the Relevance Vector Machine (RVM) model and Average Blocks (AB) for detecting SIPs from protein sequences. Firstly, Average Blocks (AB) feature extraction method is employed to represent protein sequences on a Position Specific Scoring Matrix (PSSM). Secondly, Principal Component Analysis (PCA) method is used to reduce the dimension of AB vector for reducing the influence of noise. Then, by employing the Relevance Vector Machine (RVM) algorithm, the performance of RVM-AB is assessed and compared with the state-of-the-art support vector machine (SVM) classifier and other exiting methods on yeast and human datasets respectively. Using the fivefold test experiment, RVM-AB model achieved very high accuracies of 93.01% and 97.72% on yeast and human datasets respectively, which are significantly better than the method based on SVM classifier and other previous methods. The experimental results proved that the RVM-AB prediction model is efficient and robust. It can be an automatic decision support tool for detecting SIPs. For facilitating extensive studies for future proteomics research, the RVMAB server is freely available for academic use at http://219.219.62.123:8888/SIP_AB. Copyright © 2017 Elsevier Ltd. All rights reserved.
Improving the Accuracy and Training Speed of Motor Imagery Brain-Computer Interfaces Using Wavelet-Based Combined Feature Vectors and Gaussian Mixture Model-Supervectors.

PubMed

Lee, David; Park, Sang-Hoon; Lee, Sang-Goog

2017-10-07

In this paper, we propose a set of wavelet-based combined feature vectors and a Gaussian mixture model (GMM)-supervector to enhance training speed and classification accuracy in motor imagery brain-computer interfaces. The proposed method is configured as follows: first, wavelet transforms are applied to extract the feature vectors for identification of motor imagery electroencephalography (EEG) and principal component analyses are used to reduce the dimensionality of the feature vectors and linearly combine them. Subsequently, the GMM universal background model is trained by the expectation-maximization (EM) algorithm to purify the training data and reduce its size. Finally, a purified and reduced GMM-supervector is used to train the support vector machine classifier. The performance of the proposed method was evaluated for three different motor imagery datasets in terms of accuracy, kappa, mutual information, and computation time, and compared with the state-of-the-art algorithms. The results from the study indicate that the proposed method achieves high accuracy with a small amount of training data compared with the state-of-the-art algorithms in motor imagery EEG classification.
Parameters selection in gene selection using Gaussian kernel support vector machines by genetic algorithm.

PubMed

Mao, Yong; Zhou, Xiao-Bo; Pi, Dao-Ying; Sun, You-Xian; Wong, Stephen T C

2005-10-01

In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying results by using conventional linear statistical methods. Recursive feature elimination based on support vector machine (SVM RFE) is an effective algorithm for gene selection and cancer classification, which are integrated into a consistent framework. In this paper, we propose a new method to select parameters of the aforementioned algorithm implemented with Gaussian kernel SVMs as better alternatives to the common practice of selecting the apparently best parameters by using a genetic algorithm to search for a couple of optimal parameter. Fast implementation issues for this method are also discussed for pragmatic reasons. The proposed method was tested on two representative hereditary breast cancer and acute leukaemia datasets. The experimental results indicate that the proposed method performs well in selecting genes and achieves high classification accuracies with these genes.
Application of biomonitoring and support vector machine in water quality assessment*

PubMed Central

Liao, Yue; Xu, Jian-yu; Wang, Zhu-wei

2012-01-01

The behavior of schools of zebrafish (Danio rerio) was studied in acute toxicity environments. Behavioral features were extracted and a method for water quality assessment using support vector machine (SVM) was developed. The behavioral parameters of fish were recorded and analyzed during one hour in an environment of a 24-h half-lethal concentration (LC50) of a pollutant. The data were used to develop a method to evaluate water quality, so as to give an early indication of toxicity. Four kinds of metal ions (Cu2+, Hg2+, Cr6+, and Cd2+) were used for toxicity testing. To enhance the efficiency and accuracy of assessment, a method combining SVM and a genetic algorithm (GA) was used. The results showed that the average prediction accuracy of the method was over 80% and the time cost was acceptable. The method gave satisfactory results for a variety of metal pollutants, demonstrating that this is an effective approach to the classification of water quality. PMID:22467374
Interpreting support vector machine models for multivariate group wise analysis in neuroimaging

PubMed Central

Gaonkar, Bilwaj; Shinohara, Russell T; Davatzikos, Christos

2015-01-01

Machine learning based classification algorithms like support vector machines (SVMs) have shown great promise for turning a high dimensional neuroimaging data into clinically useful decision criteria. However, tracing imaging based patterns that contribute significantly to classifier decisions remains an open problem. This is an issue of critical importance in imaging studies seeking to determine which anatomical or physiological imaging features contribute to the classifier’s decision, thereby allowing users to critically evaluate the findings of such machine learning methods and to understand disease mechanisms. The majority of published work addresses the question of statistical inference for support vector classification using permutation tests based on SVM weight vectors. Such permutation testing ignores the SVM margin, which is critical in SVM theory. In this work we emphasize the use of a statistic that explicitly accounts for the SVM margin and show that the null distributions associated with this statistic are asymptotically normal. Further, our experiments show that this statistic is a lot less conservative as compared to weight based permutation tests and yet specific enough to tease out multivariate patterns in the data. Thus, we can better understand the multivariate patterns that the SVM uses for neuroimaging based classification. PMID:26210913
Real-data comparison of data mining methods in prediction of diabetes in iran.

PubMed

Tapak, Lily; Mahjub, Hossein; Hamidi, Omid; Poorolajal, Jalal

2013-09-01

Diabetes is one of the most common non-communicable diseases in developing countries. Early screening and diagnosis play an important role in effective prevention strategies. This study compared two traditional classification methods (logistic regression and Fisher linear discriminant analysis) and four machine-learning classifiers (neural networks, support vector machines, fuzzy c-mean, and random forests) to classify persons with and without diabetes. The data set used in this study included 6,500 subjects from the Iranian national non-communicable diseases risk factors surveillance obtained through a cross-sectional survey. The obtained sample was based on cluster sampling of the Iran population which was conducted in 2005-2009 to assess the prevalence of major non-communicable disease risk factors. Ten risk factors that are commonly associated with diabetes were selected to compare the performance of six classifiers in terms of sensitivity, specificity, total accuracy, and area under the receiver operating characteristic (ROC) curve criteria. Support vector machines showed the highest total accuracy (0.986) as well as area under the ROC (0.979). Also, this method showed high specificity (1.000) and sensitivity (0.820). All other methods produced total accuracy of more than 85%, but for all methods, the sensitivity values were very low (less than 0.350). The results of this study indicate that, in terms of sensitivity, specificity, and overall classification accuracy, the support vector machine model ranks first among all the classifiers tested in the prediction of diabetes. Therefore, this approach is a promising classifier for predicting diabetes, and it should be further investigated for the prediction of other diseases.
Improving protein–protein interactions prediction accuracy using protein evolutionary information and relevance vector machine model

PubMed Central

An, Ji‐Yong; Meng, Fan‐Rong; Chen, Xing; Yan, Gui‐Ying; Hu, Ji‐Pu

2016-01-01

Abstract Predicting protein–protein interactions (PPIs) is a challenging task and essential to construct the protein interaction networks, which is important for facilitating our understanding of the mechanisms of biological systems. Although a number of high‐throughput technologies have been proposed to predict PPIs, there are unavoidable shortcomings, including high cost, time intensity, and inherently high false positive rates. For these reasons, many computational methods have been proposed for predicting PPIs. However, the problem is still far from being solved. In this article, we propose a novel computational method called RVM‐BiGP that combines the relevance vector machine (RVM) model and Bi‐gram Probabilities (BiGP) for PPIs detection from protein sequences. The major improvement includes (1) Protein sequences are represented using the Bi‐gram probabilities (BiGP) feature representation on a Position Specific Scoring Matrix (PSSM), in which the protein evolutionary information is contained; (2) For reducing the influence of noise, the Principal Component Analysis (PCA) method is used to reduce the dimension of BiGP vector; (3) The powerful and robust Relevance Vector Machine (RVM) algorithm is used for classification. Five‐fold cross‐validation experiments executed on yeast and Helicobacter pylori datasets, which achieved very high accuracies of 94.57 and 90.57%, respectively. Experimental results are significantly better than previous methods. To further evaluate the proposed method, we compare it with the state‐of‐the‐art support vector machine (SVM) classifier on the yeast dataset. The experimental results demonstrate that our RVM‐BiGP method is significantly better than the SVM‐based method. In addition, we achieved 97.15% accuracy on imbalance yeast dataset, which is higher than that of balance yeast dataset. The promising experimental results show the efficiency and robust of the proposed method, which can be an automatic decision support tool for future proteomics research. For facilitating extensive studies for future proteomics research, we developed a freely available web server called RVM‐BiGP‐PPIs in Hypertext Preprocessor (PHP) for predicting PPIs. The web server including source code and the datasets are available at http://219.219.62.123:8888/BiGP/. PMID:27452983
Support Vector Machines Trained with Evolutionary Algorithms Employing Kernel Adatron for Large Scale Classification of Protein Structures.

PubMed

Arana-Daniel, Nancy; Gallegos, Alberto A; López-Franco, Carlos; Alanís, Alma Y; Morales, Jacob; López-Franco, Adriana

2016-01-01

With the increasing power of computers, the amount of data that can be processed in small periods of time has grown exponentially, as has the importance of classifying large-scale data efficiently. Support vector machines have shown good results classifying large amounts of high-dimensional data, such as data generated by protein structure prediction, spam recognition, medical diagnosis, optical character recognition and text classification, etc. Most state of the art approaches for large-scale learning use traditional optimization methods, such as quadratic programming or gradient descent, which makes the use of evolutionary algorithms for training support vector machines an area to be explored. The present paper proposes an approach that is simple to implement based on evolutionary algorithms and Kernel-Adatron for solving large-scale classification problems, focusing on protein structure prediction. The functional properties of proteins depend upon their three-dimensional structures. Knowing the structures of proteins is crucial for biology and can lead to improvements in areas such as medicine, agriculture and biofuels.
Support vector machine firefly algorithm based optimization of lens system.

PubMed

Shamshirband, Shahaboddin; Petković, Dalibor; Pavlović, Nenad T; Ch, Sudheer; Altameem, Torki A; Gani, Abdullah

2015-01-01

Lens system design is an important factor in image quality. The main aspect of the lens system design methodology is the optimization procedure. Since optimization is a complex, nonlinear task, soft computing optimization algorithms can be used. There are many tools that can be employed to measure optical performance, but the spot diagram is the most useful. The spot diagram gives an indication of the image of a point object. In this paper, the spot size radius is considered an optimization criterion. Intelligent soft computing scheme support vector machines (SVMs) coupled with the firefly algorithm (FFA) are implemented. The performance of the proposed estimators is confirmed with the simulation results. The result of the proposed SVM-FFA model has been compared with support vector regression (SVR), artificial neural networks, and generic programming methods. The results show that the SVM-FFA model performs more accurately than the other methodologies. Therefore, SVM-FFA can be used as an efficient soft computing technique in the optimization of lens system designs.
Cervical cancer survival prediction using hybrid of SMOTE, CART and smooth support vector machine

NASA Astrophysics Data System (ADS)

Purnami, S. W.; Khasanah, P. M.; Sumartini, S. H.; Chosuvivatwong, V.; Sriplung, H.

2016-04-01

According to the WHO, every two minutes there is one patient who died from cervical cancer. The high mortality rate is due to the lack of awareness of women for early detection. There are several factors that supposedly influence the survival of cervical cancer patients, including age, anemia status, stage, type of treatment, complications and secondary disease. This study wants to classify/predict cervical cancer survival based on those factors. Various classifications methods: classification and regression tree (CART), smooth support vector machine (SSVM), three order spline SSVM (TSSVM) were used. Since the data of cervical cancer are imbalanced, synthetic minority oversampling technique (SMOTE) is used for handling imbalanced dataset. Performances of these methods are evaluated using accuracy, sensitivity and specificity. Results of this study show that balancing data using SMOTE as preprocessing can improve performance of classification. The SMOTE-SSVM method provided better result than SMOTE-TSSVM and SMOTE-CART.
Deep learning of support vector machines with class probability output networks.

PubMed

Kim, Sangwook; Yu, Zhibin; Kil, Rhee Man; Lee, Minho

2015-04-01

Deep learning methods endeavor to learn features automatically at multiple levels and allow systems to learn complex functions mapping from the input space to the output space for the given data. The ability to learn powerful features automatically is increasingly important as the volume of data and range of applications of machine learning methods continues to grow. This paper proposes a new deep architecture that uses support vector machines (SVMs) with class probability output networks (CPONs) to provide better generalization power for pattern classification problems. As a result, deep features are extracted without additional feature engineering steps, using multiple layers of the SVM classifiers with CPONs. The proposed structure closely approaches the ideal Bayes classifier as the number of layers increases. Using a simulation of classification problems, the effectiveness of the proposed method is demonstrated. Copyright © 2014 Elsevier Ltd. All rights reserved.
Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback

PubMed Central

Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi

2016-01-01

Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery. PMID:27861505
Breast Cancer Recognition Using a Novel Hybrid Intelligent Method

PubMed Central

Addeh, Jalil; Ebrahimzadeh, Ata

2012-01-01

Breast cancer is the second largest cause of cancer deaths among women. At the same time, it is also among the most curable cancer types if it can be diagnosed early. This paper presents a novel hybrid intelligent method for recognition of breast cancer tumors. The proposed method includes three main modules: the feature extraction module, the classifier module, and the optimization module. In the feature extraction module, fuzzy features are proposed as the efficient characteristic of the patterns. In the classifier module, because of the promising generalization capability of support vector machines (SVM), a SVM-based classifier is proposed. In support vector machine training, the hyperparameters have very important roles for its recognition accuracy. Therefore, in the optimization module, the bees algorithm (BA) is proposed for selecting appropriate parameters of the classifier. The proposed system is tested on Wisconsin Breast Cancer database and simulation results show that the recommended system has a high accuracy. PMID:23626945
Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback.

PubMed

Hu, Kai; Gui, Zhipeng; Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi

2016-01-01

Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery.
Effective 2D-3D medical image registration using Support Vector Machine.

PubMed

Qi, Wenyuan; Gu, Lixu; Zhao, Qiang

2008-01-01

Registration of pre-operative 3D volume dataset and intra-operative 2D images gradually becomes an important technique to assist radiologists in diagnosing complicated diseases easily and quickly. In this paper, we proposed a novel 2D/3D registration framework based on Support Vector Machine (SVM) to compensate the disadvantages of generating large number of DRR images in the stage of intra-operation. Estimated similarity metric distribution could be built up from the relationship between parameters of transform and prior sparse target metric values by means of SVR method. Based on which, global optimal parameters of transform are finally searched out by an optimizer in order to guide 3D volume dataset to match intra-operative 2D image. Experiments reveal that our proposed registration method improved performance compared to conventional registration method and also provided a precise registration result efficiently.
A Novel Degradation Identification Method for Wind Turbine Pitch System

NASA Astrophysics Data System (ADS)

Guo, Hui-Dong

2018-04-01

It’s difficult for traditional threshold value method to identify degradation of operating equipment accurately. An novel degradation evaluation method suitable for wind turbine condition maintenance strategy implementation was proposed in this paper. Based on the analysis of typical variable-speed pitch-to-feather control principle and monitoring parameters for pitch system, a multi input multi output (MIMO) regression model was applied to pitch system, where wind speed, power generation regarding as input parameters, wheel rotation speed, pitch angle and motor driving currency for three blades as output parameters. Then, the difference between the on-line measurement and the calculated value from the MIMO regression model applying least square support vector machines (LSSVM) method was defined as the Observed Vector of the system. The Gaussian mixture model (GMM) was applied to fitting the distribution of the multi dimension Observed Vectors. Applying the model established, the Degradation Index was calculated using the SCADA data of a wind turbine damaged its pitch bearing retainer and rolling body, which illustrated the feasibility of the provided method.

Ischemic stroke lesion segmentation in multi-spectral MR images with support vector machine classifiers

NASA Astrophysics Data System (ADS)

Maier, Oskar; Wilms, Matthias; von der Gablentz, Janina; Krämer, Ulrike; Handels, Heinz

2014-03-01

Automatic segmentation of ischemic stroke lesions in magnetic resonance (MR) images is important in clinical practice and for neuroscientific trials. The key problem is to detect largely inhomogeneous regions of varying sizes, shapes and locations. We present a stroke lesion segmentation method based on local features extracted from multi-spectral MR data that are selected to model a human observer's discrimination criteria. A support vector machine classifier is trained on expert-segmented examples and then used to classify formerly unseen images. Leave-one-out cross validation on eight datasets with lesions of varying appearances is performed, showing our method to compare favourably with other published approaches in terms of accuracy and robustness. Furthermore, we compare a number of feature selectors and closely examine each feature's and MR sequence's contribution.
Comparison of different wind data interpolation methods for a region with complex terrain in Central Asia

NASA Astrophysics Data System (ADS)

Reinhardt, Katja; Samimi, Cyrus

2018-01-01

While climatological data of high spatial resolution are largely available in most developed countries, the network of climatological stations in many other regions of the world still constitutes large gaps. Especially for those regions, interpolation methods are important tools to fill these gaps and to improve the data base indispensible for climatological research. Over the last years, new hybrid methods of machine learning and geostatistics have been developed which provide innovative prospects in spatial predictive modelling. This study will focus on evaluating the performance of 12 different interpolation methods for the wind components \\overrightarrow{u} and \\overrightarrow{v} in a mountainous region of Central Asia. Thereby, a special focus will be on applying new hybrid methods on spatial interpolation of wind data. This study is the first evaluating and comparing the performance of several of these hybrid methods. The overall aim of this study is to determine whether an optimal interpolation method exists, which can equally be applied for all pressure levels, or whether different interpolation methods have to be used for the different pressure levels. Deterministic (inverse distance weighting) and geostatistical interpolation methods (ordinary kriging) were explored, which take into account only the initial values of \\overrightarrow{u} and \\overrightarrow{v} . In addition, more complex methods (generalized additive model, support vector machine and neural networks as single methods and as hybrid methods as well as regression-kriging) that consider additional variables were applied. The analysis of the error indices revealed that regression-kriging provided the most accurate interpolation results for both wind components and all pressure heights. At 200 and 500 hPa, regression-kriging is followed by the different kinds of neural networks and support vector machines and for 850 hPa it is followed by the different types of support vector machine and ordinary kriging. Overall, explanatory variables improve the interpolation results.
Support Vector Hazards Machine: A Counting Process Framework for Learning Risk Scores for Censored Outcomes.

PubMed

Wang, Yuanjia; Chen, Tianle; Zeng, Donglin

2016-01-01

Learning risk scores to predict dichotomous or continuous outcomes using machine learning approaches has been studied extensively. However, how to learn risk scores for time-to-event outcomes subject to right censoring has received little attention until recently. Existing approaches rely on inverse probability weighting or rank-based regression, which may be inefficient. In this paper, we develop a new support vector hazards machine (SVHM) approach to predict censored outcomes. Our method is based on predicting the counting process associated with the time-to-event outcomes among subjects at risk via a series of support vector machines. Introducing counting processes to represent time-to-event data leads to a connection between support vector machines in supervised learning and hazards regression in standard survival analysis. To account for different at risk populations at observed event times, a time-varying offset is used in estimating risk scores. The resulting optimization is a convex quadratic programming problem that can easily incorporate non-linearity using kernel trick. We demonstrate an interesting link from the profiled empirical risk function of SVHM to the Cox partial likelihood. We then formally show that SVHM is optimal in discriminating covariate-specific hazard function from population average hazard function, and establish the consistency and learning rate of the predicted risk using the estimated risk scores. Simulation studies show improved prediction accuracy of the event times using SVHM compared to existing machine learning methods and standard conventional approaches. Finally, we analyze two real world biomedical study data where we use clinical markers and neuroimaging biomarkers to predict age-at-onset of a disease, and demonstrate superiority of SVHM in distinguishing high risk versus low risk subjects.
Hybrid PSO-ASVR-based method for data fitting in the calibration of infrared radiometer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, Sen; Li, Chengwei, E-mail: heikuanghit@163.com

2016-06-15

The present paper describes a hybrid particle swarm optimization-adaptive support vector regression (PSO-ASVR)-based method for data fitting in the calibration of infrared radiometer. The proposed hybrid PSO-ASVR-based method is based on PSO in combination with Adaptive Processing and Support Vector Regression (SVR). The optimization technique involves setting parameters in the ASVR fitting procedure, which significantly improves the fitting accuracy. However, its use in the calibration of infrared radiometer has not yet been widely explored. Bearing this in mind, the PSO-ASVR-based method, which is based on the statistical learning theory, is successfully used here to get the relationship between the radiationmore » of a standard source and the response of an infrared radiometer. Main advantages of this method are the flexible adjustment mechanism in data processing and the optimization mechanism in a kernel parameter setting of SVR. Numerical examples and applications to the calibration of infrared radiometer are performed to verify the performance of PSO-ASVR-based method compared to conventional data fitting methods.« less
Detection of Alzheimer's Disease by Three-Dimensional Displacement Field Estimation in Structural Magnetic Resonance Imaging.

PubMed

Wang, Shuihua; Zhang, Yudong; Liu, Ge; Phillips, Preetha; Yuan, Ti-Fei

2016-01-01

Within the past decade, computer scientists have developed many methods using computer vision and machine learning techniques to detect Alzheimer's disease (AD) in its early stages. However, some of these methods are unable to achieve excellent detection accuracy, and several other methods are unable to locate AD-related regions. Hence, our goal was to develop a novel AD brain detection method. In this study, our method was based on the three-dimensional (3D) displacement-field (DF) estimation between subjects in the healthy elder control group and AD group. The 3D-DF was treated with AD-related features. The three feature selection measures were used in the Bhattacharyya distance, Student's t-test, and Welch's t-test (WTT). Two non-parallel support vector machines, i.e., generalized eigenvalue proximal support vector machine and twin support vector machine (TSVM), were then used for classification. A 50 × 10-fold cross validation was implemented for statistical analysis. The results showed that "3D-DF+WTT+TSVM" achieved the best performance, with an accuracy of 93.05 ± 2.18, a sensitivity of 92.57 ± 3.80, a specificity of 93.18 ± 3.35, and a precision of 79.51 ± 2.86. This method also exceled in 13 state-of-the-art approaches. Additionally, we were able to detect 17 regions related to AD by using the pure computer-vision technique. These regions include sub-gyral, inferior parietal lobule, precuneus, angular gyrus, lingual gyrus, supramarginal gyrus, postcentral gyrus, third ventricle, superior parietal lobule, thalamus, middle temporal gyrus, precentral gyrus, superior temporal gyrus, superior occipital gyrus, cingulate gyrus, culmen, and insula. These regions were reported in recent publications. The 3D-DF is effective in AD subject and related region detection.
Monthly evaporation forecasting using artificial neural networks and support vector machines

NASA Astrophysics Data System (ADS)

Tezel, Gulay; Buyukyildiz, Meral

2016-04-01

Evaporation is one of the most important components of the hydrological cycle, but is relatively difficult to estimate, due to its complexity, as it can be influenced by numerous factors. Estimation of evaporation is important for the design of reservoirs, especially in arid and semi-arid areas. Artificial neural network methods and support vector machines (SVM) are frequently utilized to estimate evaporation and other hydrological variables. In this study, usability of artificial neural networks (ANNs) (multilayer perceptron (MLP) and radial basis function network (RBFN)) and ɛ-support vector regression (SVR) artificial intelligence methods was investigated to estimate monthly pan evaporation. For this aim, temperature, relative humidity, wind speed, and precipitation data for the period 1972 to 2005 from Beysehir meteorology station were used as input variables while pan evaporation values were used as output. The Romanenko and Meyer method was also considered for the comparison. The results were compared with observed class A pan evaporation data. In MLP method, four different training algorithms, gradient descent with momentum and adaptive learning rule backpropagation (GDX), Levenberg-Marquardt (LVM), scaled conjugate gradient (SCG), and resilient backpropagation (RBP), were used. Also, ɛ-SVR model was used as SVR model. The models were designed via 10-fold cross-validation (CV); algorithm performance was assessed via mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R 2). According to the performance criteria, the ANN algorithms and ɛ-SVR had similar results. The ANNs and ɛ-SVR methods were found to perform better than the Romanenko and Meyer methods. Consequently, the best performance using the test data was obtained using SCG(4,2,2,1) with R 2 = 0.905.
A review of the vector management methods to prevent and control outbreaks of West Nile virus infection and the challenge for Europe

PubMed Central

2014-01-01

West Nile virus infection is a growing concern in Europe. Vector management is often the primary option to prevent and control outbreaks of the disease. Its implementation is, however, complex and needs to be supported by integrated multidisciplinary surveillance systems and to be organized within the framework of predefined response plans. The impact of the vector control measures depends on multiple factors and the identification of the best combination of vector control methods is therefore not always straightforward. Therefore, this contribution aims at critically reviewing the existing vector control methods to prevent and control outbreaks of West Nile virus infection and to present the challenges for Europe. Most West Nile virus vector control experiences have been recently developed in the US, where ecological conditions are different from the EU and vector control is organized under a different regulatory frame. The extrapolation of information produced in North America to Europe might be limited because of the seemingly different epidemiology in the European region. Therefore, there is an urgent need to analyse the European experiences of the prevention and control of outbreaks of West Nile virus infection and to perform robust cost-benefit analysis that can guide the implementation of the appropriate control measures. Furthermore, to be effective, vector control programs require a strong organisational backbone relying on a previously defined plan, skilled technicians and operators, appropriate equipment, and sufficient financial resources. A decision making guide scheme is proposed which may assist in the process of implementation of vector control measures tailored on specific areas and considering the available information and possible scenarios. PMID:25015004
Stroke localization and classification using microwave tomography with k-means clustering and support vector machine.

PubMed

Guo, Lei; Abbosh, Amin

2018-05-01

For any chance for stroke patients to survive, the stroke type should be classified to enable giving medication within a few hours of the onset of symptoms. In this paper, a microwave-based stroke localization and classification framework is proposed. It is based on microwave tomography, k-means clustering, and a support vector machine (SVM) method. The dielectric profile of the brain is first calculated using the Born iterative method, whereas the amplitude of the dielectric profile is then taken as the input to k-means clustering. The cluster is selected as the feature vector for constructing and testing the SVM. A database of MRI-derived realistic head phantoms at different signal-to-noise ratios is used in the classification procedure. The performance of the proposed framework is evaluated using the receiver operating characteristic (ROC) curve. The results based on a two-dimensional framework show that 88% classification accuracy, with a sensitivity of 91% and a specificity of 87%, can be achieved. Bioelectromagnetics. 39:312-324, 2018. © 2018 Wiley Periodicals, Inc. © 2018 Wiley Periodicals, Inc.
Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

PubMed Central

2011-01-01

Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most sensitivity was around or even lower than a median value of 0.5. Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing. PMID:21849043
Spectrophotometric determination of ternary mixtures of thiamin, riboflavin and pyridoxal in pharmaceutical and human plasma by least-squares support vector machines.

PubMed

Niazi, Ali; Zolgharnein, Javad; Afiuni-Zadeh, Somaie

2007-11-01

Ternary mixtures of thiamin, riboflavin and pyridoxal have been simultaneously determined in synthetic and real samples by applications of spectrophotometric and least-squares support vector machines. The calibration graphs were linear in the ranges of 1.0 - 20.0, 1.0 - 10.0 and 1.0 - 20.0 microg ml(-1) with detection limits of 0.6, 0.5 and 0.7 microg ml(-1) for thiamin, riboflavin and pyridoxal, respectively. The experimental calibration matrix was designed with 21 mixtures of these chemicals. The concentrations were varied between calibration graph concentrations of vitamins. The simultaneous determination of these vitamin mixtures by using spectrophotometric methods is a difficult problem, due to spectral interferences. The partial least squares (PLS) modeling and least-squares support vector machines were used for the multivariate calibration of the spectrophotometric data. An excellent model was built using LS-SVM, with low prediction errors and superior performance in relation to PLS. The root mean square errors of prediction (RMSEP) for thiamin, riboflavin and pyridoxal with PLS and LS-SVM were 0.6926, 0.3755, 0.4322 and 0.0421, 0.0318, 0.0457, respectively. The proposed method was satisfactorily applied to the rapid simultaneous determination of thiamin, riboflavin and pyridoxal in commercial pharmaceutical preparations and human plasma samples.
Prediction of Human Intestinal Absorption of Compounds Using Artificial Intelligence Techniques.

PubMed

Kumar, Rajnish; Sharma, Anju; Siddiqui, Mohammed Haris; Tiwari, Rajesh Kumar

2017-01-01

Information about Pharmacokinetics of compounds is an essential component of drug design and development. Modeling the pharmacokinetic properties require identification of the factors effecting absorption, distribution, metabolism and excretion of compounds. There have been continuous attempts in the prediction of intestinal absorption of compounds using various Artificial intelligence methods in the effort to reduce the attrition rate of drug candidates entering to preclinical and clinical trials. Currently, there are large numbers of individual predictive models available for absorption using machine learning approaches. Six Artificial intelligence methods namely, Support vector machine, k- nearest neighbor, Probabilistic neural network, Artificial neural network, Partial least square and Linear discriminant analysis were used for prediction of absorption of compounds. Prediction accuracy of Support vector machine, k- nearest neighbor, Probabilistic neural network, Artificial neural network, Partial least square and Linear discriminant analysis for prediction of intestinal absorption of compounds was found to be 91.54%, 88.33%, 84.30%, 86.51%, 79.07% and 80.08% respectively. Comparative analysis of all the six prediction models suggested that Support vector machine with Radial basis function based kernel is comparatively better for binary classification of compounds using human intestinal absorption and may be useful at preliminary stages of drug design and development. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
[Discrimination of types of polyacrylamide based on near infrared spectroscopy coupled with least square support vector machine].

PubMed

Zhang, Hong-Guang; Yang, Qin-Min; Lu, Jian-Gang

2014-04-01

In this paper, a novel discriminant methodology based on near infrared spectroscopic analysis technique and least square support vector machine was proposed for rapid and nondestructive discrimination of different types of Polyacrylamide. The diffuse reflectance spectra of samples of Non-ionic Polyacrylamide, Anionic Polyacrylamide and Cationic Polyacrylamide were measured. Then principal component analysis method was applied to reduce the dimension of the spectral data and extract of the principal compnents. The first three principal components were used for cluster analysis of the three different types of Polyacrylamide. Then those principal components were also used as inputs of least square support vector machine model. The optimization of the parameters and the number of principal components used as inputs of least square support vector machine model was performed through cross validation based on grid search. 60 samples of each type of Polyacrylamide were collected. Thus a total of 180 samples were obtained. 135 samples, 45 samples for each type of Polyacrylamide, were randomly split into a training set to build calibration model and the rest 45 samples were used as test set to evaluate the performance of the developed model. In addition, 5 Cationic Polyacrylamide samples and 5 Anionic Polyacrylamide samples adulterated with different proportion of Non-ionic Polyacrylamide were also prepared to show the feasibilty of the proposed method to discriminate the adulterated Polyacrylamide samples. The prediction error threshold for each type of Polyacrylamide was determined by F statistical significance test method based on the prediction error of the training set of corresponding type of Polyacrylamide in cross validation. The discrimination accuracy of the built model was 100% for prediction of the test set. The prediction of the model for the 10 mixing samples was also presented, and all mixing samples were accurately discriminated as adulterated samples. The overall results demonstrate that the discrimination method proposed in the present paper can rapidly and nondestructively discriminate the different types of Polyacrylamide and the adulterated Polyacrylamide samples, and offered a new approach to discriminate the types of Polyacrylamide.
Boosting specificity of MEG artifact removal by weighted support vector machine.

PubMed

Duan, Fang; Phothisonothai, Montri; Kikuchi, Mitsuru; Yoshimura, Yuko; Minabe, Yoshio; Watanabe, Kastumi; Aihara, Kazuyuki

2013-01-01

An automatic artifact removal method of magnetoencephalogram (MEG) was presented in this paper. The method proposed is based on independent components analysis (ICA) and support vector machine (SVM). However, different from the previous studies, in this paper we consider two factors which would influence the performance. First, the imbalance factor of independent components (ICs) of MEG is handled by weighted SVM. Second, instead of simply setting a fixed weight to each class, a re-weighting scheme is used for the preservation of useful MEG ICs. Experimental results on manually marked MEG dataset showed that the method proposed could correctly distinguish the artifacts from the MEG ICs. Meanwhile, 99.72% ± 0.67 of MEG ICs were preserved. The classification accuracy was 97.91% ± 1.39. In addition, it was found that this method was not sensitive to individual differences. The cross validation (leave-one-subject-out) results showed an averaged accuracy of 97.41% ± 2.14.
Feature Selection in Order to Extract Multiple Sclerosis Lesions Automatically in 3D Brain Magnetic Resonance Images Using Combination of Support Vector Machine and Genetic Algorithm.

PubMed

Khotanlou, Hassan; Afrasiabi, Mahlagha

2012-10-01

This paper presents a new feature selection approach for automatically extracting multiple sclerosis (MS) lesions in three-dimensional (3D) magnetic resonance (MR) images. Presented method is applicable to different types of MS lesions. In this method, T1, T2, and fluid attenuated inversion recovery (FLAIR) images are firstly preprocessed. In the next phase, effective features to extract MS lesions are selected by using a genetic algorithm (GA). The fitness function of the GA is the Similarity Index (SI) of a support vector machine (SVM) classifier. The results obtained on different types of lesions have been evaluated by comparison with manual segmentations. This algorithm is evaluated on 15 real 3D MR images using several measures. As a result, the SI between MS regions determined by the proposed method and radiologists was 87% on average. Experiments and comparisons with other methods show the effectiveness and the efficiency of the proposed approach.
Logic Learning Machine and standard supervised methods for Hodgkin's lymphoma prognosis using gene expression data and clinical variables.

PubMed

Parodi, Stefano; Manneschi, Chiara; Verda, Damiano; Ferrari, Enrico; Muselli, Marco

2018-03-01

This study evaluates the performance of a set of machine learning techniques in predicting the prognosis of Hodgkin's lymphoma using clinical factors and gene expression data. Analysed samples from 130 Hodgkin's lymphoma patients included a small set of clinical variables and more than 54,000 gene features. Machine learning classifiers included three black-box algorithms ( k-nearest neighbour, Artificial Neural Network, and Support Vector Machine) and two methods based on intelligible rules (Decision Tree and the innovative Logic Learning Machine method). Support Vector Machine clearly outperformed any of the other methods. Among the two rule-based algorithms, Logic Learning Machine performed better and identified a set of simple intelligible rules based on a combination of clinical variables and gene expressions. Decision Tree identified a non-coding gene ( XIST) involved in the early phases of X chromosome inactivation that was overexpressed in females and in non-relapsed patients. XIST expression might be responsible for the better prognosis of female Hodgkin's lymphoma patients.
Identification of handwriting by using the genetic algorithm (GA) and support vector machine (SVM)

NASA Astrophysics Data System (ADS)

Zhang, Qigui; Deng, Kai

2016-12-01

As portable digital camera and a camera phone comes more and more popular, and equally pressing is meeting the requirements of people to shoot at any time, to identify and storage handwritten character. In this paper, genetic algorithm(GA) and support vector machine(SVM)are used for identification of handwriting. Compare with parameters-optimized method, this technique overcomes two defects: first, it's easy to trap in the local optimum; second, finding the best parameters in the larger range will affects the efficiency of classification and prediction. As the experimental results suggest, GA-SVM has a higher recognition rate.
Optimization of Support Vector Machine (SVM) for Object Classification

NASA Technical Reports Server (NTRS)

Scholten, Matthew; Dhingra, Neil; Lu, Thomas T.; Chao, Tien-Hsin

2012-01-01

The Support Vector Machine (SVM) is a powerful algorithm, useful in classifying data into species. The SVMs implemented in this research were used as classifiers for the final stage in a Multistage Automatic Target Recognition (ATR) system. A single kernel SVM known as SVMlight, and a modified version known as a SVM with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SVM as a method for classification. From trial to trial, SVM produces consistent results.
Applications of Support Vector Machines In Chemo And Bioinformatics

NASA Astrophysics Data System (ADS)

Jayaraman, V. K.; Sundararajan, V.

2010-10-01

Conventional linear & nonlinear tools for classification, regression & data driven modeling are being replaced on a rapid scale by newer techniques & tools based on artificial intelligence and machine learning. While the linear techniques are not applicable for inherently nonlinear problems, newer methods serve as attractive alternatives for solving real life problems. Support Vector Machine (SVM) classifiers are a set of universal feed-forward network based classification algorithms that have been formulated from statistical learning theory and structural risk minimization principle. SVM regression closely follows the classification methodology. In this work recent applications of SVM in Chemo & Bioinformatics will be described with suitable illustrative examples.
The Application of Auto-Disturbance Rejection Control Optimized by Least Squares Support Vector Machines Method and Time-Frequency Representation in Voltage Source Converter-High Voltage Direct Current System.

PubMed

Liu, Ying-Pei; Liang, Hai-Ping; Gao, Zhong-Ke

2015-01-01

In order to improve the performance of voltage source converter-high voltage direct current (VSC-HVDC) system, we propose an improved auto-disturbance rejection control (ADRC) method based on least squares support vector machines (LSSVM) in the rectifier side. Firstly, we deduce the high frequency transient mathematical model of VSC-HVDC system. Then we investigate the ADRC and LSSVM principles. We ignore the tracking differentiator in the ADRC controller aiming to improve the system dynamic response speed. On this basis, we derive the mathematical model of ADRC controller optimized by LSSVM for direct current voltage loop. Finally we carry out simulations to verify the feasibility and effectiveness of our proposed control method. In addition, we employ the time-frequency representation methods, i.e., Wigner-Ville distribution (WVD) and adaptive optimal kernel (AOK) time-frequency representation, to demonstrate our proposed method performs better than the traditional method from the perspective of energy distribution in time and frequency plane.
The Application of Auto-Disturbance Rejection Control Optimized by Least Squares Support Vector Machines Method and Time-Frequency Representation in Voltage Source Converter-High Voltage Direct Current System

PubMed Central

Gao, Zhong-Ke

2015-01-01

In order to improve the performance of voltage source converter-high voltage direct current (VSC-HVDC) system, we propose an improved auto-disturbance rejection control (ADRC) method based on least squares support vector machines (LSSVM) in the rectifier side. Firstly, we deduce the high frequency transient mathematical model of VSC-HVDC system. Then we investigate the ADRC and LSSVM principles. We ignore the tracking differentiator in the ADRC controller aiming to improve the system dynamic response speed. On this basis, we derive the mathematical model of ADRC controller optimized by LSSVM for direct current voltage loop. Finally we carry out simulations to verify the feasibility and effectiveness of our proposed control method. In addition, we employ the time-frequency representation methods, i.e., Wigner-Ville distribution (WVD) and adaptive optimal kernel (AOK) time-frequency representation, to demonstrate our proposed method performs better than the traditional method from the perspective of energy distribution in time and frequency plane. PMID:26098556

Improving Non-Destructive Concrete Strength Tests Using Support Vector Machines

PubMed Central

Shih, Yi-Fan; Wang, Yu-Ren; Lin, Kuo-Liang; Chen, Chin-Wen

2015-01-01

Non-destructive testing (NDT) methods are important alternatives when destructive tests are not feasible to examine the in situ concrete properties without damaging the structure. The rebound hammer test and the ultrasonic pulse velocity test are two popular NDT methods to examine the properties of concrete. The rebound of the hammer depends on the hardness of the test specimen and ultrasonic pulse travelling speed is related to density, uniformity, and homogeneity of the specimen. Both of these two methods have been adopted to estimate the concrete compressive strength. Statistical analysis has been implemented to establish the relationship between hammer rebound values/ultrasonic pulse velocities and concrete compressive strength. However, the estimated results can be unreliable. As a result, this research proposes an Artificial Intelligence model using support vector machines (SVMs) for the estimation. Data from 95 cylinder concrete samples are collected to develop and validate the model. The results show that combined NDT methods (also known as SonReb method) yield better estimations than single NDT methods. The results also show that the SVMs model is more accurate than the statistical regression model. PMID:28793627
User's guide to PANCOR: A panel method program for interference assessment in slotted-wall wind tunnels

NASA Technical Reports Server (NTRS)

Kemp, William B., Jr.

1990-01-01

Guidelines are presented for use of the computer program PANCOR to assess the interference due to tunnel walls and model support in a slotted wind tunnel test section at subsonic speeds. Input data requirements are described in detail and program output and general program usage are described. The program is written for effective automatic vectorization on a CDC CYBER 200 class vector processing system.
Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs.

PubMed

Shamim, Mohammad Tabrez Anwar; Anwaruddin, Mohammad; Nagarajaram, H A

2007-12-15

Fold recognition is a key step in the protein structure discovery process, especially when traditional sequence comparison methods fail to yield convincing structural homologies. Although many methods have been developed for protein fold recognition, their accuracies remain low. This can be attributed to insufficient exploitation of fold discriminatory features. We have developed a new method for protein fold recognition using structural information of amino acid residues and amino acid residue pairs. Since protein fold recognition can be treated as a protein fold classification problem, we have developed a Support Vector Machine (SVM) based classifier approach that uses secondary structural state and solvent accessibility state frequencies of amino acids and amino acid pairs as feature vectors. Among the individual properties examined secondary structural state frequencies of amino acids gave an overall accuracy of 65.2% for fold discrimination, which is better than the accuracy by any method reported so far in the literature. Combination of secondary structural state frequencies with solvent accessibility state frequencies of amino acids and amino acid pairs further improved the fold discrimination accuracy to more than 70%, which is approximately 8% higher than the best available method. In this study we have also tested, for the first time, an all-together multi-class method known as Crammer and Singer method for protein fold classification. Our studies reveal that the three multi-class classification methods, namely one versus all, one versus one and Crammer and Singer method, yield similar predictions. Dataset and stand-alone program are available upon request.
SNPs selection using support vector regression and genetic algorithms in GWAS

PubMed Central

2014-01-01

Introduction This paper proposes a new methodology to simultaneously select the most relevant SNPs markers for the characterization of any measurable phenotype described by a continuous variable using Support Vector Regression with Pearson Universal kernel as fitness function of a binary genetic algorithm. The proposed methodology is multi-attribute towards considering several markers simultaneously to explain the phenotype and is based jointly on statistical tools, machine learning and computational intelligence. Results The suggested method has shown potential in the simulated database 1, with additive effects only, and real database. In this simulated database, with a total of 1,000 markers, and 7 with major effect on the phenotype and the other 993 SNPs representing the noise, the method identified 21 markers. Of this total, 5 are relevant SNPs between the 7 but 16 are false positives. In real database, initially with 50,752 SNPs, we have reduced to 3,073 markers, increasing the accuracy of the model. In the simulated database 2, with additive effects and interactions (epistasis), the proposed method matched to the methodology most commonly used in GWAS. Conclusions The method suggested in this paper demonstrates the effectiveness in explaining the real phenotype (PTA for milk), because with the application of the wrapper based on genetic algorithm and Support Vector Regression with Pearson Universal, many redundant markers were eliminated, increasing the prediction and accuracy of the model on the real database without quality control filters. The PUK demonstrated that it can replicate the performance of linear and RBF kernels. PMID:25573332
Using support vector machines to improve elemental ion identification in macromolecular crystal structures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Morshed, Nader; Lawrence Berkeley National Laboratory, Berkeley, CA 94720; Echols, Nathaniel, E-mail: nechols@lbl.gov

2015-05-01

A method to automatically identify possible elemental ions in X-ray crystal structures has been extended to use support vector machine (SVM) classifiers trained on selected structures in the PDB, with significantly improved sensitivity over manually encoded heuristics. In the process of macromolecular model building, crystallographers must examine electron density for isolated atoms and differentiate sites containing structured solvent molecules from those containing elemental ions. This task requires specific knowledge of metal-binding chemistry and scattering properties and is prone to error. A method has previously been described to identify ions based on manually chosen criteria for a number of elements. Here,more » the use of support vector machines (SVMs) to automatically classify isolated atoms as either solvent or one of various ions is described. Two data sets of protein crystal structures, one containing manually curated structures deposited with anomalous diffraction data and another with automatically filtered, high-resolution structures, were constructed. On the manually curated data set, an SVM classifier was able to distinguish calcium from manganese, zinc, iron and nickel, as well as all five of these ions from water molecules, with a high degree of accuracy. Additionally, SVMs trained on the automatically curated set of high-resolution structures were able to successfully classify most common elemental ions in an independent validation test set. This method is readily extensible to other elemental ions and can also be used in conjunction with previous methods based on a priori expectations of the chemical environment and X-ray scattering.« less
A Fast Vector Radiative Transfer Model for Atmospheric and Oceanic Remote Sensing

NASA Astrophysics Data System (ADS)

Ding, J.; Yang, P.; King, M. D.; Platnick, S. E.; Meyer, K.

2017-12-01

A fast vector radiative transfer model is developed in support of atmospheric and oceanic remote sensing. This model is capable of simulating the Stokes vector observed at the top of the atmosphere (TOA) and the terrestrial surface by considering absorption, scattering, and emission. The gas absorption is parameterized in terms of atmospheric gas concentrations, temperature, and pressure. The parameterization scheme combines a regression method and the correlated-K distribution method, and can easily integrate with multiple scattering computations. The approach is more than four orders of magnitude faster than a line-by-line radiative transfer model with errors less than 0.5% in terms of transmissivity. A two-component approach is utilized to solve the vector radiative transfer equation (VRTE). The VRTE solver separates the phase matrices of aerosol and cloud into forward and diffuse parts and thus the solution is also separated. The forward solution can be expressed by a semi-analytical equation based on the small-angle approximation, and serves as the source of the diffuse part. The diffuse part is solved by the adding-doubling method. The adding-doubling implementation is computationally efficient because the diffuse component needs much fewer spherical function expansion terms. The simulated Stokes vector at both the TOA and the surface have comparable accuracy compared with the counterparts based on numerically rigorous methods.
Supervised machine learning algorithms to diagnose stress for vehicle drivers based on physiological sensor signals.

PubMed

Barua, Shaibal; Begum, Shahina; Ahmed, Mobyen Uddin

2015-01-01

Machine learning algorithms play an important role in computer science research. Recent advancement in sensor data collection in clinical sciences lead to a complex, heterogeneous data processing, and analysis for patient diagnosis and prognosis. Diagnosis and treatment of patients based on manual analysis of these sensor data are difficult and time consuming. Therefore, development of Knowledge-based systems to support clinicians in decision-making is important. However, it is necessary to perform experimental work to compare performances of different machine learning methods to help to select appropriate method for a specific characteristic of data sets. This paper compares classification performance of three popular machine learning methods i.e., case-based reasoning, neutral networks and support vector machine to diagnose stress of vehicle drivers using finger temperature and heart rate variability. The experimental results show that case-based reasoning outperforms other two methods in terms of classification accuracy. Case-based reasoning has achieved 80% and 86% accuracy to classify stress using finger temperature and heart rate variability. On contrary, both neural network and support vector machine have achieved less than 80% accuracy by using both physiological signals.
Progressive Classification Using Support Vector Machines

NASA Technical Reports Server (NTRS)

Wagstaff, Kiri; Kocurek, Michael

2009-01-01

An algorithm for progressive classification of data, analogous to progressive rendering of images, makes it possible to compromise between speed and accuracy. This algorithm uses support vector machines (SVMs) to classify data. An SVM is a machine learning algorithm that builds a mathematical model of the desired classification concept by identifying the critical data points, called support vectors. Coarse approximations to the concept require only a few support vectors, while precise, highly accurate models require far more support vectors. Once the model has been constructed, the SVM can be applied to new observations. The cost of classifying a new observation is proportional to the number of support vectors in the model. When computational resources are limited, an SVM of the appropriate complexity can be produced. However, if the constraints are not known when the model is constructed, or if they can change over time, a method for adaptively responding to the current resource constraints is required. This capability is particularly relevant for spacecraft (or any other real-time systems) that perform onboard data analysis. The new algorithm enables the fast, interactive application of an SVM classifier to a new set of data. The classification process achieved by this algorithm is characterized as progressive because a coarse approximation to the true classification is generated rapidly and thereafter iteratively refined. The algorithm uses two SVMs: (1) a fast, approximate one and (2) slow, highly accurate one. New data are initially classified by the fast SVM, producing a baseline approximate classification. For each classified data point, the algorithm calculates a confidence index that indicates the likelihood that it was classified correctly in the first pass. Next, the data points are sorted by their confidence indices and progressively reclassified by the slower, more accurate SVM, starting with the items most likely to be incorrectly classified. The user can halt this reclassification process at any point, thereby obtaining the best possible result for a given amount of computation time. Alternatively, the results can be displayed as they are generated, providing the user with real-time feedback about the current accuracy of classification.
Analyzing big data with the hybrid interval regression methods.

PubMed

Huang, Chia-Hui; Yang, Keng-Chieh; Kao, Han-Ying

2014-01-01

Big data is a new trend at present, forcing the significant impacts on information technologies. In big data applications, one of the most concerned issues is dealing with large-scale data sets that often require computation resources provided by public cloud services. How to analyze big data efficiently becomes a big challenge. In this paper, we collaborate interval regression with the smooth support vector machine (SSVM) to analyze big data. Recently, the smooth support vector machine (SSVM) was proposed as an alternative of the standard SVM that has been proved more efficient than the traditional SVM in processing large-scale data. In addition the soft margin method is proposed to modify the excursion of separation margin and to be effective in the gray zone that the distribution of data becomes hard to be described and the separation margin between classes.
Analyzing Big Data with the Hybrid Interval Regression Methods

PubMed Central

Kao, Han-Ying

2014-01-01

Big data is a new trend at present, forcing the significant impacts on information technologies. In big data applications, one of the most concerned issues is dealing with large-scale data sets that often require computation resources provided by public cloud services. How to analyze big data efficiently becomes a big challenge. In this paper, we collaborate interval regression with the smooth support vector machine (SSVM) to analyze big data. Recently, the smooth support vector machine (SSVM) was proposed as an alternative of the standard SVM that has been proved more efficient than the traditional SVM in processing large-scale data. In addition the soft margin method is proposed to modify the excursion of separation margin and to be effective in the gray zone that the distribution of data becomes hard to be described and the separation margin between classes. PMID:25143968
A consensus least squares support vector regression (LS-SVR) for analysis of near-infrared spectra of plant samples.

PubMed

Li, Yankun; Shao, Xueguang; Cai, Wensheng

2007-04-15

Consensus modeling of combining the results of multiple independent models to produce a single prediction avoids the instability of single model. Based on the principle of consensus modeling, a consensus least squares support vector regression (LS-SVR) method for calibrating the near-infrared (NIR) spectra was proposed. In the proposed approach, NIR spectra of plant samples were firstly preprocessed using discrete wavelet transform (DWT) for filtering the spectral background and noise, then, consensus LS-SVR technique was used for building the calibration model. With an optimization of the parameters involved in the modeling, a satisfied model was achieved for predicting the content of reducing sugar in plant samples. The predicted results show that consensus LS-SVR model is more robust and reliable than the conventional partial least squares (PLS) and LS-SVR methods.
A Support Vector Machine-Based Gender Identification Using Speech Signal

NASA Astrophysics Data System (ADS)

Lee, Kye-Hwan; Kang, Sang-Ick; Kim, Deok-Hwan; Chang, Joon-Hyuk

We propose an effective voice-based gender identification method using a support vector machine (SVM). The SVM is a binary classification algorithm that classifies two groups by finding the voluntary nonlinear boundary in a feature space and is known to yield high classification performance. In the present work, we compare the identification performance of the SVM with that of a Gaussian mixture model (GMM)-based method using the mel frequency cepstral coefficients (MFCC). A novel approach of incorporating a features fusion scheme based on a combination of the MFCC and the fundamental frequency is proposed with the aim of improving the performance of gender identification. Experimental results demonstrate that the gender identification performance using the SVM is significantly better than that of the GMM-based scheme. Moreover, the performance is substantially improved when the proposed features fusion technique is applied.
Application of a support vector machine algorithm to the safety precaution technique of medium-low pressure gas regulators

NASA Astrophysics Data System (ADS)

Hao, Xuejun; An, Xaioran; Wu, Bo; He, Shaoping

2018-02-01

In the gas pipeline system, safe operation of a gas regulator determines the stability of the fuel gas supply, and the medium-low pressure gas regulator of the safety precaution system is not perfect at the present stage in the Beijing Gas Group; therefore, safety precaution technique optimization has important social and economic significance. In this paper, according to the running status of the medium-low pressure gas regulator in the SCADA system, a new method for gas regulator safety precaution based on the support vector machine (SVM) is presented. This method takes the gas regulator outlet pressure data as input variables of the SVM model, the fault categories and degree as output variables, which will effectively enhance the precaution accuracy as well as save significant manpower and material resources.
On the use of feature selection to improve the detection of sea oil spills in SAR images

NASA Astrophysics Data System (ADS)

Mera, David; Bolon-Canedo, Veronica; Cotos, J. M.; Alonso-Betanzos, Amparo

2017-03-01

Fast and effective oil spill detection systems are crucial to ensure a proper response to environmental emergencies caused by hydrocarbon pollution on the ocean's surface. Typically, these systems uncover not only oil spills, but also a high number of look-alikes. The feature extraction is a critical and computationally intensive phase where each detected dark spot is independently examined. Traditionally, detection systems use an arbitrary set of features to discriminate between oil spills and look-alikes phenomena. However, Feature Selection (FS) methods based on Machine Learning (ML) have proved to be very useful in real domains for enhancing the generalization capabilities of the classifiers, while discarding the existing irrelevant features. In this work, we present a generic and systematic approach, based on FS methods, for choosing a concise and relevant set of features to improve the oil spill detection systems. We have compared five FS methods: Correlation-based feature selection (CFS), Consistency-based filter, Information Gain, ReliefF and Recursive Feature Elimination for Support Vector Machine (SVM-RFE). They were applied on a 141-input vector composed of features from a collection of outstanding studies. Selected features were validated via a Support Vector Machine (SVM) classifier and the results were compared with previous works. Test experiments revealed that the classifier trained with the 6-input feature vector proposed by SVM-RFE achieved the best accuracy and Cohen's kappa coefficient (87.1% and 74.06% respectively). This is a smaller feature combination with similar or even better classification accuracy than previous works. The presented finding allows to speed up the feature extraction phase without reducing the classifier accuracy. Experiments also confirmed the significance of the geometrical features since 75.0% of the different features selected by the applied FS methods as well as 66.67% of the proposed 6-input feature vector belong to this category.
CompareSVM: supervised, Support Vector Machine (SVM) inference of gene regularity networks.

PubMed

Gillani, Zeeshan; Akash, Muhammad Sajid Hamid; Rahaman, M D Matiur; Chen, Ming

2014-11-30

Predication of gene regularity network (GRN) from expression data is a challenging task. There are many methods that have been developed to address this challenge ranging from supervised to unsupervised methods. Most promising methods are based on support vector machine (SVM). There is a need for comprehensive analysis on prediction accuracy of supervised method SVM using different kernels on different biological experimental conditions and network size. We developed a tool (CompareSVM) based on SVM to compare different kernel methods for inference of GRN. Using CompareSVM, we investigated and evaluated different SVM kernel methods on simulated datasets of microarray of different sizes in detail. The results obtained from CompareSVM showed that accuracy of inference method depends upon the nature of experimental condition and size of the network. For network with nodes (<200) and average (over all sizes of networks), SVM Gaussian kernel outperform on knockout, knockdown, and multifactorial datasets compared to all the other inference methods. For network with large number of nodes (~500), choice of inference method depend upon nature of experimental condition. CompareSVM is available at http://bis.zju.edu.cn/CompareSVM/ .
Robust support vector regression networks for function approximation with outliers.

PubMed

Chuang, Chen-Chia; Su, Shun-Feng; Jeng, Jin-Tsong; Hsiao, Chih-Ching

2002-01-01

Support vector regression (SVR) employs the support vector machine (SVM) to tackle problems of function approximation and regression estimation. SVR has been shown to have good robust properties against noise. When the parameters used in SVR are improperly selected, overfitting phenomena may still occur. However, the selection of various parameters is not straightforward. Besides, in SVR, outliers may also possibly be taken as support vectors. Such an inclusion of outliers in support vectors may lead to seriously overfitting phenomena. In this paper, a novel regression approach, termed as the robust support vector regression (RSVR) network, is proposed to enhance the robust capability of SVR. In the approach, traditional robust learning approaches are employed to improve the learning performance for any selected parameters. From the simulation results, our RSVR can always improve the performance of the learned systems for all cases. Besides, it can be found that even the training lasted for a long period, the testing errors would not go up. In other words, the overfitting phenomenon is indeed suppressed.
Kennard-Stone combined with least square support vector machine method for noncontact discriminating human blood species

NASA Astrophysics Data System (ADS)

Zhang, Linna; Li, Gang; Sun, Meixiu; Li, Hongxiao; Wang, Zhennan; Li, Yingxin; Lin, Ling

2017-11-01

Identifying whole bloods to be either human or nonhuman is an important responsibility for import-export ports and inspection and quarantine departments. Analytical methods and DNA testing methods are usually destructive. Previous studies demonstrated that visible diffuse reflectance spectroscopy method can realize noncontact human and nonhuman blood discrimination. An appropriate method for calibration set selection was very important for a robust quantitative model. In this paper, Random Selection (RS) method and Kennard-Stone (KS) method was applied in selecting samples for calibration set. Moreover, proper stoichiometry method can be greatly beneficial for improving the performance of classification model or quantification model. Partial Least Square Discrimination Analysis (PLSDA) method was commonly used in identification of blood species with spectroscopy methods. Least Square Support Vector Machine (LSSVM) was proved to be perfect for discrimination analysis. In this research, PLSDA method and LSSVM method was used for human blood discrimination. Compared with the results of PLSDA method, this method could enhance the performance of identified models. The overall results convinced that LSSVM method was more feasible for identifying human and animal blood species, and sufficiently demonstrated LSSVM method was a reliable and robust method for human blood identification, and can be more effective and accurate.
Ranking Support Vector Machine with Kernel Approximation

PubMed Central

Dou, Yong

2017-01-01

Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM) is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels) can give higher accuracy than linear RankSVM (RankSVM with a linear kernel) for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss) objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms. PMID:28293256
Ranking Support Vector Machine with Kernel Approximation.

PubMed

Chen, Kai; Li, Rongchun; Dou, Yong; Liang, Zhengfa; Lv, Qi

2017-01-01

Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM) is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels) can give higher accuracy than linear RankSVM (RankSVM with a linear kernel) for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss) objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms.
A single EBV-based vector for stable episomal maintenance and expression of GFP in human embryonic stem cells.

PubMed

Thyagarajan, Bhaskar; Scheyhing, Kelly; Xue, Haipeng; Fontes, Andrew; Chesnut, Jon; Rao, Mahendra; Lakshmipathy, Uma

2009-03-01

Stable expression of transgenes in stem cells has been a challenge due to the nonavailability of efficient transfection methods and the inability of transgenes to support sustained gene expression. Several methods have been reported to stably modify both embryonic and adult stem cells. These methods rely on integration of the transgene into the genome of the host cell, which could result in an expression pattern dependent on the number of integrations and the genomic locus of integration. To overcome this issue, site-specific integration methods mediated by integrase, adeno-associated virus or via homologous recombination have been used to generate stable human embryonic stem cell (hESC) lines. In this study, we describe a vector that is maintained episomally in hESCs. The vector used in this study is based on components derived from the Epstein-Barr virus, containing the Epstein-Barr virus nuclear antigen 1 expression cassette and the OriP origin of replication. The vector also expresses the drug-resistance marker gene hygromycin, which allows for selection and long-term maintenance of cells harboring the plasmid. Using this vector system, we show sustained expression of green fluorescent protein in undifferentiated hESCs and their differentiating embryoid bodies. In addition, the stable hESC clones show comparable expression with and without drug selection. Consistent with this observation, bulk-transfected adipose tissue-derived mesenchymal stem cells showed persistent marker gene expression as they differentiate into adipocytes, osteoblasts and chondroblasts. Episomal vectors offer a fast and efficient method to create hESC reporter lines, which in turn allows one to test the effect of overexpression of various genes on stem cell growth, proliferation and differentiation.

Identification of DNA-binding proteins by combining auto-cross covariance transformation and ensemble learning.

PubMed

Liu, Bin; Wang, Shanyi; Dong, Qiwen; Li, Shumin; Liu, Xuan

2016-04-20

DNA-binding proteins play a pivotal role in various intra- and extra-cellular activities ranging from DNA replication to gene expression control. With the rapid development of next generation of sequencing technique, the number of protein sequences is unprecedentedly increasing. Thus it is necessary to develop computational methods to identify the DNA-binding proteins only based on the protein sequence information. In this study, a novel method called iDNA-KACC is presented, which combines the Support Vector Machine (SVM) and the auto-cross covariance transformation. The protein sequences are first converted into profile-based protein representation, and then converted into a series of fixed-length vectors by the auto-cross covariance transformation with Kmer composition. The sequence order effect can be effectively captured by this scheme. These vectors are then fed into Support Vector Machine (SVM) to discriminate the DNA-binding proteins from the non DNA-binding ones. iDNA-KACC achieves an overall accuracy of 75.16% and Matthew correlation coefficient of 0.5 by a rigorous jackknife test. Its performance is further improved by employing an ensemble learning approach, and the improved predictor is called iDNA-KACC-EL. Experimental results on an independent dataset shows that iDNA-KACC-EL outperforms all the other state-of-the-art predictors, indicating that it would be a useful computational tool for DNA binding protein identification. .
A method for generating double-ring-shaped vector beams

NASA Astrophysics Data System (ADS)

Huan, Chen; Xiao-Hui, Ling; Zhi-Hong, Chen; Qian-Guang, Li; Hao, Lv; Hua-Qing, Yu; Xu-Nong, Yi

2016-07-01

We propose a method for generating double-ring-shaped vector beams. A step phase introduced by a spatial light modulator (SLM) first makes the incident laser beam have a nodal cycle. This phase is dynamic in nature because it depends on the optical length. Then a Pancharatnam-Berry phase (PBP) optical element is used to manipulate the local polarization of the optical field by modulating the geometric phase. The experimental results show that this scheme can effectively create double-ring-shaped vector beams. It provides much greater flexibility to manipulate the phase and polarization by simultaneously modulating the dynamic and the geometric phases. Project supported by the National Natural Science Foundation of China (Grant No. 11547017), the Hubei Engineering University Research Foundation, China (Grant No. z2014001), and the Natural Science Foundation of Hubei Province, China (Grant No. 2014CFB578).
Predicting the host of influenza viruses based on the word vector.

PubMed

Xu, Beibei; Tan, Zhiying; Li, Kenli; Jiang, Taijiao; Peng, Yousong

2017-01-01

Newly emerging influenza viruses continue to threaten public health. A rapid determination of the host range of newly discovered influenza viruses would assist in early assessment of their risk. Here, we attempted to predict the host of influenza viruses using the Support Vector Machine (SVM) classifier based on the word vector, a new representation and feature extraction method for biological sequences. The results show that the length of the word within the word vector, the sequence type (DNA or protein) and the species from which the sequences were derived for generating the word vector all influence the performance of models in predicting the host of influenza viruses. In nearly all cases, the models built on the surface proteins hemagglutinin (HA) and neuraminidase (NA) (or their genes) produced better results than internal influenza proteins (or their genes). The best performance was achieved when the model was built on the HA gene based on word vectors (words of three-letters long) generated from DNA sequences of the influenza virus. This results in accuracies of 99.7% for avian, 96.9% for human and 90.6% for swine influenza viruses. Compared to the method of sequence homology best-hit searches using the Basic Local Alignment Search Tool (BLAST), the word vector-based models still need further improvements in predicting the host of influenza A viruses.
Analysis of Particle Content of Recombinant Adeno-Associated Virus Serotype 8 Vectors by Ion-Exchange Chromatography

PubMed Central

Lock, Martin; Alvira, Mauricio R.

2012-01-01

Abstract Advances in adeno-associated virus (AAV)-mediated gene therapy have brought the possibility of commercial manufacturing of AAV vectors one step closer. To realize this prospect, a parallel effort with the goal of ever-increasing sophistication for AAV vector production technology and supporting assays will be required. Among the important release assays for a clinical gene therapy product, those monitoring potentially hazardous contaminants are most critical for patient safety. A prominent contaminant in many AAV vector preparations is vector particles lacking a genome, which can substantially increase the dose of AAV capsid proteins and lead to possible unwanted immunological consequences. Current methods to determine empty particle content suffer from inconsistency, are adversely affected by contaminants, or are not applicable to all serotypes. Here we describe the development of an ion-exchange chromatography-based assay that permits the rapid separation and relative quantification of AAV8 empty and full vector particles through the application of shallow gradients and a strong anion-exchange monolith chromatography medium. PMID:22428980
Comparison of SVM RBF-NN and DT for crop and weed identification based on spectral measurement over corn fields

USDA-ARS?s Scientific Manuscript database

It is important to find an appropriate pattern-recognition method for in-field plant identification based on spectral measurement in order to classify the crop and weeds accurately. In this study, the method of Support Vector Machine (SVM) was evaluated and compared with two other methods, Decision ...
TWSVR: Regression via Twin Support Vector Machine.

PubMed

Khemchandani, Reshma; Goyal, Keshav; Chandra, Suresh

2016-02-01

Taking motivation from Twin Support Vector Machine (TWSVM) formulation, Peng (2010) attempted to propose Twin Support Vector Regression (TSVR) where the regressor is obtained via solving a pair of quadratic programming problems (QPPs). In this paper we argue that TSVR formulation is not in the true spirit of TWSVM. Further, taking motivation from Bi and Bennett (2003), we propose an alternative approach to find a formulation for Twin Support Vector Regression (TWSVR) which is in the true spirit of TWSVM. We show that our proposed TWSVR can be derived from TWSVM for an appropriately constructed classification problem. To check the efficacy of our proposed TWSVR we compare its performance with TSVR and classical Support Vector Regression(SVR) on various regression datasets. Copyright © 2015 Elsevier Ltd. All rights reserved.
Multiscale asymmetric orthogonal wavelet kernel for linear programming support vector learning and nonlinear dynamic systems identification.

PubMed

Lu, Zhao; Sun, Jing; Butts, Kenneth

2014-05-01

Support vector regression for approximating nonlinear dynamic systems is more delicate than the approximation of indicator functions in support vector classification, particularly for systems that involve multitudes of time scales in their sampled data. The kernel used for support vector learning determines the class of functions from which a support vector machine can draw its solution, and the choice of kernel significantly influences the performance of a support vector machine. In this paper, to bridge the gap between wavelet multiresolution analysis and kernel learning, the closed-form orthogonal wavelet is exploited to construct new multiscale asymmetric orthogonal wavelet kernels for linear programming support vector learning. The closed-form multiscale orthogonal wavelet kernel provides a systematic framework to implement multiscale kernel learning via dyadic dilations and also enables us to represent complex nonlinear dynamics effectively. To demonstrate the superiority of the proposed multiscale wavelet kernel in identifying complex nonlinear dynamic systems, two case studies are presented that aim at building parallel models on benchmark datasets. The development of parallel models that address the long-term/mid-term prediction issue is more intricate and challenging than the identification of series-parallel models where only one-step ahead prediction is required. Simulation results illustrate the effectiveness of the proposed multiscale kernel learning.
Anticipatory Monitoring and Control of Complex Systems using a Fuzzy based Fusion of Support Vector Regressors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Miltiadis Alamaniotis; Vivek Agarwal

This paper places itself in the realm of anticipatory systems and envisions monitoring and control methods being capable of making predictions over system critical parameters. Anticipatory systems allow intelligent control of complex systems by predicting their future state. In the current work, an intelligent model aimed at implementing anticipatory monitoring and control in energy industry is presented and tested. More particularly, a set of support vector regressors (SVRs) are trained using both historical and observed data. The trained SVRs are used to predict the future value of the system based on current operational system parameter. The predicted values are thenmore » inputted to a fuzzy logic based module where the values are fused to obtain a single value, i.e., final system output prediction. The methodology is tested on real turbine degradation datasets. The outcome of the approach presented in this paper highlights the superiority over single support vector regressors. In addition, it is shown that appropriate selection of fuzzy sets and fuzzy rules plays an important role in improving system performance.« less
Application of support vector machines for copper potential mapping in Kerman region, Iran

NASA Astrophysics Data System (ADS)

Shabankareh, Mahdi; Hezarkhani, Ardeshir

2017-04-01

The first step in systematic exploration studies is mineral potential mapping, which involves classification of the study area to favorable and unfavorable parts. Support vector machines (SVM) are designed for supervised classification based on statistical learning theory. This method named support vector classification (SVC). This paper describes SVC model, which combine exploration data in the regional-scale for copper potential mapping in Kerman copper bearing belt in south of Iran. Data layers or evidential maps were in six datasets namely lithology, tectonic, airborne geophysics, ferric alteration, hydroxide alteration and geochemistry. The SVC modeling result selected 2220 pixels as favorable zones, approximately 25 percent of the study area. Besides, 66 out of 86 copper indices, approximately 78.6% of all, were located in favorable zones. Other main goal of this study was to determine how each input affects favorable output. For this purpose, the histogram of each normalized input data to its favorable output was drawn. The histograms of each input dataset for favorable output showed that each information layer had a certain pattern. These patterns of SVC results could be considered as regional copper exploration characteristics.
Improving protein-protein interactions prediction accuracy using protein evolutionary information and relevance vector machine model.

PubMed

An, Ji-Yong; Meng, Fan-Rong; You, Zhu-Hong; Chen, Xing; Yan, Gui-Ying; Hu, Ji-Pu

2016-10-01

Predicting protein-protein interactions (PPIs) is a challenging task and essential to construct the protein interaction networks, which is important for facilitating our understanding of the mechanisms of biological systems. Although a number of high-throughput technologies have been proposed to predict PPIs, there are unavoidable shortcomings, including high cost, time intensity, and inherently high false positive rates. For these reasons, many computational methods have been proposed for predicting PPIs. However, the problem is still far from being solved. In this article, we propose a novel computational method called RVM-BiGP that combines the relevance vector machine (RVM) model and Bi-gram Probabilities (BiGP) for PPIs detection from protein sequences. The major improvement includes (1) Protein sequences are represented using the Bi-gram probabilities (BiGP) feature representation on a Position Specific Scoring Matrix (PSSM), in which the protein evolutionary information is contained; (2) For reducing the influence of noise, the Principal Component Analysis (PCA) method is used to reduce the dimension of BiGP vector; (3) The powerful and robust Relevance Vector Machine (RVM) algorithm is used for classification. Five-fold cross-validation experiments executed on yeast and Helicobacter pylori datasets, which achieved very high accuracies of 94.57 and 90.57%, respectively. Experimental results are significantly better than previous methods. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM) classifier on the yeast dataset. The experimental results demonstrate that our RVM-BiGP method is significantly better than the SVM-based method. In addition, we achieved 97.15% accuracy on imbalance yeast dataset, which is higher than that of balance yeast dataset. The promising experimental results show the efficiency and robust of the proposed method, which can be an automatic decision support tool for future proteomics research. For facilitating extensive studies for future proteomics research, we developed a freely available web server called RVM-BiGP-PPIs in Hypertext Preprocessor (PHP) for predicting PPIs. The web server including source code and the datasets are available at http://219.219.62.123:8888/BiGP/. © 2016 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
A Code Generation Approach for Auto-Vectorization in the Spade Compiler

NASA Astrophysics Data System (ADS)

Wang, Huayong; Andrade, Henrique; Gedik, Buğra; Wu, Kun-Lung

We describe an auto-vectorization approach for the Spade stream processing programming language, comprising two ideas. First, we provide support for vectors as a primitive data type. Second, we provide a C++ library with architecture-specific implementations of a large number of pre-vectorized operations as the means to support language extensions. We evaluate our approach with several stream processing operators, contrasting Spade's auto-vectorization with the native auto-vectorization provided by the GNU gcc and Intel icc compilers.
Aeromagnetic gradient compensation method for helicopter based on ɛ-support vector regression algorithm

NASA Astrophysics Data System (ADS)

Wu, Peilin; Zhang, Qunying; Fei, Chunjiao; Fang, Guangyou

2017-04-01

Aeromagnetic gradients are typically measured by optically pumped magnetometers mounted on an aircraft. Any aircraft, particularly helicopters, produces significant levels of magnetic interference. Therefore, aeromagnetic compensation is essential, and least square (LS) is the conventional method used for reducing interference levels. However, the LSs approach to solving the aeromagnetic interference model has a few difficulties, one of which is in handling multicollinearity. Therefore, we propose an aeromagnetic gradient compensation method, specifically targeted for helicopter use but applicable on any airborne platform, which is based on the ɛ-support vector regression algorithm. The structural risk minimization criterion intrinsic to the method avoids multicollinearity altogether. Local aeromagnetic anomalies can be retained, and platform-generated fields are suppressed simultaneously by constructing an appropriate loss function and kernel function. The method was tested using an unmanned helicopter and obtained improvement ratios of 12.7 and 3.5 in the vertical and horizontal gradient data, respectively. Both of these values are probably better than those that would have been obtained from the conventional method applied to the same data, had it been possible to do so in a suitable comparative context. The validity of the proposed method is demonstrated by the experimental result.
Distance Metric Learning via Iterated Support Vector Machines.

PubMed

Zuo, Wangmeng; Wang, Faqiang; Zhang, David; Lin, Liang; Huang, Yuchi; Meng, Deyu; Zhang, Lei

2017-07-11

Distance metric learning aims to learn from the given training data a valid distance metric, with which the similarity between data samples can be more effectively evaluated for classification. Metric learning is often formulated as a convex or nonconvex optimization problem, while most existing methods are based on customized optimizers and become inefficient for large scale problems. In this paper, we formulate metric learning as a kernel classification problem with the positive semi-definite constraint, and solve it by iterated training of support vector machines (SVMs). The new formulation is easy to implement and efficient in training with the off-the-shelf SVM solvers. Two novel metric learning models, namely Positive-semidefinite Constrained Metric Learning (PCML) and Nonnegative-coefficient Constrained Metric Learning (NCML), are developed. Both PCML and NCML can guarantee the global optimality of their solutions. Experiments are conducted on general classification, face verification and person re-identification to evaluate our methods. Compared with the state-of-the-art approaches, our methods can achieve comparable classification accuracy and are efficient in training.
Stable Isotope Ratio and Elemental Profile Combined with Support Vector Machine for Provenance Discrimination of Oolong Tea (Wuyi-Rock Tea)

PubMed Central

Lou, Yun-xiao; Fu, Xian-shu; Yu, Xiao-ping; Zhang, Ya-fen

2017-01-01

This paper focused on an effective method to discriminate the geographical origin of Wuyi-Rock tea by the stable isotope ratio (SIR) and metallic element profiling (MEP) combined with support vector machine (SVM) analysis. Wuyi-Rock tea (n = 99) collected from nine producing areas and non-Wuyi-Rock tea (n = 33) from eleven nonproducing areas were analysed for SIR and MEP by established methods. The SVM model based on coupled data produced the best prediction accuracy (0.9773). This prediction shows that instrumental methods combined with a classification model can provide an effective and stable tool for provenance discrimination. Moreover, every feature variable in stable isotope and metallic element data was ranked by its contribution to the model. The results show that δ2H, δ18O, Cs, Cu, Ca, and Rb contents are significant indications for provenance discrimination and not all of the metallic elements improve the prediction accuracy of the SVM model. PMID:28473941
Rapid prediction of chemical metabolism by human UDP-glucuronosyltransferase isoforms using quantum chemical descriptors derived with the electronegativity equalization method.

PubMed

Sorich, Michael J; McKinnon, Ross A; Miners, John O; Winkler, David A; Smith, Paul A

2004-10-07

This study aimed to evaluate in silico models based on quantum chemical (QC) descriptors derived using the electronegativity equalization method (EEM) and to assess the use of QC properties to predict chemical metabolism by human UDP-glucuronosyltransferase (UGT) isoforms. Various EEM-derived QC molecular descriptors were calculated for known UGT substrates and nonsubstrates. Classification models were developed using support vector machine and partial least squares discriminant analysis. In general, the most predictive models were generated with the support vector machine. Combining QC and 2D descriptors (from previous work) using a consensus approach resulted in a statistically significant improvement in predictivity (to 84%) over both the QC and 2D models and the other methods of combining the descriptors. EEM-derived QC descriptors were shown to be both highly predictive and computationally efficient. It is likely that EEM-derived QC properties will be generally useful for predicting ADMET and physicochemical properties during drug discovery.
Speech sound classification and detection of articulation disorders with support vector machines and wavelets.

PubMed

Georgoulas, George; Georgopoulos, Voula C; Stylios, Chrysostomos D

2006-01-01

This paper proposes a novel integrated methodology to extract features and classify speech sounds with intent to detect the possible existence of a speech articulation disorder in a speaker. Articulation, in effect, is the specific and characteristic way that an individual produces the speech sounds. A methodology to process the speech signal, extract features and finally classify the signal and detect articulation problems in a speaker is presented. The use of support vector machines (SVMs), for the classification of speech sounds and detection of articulation disorders is introduced. The proposed method is implemented on a data set where different sets of features and different schemes of SVMs are tested leading to satisfactory performance.
Prediction of mutagenic toxicity by combination of Recursive Partitioning and Support Vector Machines.

PubMed

Liao, Quan; Yao, Jianhua; Yuan, Shengang

2007-05-01

The study of prediction of toxicity is very important and necessary because measurement of toxicity is typically time-consuming and expensive. In this paper, Recursive Partitioning (RP) method was used to select descriptors. RP and Support Vector Machines (SVM) were used to construct structure-toxicity relationship models, RP model and SVM model, respectively. The performances of the two models are different. The prediction accuracies of the RP model are 80.2% for mutagenic compounds in MDL's toxicity database, 83.4% for compounds in CMC and 84.9% for agrochemicals in in-house database respectively. Those of SVM model are 81.4%, 87.0% and 87.3% respectively.
StruLocPred: structure-based protein subcellular localisation prediction using multi-class support vector machine.

PubMed

Zhou, Wengang; Dickerson, Julie A

2012-01-01

Knowledge of protein subcellular locations can help decipher a protein's biological function. This work proposes new features: sequence-based: Hybrid Amino Acid Pair (HAAP) and two structure-based: Secondary Structural Element Composition (SSEC) and solvent accessibility state frequency. A multi-class Support Vector Machine is developed to predict the locations. Testing on two established data sets yields better prediction accuracies than the best available systems. Comparisons with existing methods show comparable results to ESLPred2. When StruLocPred is applied to the entire Arabidopsis proteome, over 77% of proteins with known locations match the prediction results. An implementation of this system is at http://wgzhou.ece. iastate.edu/StruLocPred/.
Neighboring block based disparity vector derivation for multiview compatible 3D-AVC

NASA Astrophysics Data System (ADS)

Kang, Jewon; Chen, Ying; Zhang, Li; Zhao, Xin; Karczewicz, Marta

2013-09-01

3D-AVC being developed under Joint Collaborative Team on 3D Video Coding (JCT-3V) significantly outperforms the Multiview Video Coding plus Depth (MVC+D) which simultaneously encodes texture views and depth views with the multiview extension of H.264/AVC (MVC). However, when the 3D-AVC is configured to support multiview compatibility in which texture views are decoded without depth information, the coding performance becomes significantly degraded. The reason is that advanced coding tools incorporated into the 3D-AVC do not perform well due to the lack of a disparity vector converted from the depth information. In this paper, we propose a disparity vector derivation method utilizing only the information of texture views. Motion information of neighboring blocks is used to determine a disparity vector for a macroblock, so that the derived disparity vector is efficiently used for the coding tools in 3D-AVC. The proposed method significantly improves a coding gain of the 3D-AVC in the multiview compatible mode about 20% BD-rate saving in the coded views and 26% BD-rate saving in the synthesized views on average.
A new local-global approach for classification.

PubMed

Peres, R T; Pedreira, C E

2010-09-01

In this paper, we propose a new local-global pattern classification scheme that combines supervised and unsupervised approaches, taking advantage of both, local and global environments. We understand as global methods the ones concerned with the aim of constructing a model for the whole problem space using the totality of the available observations. Local methods focus into sub regions of the space, possibly using an appropriately selected subset of the sample. In the proposed method, the sample is first divided in local cells by using a Vector Quantization unsupervised algorithm, the LBG (Linde-Buzo-Gray). In a second stage, the generated assemblage of much easier problems is locally solved with a scheme inspired by Bayes' rule. Four classification methods were implemented for comparison purposes with the proposed scheme: Learning Vector Quantization (LVQ); Feedforward Neural Networks; Support Vector Machine (SVM) and k-Nearest Neighbors. These four methods and the proposed scheme were implemented in eleven datasets, two controlled experiments, plus nine public available datasets from the UCI repository. The proposed method has shown a quite competitive performance when compared to these classical and largely used classifiers. Our method is simple concerning understanding and implementation and is based on very intuitive concepts. Copyright 2010 Elsevier Ltd. All rights reserved.

Working set selection using functional gain for LS-SVM.

PubMed

Bo, Liefeng; Jiao, Licheng; Wang, Ling

2007-09-01

The efficiency of sequential minimal optimization (SMO) depends strongly on the working set selection. This letter shows how the improvement of SMO in each iteration, named the functional gain (FG), is used to select the working set for least squares support vector machine (LS-SVM). We prove the convergence of the proposed method and give some theoretical support for its performance. Empirical comparisons demonstrate that our method is superior to the maximum violating pair (MVP) working set selection.
Process service quality evaluation based on Dempster-Shafer theory and support vector machine.

PubMed

Pei, Feng-Que; Li, Dong-Bo; Tong, Yi-Fei; He, Fei

2017-01-01

Human involvement influences traditional service quality evaluations, which triggers an evaluation's low accuracy, poor reliability and less impressive predictability. This paper proposes a method by employing a support vector machine (SVM) and Dempster-Shafer evidence theory to evaluate the service quality of a production process by handling a high number of input features with a low sampling data set, which is called SVMs-DS. Features that can affect production quality are extracted by a large number of sensors. Preprocessing steps such as feature simplification and normalization are reduced. Based on three individual SVM models, the basic probability assignments (BPAs) are constructed, which can help the evaluation in a qualitative and quantitative way. The process service quality evaluation results are validated by the Dempster rules; the decision threshold to resolve conflicting results is generated from three SVM models. A case study is presented to demonstrate the effectiveness of the SVMs-DS method.
Real Time Monitoring System of Pollution Waste on Musi River Using Support Vector Machine (SVM) Method

NASA Astrophysics Data System (ADS)

Fachrurrozi, Muhammad; Saparudin; Erwin

2017-04-01

Real-time Monitoring and early detection system which measures the quality standard of waste in Musi River, Palembang, Indonesia is a system for determining air and water pollution level. This system was designed in order to create an integrated monitoring system and provide real time information that can be read. It is designed to measure acidity and water turbidity polluted by industrial waste, as well as to show and provide conditional data integrated in one system. This system consists of inputting and processing the data, and giving output based on processed data. Turbidity, substances, and pH sensor is used as a detector that produce analog electrical direct current voltage (DC). Early detection system works by determining the value of the ammonia threshold, acidity, and turbidity level of water in Musi River. The results is then presented based on the level group pollution by the Support Vector Machine classification method.
Privacy preserving RBF kernel support vector machine.

PubMed

Li, Haoran; Xiong, Li; Ohno-Machado, Lucila; Jiang, Xiaoqian

2014-01-01

Data sharing is challenging but important for healthcare research. Methods for privacy-preserving data dissemination based on the rigorous differential privacy standard have been developed but they did not consider the characteristics of biomedical data and make full use of the available information. This often results in too much noise in the final outputs. We hypothesized that this situation can be alleviated by leveraging a small portion of open-consented data to improve utility without sacrificing privacy. We developed a hybrid privacy-preserving differentially private support vector machine (SVM) model that uses public data and private data together. Our model leverages the RBF kernel and can handle nonlinearly separable cases. Experiments showed that this approach outperforms two baselines: (1) SVMs that only use public data, and (2) differentially private SVMs that are built from private data. Our method demonstrated very close performance metrics compared to nonprivate SVMs trained on the private data.
A support vector machine based control application to the experimental three-tank system.

PubMed

Iplikci, Serdar

2010-07-01

This paper presents a support vector machine (SVM) approach to generalized predictive control (GPC) of multiple-input multiple-output (MIMO) nonlinear systems. The possession of higher generalization potential and at the same time avoidance of getting stuck into the local minima have motivated us to employ SVM algorithms for modeling MIMO systems. Based on the SVM model, detailed and compact formulations for calculating predictions and gradient information, which are used in the computation of the optimal control action, are given in the paper. The proposed MIMO SVM-based GPC method has been verified on an experimental three-tank liquid level control system. Experimental results have shown that the proposed method can handle the control task successfully for different reference trajectories. Moreover, a detailed discussion on data gathering, model selection and effects of the control parameters have been given in this paper. 2010 ISA. Published by Elsevier Ltd. All rights reserved.
Privacy Preserving RBF Kernel Support Vector Machine

PubMed Central

Xiong, Li; Ohno-Machado, Lucila

2014-01-01

Data sharing is challenging but important for healthcare research. Methods for privacy-preserving data dissemination based on the rigorous differential privacy standard have been developed but they did not consider the characteristics of biomedical data and make full use of the available information. This often results in too much noise in the final outputs. We hypothesized that this situation can be alleviated by leveraging a small portion of open-consented data to improve utility without sacrificing privacy. We developed a hybrid privacy-preserving differentially private support vector machine (SVM) model that uses public data and private data together. Our model leverages the RBF kernel and can handle nonlinearly separable cases. Experiments showed that this approach outperforms two baselines: (1) SVMs that only use public data, and (2) differentially private SVMs that are built from private data. Our method demonstrated very close performance metrics compared to nonprivate SVMs trained on the private data. PMID:25013805
Support vector machines-based fault diagnosis for turbo-pump rotor

NASA Astrophysics Data System (ADS)

Yuan, Sheng-Fa; Chu, Fu-Lei

2006-05-01

Most artificial intelligence methods used in fault diagnosis are based on empirical risk minimisation principle and have poor generalisation when fault samples are few. Support vector machines (SVM) is a new general machine-learning tool based on structural risk minimisation principle that exhibits good generalisation even when fault samples are few. Fault diagnosis based on SVM is discussed. Since basic SVM is originally designed for two-class classification, while most of fault diagnosis problems are multi-class cases, a new multi-class classification of SVM named 'one to others' algorithm is presented to solve the multi-class recognition problems. It is a binary tree classifier composed of several two-class classifiers organised by fault priority, which is simple, and has little repeated training amount, and the rate of training and recognition is expedited. The effectiveness of the method is verified by the application to the fault diagnosis for turbo pump rotor.
Assessing and comparison of different machine learning methods in parent-offspring trios for genotype imputation.

PubMed

Mikhchi, Abbas; Honarvar, Mahmood; Kashan, Nasser Emam Jomeh; Aminafshar, Mehdi

2016-06-21

Genotype imputation is an important tool for prediction of unknown genotypes for both unrelated individuals and parent-offspring trios. Several imputation methods are available and can either employ universal machine learning methods, or deploy algorithms dedicated to infer missing genotypes. In this research the performance of eight machine learning methods: Support Vector Machine, K-Nearest Neighbors, Extreme Learning Machine, Radial Basis Function, Random Forest, AdaBoost, LogitBoost, and TotalBoost compared in terms of the imputation accuracy, computation time and the factors affecting imputation accuracy. The methods employed using real and simulated datasets to impute the un-typed SNPs in parent-offspring trios. The tested methods show that imputation of parent-offspring trios can be accurate. The Random Forest and Support Vector Machine were more accurate than the other machine learning methods. The TotalBoost performed slightly worse than the other methods.The running times were different between methods. The ELM was always most fast algorithm. In case of increasing the sample size, the RBF requires long imputation time.The tested methods in this research can be an alternative for imputation of un-typed SNPs in low missing rate of data. However, it is recommended that other machine learning methods to be used for imputation. Copyright © 2016 Elsevier Ltd. All rights reserved.
Support vector inductive logic programming outperforms the naive Bayes classifier and inductive logic programming for the classification of bioactive chemical compounds.

PubMed

Cannon, Edward O; Amini, Ata; Bender, Andreas; Sternberg, Michael J E; Muggleton, Stephen H; Glen, Robert C; Mitchell, John B O

2007-05-01

We investigate the classification performance of circular fingerprints in combination with the Naive Bayes Classifier (MP2D), Inductive Logic Programming (ILP) and Support Vector Inductive Logic Programming (SVILP) on a standard molecular benchmark dataset comprising 11 activity classes and about 102,000 structures. The Naive Bayes Classifier treats features independently while ILP combines structural fragments, and then creates new features with higher predictive power. SVILP is a very recently presented method which adds a support vector machine after common ILP procedures. The performance of the methods is evaluated via a number of statistical measures, namely recall, specificity, precision, F-measure, Matthews Correlation Coefficient, area under the Receiver Operating Characteristic (ROC) curve and enrichment factor (EF). According to the F-measure, which takes both recall and precision into account, SVILP is for seven out of the 11 classes the superior method. The results show that the Bayes Classifier gives the best recall performance for eight of the 11 targets, but has a much lower precision, specificity and F-measure. The SVILP model on the other hand has the highest recall for only three of the 11 classes, but generally far superior specificity and precision. To evaluate the statistical significance of the SVILP superiority, we employ McNemar's test which shows that SVILP performs significantly (p < 5%) better than both other methods for six out of 11 activity classes, while being superior with less significance for three of the remaining classes. While previously the Bayes Classifier was shown to perform very well in molecular classification studies, these results suggest that SVILP is able to extract additional knowledge from the data, thus improving classification results further.
Automatic Recognition of Fetal Facial Standard Plane in Ultrasound Image via Fisher Vector.

PubMed

Lei, Baiying; Tan, Ee-Leng; Chen, Siping; Zhuo, Liu; Li, Shengli; Ni, Dong; Wang, Tianfu

2015-01-01

Acquisition of the standard plane is the prerequisite of biometric measurement and diagnosis during the ultrasound (US) examination. In this paper, a new algorithm is developed for the automatic recognition of the fetal facial standard planes (FFSPs) such as the axial, coronal, and sagittal planes. Specifically, densely sampled root scale invariant feature transform (RootSIFT) features are extracted and then encoded by Fisher vector (FV). The Fisher network with multi-layer design is also developed to extract spatial information to boost the classification performance. Finally, automatic recognition of the FFSPs is implemented by support vector machine (SVM) classifier based on the stochastic dual coordinate ascent (SDCA) algorithm. Experimental results using our dataset demonstrate that the proposed method achieves an accuracy of 93.27% and a mean average precision (mAP) of 99.19% in recognizing different FFSPs. Furthermore, the comparative analyses reveal the superiority of the proposed method based on FV over the traditional methods.
Thrust vector control algorithm design for the Cassini spacecraft

NASA Technical Reports Server (NTRS)

Enright, Paul J.

1993-01-01

This paper describes a preliminary design of the thrust vector control algorithm for the interplanetary spacecraft, Cassini. Topics of discussion include flight software architecture, modeling of sensors, actuators, and vehicle dynamics, and controller design and analysis via classical methods. Special attention is paid to potential interactions with structural flexibilities and propellant dynamics. Controller performance is evaluated in a simulation environment built around a multi-body dynamics model, which contains nonlinear models of the relevant hardware and preliminary versions of supporting attitude determination and control functions.
Signal detection using support vector machines in the presence of ultrasonic speckle

NASA Astrophysics Data System (ADS)

Kotropoulos, Constantine L.; Pitas, Ioannis

2002-04-01

Support Vector Machines are a general algorithm based on guaranteed risk bounds of statistical learning theory. They have found numerous applications, such as in classification of brain PET images, optical character recognition, object detection, face verification, text categorization and so on. In this paper we propose the use of support vector machines to segment lesions in ultrasound images and we assess thoroughly their lesion detection ability. We demonstrate that trained support vector machines with a Radial Basis Function kernel segment satisfactorily (unseen) ultrasound B-mode images as well as clinical ultrasonic images.
An image segmentation method for apple sorting and grading using support vector machine and Otsu's method

USDA-ARS?s Scientific Manuscript database

Segmentation is the first step in image analysis to subdivide an image into meaningful regions. The segmentation result directly affects the subsequent image analysis. The objective of the research was to develop an automatic adjustable algorithm for segmentation of color images, using linear suppor...
Aeroelastic analysis of a troposkien-type wind turbine blade

NASA Technical Reports Server (NTRS)

Nitzsche, F.

1981-01-01

The linear aeroelastic equations for one curved blade of a vertical axis wind turbine in state vector form are presented. The method is based on a simple integrating matrix scheme together with the transfer matrix idea. The method is proposed as a convenient way of solving the associated eigenvalue problem for general support conditions.
Gender classification from face images by using local binary pattern and gray-level co-occurrence matrix

NASA Astrophysics Data System (ADS)

Uzbaş, Betül; Arslan, Ahmet

2018-04-01

Gender is an important step for human computer interactive processes and identification. Human face image is one of the important sources to determine gender. In the present study, gender classification is performed automatically from facial images. In order to classify gender, we propose a combination of features that have been extracted face, eye and lip regions by using a hybrid method of Local Binary Pattern and Gray-Level Co-Occurrence Matrix. The features have been extracted from automatically obtained face, eye and lip regions. All of the extracted features have been combined and given as input parameters to classification methods (Support Vector Machine, Artificial Neural Networks, Naive Bayes and k-Nearest Neighbor methods) for gender classification. The Nottingham Scan face database that consists of the frontal face images of 100 people (50 male and 50 female) is used for this purpose. As the result of the experimental studies, the highest success rate has been achieved as 98% by using Support Vector Machine. The experimental results illustrate the efficacy of our proposed method.
Identification of DNA-Binding Proteins Using Mixed Feature Representation Methods.

PubMed

Qu, Kaiyang; Han, Ke; Wu, Song; Wang, Guohua; Wei, Leyi

2017-09-22

DNA-binding proteins play vital roles in cellular processes, such as DNA packaging, replication, transcription, regulation, and other DNA-associated activities. The current main prediction method is based on machine learning, and its accuracy mainly depends on the features extraction method. Therefore, using an efficient feature representation method is important to enhance the classification accuracy. However, existing feature representation methods cannot efficiently distinguish DNA-binding proteins from non-DNA-binding proteins. In this paper, a multi-feature representation method, which combines three feature representation methods, namely, K-Skip-N-Grams, Information theory, and Sequential and structural features (SSF), is used to represent the protein sequences and improve feature representation ability. In addition, the classifier is a support vector machine. The mixed-feature representation method is evaluated using 10-fold cross-validation and a test set. Feature vectors, which are obtained from a combination of three feature extractions, show the best performance in 10-fold cross-validation both under non-dimensional reduction and dimensional reduction by max-relevance-max-distance. Moreover, the reduced mixed feature method performs better than the non-reduced mixed feature technique. The feature vectors, which are a combination of SSF and K-Skip-N-Grams, show the best performance in the test set. Among these methods, mixed features exhibit superiority over the single features.
Support vector machines for prediction and analysis of beta and gamma-turns in proteins.

PubMed

Pham, Tho Hoan; Satou, Kenji; Ho, Tu Bao

2005-04-01

Tight turns have long been recognized as one of the three important features of proteins, together with alpha-helix and beta-sheet. Tight turns play an important role in globular proteins from both the structural and functional points of view. More than 90% tight turns are beta-turns and most of the rest are gamma-turns. Analysis and prediction of beta-turns and gamma-turns is very useful for design of new molecules such as drugs, pesticides, and antigens. In this paper we investigated two aspects of applying support vector machine (SVM), a promising machine learning method for bioinformatics, to prediction and analysis of beta-turns and gamma-turns. First, we developed two SVM-based methods, called BTSVM and GTSVM, which predict beta-turns and gamma-turns in a protein from its sequence. When compared with other methods, BTSVM has a superior performance and GTSVM is competitive. Second, we used SVMs with a linear kernel to estimate the support of amino acids for the formation of beta-turns and gamma-turns depending on their position in a protein. Our analysis results are more comprehensive and easier to use than the previous results in designing turns in proteins.
Evaluation Method for Service Branding Using Word-of-Mouth Data

NASA Astrophysics Data System (ADS)

Shirahada, Kunio; Kosaka, Michitaka

Development and spread of internet technology contributes service firms to obtaining the high capability of brand information transmission as well as relative customer feedback data collection. In this paper, we propose a new evaluation method for service branding using firms and consumers data on the internet. Based on service marketing 7Ps (Product, Price, Place, Promotion, People, Physical evidence, Process) which are the key viewpoints for branding, we develop a brand evaluation system including coding methods for Word-of-Mouth (WoM) and corporate introductory information on the internet to identify both customer's service value recognition vector and firm's service value proposition vector. Our system quantitatively clarify both customer's service value recognition of the firm and firm's strength in service value proposition, thereby analyzing service brand communication gaps between firm and consumers. We applied this system to Japanese Ryokan hotel industry. Using six ryokan-hotels' data on Jyaran-net and Rakuten travel, we made totally 983 codes from WoM information and analyzed their service brand value according to three price based categories. As a result, we found that the characteristics of customers' service value recognition vector differ according to the price categories. In addition, the system clarified that there is a firm that has a different service value proposition vector from customers' recognition vector. This helps to analyze corporate service brand strategy and has a significance as a system technology supporting service management.
Building a computer program to support children, parents, and distraction during healthcare procedures.

PubMed

Hanrahan, Kirsten; McCarthy, Ann Marie; Kleiber, Charmaine; Ataman, Kaan; Street, W Nick; Zimmerman, M Bridget; Ersig, Anne L

2012-10-01

This secondary data analysis used data mining methods to develop predictive models of child risk for distress during a healthcare procedure. Data used came from a study that predicted factors associated with children's responses to an intravenous catheter insertion while parents provided distraction coaching. From the 255 items used in the primary study, 44 predictive items were identified through automatic feature selection and used to build support vector machine regression models. Models were validated using multiple cross-validation tests and by comparing variables identified as explanatory in the traditional versus support vector machine regression. Rule-based approaches were applied to the model outputs to identify overall risk for distress. A decision tree was then applied to evidence-based instructions for tailoring distraction to characteristics and preferences of the parent and child. The resulting decision support computer application, titled Children, Parents and Distraction, is being used in research. Future use will support practitioners in deciding the level and type of distraction intervention needed by a child undergoing a healthcare procedure.
Wavelet Entropy and Directed Acyclic Graph Support Vector Machine for Detection of Patients with Unilateral Hearing Loss in MRI Scanning

PubMed Central

Wang, Shuihua; Yang, Ming; Du, Sidan; Yang, Jiquan; Liu, Bin; Gorriz, Juan M.; Ramírez, Javier; Yuan, Ti-Fei; Zhang, Yudong

2016-01-01

Highlights We develop computer-aided diagnosis system for unilateral hearing loss detection in structural magnetic resonance imaging.Wavelet entropy is introduced to extract image global features from brain images. Directed acyclic graph is employed to endow support vector machine an ability to handle multi-class problems.The developed computer-aided diagnosis system achieves an overall accuracy of 95.1% for this three-class problem of differentiating left-sided and right-sided hearing loss from healthy controls. Aim: Sensorineural hearing loss (SNHL) is correlated to many neurodegenerative disease. Now more and more computer vision based methods are using to detect it in an automatic way. Materials: We have in total 49 subjects, scanned by 3.0T MRI (Siemens Medical Solutions, Erlangen, Germany). The subjects contain 14 patients with right-sided hearing loss (RHL), 15 patients with left-sided hearing loss (LHL), and 20 healthy controls (HC). Method: We treat this as a three-class classification problem: RHL, LHL, and HC. Wavelet entropy (WE) was selected from the magnetic resonance images of each subjects, and then submitted to a directed acyclic graph support vector machine (DAG-SVM). Results: The 10 repetition results of 10-fold cross validation shows 3-level decomposition will yield an overall accuracy of 95.10% for this three-class classification problem, higher than feedforward neural network, decision tree, and naive Bayesian classifier. Conclusions: This computer-aided diagnosis system is promising. We hope this study can attract more computer vision method for detecting hearing loss. PMID:27807415

[Identification of varieties of cashmere by Vis/NIR spectroscopy technology based on PCA-SVM].

PubMed

Wu, Gui-Fang; He, Yong

2009-06-01

One mixed algorithm was presented to discriminate cashmere varieties with principal component analysis (PCA) and support vector machine (SVM). Cashmere fiber has such characteristics as threadlike, softness, glossiness and high tensile strength. The quality characters and economic value of each breed of cashmere are very different. In order to safeguard the consumer's rights and guarantee the quality of cashmere product, quickly, efficiently and correctly identifying cashmere has significant meaning to the production and transaction of cashmere material. The present research adopts Vis/NIRS spectroscopy diffuse techniques to collect the spectral data of cashmere. The near infrared fingerprint of cashmere was acquired by principal component analysis (PCA), and support vector machine (SVM) methods were used to further identify the cashmere material. The result of PCA indicated that the score map made by the scores of PC1, PC2 and PC3 was used, and 10 principal components (PCs) were selected as the input of support vector machine (SVM) based on the reliabilities of PCs of 99.99%. One hundred cashmere samples were used for calibration and the remaining 75 cashmere samples were used for validation. A one-against-all multi-class SVM model was built, the capabilities of SVM with different kernel function were comparatively analyzed, and the result showed that SVM possessing with the Gaussian kernel function has the best identification capabilities with the accuracy of 100%. This research indicated that the data mining method of PCA-SVM has a good identification effect, and can work as a new method for rapid identification of cashmere material varieties.
Wavelet Entropy and Directed Acyclic Graph Support Vector Machine for Detection of Patients with Unilateral Hearing Loss in MRI Scanning.

PubMed

Wang, Shuihua; Yang, Ming; Du, Sidan; Yang, Jiquan; Liu, Bin; Gorriz, Juan M; Ramírez, Javier; Yuan, Ti-Fei; Zhang, Yudong

2016-01-01

Highlights We develop computer-aided diagnosis system for unilateral hearing loss detection in structural magnetic resonance imaging.Wavelet entropy is introduced to extract image global features from brain images. Directed acyclic graph is employed to endow support vector machine an ability to handle multi-class problems.The developed computer-aided diagnosis system achieves an overall accuracy of 95.1% for this three-class problem of differentiating left-sided and right-sided hearing loss from healthy controls. Aim: Sensorineural hearing loss (SNHL) is correlated to many neurodegenerative disease. Now more and more computer vision based methods are using to detect it in an automatic way. Materials: We have in total 49 subjects, scanned by 3.0T MRI (Siemens Medical Solutions, Erlangen, Germany). The subjects contain 14 patients with right-sided hearing loss (RHL), 15 patients with left-sided hearing loss (LHL), and 20 healthy controls (HC). Method: We treat this as a three-class classification problem: RHL, LHL, and HC. Wavelet entropy (WE) was selected from the magnetic resonance images of each subjects, and then submitted to a directed acyclic graph support vector machine (DAG-SVM). Results: The 10 repetition results of 10-fold cross validation shows 3-level decomposition will yield an overall accuracy of 95.10% for this three-class classification problem, higher than feedforward neural network, decision tree, and naive Bayesian classifier. Conclusions: This computer-aided diagnosis system is promising. We hope this study can attract more computer vision method for detecting hearing loss.
Identification and Severity Determination of Wheat Stripe Rust and Wheat Leaf Rust Based on Hyperspectral Data Acquired Using a Black-Paper-Based Measuring Method.

PubMed

Wang, Hui; Qin, Feng; Ruan, Liu; Wang, Rui; Liu, Qi; Ma, Zhanhong; Li, Xiaolong; Cheng, Pei; Wang, Haiguang

2016-01-01

It is important to implement detection and assessment of plant diseases based on remotely sensed data for disease monitoring and control. Hyperspectral data of healthy leaves, leaves in incubation period and leaves in diseased period of wheat stripe rust and wheat leaf rust were collected under in-field conditions using a black-paper-based measuring method developed in this study. After data preprocessing, the models to identify the diseases were built using distinguished partial least squares (DPLS) and support vector machine (SVM), and the disease severity inversion models of stripe rust and the disease severity inversion models of leaf rust were built using quantitative partial least squares (QPLS) and support vector regression (SVR). All the models were validated by using leave-one-out cross validation and external validation. The diseases could be discriminated using both distinguished partial least squares and support vector machine with the accuracies of more than 99%. For each wheat rust, disease severity levels were accurately retrieved using both the optimal QPLS models and the optimal SVR models with the coefficients of determination (R2) of more than 0.90 and the root mean square errors (RMSE) of less than 0.15. The results demonstrated that identification and severity evaluation of stripe rust and leaf rust at the leaf level could be implemented based on the hyperspectral data acquired using the developed method. A scientific basis was provided for implementing disease monitoring by using aerial and space remote sensing technologies.
Identification and Severity Determination of Wheat Stripe Rust and Wheat Leaf Rust Based on Hyperspectral Data Acquired Using a Black-Paper-Based Measuring Method

PubMed Central

Ruan, Liu; Wang, Rui; Liu, Qi; Ma, Zhanhong; Li, Xiaolong; Cheng, Pei; Wang, Haiguang

2016-01-01

It is important to implement detection and assessment of plant diseases based on remotely sensed data for disease monitoring and control. Hyperspectral data of healthy leaves, leaves in incubation period and leaves in diseased period of wheat stripe rust and wheat leaf rust were collected under in-field conditions using a black-paper-based measuring method developed in this study. After data preprocessing, the models to identify the diseases were built using distinguished partial least squares (DPLS) and support vector machine (SVM), and the disease severity inversion models of stripe rust and the disease severity inversion models of leaf rust were built using quantitative partial least squares (QPLS) and support vector regression (SVR). All the models were validated by using leave-one-out cross validation and external validation. The diseases could be discriminated using both distinguished partial least squares and support vector machine with the accuracies of more than 99%. For each wheat rust, disease severity levels were accurately retrieved using both the optimal QPLS models and the optimal SVR models with the coefficients of determination (R2) of more than 0.90 and the root mean square errors (RMSE) of less than 0.15. The results demonstrated that identification and severity evaluation of stripe rust and leaf rust at the leaf level could be implemented based on the hyperspectral data acquired using the developed method. A scientific basis was provided for implementing disease monitoring by using aerial and space remote sensing technologies. PMID:27128464
Predicting protein amidation sites by orchestrating amino acid sequence features

NASA Astrophysics Data System (ADS)

Zhao, Shuqiu; Yu, Hua; Gong, Xiujun

2017-08-01

Amidation is the fourth major category of post-translational modifications, which plays an important role in physiological and pathological processes. Identifying amidation sites can help us understanding the amidation and recognizing the original reason of many kinds of diseases. But the traditional experimental methods for predicting amidation sites are often time-consuming and expensive. In this study, we propose a computational method for predicting amidation sites by orchestrating amino acid sequence features. Three kinds of feature extraction methods are used to build a feature vector enabling to capture not only the physicochemical properties but also position related information of the amino acids. An extremely randomized trees algorithm is applied to choose the optimal features to remove redundancy and dependence among components of the feature vector by a supervised fashion. Finally the support vector machine classifier is used to label the amidation sites. When tested on an independent data set, it shows that the proposed method performs better than all the previous ones with the prediction accuracy of 0.962 at the Matthew's correlation coefficient of 0.89 and area under curve of 0.964.
Adaptive h -refinement for reduced-order models: ADAPTIVE h -refinement for reduced-order models

DOE PAGES

Carlberg, Kevin T.

2014-11-05

Our work presents a method to adaptively refine reduced-order models a posteriori without requiring additional full-order-model solves. The technique is analogous to mesh-adaptive h-refinement: it enriches the reduced-basis space online by ‘splitting’ a given basis vector into several vectors with disjoint support. The splitting scheme is defined by a tree structure constructed offline via recursive k-means clustering of the state variables using snapshot data. This method identifies the vectors to split online using a dual-weighted-residual approach that aims to reduce error in an output quantity of interest. The resulting method generates a hierarchy of subspaces online without requiring large-scale operationsmore » or full-order-model solves. Furthermore, it enables the reduced-order model to satisfy any prescribed error tolerance regardless of its original fidelity, as a completely refined reduced-order model is mathematically equivalent to the original full-order model. Experiments on a parameterized inviscid Burgers equation highlight the ability of the method to capture phenomena (e.g., moving shocks) not contained in the span of the original reduced basis.« less
Prediction and analysis of beta-turns in proteins by support vector machine.

PubMed

Pham, Tho Hoan; Satou, Kenji; Ho, Tu Bao

2003-01-01

Tight turn has long been recognized as one of the three important features of proteins after the alpha-helix and beta-sheet. Tight turns play an important role in globular proteins from both the structural and functional points of view. More than 90% tight turns are beta-turns. Analysis and prediction of beta-turns in particular and tight turns in general are very useful for the design of new molecules such as drugs, pesticides, and antigens. In this paper, we introduce a support vector machine (SVM) approach to prediction and analysis of beta-turns. We have investigated two aspects of applying SVM to the prediction and analysis of beta-turns. First, we developed a new SVM method, called BTSVM, which predicts beta-turns of a protein from its sequence. The prediction results on the dataset of 426 non-homologous protein chains by sevenfold cross-validation technique showed that our method is superior to the other previous methods. Second, we analyzed how amino acid positions support (or prevent) the formation of beta-turns based on the "multivariable" classification model of a linear SVM. This model is more general than the other ones of previous statistical methods. Our analysis results are more comprehensive and easier to use than previously published analysis results.
Support Vector Machines Model of Computed Tomography for Assessing Lymph Node Metastasis in Esophageal Cancer with Neoadjuvant Chemotherapy.

PubMed

Wang, Zhi-Long; Zhou, Zhi-Guo; Chen, Ying; Li, Xiao-Ting; Sun, Ying-Shi

The aim of this study was to diagnose lymph node metastasis of esophageal cancer by support vector machines model based on computed tomography. A total of 131 esophageal cancer patients with preoperative chemotherapy and radical surgery were included. Various indicators (tumor thickness, tumor length, tumor CT value, total number of lymph nodes, and long axis and short axis sizes of largest lymph node) on CT images before and after neoadjuvant chemotherapy were recorded. A support vector machines model based on these CT indicators was built to predict lymph node metastasis. Support vector machines model diagnosed lymph node metastasis better than preoperative short axis size of largest lymph node on CT. The area under the receiver operating characteristic curves were 0.887 and 0.705, respectively. The support vector machine model of CT images can help diagnose lymph node metastasis in esophageal cancer with preoperative chemotherapy.
On A Nonlinear Generalization of Sparse Coding and Dictionary Learning.

PubMed

Xie, Yuchen; Ho, Jeffrey; Vemuri, Baba

2013-01-01

Existing dictionary learning algorithms are based on the assumption that the data are vectors in an Euclidean vector space ℝ d , and the dictionary is learned from the training data using the vector space structure of ℝ d and its Euclidean L 2 -metric. However, in many applications, features and data often originated from a Riemannian manifold that does not support a global linear (vector space) structure. Furthermore, the extrinsic viewpoint of existing dictionary learning algorithms becomes inappropriate for modeling and incorporating the intrinsic geometry of the manifold that is potentially important and critical to the application. This paper proposes a novel framework for sparse coding and dictionary learning for data on a Riemannian manifold, and it shows that the existing sparse coding and dictionary learning methods can be considered as special (Euclidean) cases of the more general framework proposed here. We show that both the dictionary and sparse coding can be effectively computed for several important classes of Riemannian manifolds, and we validate the proposed method using two well-known classification problems in computer vision and medical imaging analysis.
On A Nonlinear Generalization of Sparse Coding and Dictionary Learning

PubMed Central

Xie, Yuchen; Ho, Jeffrey; Vemuri, Baba

2013-01-01

Existing dictionary learning algorithms are based on the assumption that the data are vectors in an Euclidean vector space ℝd, and the dictionary is learned from the training data using the vector space structure of ℝd and its Euclidean L2-metric. However, in many applications, features and data often originated from a Riemannian manifold that does not support a global linear (vector space) structure. Furthermore, the extrinsic viewpoint of existing dictionary learning algorithms becomes inappropriate for modeling and incorporating the intrinsic geometry of the manifold that is potentially important and critical to the application. This paper proposes a novel framework for sparse coding and dictionary learning for data on a Riemannian manifold, and it shows that the existing sparse coding and dictionary learning methods can be considered as special (Euclidean) cases of the more general framework proposed here. We show that both the dictionary and sparse coding can be effectively computed for several important classes of Riemannian manifolds, and we validate the proposed method using two well-known classification problems in computer vision and medical imaging analysis. PMID:24129583
MO-F-CAMPUS-J-02: Automatic Recognition of Patient Treatment Site in Portal Images Using Machine Learning

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chang, X; Yang, D

Purpose: To investigate the method to automatically recognize the treatment site in the X-Ray portal images. It could be useful to detect potential treatment errors, and to provide guidance to sequential tasks, e.g. automatically verify the patient daily setup. Methods: The portal images were exported from MOSAIQ as DICOM files, and were 1) processed with a threshold based intensity transformation algorithm to enhance contrast, and 2) where then down-sampled (from 1024×768 to 128×96) by using bi-cubic interpolation algorithm. An appearance-based vector space model (VSM) was used to rearrange the images into vectors. A principal component analysis (PCA) method was usedmore » to reduce the vector dimensions. A multi-class support vector machine (SVM), with radial basis function kernel, was used to build the treatment site recognition models. These models were then used to recognize the treatment sites in the portal image. Portal images of 120 patients were included in the study. The images were selected to cover six treatment sites: brain, head and neck, breast, lung, abdomen and pelvis. Each site had images of the twenty patients. Cross-validation experiments were performed to evaluate the performance. Results: MATLAB image processing Toolbox and scikit-learn (a machine learning library in python) were used to implement the proposed method. The average accuracies using the AP and RT images separately were 95% and 94% respectively. The average accuracy using AP and RT images together was 98%. Computation time was ∼0.16 seconds per patient with AP or RT image, ∼0.33 seconds per patient with both of AP and RT images. Conclusion: The proposed method of treatment site recognition is efficient and accurate. It is not sensitive to the differences of image intensity, size and positions of patients in the portal images. It could be useful for the patient safety assurance. The work was partially supported by a research grant from Varian Medical System.« less
VectorBase: an updated bioinformatics resource for invertebrate vectors and other organisms related with human diseases

PubMed Central

Giraldo-Calderón, Gloria I.; Emrich, Scott J.; MacCallum, Robert M.; Maslen, Gareth; Dialynas, Emmanuel; Topalis, Pantelis; Ho, Nicholas; Gesing, Sandra; Madey, Gregory; Collins, Frank H.; Lawson, Daniel

2015-01-01

VectorBase is a National Institute of Allergy and Infectious Diseases supported Bioinformatics Resource Center (BRC) for invertebrate vectors of human pathogens. Now in its 11th year, VectorBase currently hosts the genomes of 35 organisms including a number of non-vectors for comparative analysis. Hosted data range from genome assemblies with annotated gene features, transcript and protein expression data to population genetics including variation and insecticide-resistance phenotypes. Here we describe improvements to our resource and the set of tools available for interrogating and accessing BRC data including the integration of Web Apollo to facilitate community annotation and providing Galaxy to support user-based workflows. VectorBase also actively supports our community through hands-on workshops and online tutorials. All information and data are freely available from our website at https://www.vectorbase.org/. PMID:25510499
Efficient boundary hunting via vector quantization

NASA Astrophysics Data System (ADS)

Diamantini, Claudia; Panti, Maurizio

2001-03-01

A great amount of information about a classification problem is contained in those instances falling near the decision boundary. This intuition dates back to the earliest studies in pattern recognition, and in the more recent adaptive approaches to the so called boundary hunting, such as the work of Aha et alii on Instance Based Learning and the work of Vapnik et alii on Support Vector Machines. The last work is of particular interest, since theoretical and experimental results ensure the accuracy of boundary reconstruction. However, its optimization approach has heavy computational and memory requirements, which limits its application on huge amounts of data. In the paper we describe an alternative approach to boundary hunting based on adaptive labeled quantization architectures. The adaptation is performed by a stochastic gradient algorithm for the minimization of the error probability. Error probability minimization guarantees the accurate approximation of the optimal decision boundary, while the use of a stochastic gradient algorithm defines an efficient method to reach such approximation. In the paper comparisons to Support Vector Machines are considered.
Automated detection of pulmonary nodules in CT images with support vector machines

NASA Astrophysics Data System (ADS)

Liu, Lu; Liu, Wanyu; Sun, Xiaoming

2008-10-01

Many methods have been proposed to avoid radiologists fail to diagnose small pulmonary nodules. Recently, support vector machines (SVMs) had received an increasing attention for pattern recognition. In this paper, we present a computerized system aimed at pulmonary nodules detection; it identifies the lung field, extracts a set of candidate regions with a high sensitivity ratio and then classifies candidates by the use of SVMs. The Computer Aided Diagnosis (CAD) system presented in this paper supports the diagnosis of pulmonary nodules from Computed Tomography (CT) images as inflammation, tuberculoma, granuloma..sclerosing hemangioma, and malignant tumor. Five texture feature sets were extracted for each lesion, while a genetic algorithm based feature selection method was applied to identify the most robust features. The selected feature set was fed into an ensemble of SVMs classifiers. The achieved classification performance was 100%, 92.75% and 90.23% in the training, validation and testing set, respectively. It is concluded that computerized analysis of medical images in combination with artificial intelligence can be used in clinical practice and may contribute to more efficient diagnosis.
Simple algorithm for improved security in the FDDI protocol

NASA Astrophysics Data System (ADS)

Lundy, G. M.; Jones, Benjamin

1993-02-01

We propose a modification to the Fiber Distributed Data Interface (FDDI) protocol based on a simple algorithm which will improve confidential communication capability. This proposed modification provides a simple and reliable system which exploits some of the inherent security properties in a fiber optic ring network. This method differs from conventional methods in that end to end encryption can be facilitated at the media access control sublayer of the data link layer in the OSI network model. Our method is based on a variation of the bit stream cipher method. The transmitting station takes the intended confidential message and uses a simple modulo two addition operation against an initialization vector. The encrypted message is virtually unbreakable without the initialization vector. None of the stations on the ring will have access to both the encrypted message and the initialization vector except the transmitting and receiving stations. The generation of the initialization vector is unique for each confidential transmission and thus provides a unique approach to the key distribution problem. The FDDI protocol is of particular interest to the military in terms of LAN/MAN implementations. Both the Army and the Navy are considering the standard as the basis for future network systems. A simple and reliable security mechanism with the potential to support realtime communications is a necessary consideration in the implementation of these systems. The proposed method offers several advantages over traditional methods in terms of speed, reliability, and standardization.
Analysis of miRNA expression profile based on SVM algorithm

NASA Astrophysics Data System (ADS)

Ting-ting, Dai; Chang-ji, Shan; Yan-shou, Dong; Yi-duo, Bian

2018-05-01

Based on mirna expression spectrum data set, a new data mining algorithm - tSVM - KNN (t statistic with support vector machine - k nearest neighbor) is proposed. the idea of the algorithm is: firstly, the feature selection of the data set is carried out by the unified measurement method; Secondly, SVM - KNN algorithm, which combines support vector machine (SVM) and k - nearest neighbor (k - nearest neighbor) is used as classifier. Simulation results show that SVM - KNN algorithm has better classification ability than SVM and KNN alone. Tsvm - KNN algorithm only needs 5 mirnas to obtain 96.08 % classification accuracy in terms of the number of mirna " tags" and recognition accuracy. compared with similar algorithms, tsvm - KNN algorithm has obvious advantages.
Application of support vector regression for optimization of vibration flow field of high-density polyethylene melts characterized by small angle light scattering

NASA Astrophysics Data System (ADS)

Xian, Guangming

2018-03-01

In this paper, the vibration flow field parameters of polymer melts in a visual slit die are optimized by using intelligent algorithm. Experimental small angle light scattering (SALS) patterns are shown to characterize the processing process. In order to capture the scattered light, a polarizer and an analyzer are placed before and after the polymer melts. The results reported in this study are obtained using high-density polyethylene (HDPE) with rotation speed at 28 rpm. In addition, support vector regression (SVR) analytical method is introduced for optimization the parameters of vibration flow field. This work establishes the general applicability of SVR for predicting the optimal parameters of vibration flow field.
Earth Observation and Indicators Pertaining to Determinants of Health- An Approach to Support Local Scale Characterization of Environmental Determinants of Vector-Borne Diseases

NASA Astrophysics Data System (ADS)

Kotchi, Serge Olivier; Brazeau, Stephanie; Ludwig, Antoinette; Aube, Guy; Berthiaume, Pilippe

2016-08-01

Environmental determinants (EVDs) were identified as key determinant of health (DoH) for the emergence and re-emergence of several vector-borne diseases. Maintaining ongoing acquisition of data related to EVDs at local scale and for large regions constitutes a significant challenge. Earth observation (EO) satellites offer a framework to overcome this challenge. However, EO image analysis methods commonly used to estimate EVDs are time and resource consuming. Moreover, variations of microclimatic conditions combined with high landscape heterogeneity limit the effectiveness of climatic variables derived from EO. In this study, we present what are DoH and EVDs, the impacts of EVDs on vector-borne diseases in the context of global environmental change, the need to characterize EVDs of vector-borne diseases at local scale and its challenges, and finally we propose an approach based on EO images to estimate at local scale indicators pertaining to EVDs of vector-borne diseases.
Vector Blood Meals Are an Early Indicator of the Effectiveness of the Ecohealth Approach in Halting Chagas Transmission in Guatemala

PubMed Central

Pellecer, Mariele J.; Dorn, Patricia L.; Bustamante, Dulce M.; Rodas, Antonieta; Monroy, M. Carlota

2013-01-01

A novel method using vector blood meal sources to assess the impact of control efforts on the risk of transmission of Chagas disease was tested in the village of El Tule, Jutiapa, Guatemala. Control used Ecohealth interventions, where villagers ameliorated the factors identified as most important for transmission. First, after an initial insecticide application, house walls were plastered. Later, bedroom floors were improved and domestic animals were moved outdoors. Only vector blood meal sources revealed the success of the first interventions: human blood meals declined from 38% to 3% after insecticide application and wall plastering. Following all interventions both vector blood meal sources and entomological indices revealed the reduction in transmission risk. These results indicate that vector blood meals may reveal effects of control efforts early on, effects that may not be apparent using traditional entomological indices, and provide further support for the Ecohealth approach to Chagas control in Guatemala. PMID:23382165
Developing operation algorithms for vision subsystems in autonomous mobile robots

NASA Astrophysics Data System (ADS)

Shikhman, M. V.; Shidlovskiy, S. V.

2018-05-01

The paper analyzes algorithms for selecting keypoints on the image for the subsequent automatic detection of people and obstacles. The algorithm is based on the histogram of oriented gradients and the support vector method. The combination of these methods allows successful selection of dynamic and static objects. The algorithm can be applied in various autonomous mobile robots.

Research on software behavior trust based on hierarchy evaluation

NASA Astrophysics Data System (ADS)

Long, Ke; Xu, Haishui

2017-08-01

In view of the correlation software behavior, we evaluate software behavior credibility from two levels of control flow and data flow. In control flow level, method of the software behavior of trace based on support vector machine (SVM) is proposed. In data flow level, behavioral evidence evaluation based on fuzzy decision analysis method is put forward.
SVM Classifier - a comprehensive java interface for support vector machine classification of microarray data.

PubMed

Pirooznia, Mehdi; Deng, Youping

2006-12-12

Graphical user interface (GUI) software promotes novelty by allowing users to extend the functionality. SVM Classifier is a cross-platform graphical application that handles very large datasets well. The purpose of this study is to create a GUI application that allows SVM users to perform SVM training, classification and prediction. The GUI provides user-friendly access to state-of-the-art SVM methods embodied in the LIBSVM implementation of Support Vector Machine. We implemented the java interface using standard swing libraries. We used a sample data from a breast cancer study for testing classification accuracy. We achieved 100% accuracy in classification among the BRCA1-BRCA2 samples with RBF kernel of SVM. We have developed a java GUI application that allows SVM users to perform SVM training, classification and prediction. We have demonstrated that support vector machines can accurately classify genes into functional categories based upon expression data from DNA microarray hybridization experiments. Among the different kernel functions that we examined, the SVM that uses a radial basis kernel function provides the best performance. The SVM Classifier is available at http://mfgn.usm.edu/ebl/svm/.
New support vector machine-based method for microRNA target prediction.

PubMed

Li, L; Gao, Q; Mao, X; Cao, Y

2014-06-09

MicroRNA (miRNA) plays important roles in cell differentiation, proliferation, growth, mobility, and apoptosis. An accurate list of precise target genes is necessary in order to fully understand the importance of miRNAs in animal development and disease. Several computational methods have been proposed for miRNA target-gene identification. However, these methods still have limitations with respect to their sensitivity and accuracy. Thus, we developed a new miRNA target-prediction method based on the support vector machine (SVM) model. The model supplies information of two binding sites (primary and secondary) for a radial basis function kernel as a similarity measure for SVM features. The information is categorized based on structural, thermodynamic, and sequence conservation. Using high-confidence datasets selected from public miRNA target databases, we obtained a human miRNA target SVM classifier model with high performance and provided an efficient tool for human miRNA target gene identification. Experiments have shown that our method is a reliable tool for miRNA target-gene prediction, and a successful application of an SVM classifier. Compared with other methods, the method proposed here improves the sensitivity and accuracy of miRNA prediction. Its performance can be further improved by providing more training examples.
Support-vector-machines-based multidimensional signal classification for fetal activity characterization

NASA Astrophysics Data System (ADS)

Ribes, S.; Voicu, I.; Girault, J. M.; Fournier, M.; Perrotin, F.; Tranquart, F.; Kouamé, D.

2011-03-01

Electronic fetal monitoring may be required during the whole pregnancy to closely monitor specific fetal and maternal disorders. Currently used methods suffer from many limitations and are not sufficient to evaluate fetal asphyxia. Fetal activity parameters such as movements, heart rate and associated parameters are essential indicators of the fetus well being, and no current device gives a simultaneous and sufficient estimation of all these parameters to evaluate the fetus well-being. We built for this purpose, a multi-transducer-multi-gate Doppler system and developed dedicated signal processing techniques for fetal activity parameter extraction in order to investigate fetus's asphyxia or well-being through fetal activity parameters. To reach this goal, this paper shows preliminary feasibility of separating normal and compromised fetuses using our system. To do so, data set consisting of two groups of fetal signals (normal and compromised) has been established and provided by physicians. From estimated parameters an instantaneous Manning-like score, referred to as ultrasonic score was introduced and was used together with movements, heart rate and associated parameters in a classification process using Support Vector Machines (SVM) method. The influence of the fetal activity parameters and the performance of the SVM were evaluated using the computation of sensibility, specificity, percentage of support vectors and total classification accuracy. We showed our ability to separate the data into two sets : normal fetuses and compromised fetuses and obtained an excellent matching with the clinical classification performed by physician.
Fruit fly optimization based least square support vector regression for blind image restoration

NASA Astrophysics Data System (ADS)

Zhang, Jiao; Wang, Rui; Li, Junshan; Yang, Yawei

2014-11-01

The goal of image restoration is to reconstruct the original scene from a degraded observation. It is a critical and challenging task in image processing. Classical restorations require explicit knowledge of the point spread function and a description of the noise as priors. However, it is not practical for many real image processing. The recovery processing needs to be a blind image restoration scenario. Since blind deconvolution is an ill-posed problem, many blind restoration methods need to make additional assumptions to construct restrictions. Due to the differences of PSF and noise energy, blurring images can be quite different. It is difficult to achieve a good balance between proper assumption and high restoration quality in blind deconvolution. Recently, machine learning techniques have been applied to blind image restoration. The least square support vector regression (LSSVR) has been proven to offer strong potential in estimating and forecasting issues. Therefore, this paper proposes a LSSVR-based image restoration method. However, selecting the optimal parameters for support vector machine is essential to the training result. As a novel meta-heuristic algorithm, the fruit fly optimization algorithm (FOA) can be used to handle optimization problems, and has the advantages of fast convergence to the global optimal solution. In the proposed method, the training samples are created from a neighborhood in the degraded image to the central pixel in the original image. The mapping between the degraded image and the original image is learned by training LSSVR. The two parameters of LSSVR are optimized though FOA. The fitness function of FOA is calculated by the restoration error function. With the acquired mapping, the degraded image can be recovered. Experimental results show the proposed method can obtain satisfactory restoration effect. Compared with BP neural network regression, SVR method and Lucy-Richardson algorithm, it speeds up the restoration rate and performs better. Both objective and subjective restoration performances are studied in the comparison experiments.
Rule-Based Design of Plant Expression Vectors Using GenoCAD.

PubMed

Coll, Anna; Wilson, Mandy L; Gruden, Kristina; Peccoud, Jean

2015-01-01

Plant synthetic biology requires software tools to assist on the design of complex multi-genic expression plasmids. Here a vector design strategy to express genes in plants is formalized and implemented as a grammar in GenoCAD, a Computer-Aided Design software for synthetic biology. It includes a library of plant biological parts organized in structural categories and a set of rules describing how to assemble these parts into large constructs. Rules developed here are organized and divided into three main subsections according to the aim of the final construct: protein localization studies, promoter analysis and protein-protein interaction experiments. The GenoCAD plant grammar guides the user through the design while allowing users to customize vectors according to their needs. Therefore the plant grammar implemented in GenoCAD will help plant biologists take advantage of methods from synthetic biology to design expression vectors supporting their research projects.
The control system of synchronous movement of the gantry crane supports

NASA Astrophysics Data System (ADS)

Odnokopylov, I. G.; Gneushev, V. V.; Galtseva, O. V.; Natalinova, N. M.; Li, J.; Serebryakov, D. I.

2017-01-01

The paper presents study findings on synchronization of the gantry crane support movement. Asynchrony moving speed bearings may lead to an emergency mode at the natural rate of deformed metal structure alignment. The use of separate control of asynchronous motors with the vector control method allows synchronizing the movement speed of crane supports and achieving a balance between the motors. Simulation results of various control systems are described. Recommendations regarding the system further application are given.
Predicting Protein-Protein Interactions by Combing Various Sequence-Derived.

PubMed

Zhao, Xiao-Wei; Ma, Zhi-Qiang; Yin, Ming-Hao

2011-09-20

Knowledge of protein-protein interactions (PPIs) plays an important role in constructing protein interaction networks and understanding the general machineries of biological systems. In this study, a new method is proposed to predict PPIs using a comprehensive set of 930 features based only on sequence information, these features measure the interactions between residues a certain distant apart in the protein sequences from different aspects. To achieve better performance, the principal component analysis (PCA) is first employed to obtain an optimized feature subset. Then, the resulting 67-dimensional feature vectors are fed to Support Vector Machine (SVM). Experimental results on Drosophila melanogaster and Helicobater pylori datasets show that our method is very promising to predict PPIs and may at least be a useful supplement tool to existing methods.
Wave-Based Algorithms and Bounds for Target Support Estimation

DTIC Science & Technology

2015-05-15

vector electromagnetic formalism in [5]. This theory leads to three main variants of the optical theorem detector, in particular, three alternative...further expands the applicability for transient pulse change detection of ar- bitrary nonlinear-media and time-varying targets [9]. This report... electromagnetic methods a new methodology to estimate the minimum convex source region and the (possibly nonconvex) support of a scattering target from knowledge of
Predicting subcellular location of apoptosis proteins based on wavelet transform and support vector machine.

PubMed

Qiu, Jian-Ding; Luo, San-Hua; Huang, Jian-Hua; Sun, Xing-Yu; Liang, Ru-Ping

2010-04-01

Apoptosis proteins have a central role in the development and homeostasis of an organism. These proteins are very important for understanding the mechanism of programmed cell death. As a result of genome and other sequencing projects, the gap between the number of known apoptosis protein sequences and the number of known apoptosis protein structures is widening rapidly. Because of this extremely unbalanced state, it would be worthwhile to develop a fast and reliable method to identify their subcellular locations so as to gain better insight into their biological functions. In view of this, a new method, in which the support vector machine combines with discrete wavelet transform, has been developed to predict the subcellular location of apoptosis proteins. The results obtained by the jackknife test were quite promising, and indicated that the proposed method can remarkably improve the prediction accuracy of subcellular locations, and might also become a useful high-throughput tool in characterizing other attributes of proteins, such as enzyme class, membrane protein type, and nuclear receptor subfamily according to their sequences.
Hybrid method to predict the resonant frequencies and to characterise dual band proximity coupled microstrip antennas

NASA Astrophysics Data System (ADS)

Varma, Ruchi; Ghosh, Jayanta

2018-06-01

A new hybrid technique, which is a combination of neural network (NN) and support vector machine, is proposed for designing of different slotted dual band proximity coupled microstrip antennas. Slots on the patch are employed to produce the second resonance along with size reduction. The proposed hybrid model provides flexibility to design the dual band antennas in the frequency range from 1 to 6 GHz. This includes DCS (1.71-1.88 GHz), PCS (1.88-1.99 GHz), UMTS (1.92-2.17 GHz), LTE2300 (2.3-2.4 GHz), Bluetooth (2.4-2.485 GHz), WiMAX (3.3-3.7 GHz), and WLAN (5.15-5.35 GHz, 5.725-5.825 GHz) bands applications. Also, the comparative study of this proposed technique is done with the existing methods like knowledge based NN and support vector machine. The proposed method is found to be more accurate in terms of % error and root mean square % error and the results are in good accord with the measured values.
Nonparametric methods for drought severity estimation at ungauged sites

NASA Astrophysics Data System (ADS)

Sadri, S.; Burn, D. H.

2012-12-01

The objective in frequency analysis is, given extreme events such as drought severity or duration, to estimate the relationship between that event and the associated return periods at a catchment. Neural networks and other artificial intelligence approaches in function estimation and regression analysis are relatively new techniques in engineering, providing an attractive alternative to traditional statistical models. There are, however, few applications of neural networks and support vector machines in the area of severity quantile estimation for drought frequency analysis. In this paper, we compare three methods for this task: multiple linear regression, radial basis function neural networks, and least squares support vector regression (LS-SVR). The area selected for this study includes 32 catchments in the Canadian Prairies. From each catchment drought severities are extracted and fitted to a Pearson type III distribution, which act as observed values. For each method-duration pair, we use a jackknife algorithm to produce estimated values at each site. The results from these three approaches are compared and analyzed, and it is found that LS-SVR provides the best quantile estimates and extrapolating capacity.
High-performance Chinese multiclass traffic sign detection via coarse-to-fine cascade and parallel support vector machine detectors

NASA Astrophysics Data System (ADS)

Chang, Faliang; Liu, Chunsheng

2017-09-01

The high variability of sign colors and shapes in uncontrolled environments has made the detection of traffic signs a challenging problem in computer vision. We propose a traffic sign detection (TSD) method based on coarse-to-fine cascade and parallel support vector machine (SVM) detectors to detect Chinese warning and danger traffic signs. First, a region of interest (ROI) extraction method is proposed to extract ROIs using color contrast features in local regions. The ROI extraction can reduce scanning regions and save detection time. For multiclass TSD, we propose a structure that combines a coarse-to-fine cascaded tree with a parallel structure of histogram of oriented gradients (HOG) + SVM detectors. The cascaded tree is designed to detect different types of traffic signs in a coarse-to-fine process. The parallel HOG + SVM detectors are designed to do fine detection of different types of traffic signs. The experiments demonstrate the proposed TSD method can rapidly detect multiclass traffic signs with different colors and shapes in high accuracy.
The construction of support vector machine classifier using the firefly algorithm.

PubMed

Chao, Chih-Feng; Horng, Ming-Huwi

2015-01-01

The setting of parameters in the support vector machines (SVMs) is very important with regard to its accuracy and efficiency. In this paper, we employ the firefly algorithm to train all parameters of the SVM simultaneously, including the penalty parameter, smoothness parameter, and Lagrangian multiplier. The proposed method is called the firefly-based SVM (firefly-SVM). This tool is not considered the feature selection, because the SVM, together with feature selection, is not suitable for the application in a multiclass classification, especially for the one-against-all multiclass SVM. In experiments, binary and multiclass classifications are explored. In the experiments on binary classification, ten of the benchmark data sets of the University of California, Irvine (UCI), machine learning repository are used; additionally the firefly-SVM is applied to the multiclass diagnosis of ultrasonic supraspinatus images. The classification performance of firefly-SVM is also compared to the original LIBSVM method associated with the grid search method and the particle swarm optimization based SVM (PSO-SVM). The experimental results advocate the use of firefly-SVM to classify pattern classifications for maximum accuracy.
The Construction of Support Vector Machine Classifier Using the Firefly Algorithm

PubMed Central

Chao, Chih-Feng; Horng, Ming-Huwi

2015-01-01

The setting of parameters in the support vector machines (SVMs) is very important with regard to its accuracy and efficiency. In this paper, we employ the firefly algorithm to train all parameters of the SVM simultaneously, including the penalty parameter, smoothness parameter, and Lagrangian multiplier. The proposed method is called the firefly-based SVM (firefly-SVM). This tool is not considered the feature selection, because the SVM, together with feature selection, is not suitable for the application in a multiclass classification, especially for the one-against-all multiclass SVM. In experiments, binary and multiclass classifications are explored. In the experiments on binary classification, ten of the benchmark data sets of the University of California, Irvine (UCI), machine learning repository are used; additionally the firefly-SVM is applied to the multiclass diagnosis of ultrasonic supraspinatus images. The classification performance of firefly-SVM is also compared to the original LIBSVM method associated with the grid search method and the particle swarm optimization based SVM (PSO-SVM). The experimental results advocate the use of firefly-SVM to classify pattern classifications for maximum accuracy. PMID:25802511
lncRScan-SVM: A Tool for Predicting Long Non-Coding RNAs Using Support Vector Machine.

PubMed

Sun, Lei; Liu, Hui; Zhang, Lin; Meng, Jia

2015-01-01

Functional long non-coding RNAs (lncRNAs) have been bringing novel insight into biological study, however it is still not trivial to accurately distinguish the lncRNA transcripts (LNCTs) from the protein coding ones (PCTs). As various information and data about lncRNAs are preserved by previous studies, it is appealing to develop novel methods to identify the lncRNAs more accurately. Our method lncRScan-SVM aims at classifying PCTs and LNCTs using support vector machine (SVM). The gold-standard datasets for lncRScan-SVM model training, lncRNA prediction and method comparison were constructed according to the GENCODE gene annotations of human and mouse respectively. By integrating features derived from gene structure, transcript sequence, potential codon sequence and conservation, lncRScan-SVM outperforms other approaches, which is evaluated by several criteria such as sensitivity, specificity, accuracy, Matthews correlation coefficient (MCC) and area under curve (AUC). In addition, several known human lncRNA datasets were assessed using lncRScan-SVM. LncRScan-SVM is an efficient tool for predicting the lncRNAs, and it is quite useful for current lncRNA study.
Noninvasive prostate cancer screening based on serum surface-enhanced Raman spectroscopy and support vector machine

NASA Astrophysics Data System (ADS)

Li, Shaoxin; Zhang, Yanjiao; Xu, Junfa; Li, Linfang; Zeng, Qiuyao; Lin, Lin; Guo, Zhouyi; Liu, Zhiming; Xiong, Honglian; Liu, Songhao

2014-09-01

This study aims to present a noninvasive prostate cancer screening methods using serum surface-enhanced Raman scattering (SERS) and support vector machine (SVM) techniques through peripheral blood sample. SERS measurements are performed using serum samples from 93 prostate cancer patients and 68 healthy volunteers by silver nanoparticles. Three types of kernel functions including linear, polynomial, and Gaussian radial basis function (RBF) are employed to build SVM diagnostic models for classifying measured SERS spectra. For comparably evaluating the performance of SVM classification models, the standard multivariate statistic analysis method of principal component analysis (PCA) is also applied to classify the same datasets. The study results show that for the RBF kernel SVM diagnostic model, the diagnostic accuracy of 98.1% is acquired, which is superior to the results of 91.3% obtained from PCA methods. The receiver operating characteristic curve of diagnostic models further confirm above research results. This study demonstrates that label-free serum SERS analysis technique combined with SVM diagnostic algorithm has great potential for noninvasive prostate cancer screening.
Localization of thermal anomalies in electrical equipment using Infrared Thermography and support vector machine

NASA Astrophysics Data System (ADS)

Laib dit Leksir, Y.; Mansour, M.; Moussaoui, A.

2018-03-01

Analysis and processing of databases obtained from infrared thermal inspections made on electrical installations require the development of new tools to obtain more information to visual inspections. Consequently, methods based on the capture of thermal images show a great potential and are increasingly employed in this field. However, there is a need for the development of effective techniques to analyse these databases in order to extract significant information relating to the state of the infrastructures. This paper presents a technique explaining how this approach can be implemented and proposes a system that can help to detect faults in thermal images of electrical installations. The proposed method classifies and identifies the region of interest (ROI). The identification is conducted using support vector machine (SVM) algorithm. The aim here is to capture the faults that exist in electrical equipments during an inspection of some machines using A40 FLIR camera. After that, binarization techniques are employed to select the region of interest. Later the comparative analysis of the obtained misclassification errors using the proposed method with Fuzzy c means and Ostu, has also be addressed.
A Novel Unsupervised Adaptive Learning Method for Long-Term Electromyography (EMG) Pattern Recognition

PubMed Central

Huang, Qi; Yang, Dapeng; Jiang, Li; Zhang, Huajie; Liu, Hong; Kotani, Kiyoshi

2017-01-01

Performance degradation will be caused by a variety of interfering factors for pattern recognition-based myoelectric control methods in the long term. This paper proposes an adaptive learning method with low computational cost to mitigate the effect in unsupervised adaptive learning scenarios. We presents a particle adaptive classifier (PAC), by constructing a particle adaptive learning strategy and universal incremental least square support vector classifier (LS-SVC). We compared PAC performance with incremental support vector classifier (ISVC) and non-adapting SVC (NSVC) in a long-term pattern recognition task in both unsupervised and supervised adaptive learning scenarios. Retraining time cost and recognition accuracy were compared by validating the classification performance on both simulated and realistic long-term EMG data. The classification results of realistic long-term EMG data showed that the PAC significantly decreased the performance degradation in unsupervised adaptive learning scenarios compared with NSVC (9.03% ± 2.23%, p < 0.05) and ISVC (13.38% ± 2.62%, p = 0.001), and reduced the retraining time cost compared with ISVC (2 ms per updating cycle vs. 50 ms per updating cycle). PMID:28608824
Computer-Aided Diagnosis for Breast Ultrasound Using Computerized BI-RADS Features and Machine Learning Methods.

PubMed

Shan, Juan; Alam, S Kaisar; Garra, Brian; Zhang, Yingtao; Ahmed, Tahira

2016-04-01

This work identifies effective computable features from the Breast Imaging Reporting and Data System (BI-RADS), to develop a computer-aided diagnosis (CAD) system for breast ultrasound. Computerized features corresponding to ultrasound BI-RADs categories were designed and tested using a database of 283 pathology-proven benign and malignant lesions. Features were selected based on classification performance using a "bottom-up" approach for different machine learning methods, including decision tree, artificial neural network, random forest and support vector machine. Using 10-fold cross-validation on the database of 283 cases, the highest area under the receiver operating characteristic (ROC) curve (AUC) was 0.84 from a support vector machine with 77.7% overall accuracy; the highest overall accuracy, 78.5%, was from a random forest with the AUC 0.83. Lesion margin and orientation were optimum features common to all of the different machine learning methods. These features can be used in CAD systems to help distinguish benign from worrisome lesions. Copyright © 2016 World Federation for Ultrasound in Medicine & Biology. All rights reserved.

A Novel Unsupervised Adaptive Learning Method for Long-Term Electromyography (EMG) Pattern Recognition.

PubMed

Huang, Qi; Yang, Dapeng; Jiang, Li; Zhang, Huajie; Liu, Hong; Kotani, Kiyoshi

2017-06-13

Performance degradation will be caused by a variety of interfering factors for pattern recognition-based myoelectric control methods in the long term. This paper proposes an adaptive learning method with low computational cost to mitigate the effect in unsupervised adaptive learning scenarios. We presents a particle adaptive classifier (PAC), by constructing a particle adaptive learning strategy and universal incremental least square support vector classifier (LS-SVC). We compared PAC performance with incremental support vector classifier (ISVC) and non-adapting SVC (NSVC) in a long-term pattern recognition task in both unsupervised and supervised adaptive learning scenarios. Retraining time cost and recognition accuracy were compared by validating the classification performance on both simulated and realistic long-term EMG data. The classification results of realistic long-term EMG data showed that the PAC significantly decreased the performance degradation in unsupervised adaptive learning scenarios compared with NSVC (9.03% ± 2.23%, p < 0.05) and ISVC (13.38% ± 2.62%, p = 0.001), and reduced the retraining time cost compared with ISVC (2 ms per updating cycle vs. 50 ms per updating cycle).
Applying machine learning methods for characterization of hexagonal prisms from their 2D scattering patterns - an investigation using modelled scattering data

NASA Astrophysics Data System (ADS)

Salawu, Emmanuel Oluwatobi; Hesse, Evelyn; Stopford, Chris; Davey, Neil; Sun, Yi

2017-11-01

Better understanding and characterization of cloud particles, whose properties and distributions affect climate and weather, are essential for the understanding of present climate and climate change. Since imaging cloud probes have limitations of optical resolution, especially for small particles (with diameter < 25 μm), instruments like the Small Ice Detector (SID) probes, which capture high-resolution spatial light scattering patterns from individual particles down to 1 μm in size, have been developed. In this work, we have proposed a method using Machine Learning techniques to estimate simulated particles' orientation-averaged projected sizes (PAD) and aspect ratio from their 2D scattering patterns. The two-dimensional light scattering patterns (2DLSP) of hexagonal prisms are computed using the Ray Tracing with Diffraction on Facets (RTDF) model. The 2DLSP cover the same angular range as the SID probes. We generated 2DLSP for 162 hexagonal prisms at 133 orientations for each. In a first step, the 2DLSP were transformed into rotation-invariant Zernike moments (ZMs), which are particularly suitable for analyses of pattern symmetry. Then we used ZMs, summed intensities, and root mean square contrast as inputs to the advanced Machine Learning methods. We created one random forests classifier for predicting prism orientation, 133 orientation-specific (OS) support vector classification models for predicting the prism aspect-ratios, 133 OS support vector regression models for estimating prism sizes, and another 133 OS Support Vector Regression (SVR) models for estimating the size PADs. We have achieved a high accuracy of 0.99 in predicting prism aspect ratios, and a low value of normalized mean square error of 0.004 for estimating the particle's size and size PADs.
Identification of type 2 diabetes-associated combination of SNPs using support vector machine.

PubMed

Ban, Hyo-Jeong; Heo, Jee Yeon; Oh, Kyung-Soo; Park, Keun-Joon

2010-04-23

Type 2 diabetes mellitus (T2D), a metabolic disorder characterized by insulin resistance and relative insulin deficiency, is a complex disease of major public health importance. Its incidence is rapidly increasing in the developed countries. Complex diseases are caused by interactions between multiple genes and environmental factors. Most association studies aim to identify individual susceptibility single markers using a simple disease model. Recent studies are trying to estimate the effects of multiple genes and multi-locus in genome-wide association. However, estimating the effects of association is very difficult. We aim to assess the rules for classifying diseased and normal subjects by evaluating potential gene-gene interactions in the same or distinct biological pathways. We analyzed the importance of gene-gene interactions in T2D susceptibility by investigating 408 single nucleotide polymorphisms (SNPs) in 87 genes involved in major T2D-related pathways in 462 T2D patients and 456 healthy controls from the Korean cohort studies. We evaluated the support vector machine (SVM) method to differentiate between cases and controls using SNP information in a 10-fold cross-validation test. We achieved a 65.3% prediction rate with a combination of 14 SNPs in 12 genes by using the radial basis function (RBF)-kernel SVM. Similarly, we investigated subpopulation data sets of men and women and identified different SNP combinations with the prediction rates of 70.9% and 70.6%, respectively. As the high-throughput technology for genome-wide SNPs improves, it is likely that a much higher prediction rate with biologically more interesting combination of SNPs can be acquired by using this method. Support Vector Machine based feature selection method in this research found novel association between combinations of SNPs and T2D in a Korean population.
Optimal Cloning of PCR Fragments by Homologous Recombination in Escherichia coli

PubMed Central

Jacobus, Ana Paula; Gross, Jeferson

2015-01-01

PCR fragments and linear vectors containing overlapping ends are easily assembled into a propagative plasmid by homologous recombination in Escherichia coli. Although this gap-repair cloning approach is straightforward, its existence is virtually unknown to most molecular biologists. To popularize this method, we tested critical parameters influencing the efficiency of PCR fragments cloning into PCR-amplified vectors by homologous recombination in the widely used E. coli strain DH5α. We found that the number of positive colonies after transformation increases with the length of overlap between the PCR fragment and linear vector. For most practical purposes, a 20 bp identity already ensures high-cloning yields. With an insert to vector ratio of 2:1, higher colony forming numbers are obtained when the amount of vector is in the range of 100 to 250 ng. An undesirable cloning background of empty vectors can be minimized during vector PCR amplification by applying a reduced amount of plasmid template or by using primers in which the 5′ termini are separated by a large gap. DpnI digestion of the plasmid template after PCR is also effective to decrease the background of negative colonies. We tested these optimized cloning parameters during the assembly of five independent DNA constructs and obtained 94% positive clones out of 100 colonies probed. We further demonstrated the efficient and simultaneous cloning of two PCR fragments into a vector. These results support the idea that homologous recombination in E. coli might be one of the most effective methods for cloning one or two PCR fragments. For its simplicity and high efficiency, we believe that recombinational cloning in E. coli has a great potential to become a routine procedure in most molecular biology-oriented laboratories. PMID:25774528
Spatially explicit multi-criteria decision analysis for managing vector-borne diseases

PubMed Central

2011-01-01

The complex epidemiology of vector-borne diseases creates significant challenges in the design and delivery of prevention and control strategies, especially in light of rapid social and environmental changes. Spatial models for predicting disease risk based on environmental factors such as climate and landscape have been developed for a number of important vector-borne diseases. The resulting risk maps have proven value for highlighting areas for targeting public health programs. However, these methods generally only offer technical information on the spatial distribution of disease risk itself, which may be incomplete for making decisions in a complex situation. In prioritizing surveillance and intervention strategies, decision-makers often also need to consider spatially explicit information on other important dimensions, such as the regional specificity of public acceptance, population vulnerability, resource availability, intervention effectiveness, and land use. There is a need for a unified strategy for supporting public health decision making that integrates available data for assessing spatially explicit disease risk, with other criteria, to implement effective prevention and control strategies. Multi-criteria decision analysis (MCDA) is a decision support tool that allows for the consideration of diverse quantitative and qualitative criteria using both data-driven and qualitative indicators for evaluating alternative strategies with transparency and stakeholder participation. Here we propose a MCDA-based approach to the development of geospatial models and spatially explicit decision support tools for the management of vector-borne diseases. We describe the conceptual framework that MCDA offers as well as technical considerations, approaches to implementation and expected outcomes. We conclude that MCDA is a powerful tool that offers tremendous potential for use in public health decision-making in general and vector-borne disease management in particular. PMID:22206355
A Subdivision-Based Representation for Vector Image Editing.

PubMed

Liao, Zicheng; Hoppe, Hugues; Forsyth, David; Yu, Yizhou

2012-11-01

Vector graphics has been employed in a wide variety of applications due to its scalability and editability. Editability is a high priority for artists and designers who wish to produce vector-based graphical content with user interaction. In this paper, we introduce a new vector image representation based on piecewise smooth subdivision surfaces, which is a simple, unified and flexible framework that supports a variety of operations, including shape editing, color editing, image stylization, and vector image processing. These operations effectively create novel vector graphics by reusing and altering existing image vectorization results. Because image vectorization yields an abstraction of the original raster image, controlling the level of detail of this abstraction is highly desirable. To this end, we design a feature-oriented vector image pyramid that offers multiple levels of abstraction simultaneously. Our new vector image representation can be rasterized efficiently using GPU-accelerated subdivision. Experiments indicate that our vector image representation achieves high visual quality and better supports editing operations than existing representations.
Evaluating the statistical performance of less applied algorithms in classification of worldview-3 imagery data in an urbanized landscape

NASA Astrophysics Data System (ADS)

Ranaie, Mehrdad; Soffianian, Alireza; Pourmanafi, Saeid; Mirghaffari, Noorollah; Tarkesh, Mostafa

2018-03-01

In recent decade, analyzing the remotely sensed imagery is considered as one of the most common and widely used procedures in the environmental studies. In this case, supervised image classification techniques play a central role. Hence, taking a high resolution Worldview-3 over a mixed urbanized landscape in Iran, three less applied image classification methods including Bagged CART, Stochastic gradient boosting model and Neural network with feature extraction were tested and compared with two prevalent methods: random forest and support vector machine with linear kernel. To do so, each method was run ten time and three validation techniques was used to estimate the accuracy statistics consist of cross validation, independent validation and validation with total of train data. Moreover, using ANOVA and Tukey test, statistical difference significance between the classification methods was significantly surveyed. In general, the results showed that random forest with marginal difference compared to Bagged CART and stochastic gradient boosting model is the best performing method whilst based on independent validation there was no significant difference between the performances of classification methods. It should be finally noted that neural network with feature extraction and linear support vector machine had better processing speed than other.
Analysis of algae growth mechanism and water bloom prediction under the effect of multi-affecting factor.

PubMed

Wang, Li; Wang, Xiaoyi; Jin, Xuebo; Xu, Jiping; Zhang, Huiyan; Yu, Jiabin; Sun, Qian; Gao, Chong; Wang, Lingbin

2017-03-01

The formation process of algae is described inaccurately and water blooms are predicted with a low precision by current methods. In this paper, chemical mechanism of algae growth is analyzed, and a correlation analysis of chlorophyll-a and algal density is conducted by chemical measurement. Taking into account the influence of multi-factors on algae growth and water blooms, the comprehensive prediction method combined with multivariate time series and intelligent model is put forward in this paper. Firstly, through the process of photosynthesis, the main factors that affect the reproduction of the algae are analyzed. A compensation prediction method of multivariate time series analysis based on neural network and Support Vector Machine has been put forward which is combined with Kernel Principal Component Analysis to deal with dimension reduction of the influence factors of blooms. Then, Genetic Algorithm is applied to improve the generalization ability of the BP network and Least Squares Support Vector Machine. Experimental results show that this method could better compensate the prediction model of multivariate time series analysis which is an effective way to improve the description accuracy of algae growth and prediction precision of water blooms.
Quantitative structure-retention relationship models for the prediction of the reversed-phase HPLC gradient retention based on the heuristic method and support vector machine.

PubMed

Du, Hongying; Wang, Jie; Yao, Xiaojun; Hu, Zhide

2009-01-01

The heuristic method (HM) and support vector machine (SVM) were used to construct quantitative structure-retention relationship models by a series of compounds to predict the gradient retention times of reversed-phase high-performance liquid chromatography (HPLC) in three different columns. The aims of this investigation were to predict the retention times of multifarious compounds, to find the main properties of the three columns, and to indicate the theory of separation procedures. In our method, we correlated the retention times of many diverse structural analytes in three columns (Symmetry C18, Chromolith, and SG-MIX) with their representative molecular descriptors, calculated from the molecular structures alone. HM was used to select the most important molecular descriptors and build linear regression models. Furthermore, non-linear regression models were built using the SVM method; the performance of the SVM models were better than that of the HM models, and the prediction results were in good agreement with the experimental values. This paper could give some insights into the factors that were likely to govern the gradient retention process of the three investigated HPLC columns, which could theoretically supervise the practical experiment.
A prediction model of drug-induced ototoxicity developed by an optimal support vector machine (SVM) method.

PubMed

Zhou, Shu; Li, Guo-Bo; Huang, Lu-Yi; Xie, Huan-Zhang; Zhao, Ying-Lan; Chen, Yu-Zong; Li, Lin-Li; Yang, Sheng-Yong

2014-08-01

Drug-induced ototoxicity, as a toxic side effect, is an important issue needed to be considered in drug discovery. Nevertheless, current experimental methods used to evaluate drug-induced ototoxicity are often time-consuming and expensive, indicating that they are not suitable for a large-scale evaluation of drug-induced ototoxicity in the early stage of drug discovery. We thus, in this investigation, established an effective computational prediction model of drug-induced ototoxicity using an optimal support vector machine (SVM) method, GA-CG-SVM. Three GA-CG-SVM models were developed based on three training sets containing agents bearing different risk levels of drug-induced ototoxicity. For comparison, models based on naïve Bayesian (NB) and recursive partitioning (RP) methods were also used on the same training sets. Among all the prediction models, the GA-CG-SVM model II showed the best performance, which offered prediction accuracies of 85.33% and 83.05% for two independent test sets, respectively. Overall, the good performance of the GA-CG-SVM model II indicates that it could be used for the prediction of drug-induced ototoxicity in the early stage of drug discovery. Copyright © 2014 Elsevier Ltd. All rights reserved.
Applying different independent component analysis algorithms and support vector regression for IT chain store sales forecasting.

PubMed

Dai, Wensheng; Wu, Jui-Yu; Lu, Chi-Jie

2014-01-01

Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting.
Applying Different Independent Component Analysis Algorithms and Support Vector Regression for IT Chain Store Sales Forecasting

PubMed Central

Dai, Wensheng

2014-01-01

Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting. PMID:25165740
Support Vector Machine algorithm for regression and classification

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yu, Chenggang; Zavaljevski, Nela

2001-08-01

The software is an implementation of the Support Vector Machine (SVM) algorithm that was invented and developed by Vladimir Vapnik and his co-workers at AT&T Bell Laboratories. The specific implementation reported here is an Active Set method for solving a quadratic optimization problem that forms the major part of any SVM program. The implementation is tuned to specific constraints generated in the SVM learning. Thus, it is more efficient than general-purpose quadratic optimization programs. A decomposition method has been implemented in the software that enables processing large data sets. The size of the learning data is virtually unlimited by themore » capacity of the computer physical memory. The software is flexible and extensible. Two upper bounds are implemented to regulate the SVM learning for classification, which allow users to adjust the false positive and false negative rates. The software can be used either as a standalone, general-purpose SVM regression or classification program, or be embedded into a larger software system.« less
Computer-assisted segmentation of white matter lesions in 3D MR images using support vector machine.

PubMed

Lao, Zhiqiang; Shen, Dinggang; Liu, Dengfeng; Jawad, Abbas F; Melhem, Elias R; Launer, Lenore J; Bryan, R Nick; Davatzikos, Christos

2008-03-01

Brain lesions, especially white matter lesions (WMLs), are associated with cardiac and vascular disease, but also with normal aging. Quantitative analysis of WML in large clinical trials is becoming more and more important. In this article, we present a computer-assisted WML segmentation method, based on local features extracted from multiparametric magnetic resonance imaging (MRI) sequences (ie, T1-weighted, T2-weighted, proton density-weighted, and fluid attenuation inversion recovery MRI scans). A support vector machine classifier is first trained on expert-defined WMLs, and is then used to classify new scans. Postprocessing analysis further reduces false positives by using anatomic knowledge and measures of distance from the training set. Cross-validation on a population of 35 patients from three different imaging sites with WMLs of varying sizes, shapes, and locations tests the robustness and accuracy of the proposed segmentation method, compared with the manual segmentation results from two experienced neuroradiologists.
Application of support vector machine for the separation of mineralised zones in the Takht-e-Gonbad porphyry deposit, SE Iran

NASA Astrophysics Data System (ADS)

Mahvash Mohammadi, Neda; Hezarkhani, Ardeshir

2018-07-01

Classification of mineralised zones is an important factor for the analysis of economic deposits. In this paper, the support vector machine (SVM), a supervised learning algorithm, based on subsurface data is proposed for classification of mineralised zones in the Takht-e-Gonbad porphyry Cu-deposit (SE Iran). The effects of the input features are evaluated via calculating the accuracy rates on the SVM performance. Ultimately, the SVM model, is developed based on input features namely lithology, alteration, mineralisation, the level and, radial basis function (RBF) as a kernel function. Moreover, the optimal amount of parameters λ and C, using n-fold cross-validation method, are calculated at level 0.001 and 0.01 respectively. The accuracy of this model is 0.931 for classification of mineralised zones in the Takht-e-Gonbad porphyry deposit. The results of the study confirm the efficiency of SVM method for classification the mineralised zones.
Distributed support vector machine in master-slave mode.

PubMed

Chen, Qingguo; Cao, Feilong

2018-05-01

It is well known that the support vector machine (SVM) is an effective learning algorithm. The alternating direction method of multipliers (ADMM) algorithm has emerged as a powerful technique for solving distributed optimisation models. This paper proposes a distributed SVM algorithm in a master-slave mode (MS-DSVM), which integrates a distributed SVM and ADMM acting in a master-slave configuration where the master node and slave nodes are connected, meaning the results can be broadcasted. The distributed SVM is regarded as a regularised optimisation problem and modelled as a series of convex optimisation sub-problems that are solved by ADMM. Additionally, the over-relaxation technique is utilised to accelerate the convergence rate of the proposed MS-DSVM. Our theoretical analysis demonstrates that the proposed MS-DSVM has linear convergence, meaning it possesses the fastest convergence rate among existing standard distributed ADMM algorithms. Numerical examples demonstrate that the convergence and accuracy of the proposed MS-DSVM are superior to those of existing methods under the ADMM framework. Copyright © 2018 Elsevier Ltd. All rights reserved.
Nonlinear temperature compensation of fluxgate magnetometers with a least-squares support vector machine

NASA Astrophysics Data System (ADS)

Pang, Hongfeng; Chen, Dixiang; Pan, Mengchun; Luo, Shitu; Zhang, Qi; Luo, Feilu

2012-02-01

Fluxgate magnetometers are widely used for magnetic field measurement. However, their accuracy is influenced by temperature. In this paper, a new method was proposed to compensate the temperature drift of fluxgate magnetometers, in which a least-squares support vector machine (LSSVM) is utilized. The compensation performance was analyzed by simulation, which shows that the LSSVM has better performance and less training time than backpropagation and radical basis function neural networks. The temperature characteristics of a DM fluxgate magnetometer were measured with a temperature experiment box. Forty-five measured data under different magnetic fields and temperatures were obtained and divided into 36 training data and nine test data. The training data were used to obtain the parameters of the LSSVM model, and the compensation performance of the LSSVM model was verified by the test data. Experimental results show that the temperature drift of magnetometer is reduced from 109.3 to 3.3 nT after compensation, which suggests that this compensation method is effective for the accuracy improvement of fluxgate magnetometers.
Boosted Regression Trees Outperforms Support Vector Machines in Predicting (Regional) Yields of Winter Wheat from Single and Cumulated Dekadal Spot-VGT Derived Normalized Difference Vegetation Indices

NASA Astrophysics Data System (ADS)

Stas, Michiel; Dong, Qinghan; Heremans, Stien; Zhang, Beier; Van Orshoven, Jos

2016-08-01

This paper compares two machine learning techniques to predict regional winter wheat yields. The models, based on Boosted Regression Trees (BRT) and Support Vector Machines (SVM), are constructed of Normalized Difference Vegetation Indices (NDVI) derived from low resolution SPOT VEGETATION satellite imagery. Three types of NDVI-related predictors were used: Single NDVI, Incremental NDVI and Targeted NDVI. BRT and SVM were first used to select features with high relevance for predicting the yield. Although the exact selections differed between the prefectures, certain periods with high influence scores for multiple prefectures could be identified. The same period of high influence stretching from March to June was detected by both machine learning methods. After feature selection, BRT and SVM models were applied to the subset of selected features for actual yield forecasting. Whereas both machine learning methods returned very low prediction errors, BRT seems to slightly but consistently outperform SVM.
Support-vector-based emergent self-organising approach for emotional understanding

NASA Astrophysics Data System (ADS)

Nguwi, Yok-Yen; Cho, Siu-Yeung

2010-12-01

This study discusses the computational analysis of general emotion understanding from questionnaires methodology. The questionnaires method approaches the subject by investigating the real experience that accompanied the emotions, whereas the other laboratory approaches are generally associated with exaggerated elements. We adopted a connectionist model called support-vector-based emergent self-organising map (SVESOM) to analyse the emotion profiling from the questionnaires method. The SVESOM first identifies the important variables by giving discriminative features with high ranking. The classifier then performs the classification based on the selected features. Experimental results show that the top rank features are in line with the work of Scherer and Wallbott [(1994), 'Evidence for Universality and Cultural Variation of Differential Emotion Response Patterning', Journal of Personality and Social Psychology, 66, 310-328], which approached the emotions physiologically. While the performance measures show that using the full features for classifications can degrade the performance, the selected features provide superior results in terms of accuracy and generalisation.
Inline Measurement of Particle Concentrations in Multicomponent Suspensions using Ultrasonic Sensor and Least Squares Support Vector Machines.

PubMed

Zhan, Xiaobin; Jiang, Shulan; Yang, Yili; Liang, Jian; Shi, Tielin; Li, Xiwen

2015-09-18

This paper proposes an ultrasonic measurement system based on least squares support vector machines (LS-SVM) for inline measurement of particle concentrations in multicomponent suspensions. Firstly, the ultrasonic signals are analyzed and processed, and the optimal feature subset that contributes to the best model performance is selected based on the importance of features. Secondly, the LS-SVM model is tuned, trained and tested with different feature subsets to obtain the optimal model. In addition, a comparison is made between the partial least square (PLS) model and the LS-SVM model. Finally, the optimal LS-SVM model with the optimal feature subset is applied to inline measurement of particle concentrations in the mixing process. The results show that the proposed method is reliable and accurate for inline measuring the particle concentrations in multicomponent suspensions and the measurement accuracy is sufficiently high for industrial application. Furthermore, the proposed method is applicable to the modeling of the nonlinear system dynamically and provides a feasible way to monitor industrial processes.

Support Vector Data Descriptions and k-Means Clustering: One Class?

PubMed

Gornitz, Nico; Lima, Luiz Alberto; Muller, Klaus-Robert; Kloft, Marius; Nakajima, Shinichi

2017-09-27

We present ClusterSVDD, a methodology that unifies support vector data descriptions (SVDDs) and k-means clustering into a single formulation. This allows both methods to benefit from one another, i.e., by adding flexibility using multiple spheres for SVDDs and increasing anomaly resistance and flexibility through kernels to k-means. In particular, our approach leads to a new interpretation of k-means as a regularized mode seeking algorithm. The unifying formulation further allows for deriving new algorithms by transferring knowledge from one-class learning settings to clustering settings and vice versa. As a showcase, we derive a clustering method for structured data based on a one-class learning scenario. Additionally, our formulation can be solved via a particularly simple optimization scheme. We evaluate our approach empirically to highlight some of the proposed benefits on artificially generated data, as well as on real-world problems, and provide a Python software package comprising various implementations of primal and dual SVDD as well as our proposed ClusterSVDD.
Efficient design of gain-flattened multi-pump Raman fiber amplifiers using least squares support vector regression

NASA Astrophysics Data System (ADS)

Chen, Jing; Qiu, Xiaojie; Yin, Cunyi; Jiang, Hao

2018-02-01

An efficient method to design the broadband gain-flattened Raman fiber amplifier with multiple pumps is proposed based on least squares support vector regression (LS-SVR). A multi-input multi-output LS-SVR model is introduced to replace the complicated solving process of the nonlinear coupled Raman amplification equation. The proposed approach contains two stages: offline training stage and online optimization stage. During the offline stage, the LS-SVR model is trained. Owing to the good generalization capability of LS-SVR, the net gain spectrum can be directly and accurately obtained when inputting any combination of the pump wavelength and power to the well-trained model. During the online stage, we incorporate the LS-SVR model into the particle swarm optimization algorithm to find the optimal pump configuration. The design results demonstrate that the proposed method greatly shortens the computation time and enhances the efficiency of the pump parameter optimization for Raman fiber amplifier design.
RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences.

PubMed

An, Ji-Yong; You, Zhu-Hong; Meng, Fan-Rong; Xu, Shu-Juan; Wang, Yin

2016-05-18

Protein-Protein Interactions (PPIs) play essential roles in most cellular processes. Knowledge of PPIs is becoming increasingly more important, which has prompted the development of technologies that are capable of discovering large-scale PPIs. Although many high-throughput biological technologies have been proposed to detect PPIs, there are unavoidable shortcomings, including cost, time intensity, and inherently high false positive and false negative rates. For the sake of these reasons, in silico methods are attracting much attention due to their good performances in predicting PPIs. In this paper, we propose a novel computational method known as RVM-AB that combines the Relevance Vector Machine (RVM) model and Average Blocks (AB) to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the AB feature representation on a Position Specific Scoring Matrix (PSSM), reducing the influence of noise using a Principal Component Analysis (PCA), and using a Relevance Vector Machine (RVM) based classifier. We performed five-fold cross-validation experiments on yeast and Helicobacter pylori datasets, and achieved very high accuracies of 92.98% and 95.58% respectively, which is significantly better than previous works. In addition, we also obtained good prediction accuracies of 88.31%, 89.46%, 91.08%, 91.55%, and 94.81% on other five independent datasets C. elegans, M. musculus, H. sapiens, H. pylori, and E. coli for cross-species prediction. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM) classifier on the yeast dataset. The experimental results demonstrate that our RVM-AB method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool. To facilitate extensive studies for future proteomics research, we developed a freely available web server called RVMAB-PPI in Hypertext Preprocessor (PHP) for predicting PPIs. The web server including source code and the datasets are available at http://219.219.62.123:8888/ppi_ab/.
Production of Recombinant Adeno-associated Virus Vectors Using Suspension HEK293 Cells and Continuous Harvest of Vector From the Culture Media for GMP FIX and FLT1 Clinical Vector

PubMed Central

Grieger, Joshua C; Soltys, Stephen M; Samulski, Richard Jude

2016-01-01

Adeno-associated virus (AAV) has shown great promise as a gene therapy vector in multiple aspects of preclinical and clinical applications. Many developments including new serotypes as well as self-complementary vectors are now entering the clinic. With these ongoing vector developments, continued effort has been focused on scalable manufacturing processes that can efficiently generate high-titer, highly pure, and potent quantities of rAAV vectors. Utilizing the relatively simple and efficient transfection system of HEK293 cells as a starting point, we have successfully adapted an adherent HEK293 cell line from a qualified clinical master cell bank to grow in animal component-free suspension conditions in shaker flasks and WAVE bioreactors that allows for rapid and scalable rAAV production. Using the triple transfection method, the suspension HEK293 cell line generates greater than 1 × 105 vector genome containing particles (vg)/cell or greater than 1 × 1014 vg/l of cell culture when harvested 48 hours post-transfection. To achieve these yields, a number of variables were optimized such as selection of a compatible serum-free suspension media that supports both growth and transfection, selection of a transfection reagent, transfection conditions and cell density. A universal purification strategy, based on ion exchange chromatography methods, was also developed that results in high-purity vector preps of AAV serotypes 1–6, 8, 9 and various chimeric capsids tested. This user-friendly process can be completed within 1 week, results in high full to empty particle ratios (>90% full particles), provides postpurification yields (>1 × 1013 vg/l) and purity suitable for clinical applications and is universal with respect to all serotypes and chimeric particles. To date, this scalable manufacturing technology has been utilized to manufacture GMP phase 1 clinical AAV vectors for retinal neovascularization (AAV2), Hemophilia B (scAAV8), giant axonal neuropathy (scAAV9), and retinitis pigmentosa (AAV2), which have been administered into patients. In addition, we report a minimum of a fivefold increase in overall vector production by implementing a perfusion method that entails harvesting rAAV from the culture media at numerous time-points post-transfection. PMID:26437810
Insect cell transformation vectors that support high level expression and promoter assessment in insect cell culture

USDA-ARS?s Scientific Manuscript database

A somatic transformation vector, pDP9, was constructed that provides a simplified means of producing permanently transformed cultured insect cells that support high levels of protein expression of foreign genes. The pDP9 plasmid vector incorporates DNA sequences from the Junonia coenia densovirus th...
Classification of Alzheimer's disease patients with hippocampal shape wrapper-based feature selection and support vector machine

NASA Astrophysics Data System (ADS)

Young, Jonathan; Ridgway, Gerard; Leung, Kelvin; Ourselin, Sebastien

2012-02-01

It is well known that hippocampal atrophy is a marker of the onset of Alzheimer's disease (AD) and as a result hippocampal volumetry has been used in a number of studies to provide early diagnosis of AD and predict conversion of mild cognitive impairment patients to AD. However, rates of atrophy are not uniform across the hippocampus making shape analysis a potentially more accurate biomarker. This study studies the hippocampi from 226 healthy controls, 148 AD patients and 330 MCI patients obtained from T1 weighted structural MRI images from the ADNI database. The hippocampi are anatomically segmented using the MAPS multi-atlas segmentation method, and the resulting binary images are then processed with SPHARM software to decompose their shapes as a weighted sum of spherical harmonic basis functions. The resulting parameterizations are then used as feature vectors in Support Vector Machine (SVM) classification. A wrapper based feature selection method was used as this considers the utility of features in discriminating classes in combination, fully exploiting the multivariate nature of the data and optimizing the selected set of features for the type of classifier that is used. The leave-one-out cross validated accuracy obtained on training data is 88.6% for classifying AD vs controls and 74% for classifying MCI-converters vs MCI-stable with very compact feature sets, showing that this is a highly promising method. There is currently a considerable fall in accuracy on unseen data indicating that the feature selection is sensitive to the data used, however feature ensemble methods may overcome this.
Building a Computer Program to Support Children, Parents, and Distraction during Healthcare Procedures

PubMed Central

McCarthy, Ann Marie; Kleiber, Charmaine; Ataman, Kaan; Street, W. Nick; Zimmerman, M. Bridget; Ersig, Anne L.

2012-01-01

This secondary data analysis used data mining methods to develop predictive models of child risk for distress during a healthcare procedure. Data used came from a study that predicted factors associated with children’s responses to an intravenous catheter insertion while parents provided distraction coaching. From the 255 items used in the primary study, 44 predictive items were identified through automatic feature selection and used to build support vector machine regression models. Models were validated using multiple cross-validation tests and by comparing variables identified as explanatory in the traditional versus support vector machine regression. Rule-based approaches were applied to the model outputs to identify overall risk for distress. A decision tree was then applied to evidence-based instructions for tailoring distraction to characteristics and preferences of the parent and child. The resulting decision support computer application, the Children, Parents and Distraction (CPaD), is being used in research. Future use will support practitioners in deciding the level and type of distraction intervention needed by a child undergoing a healthcare procedure. PMID:22805121
Extended robust support vector machine based on financial risk minimization.

PubMed

Takeda, Akiko; Fujiwara, Shuhei; Kanamori, Takafumi

2014-11-01

Financial risk measures have been used recently in machine learning. For example, ν-support vector machine ν-SVM) minimizes the conditional value at risk (CVaR) of margin distribution. The measure is popular in finance because of the subadditivity property, but it is very sensitive to a few outliers in the tail of the distribution. We propose a new classification method, extended robust SVM (ER-SVM), which minimizes an intermediate risk measure between the CVaR and value at risk (VaR) by expecting that the resulting model becomes less sensitive than ν-SVM to outliers. We can regard ER-SVM as an extension of robust SVM, which uses a truncated hinge loss. Numerical experiments imply the ER-SVM's possibility of achieving a better prediction performance with proper parameter setting.
Fractal and twin SVM-based handgrip recognition for healthy subjects and trans-radial amputees using myoelectric signal.

PubMed

Arjunan, Sridhar Poosapadi; Kumar, Dinesh Kant; Jayadeva J

2016-02-01

Identifying functional handgrip patterns using surface electromygram (sEMG) signal recorded from amputee residual muscle is required for controlling the myoelectric prosthetic hand. In this study, we have computed the signal fractal dimension (FD) and maximum fractal length (MFL) during different grip patterns performed by healthy and transradial amputee subjects. The FD and MFL of the sEMG, referred to as the fractal features, were classified using twin support vector machines (TSVM) to recognize the handgrips. TSVM requires fewer support vectors, is suitable for data sets with unbalanced distributions, and can simultaneously be trained for improving both sensitivity and specificity. When compared with other methods, this technique resulted in improved grip recognition accuracy, sensitivity, and specificity, and this improvement was significant (κ=0.91).
Segmentation of mosaicism in cervicographic images using support vector machines

NASA Astrophysics Data System (ADS)

Xue, Zhiyun; Long, L. Rodney; Antani, Sameer; Jeronimo, Jose; Thoma, George R.

2009-02-01

The National Library of Medicine (NLM), in collaboration with the National Cancer Institute (NCI), is creating a large digital repository of cervicographic images for the study of uterine cervix cancer prevention. One of the research goals is to automatically detect diagnostic bio-markers in these images. Reliable bio-marker segmentation in large biomedical image collections is a challenging task due to the large variation in image appearance. Methods described in this paper focus on segmenting mosaicism, which is an important vascular feature used to visually assess the degree of cervical intraepithelial neoplasia. The proposed approach uses support vector machines (SVM) trained on a ground truth dataset annotated by medical experts (which circumvents the need for vascular structure extraction). We have evaluated the performance of the proposed algorithm and experimentally demonstrated its feasibility.
Genetic algorithm based feature selection combined with dual classification for the automated detection of proliferative diabetic retinopathy.

PubMed

Welikala, R A; Fraz, M M; Dehmeshki, J; Hoppe, A; Tah, V; Mann, S; Williamson, T H; Barman, S A

2015-07-01

Proliferative diabetic retinopathy (PDR) is a condition that carries a high risk of severe visual impairment. The hallmark of PDR is the growth of abnormal new vessels. In this paper, an automated method for the detection of new vessels from retinal images is presented. This method is based on a dual classification approach. Two vessel segmentation approaches are applied to create two separate binary vessel map which each hold vital information. Local morphology features are measured from each binary vessel map to produce two separate 4-D feature vectors. Independent classification is performed for each feature vector using a support vector machine (SVM) classifier. The system then combines these individual outcomes to produce a final decision. This is followed by the creation of additional features to generate 21-D feature vectors, which feed into a genetic algorithm based feature selection approach with the objective of finding feature subsets that improve the performance of the classification. Sensitivity and specificity results using a dataset of 60 images are 0.9138 and 0.9600, respectively, on a per patch basis and 1.000 and 0.975, respectively, on a per image basis. Copyright © 2015 Elsevier Ltd. All rights reserved.
Estimation of vector static magnetic field by a nitrogen-vacancy center with a single first-shell 13C nuclear (NV–13C) spin in diamond

NASA Astrophysics Data System (ADS)

Jiang, Feng-Jian; Ye, Jian-Feng; Jiao, Zheng; Huang, Zhi-Yong; Lv, Hai-Jiang

2018-05-01

We suggest an experimental scheme that a single nitrogen-vacancy (NV) center coupled to a nearest neighbor 13C nucleus as a sensor in diamond can be used to detect a static vector magnetic field. By means of optical detection magnetic resonance (ODMR) technique, both the strength and the direction of the vector field could be determined by relevant resonance frequencies of continuous wave (CW) and Ramsey spectrums. In addition, we give a method that determines the unique one of eight possible hyperfine tensors for an (NV–13C) system. Finally, we propose an unambiguous method to exclude the symmetrical solution from eight possible vector fields, which correspond to nearly identical resonance frequencies due to their mirror symmetry about 14N–Vacancy–13C (14N–V–13C) plane. Protect supported by the National Natural Science Foundation of China (Grant Nos. 11305074, 11135002, and 11275083), the Key Program of the Education Department Outstanding Youth Foundation of Anhui Province, China (Grant No. gxyqZD2017080), and the Natural Science Foundation of Anhui Province, China (Grant No. KJHS2015B09).
ATLS Hypovolemic Shock Classification by Prediction of Blood Loss in Rats Using Regression Models.

PubMed

Choi, Soo Beom; Choi, Joon Yul; Park, Jee Soo; Kim, Deok Won

2016-07-01

In our previous study, our input data set consisted of 78 rats, the blood loss in percent as a dependent variable, and 11 independent variables (heart rate, systolic blood pressure, diastolic blood pressure, mean arterial pressure, pulse pressure, respiration rate, temperature, perfusion index, lactate concentration, shock index, and new index (lactate concentration/perfusion)). The machine learning methods for multicategory classification were applied to a rat model in acute hemorrhage to predict the four Advanced Trauma Life Support (ATLS) hypovolemic shock classes for triage in our previous study. However, multicategory classification is much more difficult and complicated than binary classification. We introduce a simple approach for classifying ATLS hypovolaemic shock class by predicting blood loss in percent using support vector regression and multivariate linear regression (MLR). We also compared the performance of the classification models using absolute and relative vital signs. The accuracies of support vector regression and MLR models with relative values by predicting blood loss in percent were 88.5% and 84.6%, respectively. These were better than the best accuracy of 80.8% of the direct multicategory classification using the support vector machine one-versus-one model in our previous study for the same validation data set. Moreover, the simple MLR models with both absolute and relative values could provide possibility of the future clinical decision support system for ATLS classification. The perfusion index and new index were more appropriate with relative changes than absolute values.
Mechanical Fault Diagnosis of High Voltage Circuit Breakers Based on Variational Mode Decomposition and Multi-Layer Classifier.

PubMed

Huang, Nantian; Chen, Huaijin; Cai, Guowei; Fang, Lihua; Wang, Yuqiang

2016-11-10

Mechanical fault diagnosis of high-voltage circuit breakers (HVCBs) based on vibration signal analysis is one of the most significant issues in improving the reliability and reducing the outage cost for power systems. The limitation of training samples and types of machine faults in HVCBs causes the existing mechanical fault diagnostic methods to recognize new types of machine faults easily without training samples as either a normal condition or a wrong fault type. A new mechanical fault diagnosis method for HVCBs based on variational mode decomposition (VMD) and multi-layer classifier (MLC) is proposed to improve the accuracy of fault diagnosis. First, HVCB vibration signals during operation are measured using an acceleration sensor. Second, a VMD algorithm is used to decompose the vibration signals into several intrinsic mode functions (IMFs). The IMF matrix is divided into submatrices to compute the local singular values (LSV). The maximum singular values of each submatrix are selected as the feature vectors for fault diagnosis. Finally, a MLC composed of two one-class support vector machines (OCSVMs) and a support vector machine (SVM) is constructed to identify the fault type. Two layers of independent OCSVM are adopted to distinguish normal or fault conditions with known or unknown fault types, respectively. On this basis, SVM recognizes the specific fault type. Real diagnostic experiments are conducted with a real SF₆ HVCB with normal and fault states. Three different faults (i.e., jam fault of the iron core, looseness of the base screw, and poor lubrication of the connecting lever) are simulated in a field experiment on a real HVCB to test the feasibility of the proposed method. Results show that the classification accuracy of the new method is superior to other traditional methods.
Mechanical Fault Diagnosis of High Voltage Circuit Breakers Based on Variational Mode Decomposition and Multi-Layer Classifier

PubMed Central

Huang, Nantian; Chen, Huaijin; Cai, Guowei; Fang, Lihua; Wang, Yuqiang

2016-01-01

Mechanical fault diagnosis of high-voltage circuit breakers (HVCBs) based on vibration signal analysis is one of the most significant issues in improving the reliability and reducing the outage cost for power systems. The limitation of training samples and types of machine faults in HVCBs causes the existing mechanical fault diagnostic methods to recognize new types of machine faults easily without training samples as either a normal condition or a wrong fault type. A new mechanical fault diagnosis method for HVCBs based on variational mode decomposition (VMD) and multi-layer classifier (MLC) is proposed to improve the accuracy of fault diagnosis. First, HVCB vibration signals during operation are measured using an acceleration sensor. Second, a VMD algorithm is used to decompose the vibration signals into several intrinsic mode functions (IMFs). The IMF matrix is divided into submatrices to compute the local singular values (LSV). The maximum singular values of each submatrix are selected as the feature vectors for fault diagnosis. Finally, a MLC composed of two one-class support vector machines (OCSVMs) and a support vector machine (SVM) is constructed to identify the fault type. Two layers of independent OCSVM are adopted to distinguish normal or fault conditions with known or unknown fault types, respectively. On this basis, SVM recognizes the specific fault type. Real diagnostic experiments are conducted with a real SF6 HVCB with normal and fault states. Three different faults (i.e., jam fault of the iron core, looseness of the base screw, and poor lubrication of the connecting lever) are simulated in a field experiment on a real HVCB to test the feasibility of the proposed method. Results show that the classification accuracy of the new method is superior to other traditional methods. PMID:27834902
Subcellular location prediction of proteins using support vector machines with alignment of block sequences utilizing amino acid composition.

PubMed

Tamura, Takeyuki; Akutsu, Tatsuya

2007-11-30

Subcellular location prediction of proteins is an important and well-studied problem in bioinformatics. This is a problem of predicting which part in a cell a given protein is transported to, where an amino acid sequence of the protein is given as an input. This problem is becoming more important since information on subcellular location is helpful for annotation of proteins and genes and the number of complete genomes is rapidly increasing. Since existing predictors are based on various heuristics, it is important to develop a simple method with high prediction accuracies. In this paper, we propose a novel and general predicting method by combining techniques for sequence alignment and feature vectors based on amino acid composition. We implemented this method with support vector machines on plant data sets extracted from the TargetP database. Through fivefold cross validation tests, the obtained overall accuracies and average MCC were 0.9096 and 0.8655 respectively. We also applied our method to other datasets including that of WoLF PSORT. Although there is a predictor which uses the information of gene ontology and yields higher accuracy than ours, our accuracies are higher than existing predictors which use only sequence information. Since such information as gene ontology can be obtained only for known proteins, our predictor is considered to be useful for subcellular location prediction of newly-discovered proteins. Furthermore, the idea of combination of alignment and amino acid frequency is novel and general so that it may be applied to other problems in bioinformatics. Our method for plant is also implemented as a web-system and available on http://sunflower.kuicr.kyoto-u.ac.jp/~tamura/slpfa.html.
Detecting double compression of audio signal

NASA Astrophysics Data System (ADS)

Yang, Rui; Shi, Yun Q.; Huang, Jiwu

2010-01-01

MP3 is the most popular audio format nowadays in our daily life, for example music downloaded from the Internet and file saved in the digital recorder are often in MP3 format. However, low bitrate MP3s are often transcoded to high bitrate since high bitrate ones are of high commercial value. Also audio recording in digital recorder can be doctored easily by pervasive audio editing software. This paper presents two methods for the detection of double MP3 compression. The methods are essential for finding out fake-quality MP3 and audio forensics. The proposed methods use support vector machine classifiers with feature vectors formed by the distributions of the first digits of the quantized MDCT (modified discrete cosine transform) coefficients. Extensive experiments demonstrate the effectiveness of the proposed methods. To the best of our knowledge, this piece of work is the first one to detect double compression of audio signal.
Spacebased Estimation of Moisture Transport in Marine Atmosphere Using Support Vector Regression

NASA Technical Reports Server (NTRS)

Xie, Xiaosu; Liu, W. Timothy; Tang, Benyang

2007-01-01

An improved algorithm is developed based on support vector regression (SVR) to estimate horizonal water vapor transport integrated through the depth of the atmosphere ((Theta)) over the global ocean from observations of surface wind-stress vector by QuikSCAT, cloud drift wind vector derived from the Multi-angle Imaging SpectroRadiometer (MISR) and geostationary satellites, and precipitable water from the Special Sensor Microwave/Imager (SSM/I). The statistical relation is established between the input parameters (the surface wind stress, the 850 mb wind, the precipitable water, time and location) and the target data ((Theta) calculated from rawinsondes and reanalysis of numerical weather prediction model). The results are validated with independent daily rawinsonde observations, monthly mean reanalysis data, and through regional water balance. This study clearly demonstrates the improvement of (Theta) derived from satellite data using SVR over previous data sets based on linear regression and neural network. The SVR methodology reduces both mean bias and standard deviation comparedwith rawinsonde observations. It agrees better with observations from synoptic to seasonal time scales, and compare more favorably with the reanalysis data on seasonal variations. Only the SVR result can achieve the water balance over South America. The rationale of the advantage by SVR method and the impact of adding the upper level wind will also be discussed.
Gene Therapy Vectors with Enhanced Transfection Based on Hydrogels Modified with Affinity Peptides

PubMed Central

Shepard, Jaclyn A.; Wesson, Paul J.; Wang, Christine E.; Stevans, Alyson C.; Holland, Samantha J.; Shikanov, Ariella; Grzybowski, Bartosz A.; Shea, Lonnie D.

2011-01-01

Regenerative strategies for damaged tissue aim to present biochemical cues that recruit and direct progenitor cell migration and differentiation. Hydrogels capable of localized gene delivery are being developed to provide a support for tissue growth, and as a versatile method to induce the expression of inductive proteins; however, the duration, level, and localization of expression isoften insufficient for regeneration. We thus investigated the modification of hydrogels with affinity peptides to enhance vector retention and increase transfection within the matrix. PEG hydrogels were modified with lysine-based repeats (K4, K8), which retained approximately 25% more vector than control peptides. Transfection increased 5- to 15-fold with K8 and K4 respectively, over the RDG control peptide. K8- and K4-modified hydrogels bound similar quantities of vector, yet the vector dissociation rate was reduced for K8, suggesting excessive binding that limited transfection. These hydrogels were subsequently applied to an in vitro co-culture model to induce NGF expression and promote neurite outgrowth. K4-modified hydrogels promoted maximal neurite outgrowth, likely due to retention of both the vector and the NGF. Thus, hydrogels modified with affinity peptides enhanced vector retention and increased gene delivery, and these hydrogels may provide a versatile scaffold for numerous regenerative medicine applications. PMID:21514659
Predicting protein-protein interactions by combing various sequence- derived features into the general form of Chou's Pseudo amino acid composition.

PubMed

Zhao, Xiao-Wei; Ma, Zhi-Qiang; Yin, Ming-Hao

2012-05-01

Knowledge of protein-protein interactions (PPIs) plays an important role in constructing protein interaction networks and understanding the general machineries of biological systems. In this study, a new method is proposed to predict PPIs using a comprehensive set of 930 features based only on sequence information, these features measure the interactions between residues a certain distant apart in the protein sequences from different aspects. To achieve better performance, the principal component analysis (PCA) is first employed to obtain an optimized feature subset. Then, the resulting 67-dimensional feature vectors are fed to Support Vector Machine (SVM). Experimental results on Drosophila melanogaster and Helicobater pylori datasets show that our method is very promising to predict PPIs and may at least be a useful supplement tool to existing methods.

Degradation trend estimation of slewing bearing based on LSSVM model

NASA Astrophysics Data System (ADS)

Lu, Chao; Chen, Jie; Hong, Rongjing; Feng, Yang; Li, Yuanyuan

2016-08-01

A novel prediction method is proposed based on least squares support vector machine (LSSVM) to estimate the slewing bearing's degradation trend with small sample data. This method chooses the vibration signal which contains rich state information as the object of the study. Principal component analysis (PCA) was applied to fuse multi-feature vectors which could reflect the health state of slewing bearing, such as root mean square, kurtosis, wavelet energy entropy, and intrinsic mode function (IMF) energy. The degradation indicator fused by PCA can reflect the degradation more comprehensively and effectively. Then the degradation trend of slewing bearing was predicted by using the LSSVM model optimized by particle swarm optimization (PSO). The proposed method was demonstrated to be more accurate and effective by the whole life experiment of slewing bearing. Therefore, it can be applied in engineering practice.
Target specific compound identification using a support vector machine.

PubMed

Plewczynski, Dariusz; von Grotthuss, Marcin; Spieser, Stephane A H; Rychlewski, Leszek; Wyrwicz, Lucjan S; Ginalski, Krzysztof; Koch, Uwe

2007-03-01

In many cases at the beginning of an HTS-campaign, some information about active molecules is already available. Often known active compounds (such as substrate analogues, natural products, inhibitors of a related protein or ligands published by a pharmaceutical company) are identified in low-throughput validation studies of the biochemical target. In this study we evaluate the effectiveness of a support vector machine applied for those compounds and used to classify a collection with unknown activity. This approach was aimed at reducing the number of compounds to be tested against the given target. Our method predicts the biological activity of chemical compounds based on only the atom pairs (AP) two dimensional topological descriptors. The supervised support vector machine (SVM) method herein is trained on compounds from the MDL drug data report (MDDR) known to be active for specific protein target. For detailed analysis, five different biological targets were selected including cyclooxygenase-2, dihydrofolate reductase, thrombin, HIV-reverse transcriptase and antagonists of the estrogen receptor. The accuracy of compound identification was estimated using the recall and precision values. The sensitivities for all protein targets exceeded 80% and the classification performance reached 100% for selected targets. In another application of the method, we addressed the absence of an initial set of active compounds for a selected protein target at the beginning of an HTS-campaign. In such a case, virtual high-throughput screening (vHTS) is usually applied by using a flexible docking procedure. However, the vHTS experiment typically contains a large percentage of false positives that should be verified by costly and time-consuming experimental follow-up assays. The subsequent use of our machine learning method was found to improve the speed (since the docking procedure was not required for all compounds from the database) and also the accuracy of the HTS hit lists (the enrichment factor).
Phylogeny of the Genus Flavivirus

PubMed Central

Kuno, Goro; Chang, Gwong-Jen J.; Tsuchiya, K. Richard; Karabatsos, Nick; Cropp, C. Bruce

1998-01-01

We undertook a comprehensive phylogenetic study to establish the genetic relationship among the viruses of the genus Flavivirus and to compare the classification based on molecular phylogeny with the existing serologic method. By using a combination of quantitative definitions (bootstrap support level and the pairwise nucleotide sequence identity), the viruses could be classified into clusters, clades, and species. Our phylogenetic study revealed for the first time that from the putative ancestor two branches, non-vector and vector-borne virus clusters, evolved and from the latter cluster emerged tick-borne and mosquito-borne virus clusters. Provided that the theory of arthropod association being an acquired trait was correct, pairwise nucleotide sequence identity among these three clusters provided supporting data for a possibility that the non-vector cluster evolved first, followed by the separation of tick-borne and mosquito-borne virus clusters in that order. Clades established in our study correlated significantly with existing antigenic complexes. We also resolved many of the past taxonomic problems by establishing phylogenetic relationships of the antigenically unclassified viruses with the well-established viruses and by identifying synonymous viruses. PMID:9420202
Phylogeny of the genus Flavivirus.

PubMed

Kuno, G; Chang, G J; Tsuchiya, K R; Karabatsos, N; Cropp, C B

1998-01-01

We undertook a comprehensive phylogenetic study to establish the genetic relationship among the viruses of the genus Flavivirus and to compare the classification based on molecular phylogeny with the existing serologic method. By using a combination of quantitative definitions (bootstrap support level and the pairwise nucleotide sequence identity), the viruses could be classified into clusters, clades, and species. Our phylogenetic study revealed for the first time that from the putative ancestor two branches, non-vector and vector-borne virus clusters, evolved and from the latter cluster emerged tick-borne and mosquito-borne virus clusters. Provided that the theory of arthropod association being an acquired trait was correct, pairwise nucleotide sequence identity among these three clusters provided supporting data for a possibility that the non-vector cluster evolved first, followed by the separation of tick-borne and mosquito-borne virus clusters in that order. Clades established in our study correlated significantly with existing antigenic complexes. We also resolved many of the past taxonomic problems by establishing phylogenetic relationships of the antigenically unclassified viruses with the well-established viruses and by identifying synonymous viruses.
Discrimination of Active and Weakly Active Human BACE1 Inhibitors Using Self-Organizing Map and Support Vector Machine.

PubMed

Li, Hang; Wang, Maolin; Gong, Ya-Nan; Yan, Aixia

2016-01-01

β-secretase (BACE1) is an aspartyl protease, which is considered as a novel vital target in Alzheimer`s disease therapy. We collected a data set of 294 BACE1 inhibitors, and built six classification models to discriminate active and weakly active inhibitors using Kohonen's Self-Organizing Map (SOM) method and Support Vector Machine (SVM) method. Each molecular descriptor was calculated using the program ADRIANA.Code. We adopted two different methods: random method and Self-Organizing Map method, for training/test set split. The descriptors were selected by F-score and stepwise linear regression analysis. The best SVM model Model2C has a good prediction performance on test set with prediction accuracy, sensitivity (SE), specificity (SP) and Matthews correlation coefficient (MCC) of 89.02%, 90%, 88%, 0.78, respectively. Model 1A is the best SOM model, whose accuracy and MCC of the test set were 94.57% and 0.98, respectively. The lone pair electronegativity and polarizability related descriptors importantly contributed to bioactivity of BACE1 inhibitor. The Extended-Connectivity Finger-Prints_4 (ECFP_4) analysis found some vitally key substructural features, which could be helpful for further drug design research. The SOM and SVM models built in this study can be obtained from the authors by email or other contacts.
Plant X-tender: An extension of the AssemblX system for the assembly and expression of multigene constructs in plants.

PubMed

Lukan, Tjaša; Machens, Fabian; Coll, Anna; Baebler, Špela; Messerschmidt, Katrin; Gruden, Kristina

2018-01-01

Cloning multiple DNA fragments for delivery of several genes of interest into the plant genome is one of the main technological challenges in plant synthetic biology. Despite several modular assembly methods developed in recent years, the plant biotechnology community has not widely adopted them yet, probably due to the lack of appropriate vectors and software tools. Here we present Plant X-tender, an extension of the highly efficient, scar-free and sequence-independent multigene assembly strategy AssemblX, based on overlap-depended cloning methods and rare-cutting restriction enzymes. Plant X-tender consists of a set of plant expression vectors and the protocols for most efficient cloning into the novel vector set needed for plant expression and thus introduces advantages of AssemblX into plant synthetic biology. The novel vector set covers different backbones and selection markers to allow full design flexibility. We have included ccdB counterselection, thereby allowing the transfer of multigene constructs into the novel vector set in a straightforward and highly efficient way. Vectors are available as empty backbones and are fully flexible regarding the orientation of expression cassettes and addition of linkers between them, if required. We optimised the assembly and subcloning protocol by testing different scar-less assembly approaches: the noncommercial SLiCE and TAR methods and the commercial Gibson assembly and NEBuilder HiFi DNA assembly kits. Plant X-tender was applicable even in combination with low efficient homemade chemically competent or electrocompetent Escherichia coli. We have further validated the developed procedure for plant protein expression by cloning two cassettes into the newly developed vectors and subsequently transferred them to Nicotiana benthamiana in a transient expression setup. Thereby we show that multigene constructs can be delivered into plant cells in a streamlined and highly efficient way. Our results will support faster introduction of synthetic biology into plant science.
Plant X-tender: An extension of the AssemblX system for the assembly and expression of multigene constructs in plants

PubMed Central

Machens, Fabian; Coll, Anna; Baebler, Špela; Messerschmidt, Katrin; Gruden, Kristina

2018-01-01

Cloning multiple DNA fragments for delivery of several genes of interest into the plant genome is one of the main technological challenges in plant synthetic biology. Despite several modular assembly methods developed in recent years, the plant biotechnology community has not widely adopted them yet, probably due to the lack of appropriate vectors and software tools. Here we present Plant X-tender, an extension of the highly efficient, scar-free and sequence-independent multigene assembly strategy AssemblX, based on overlap-depended cloning methods and rare-cutting restriction enzymes. Plant X-tender consists of a set of plant expression vectors and the protocols for most efficient cloning into the novel vector set needed for plant expression and thus introduces advantages of AssemblX into plant synthetic biology. The novel vector set covers different backbones and selection markers to allow full design flexibility. We have included ccdB counterselection, thereby allowing the transfer of multigene constructs into the novel vector set in a straightforward and highly efficient way. Vectors are available as empty backbones and are fully flexible regarding the orientation of expression cassettes and addition of linkers between them, if required. We optimised the assembly and subcloning protocol by testing different scar-less assembly approaches: the noncommercial SLiCE and TAR methods and the commercial Gibson assembly and NEBuilder HiFi DNA assembly kits. Plant X-tender was applicable even in combination with low efficient homemade chemically competent or electrocompetent Escherichia coli. We have further validated the developed procedure for plant protein expression by cloning two cassettes into the newly developed vectors and subsequently transferred them to Nicotiana benthamiana in a transient expression setup. Thereby we show that multigene constructs can be delivered into plant cells in a streamlined and highly efficient way. Our results will support faster introduction of synthetic biology into plant science. PMID:29300787
a Hyperspectral Image Classification Method Using Isomap and Rvm

NASA Astrophysics Data System (ADS)

Chang, H.; Wang, T.; Fang, H.; Su, Y.

2018-04-01

Classification is one of the most significant applications of hyperspectral image processing and even remote sensing. Though various algorithms have been proposed to implement and improve this application, there are still drawbacks in traditional classification methods. Thus further investigations on some aspects, such as dimension reduction, data mining, and rational use of spatial information, should be developed. In this paper, we used a widely utilized global manifold learning approach, isometric feature mapping (ISOMAP), to address the intrinsic nonlinearities of hyperspectral image for dimension reduction. Considering the impropriety of Euclidean distance in spectral measurement, we applied spectral angle (SA) for substitute when constructed the neighbourhood graph. Then, relevance vector machines (RVM) was introduced to implement classification instead of support vector machines (SVM) for simplicity, generalization and sparsity. Therefore, a probability result could be obtained rather than a less convincing binary result. Moreover, taking into account the spatial information of the hyperspectral image, we employ a spatial vector formed by different classes' ratios around the pixel. At last, we combined the probability results and spatial factors with a criterion to decide the final classification result. To verify the proposed method, we have implemented multiple experiments with standard hyperspectral images compared with some other methods. The results and different evaluation indexes illustrated the effectiveness of our method.
Fault Diagnosis for Rolling Bearings under Variable Conditions Based on Visual Cognition

PubMed Central

Cheng, Yujie; Zhou, Bo; Lu, Chen; Yang, Chao

2017-01-01

Fault diagnosis for rolling bearings has attracted increasing attention in recent years. However, few studies have focused on fault diagnosis for rolling bearings under variable conditions. This paper introduces a fault diagnosis method for rolling bearings under variable conditions based on visual cognition. The proposed method includes the following steps. First, the vibration signal data are transformed into a recurrence plot (RP), which is a two-dimensional image. Then, inspired by the visual invariance characteristic of the human visual system (HVS), we utilize speed up robust feature to extract fault features from the two-dimensional RP and generate a 64-dimensional feature vector, which is invariant to image translation, rotation, scaling variation, etc. Third, based on the manifold perception characteristic of HVS, isometric mapping, a manifold learning method that can reflect the intrinsic manifold embedded in the high-dimensional space, is employed to obtain a low-dimensional feature vector. Finally, a classical classification method, support vector machine, is utilized to realize fault diagnosis. Verification data were collected from Case Western Reserve University Bearing Data Center, and the experimental result indicates that the proposed fault diagnosis method based on visual cognition is highly effective for rolling bearings under variable conditions, thus providing a promising approach from the cognitive computing field. PMID:28772943
Coquillettidia (Culicidae, Diptera) mosquitoes are natural vectors of avian malaria in Africa

PubMed Central

2009-01-01

Background The mosquito vectors of Plasmodium spp. have largely been overlooked in studies of ecology and evolution of avian malaria and other vertebrates in wildlife. Methods Plasmodium DNA from wild-caught Coquillettidia spp. collected from lowland forests in Cameroon was isolated and sequenced using nested PCR. Female Coquillettidia aurites were also dissected and salivary glands were isolated and microscopically examined for the presence of sporozoites. Results In total, 33% (85/256) of mosquito pools tested positive for avian Plasmodium spp., harbouring at least eight distinct parasite lineages. Sporozoites of Plasmodium spp. were recorded in salivary glands of C. aurites supporting the PCR data that the parasites complete development in these mosquitoes. Results suggest C. aurites, Coquillettidia pseudoconopas and Coquillettidia metallica as new and important vectors of avian malaria in Africa. All parasite lineages recovered clustered with parasites formerly identified from several bird species and suggest the vectors capability of infecting birds from different families. Conclusion Identifying the major vectors of avian Plasmodium spp. will assist in understanding the epizootiology of avian malaria, including differences in this disease distribution between pristine and disturbed landscapes. PMID:19664282
Geographical distribution of reference value of aging people's left ventricular end systolic diameter based on the support vector regression.

PubMed

Han, Xiao; Ge, Miao; Dong, Jie; Xue, Ranying; Wang, Zixuan; He, Jinwei

2014-09-01

The aim of this paper is to analyze the geographical distribution of reference value of aging people's left ventricular end systolic diameter (LVDs), and to provide a scientific basis for clinical examination. The study is focus on the relationship between reference value of left ventricular end systolic diameter of aging people and 14 geographical factors, selecting 2495 samples of left ventricular end systolic diameter (LVDs) of aging people in 71 units of China, in which including 1620 men and 875 women. By using the Moran's I index to make sure the relationship between the reference values and spatial geographical factors, extracting 5 geographical factors which have significant correlation with left ventricular end systolic diameter for building the support vector regression, detecting by the method of paired sample t test to make sure the consistency between predicted and measured values, finally, makes the distribution map through the disjunctive kriging interpolation method and fits the three-dimensional trend of normal reference value. It is found that the correlation between the extracted geographical factors and the reference value of left ventricular end systolic diameter is quite significant, the 5 indexes respectively are latitude, annual mean air temperature, annual mean relative humidity, annual precipitation amount, annual range of air temperature, the predicted values and the observed ones are in good conformity, there is no significant difference at 95% degree of confidence. The overall trend of predicted values increases from west to east, increases first and then decreases from north to south. If geographical values are obtained in one region, the reference value of left ventricular end systolic diameter of aging people in this region can be obtained by using the support vector regression model. It could be more scientific to formulate the different distributions on the basis of synthesizing the physiological and the geographical factors. -Use Moran's index to analyze the spatial correlation. -Choose support vector machine to build model that overcome complexity of variables. -Test normal distribution of predicted data to guarantee the interpolation results. -Through trend analysis to explain the changes of reference value clearly. Copyright © 2014 Elsevier Inc. All rights reserved.
The Design of a Templated C++ Small Vector Class for Numerical Computing

NASA Technical Reports Server (NTRS)

Moran, Patrick J.

2000-01-01

We describe the design and implementation of a templated C++ class for vectors. The vector class is templated both for vector length and vector component type; the vector length is fixed at template instantiation time. The vector implementation is such that for a vector of N components of type T, the total number of bytes required by the vector is equal to N * size of (T), where size of is the built-in C operator. The property of having a size no bigger than that required by the components themselves is key in many numerical computing applications, where one may allocate very large arrays of small, fixed-length vectors. In addition to the design trade-offs motivating our fixed-length vector design choice, we review some of the C++ template features essential to an efficient, succinct implementation. In particular, we highlight some of the standard C++ features, such as partial template specialization, that are not supported by all compilers currently. This report provides an inventory listing the relevant support currently provided by some key compilers, as well as test code one can use to verify compiler capabilities.
An Efficient Wait-Free Vector

DOE PAGES

Feldman, Steven; Valera-Leon, Carlos; Dechev, Damian

2016-03-01

The vector is a fundamental data structure, which provides constant-time access to a dynamically-resizable range of elements. Currently, there exist no wait-free vectors. The only non-blocking version supports only a subset of the sequential vector API and exhibits significant synchronization overhead caused by supporting opposing operations. Since many applications operate in phases of execution, wherein each phase only a subset of operations are used, this overhead is unnecessary for the majority of the application. To address the limitations of the non-blocking version, we present a new design that is wait-free, supports more of the operations provided by the sequential vector,more » and provides alternative implementations of key operations. These alternatives allow the developer to balance the performance and functionality of the vector as requirements change throughout execution. Compared to the known non-blocking version and the concurrent vector found in Intel’s TBB library, our design outperforms or provides comparable performance in the majority of tested scenarios. Over all tested scenarios, the presented design performs an average of 4.97 times more operations per second than the non-blocking vector and 1.54 more than the TBB vector. In a scenario designed to simulate the filling of a vector, performance improvement increases to 13.38 and 1.16 times. This work presents the first ABA-free non-blocking vector. Finally, unlike the other non-blocking approach, all operations are wait-free and bounds-checked and elements are stored contiguously in memory.« less
River flow prediction using hybrid models of support vector regression with the wavelet transform, singular spectrum analysis and chaotic approach

NASA Astrophysics Data System (ADS)

Baydaroğlu, Özlem; Koçak, Kasım; Duran, Kemal

2018-06-01

Prediction of water amount that will enter the reservoirs in the following month is of vital importance especially for semi-arid countries like Turkey. Climate projections emphasize that water scarcity will be one of the serious problems in the future. This study presents a methodology for predicting river flow for the subsequent month based on the time series of observed monthly river flow with hybrid models of support vector regression (SVR). Monthly river flow over the period 1940-2012 observed for the Kızılırmak River in Turkey has been used for training the method, which then has been applied for predictions over a period of 3 years. SVR is a specific implementation of support vector machines (SVMs), which transforms the observed input data time series into a high-dimensional feature space (input matrix) by way of a kernel function and performs a linear regression in this space. SVR requires a special input matrix. The input matrix was produced by wavelet transforms (WT), singular spectrum analysis (SSA), and a chaotic approach (CA) applied to the input time series. WT convolutes the original time series into a series of wavelets, and SSA decomposes the time series into a trend, an oscillatory and a noise component by singular value decomposition. CA uses a phase space formed by trajectories, which represent the dynamics producing the time series. These three methods for producing the input matrix for the SVR proved successful, while the SVR-WT combination resulted in the highest coefficient of determination and the lowest mean absolute error.
Antepartum fetal heart rate feature extraction and classification using empirical mode decomposition and support vector machine

PubMed Central

2011-01-01

Background Cardiotocography (CTG) is the most widely used tool for fetal surveillance. The visual analysis of fetal heart rate (FHR) traces largely depends on the expertise and experience of the clinician involved. Several approaches have been proposed for the effective interpretation of FHR. In this paper, a new approach for FHR feature extraction based on empirical mode decomposition (EMD) is proposed, which was used along with support vector machine (SVM) for the classification of FHR recordings as 'normal' or 'at risk'. Methods The FHR were recorded from 15 subjects at a sampling rate of 4 Hz and a dataset consisting of 90 randomly selected records of 20 minutes duration was formed from these. All records were labelled as 'normal' or 'at risk' by two experienced obstetricians. A training set was formed by 60 records, the remaining 30 left as the testing set. The standard deviations of the EMD components are input as features to a support vector machine (SVM) to classify FHR samples. Results For the training set, a five-fold cross validation test resulted in an accuracy of 86% whereas the overall geometric mean of sensitivity and specificity was 94.8%. The Kappa value for the training set was .923. Application of the proposed method to the testing set (30 records) resulted in a geometric mean of 81.5%. The Kappa value for the testing set was .684. Conclusions Based on the overall performance of the system it can be stated that the proposed methodology is a promising new approach for the feature extraction and classification of FHR signals. PMID:21244712
Scaling Support Vector Machines On Modern HPC Platforms

DOE Office of Scientific and Technical Information (OSTI.GOV)

You, Yang; Fu, Haohuan; Song, Shuaiwen

2015-02-01

We designed and implemented MIC-SVM, a highly efficient parallel SVM for x86 based multicore and many-core architectures, such as the Intel Ivy Bridge CPUs and Intel Xeon Phi co-processor (MIC). We propose various novel analysis methods and optimization techniques to fully utilize the multilevel parallelism provided by these architectures and serve as general optimization methods for other machine learning tools.
Joint Data Management for MOVINT Data-to-Decision Making

DTIC Science & Technology

2011-07-01

flux tensor , aligned motion history images, and related approaches have been shown to be versatile approaches [12, 16, 17, 18]. Scaling these...methods include voting , neural networks, fuzzy logic, neuro-dynamic programming, support vector machines, Bayesian and Dempster-Shafer methods. One way...Information Fusion, 2010. [16] F. Bunyak, K. Palaniappan, S. K. Nath, G. Seetharaman, “Flux tensor constrained geodesic active contours with sensor fusion
A new discriminative kernel from probabilistic models.

PubMed

Tsuda, Koji; Kawanabe, Motoaki; Rätsch, Gunnar; Sonnenburg, Sören; Müller, Klaus-Robert

2002-10-01

Recently, Jaakkola and Haussler (1999) proposed a method for constructing kernel functions from probabilistic models. Their so-called Fisher kernel has been combined with discriminative classifiers such as support vector machines and applied successfully in, for example, DNA and protein analysis. Whereas the Fisher kernel is calculated from the marginal log-likelihood, we propose the TOP kernel derived; from tangent vectors of posterior log-odds. Furthermore, we develop a theoretical framework on feature extractors from probabilistic models and use it for analyzing the TOP kernel. In experiments, our new discriminative TOP kernel compares favorably to the Fisher kernel.
Extraction and Analysis of Mega Cities’ Impervious Surface on Pixel-based and Object-oriented Support Vector Machine Classification Technology: A case of Bombay

NASA Astrophysics Data System (ADS)

Yu, S. S.; Sun, Z. C.; Sun, L.; Wu, M. F.

2017-02-01

The object of this paper is to study the impervious surface extraction method using remote sensing imagery and monitor the spatiotemporal changing patterns of mega cities. Megacity Bombay was selected as the interesting area. Firstly, the pixel-based and object-oriented support vector machine (SVM) classification methods were used to acquire the land use/land cover (LULC) products of Bombay in 2010. Consequently, the overall accuracy (OA) and overall Kappa (OK) of the pixel-based method were 94.97% and 0.96 with a running time of 78 minutes, the OA and OK of the object-oriented method were 93.72% and 0.94 with a running time of only 17s. Additionally, OA and OK of the object-oriented method after a post-classification were improved up to 95.8% and 0.94. Then, the dynamic impervious surfaces of Bombay in the period 1973-2015 were extracted and the urbanization pattern of Bombay was analysed. Results told that both the two SVM classification methods could accomplish the impervious surface extraction, but the object-oriented method should be a better choice. Urbanization of Bombay experienced a fast extending during the past 42 years, implying a dramatically urban sprawl of mega cities in the developing countries along the One Belt and One Road (OBOR).
Sodium Chloride Enhances Recombinant Adeno-Associated Virus Production in a Serum-Free Suspension Manufacturing Platform Using the Herpes Simplex Virus System

PubMed Central

Adamson-Small, Laura; Potter, Mark; Byrne, Barry J.; Clément, Nathalie

2017-01-01

The increase in effective treatments using recombinant adeno-associated viral (rAAV) vectors has underscored the importance of scalable, high-yield manufacturing methods. Previous work from this group reported the use of recombinant herpes simplex virus type 1 (rHSV) vectors to produce rAAV in adherent HEK293 cells, demonstrating the capacity of this system and quality of the product generated. Here we report production and optimization of rAAV using the rHSV system in suspension HEK293 cells (Expi293F) grown in serum and animal component-free medium. Through adjustment of salt concentration in the medium and optimization of infection conditions, titers greater than 1 × 1014 vector genomes per liter (VG/liter) were observed in purified rAAV stocks produced in Expi293F cells. Furthermore, this system allowed for high-titer production of multiple rAAV serotypes (2, 5, and 9) as well as multiple transgenes (green fluorescent protein and acid α-glucosidase). A proportional increase in vector production was observed as this method was scaled, with a final 3-liter shaker flask production yielding an excess of 1 × 1015 VG in crude cell harvests and an average of 3.5 × 1014 total VG of purified rAAV9 material, resulting in greater than 1 × 105 VG/cell. These results support the use of this rHSV-based rAAV production method for large-scale preclinical and clinical vector production. PMID:28117600

Diffuse Interface Methods for Multiclass Segmentation of High-Dimensional Data

DTIC Science & Technology

2014-03-04

handwritten digits , 1998. http://yann.lecun.com/exdb/mnist/. [19] S. Nene, S. Nayar, H. Murase, Columbia Object Image Library (COIL-100), Technical Report... recognition on smartphones using a multiclass hardware-friendly support vector machine, in: Ambient Assisted Living and Home Care, Springer, 2012, pp. 216–223.
SVM-based feature extraction and classification of aflatoxin contaminated corn using fluorescence hyperspectral data

USDA-ARS?s Scientific Manuscript database

Support Vector Machine (SVM) was used in the Genetic Algorithms (GA) process to select and classify a subset of hyperspectral image bands. The method was applied to fluorescence hyperspectral data for the detection of aflatoxin contamination in Aspergillus flavus infected single corn kernels. In the...
Granular support vector machines with association rules mining for protein homology prediction.

PubMed

Tang, Yuchun; Jin, Bo; Zhang, Yan-Qing

2005-01-01

Protein homology prediction between protein sequences is one of critical problems in computational biology. Such a complex classification problem is common in medical or biological information processing applications. How to build a model with superior generalization capability from training samples is an essential issue for mining knowledge to accurately predict/classify unseen new samples and to effectively support human experts to make correct decisions. A new learning model called granular support vector machines (GSVM) is proposed based on our previous work. GSVM systematically and formally combines the principles from statistical learning theory and granular computing theory and thus provides an interesting new mechanism to address complex classification problems. It works by building a sequence of information granules and then building support vector machines (SVM) in some of these information granules on demand. A good granulation method to find suitable granules is crucial for modeling a GSVM with good performance. In this paper, we also propose an association rules-based granulation method. For the granules induced by association rules with high enough confidence and significant support, we leave them as they are because of their high "purity" and significant effect on simplifying the classification task. For every other granule, a SVM is modeled to discriminate the corresponding data. In this way, a complex classification problem is divided into multiple smaller problems so that the learning task is simplified. The proposed algorithm, here named GSVM-AR, is compared with SVM by KDDCUP04 protein homology prediction data. The experimental results show that finding the splitting hyperplane is not a trivial task (we should be careful to select the association rules to avoid overfitting) and GSVM-AR does show significant improvement compared to building one single SVM in the whole feature space. Another advantage is that the utility of GSVM-AR is very good because it is easy to be implemented. More importantly and more interestingly, GSVM provides a new mechanism to address complex classification problems.
Multi-centre diagnostic classification of individual structural neuroimaging scans from patients with major depressive disorder.

PubMed

Mwangi, Benson; Ebmeier, Klaus P; Matthews, Keith; Steele, J Douglas

2012-05-01

Quantitative abnormalities of brain structure in patients with major depressive disorder have been reported at a group level for decades. However, these structural differences appear subtle in comparison with conventional radiologically defined abnormalities, with considerable inter-subject variability. Consequently, it has not been possible to readily identify scans from patients with major depressive disorder at an individual level. Recently, machine learning techniques such as relevance vector machines and support vector machines have been applied to predictive classification of individual scans with variable success. Here we describe a novel hybrid method, which combines machine learning with feature selection and characterization, with the latter aimed at maximizing the accuracy of machine learning prediction. The method was tested using a multi-centre dataset of T(1)-weighted 'structural' scans. A total of 62 patients with major depressive disorder and matched controls were recruited from referred secondary care clinical populations in Aberdeen and Edinburgh, UK. The generalization ability and predictive accuracy of the classifiers was tested using data left out of the training process. High prediction accuracy was achieved (~90%). While feature selection was important for maximizing high predictive accuracy with machine learning, feature characterization contributed only a modest improvement to relevance vector machine-based prediction (~5%). Notably, while the only information provided for training the classifiers was T(1)-weighted scans plus a categorical label (major depressive disorder versus controls), both relevance vector machine and support vector machine 'weighting factors' (used for making predictions) correlated strongly with subjective ratings of illness severity. These results indicate that machine learning techniques have the potential to inform clinical practice and research, as they can make accurate predictions about brain scan data from individual subjects. Furthermore, machine learning weighting factors may reflect an objective biomarker of major depressive disorder illness severity, based on abnormalities of brain structure.
Application of the support vector machine to predict subclinical mastitis in dairy cattle.

PubMed

Mammadova, Nazira; Keskin, Ismail

2013-01-01

This study presented a potentially useful alternative approach to ascertain the presence of subclinical and clinical mastitis in dairy cows using support vector machine (SVM) techniques. The proposed method detected mastitis in a cross-sectional representative sample of Holstein dairy cattle milked using an automatic milking system. The study used such suspected indicators of mastitis as lactation rank, milk yield, electrical conductivity, average milking duration, and control season as input data. The output variable was somatic cell counts obtained from milk samples collected monthly throughout the 15 months of the control period. Cattle were judged to be healthy or infected based on those somatic cell counts. This study undertook a detailed scrutiny of the SVM methodology, constructing and examining a model which showed 89% sensitivity, 92% specificity, and 50% error in mastitis detection.
Soft-sensing model of temperature for aluminum reduction cell on improved twin support vector regression

NASA Astrophysics Data System (ADS)

Li, Tao

2018-06-01

The complexity of aluminum electrolysis process leads the temperature for aluminum reduction cells hard to measure directly. However, temperature is the control center of aluminum production. To solve this problem, combining some aluminum plant's practice data, this paper presents a Soft-sensing model of temperature for aluminum electrolysis process on Improved Twin Support Vector Regression (ITSVR). ITSVR eliminates the slow learning speed of Support Vector Regression (SVR) and the over-fit risk of Twin Support Vector Regression (TSVR) by introducing a regularization term into the objective function of TSVR, which ensures the structural risk minimization principle and lower computational complexity. Finally, the model with some other parameters as auxiliary variable, predicts the temperature by ITSVR. The simulation result shows Soft-sensing model based on ITSVR has short time-consuming and better generalization.
Arbitrary norm support vector machines.

PubMed

Huang, Kaizhu; Zheng, Danian; King, Irwin; Lyu, Michael R

2009-02-01

Support vector machines (SVM) are state-of-the-art classifiers. Typically L2-norm or L1-norm is adopted as a regularization term in SVMs, while other norm-based SVMs, for example, the L0-norm SVM or even the L(infinity)-norm SVM, are rarely seen in the literature. The major reason is that L0-norm describes a discontinuous and nonconvex term, leading to a combinatorially NP-hard optimization problem. In this letter, motivated by Bayesian learning, we propose a novel framework that can implement arbitrary norm-based SVMs in polynomial time. One significant feature of this framework is that only a sequence of sequential minimal optimization problems needs to be solved, thus making it practical in many real applications. The proposed framework is important in the sense that Bayesian priors can be efficiently plugged into most learning methods without knowing the explicit form. Hence, this builds a connection between Bayesian learning and the kernel machines. We derive the theoretical framework, demonstrate how our approach works on the L0-norm SVM as a typical example, and perform a series of experiments to validate its advantages. Experimental results on nine benchmark data sets are very encouraging. The implemented L0-norm is competitive with or even better than the standard L2-norm SVM in terms of accuracy but with a reduced number of support vectors, -9.46% of the number on average. When compared with another sparse model, the relevance vector machine, our proposed algorithm also demonstrates better sparse properties with a training speed over seven times faster.
SFM: A novel sequence-based fusion method for disease genes identification and prioritization.

PubMed

Yousef, Abdulaziz; Moghadam Charkari, Nasrollah

2015-10-21

The identification of disease genes from human genome is of great importance to improve diagnosis and treatment of disease. Several machine learning methods have been introduced to identify disease genes. However, these methods mostly differ in the prior knowledge used to construct the feature vector for each instance (gene), the ways of selecting negative data (non-disease genes) where there is no investigational approach to find them and the classification methods used to make the final decision. In this work, a novel Sequence-based fusion method (SFM) is proposed to identify disease genes. In this regard, unlike existing methods, instead of using a noisy and incomplete prior-knowledge, the amino acid sequence of the proteins which is universal data has been carried out to present the genes (proteins) into four different feature vectors. To select more likely negative data from candidate genes, the intersection set of four negative sets which are generated using distance approach is considered. Then, Decision Tree (C4.5) has been applied as a fusion method to combine the results of four independent state-of the-art predictors based on support vector machine (SVM) algorithm, and to make the final decision. The experimental results of the proposed method have been evaluated by some standard measures. The results indicate the precision, recall and F-measure of 82.6%, 85.6% and 84, respectively. These results confirm the efficiency and validity of the proposed method. Copyright © 2015 Elsevier Ltd. All rights reserved.
Ecological footprint model using the support vector machine technique.

PubMed

Ma, Haibo; Chang, Wenjuan; Cui, Guangbai

2012-01-01

The per capita ecological footprint (EF) is one of the most widely recognized measures of environmental sustainability. It aims to quantify the Earth's biological resources required to support human activity. In this paper, we summarize relevant previous literature, and present five factors that influence per capita EF. These factors are: National gross domestic product (GDP), urbanization (independent of economic development), distribution of income (measured by the Gini coefficient), export dependence (measured by the percentage of exports to total GDP), and service intensity (measured by the percentage of service to total GDP). A new ecological footprint model based on a support vector machine (SVM), which is a machine-learning method based on the structural risk minimization principle from statistical learning theory was conducted to calculate the per capita EF of 24 nations using data from 123 nations. The calculation accuracy was measured by average absolute error and average relative error. They were 0.004883 and 0.351078% respectively. Our results demonstrate that the EF model based on SVM has good calculation performance.
Predicting asthma exacerbations using artificial intelligence.

PubMed

Finkelstein, Joseph; Wood, Jeffrey

2013-01-01

Modern telemonitoring systems identify a serious patient deterioration when it already occurred. It would be much more beneficial if the upcoming clinical deterioration were identified ahead of time even before a patient actually experiences it. The goal of this study was to assess artificial intelligence approaches which potentially can be used in telemonitoring systems for advance prediction of changes in disease severity before they actually occur. The study dataset was based on daily self-reports submitted by 26 adult asthma patients during home telemonitoring consisting of 7001 records. Two classification algorithms were employed for building predictive models: naïve Bayesian classifier and support vector machines. Using a 7-day window, a support vector machine was able to predict asthma exacerbation to occur on the day 8 with the accuracy of 0.80, sensitivity of 0.84 and specificity of 0.80. Our study showed that methods of artificial intelligence have significant potential in developing individualized decision support for chronic disease telemonitoring systems.
Breast Cancer Detection with Reduced Feature Set.

PubMed

Mert, Ahmet; Kılıç, Niyazi; Bilgili, Erdem; Akan, Aydin

2015-01-01

This paper explores feature reduction properties of independent component analysis (ICA) on breast cancer decision support system. Wisconsin diagnostic breast cancer (WDBC) dataset is reduced to one-dimensional feature vector computing an independent component (IC). The original data with 30 features and reduced one feature (IC) are used to evaluate diagnostic accuracy of the classifiers such as k-nearest neighbor (k-NN), artificial neural network (ANN), radial basis function neural network (RBFNN), and support vector machine (SVM). The comparison of the proposed classification using the IC with original feature set is also tested on different validation (5/10-fold cross-validations) and partitioning (20%-40%) methods. These classifiers are evaluated how to effectively categorize tumors as benign and malignant in terms of specificity, sensitivity, accuracy, F-score, Youden's index, discriminant power, and the receiver operating characteristic (ROC) curve with its criterion values including area under curve (AUC) and 95% confidential interval (CI). This represents an improvement in diagnostic decision support system, while reducing computational complexity.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Feldman, Steven; Valera-Leon, Carlos; Dechev, Damian

The vector is a fundamental data structure, which provides constant-time access to a dynamically-resizable range of elements. Currently, there exist no wait-free vectors. The only non-blocking version supports only a subset of the sequential vector API and exhibits significant synchronization overhead caused by supporting opposing operations. Since many applications operate in phases of execution, wherein each phase only a subset of operations are used, this overhead is unnecessary for the majority of the application. To address the limitations of the non-blocking version, we present a new design that is wait-free, supports more of the operations provided by the sequential vector,more » and provides alternative implementations of key operations. These alternatives allow the developer to balance the performance and functionality of the vector as requirements change throughout execution. Compared to the known non-blocking version and the concurrent vector found in Intel’s TBB library, our design outperforms or provides comparable performance in the majority of tested scenarios. Over all tested scenarios, the presented design performs an average of 4.97 times more operations per second than the non-blocking vector and 1.54 more than the TBB vector. In a scenario designed to simulate the filling of a vector, performance improvement increases to 13.38 and 1.16 times. This work presents the first ABA-free non-blocking vector. Finally, unlike the other non-blocking approach, all operations are wait-free and bounds-checked and elements are stored contiguously in memory.« less
Gas chimney detection based on improving the performance of combined multilayer perceptron and support vector classifier

NASA Astrophysics Data System (ADS)

Hashemi, H.; Tax, D. M. J.; Duin, R. P. W.; Javaherian, A.; de Groot, P.

2008-11-01

Seismic object detection is a relatively new field in which 3-D bodies are visualized and spatial relationships between objects of different origins are studied in order to extract geologic information. In this paper, we propose a method for finding an optimal classifier with the help of a statistical feature ranking technique and combining different classifiers. The method, which has general applicability, is demonstrated here on a gas chimney detection problem. First, we evaluate a set of input seismic attributes extracted at locations labeled by a human expert using regularized discriminant analysis (RDA). In order to find the RDA score for each seismic attribute, forward and backward search strategies are used. Subsequently, two non-linear classifiers: multilayer perceptron (MLP) and support vector classifier (SVC) are run on the ranked seismic attributes. Finally, to capitalize on the intrinsic differences between both classifiers, the MLP and SVC results are combined using logical rules of maximum, minimum and mean. The proposed method optimizes the ranked feature space size and yields the lowest classification error in the final combined result. We will show that the logical minimum reveals gas chimneys that exhibit both the softness of MLP and the resolution of SVC classifiers.
Automatic detection of sleep apnea based on EEG detrended fluctuation analysis and support vector machine.

PubMed

Zhou, Jing; Wu, Xiao-ming; Zeng, Wei-jie

2015-12-01

Sleep apnea syndrome (SAS) is prevalent in individuals and recently, there are many studies focus on using simple and efficient methods for SAS detection instead of polysomnography. However, not much work has been done on using nonlinear behavior of the electroencephalogram (EEG) signals. The purpose of this study is to find a novel and simpler method for detecting apnea patients and to quantify nonlinear characteristics of the sleep apnea. 30 min EEG scaling exponents that quantify power-law correlations were computed using detrended fluctuation analysis (DFA) and compared between six SAS and six healthy subjects during sleep. The mean scaling exponents were calculated every 30 s and 360 control values and 360 apnea values were obtained. These values were compared between the two groups and support vector machine (SVM) was used to classify apnea patients. Significant difference was found between EEG scaling exponents of the two groups (p < 0.001). SVM was used and obtained high and consistent recognition rate: average classification accuracy reached 95.1% corresponding to the sensitivity 93.2% and specificity 98.6%. DFA of EEG is an efficient and practicable method and is helpful clinically in diagnosis of sleep apnea.
Improving near-infrared prediction model robustness with support vector machine regression: a pharmaceutical tablet assay example.

PubMed

Igne, Benoît; Drennen, James K; Anderson, Carl A

2014-01-01

Changes in raw materials and process wear and tear can have significant effects on the prediction error of near-infrared calibration models. When the variability that is present during routine manufacturing is not included in the calibration, test, and validation sets, the long-term performance and robustness of the model will be limited. Nonlinearity is a major source of interference. In near-infrared spectroscopy, nonlinearity can arise from light path-length differences that can come from differences in particle size or density. The usefulness of support vector machine (SVM) regression to handle nonlinearity and improve the robustness of calibration models in scenarios where the calibration set did not include all the variability present in test was evaluated. Compared to partial least squares (PLS) regression, SVM regression was less affected by physical (particle size) and chemical (moisture) differences. The linearity of the SVM predicted values was also improved. Nevertheless, although visualization and interpretation tools have been developed to enhance the usability of SVM-based methods, work is yet to be done to provide chemometricians in the pharmaceutical industry with a regression method that can supplement PLS-based methods.
Least-Squares Support Vector Machine Approach to Viral Replication Origin Prediction

PubMed Central

Cruz-Cano, Raul; Chew, David S.H.; Kwok-Pui, Choi; Ming-Ying, Leung

2010-01-01

Replication of their DNA genomes is a central step in the reproduction of many viruses. Procedures to find replication origins, which are initiation sites of the DNA replication process, are therefore of great importance for controlling the growth and spread of such viruses. Existing computational methods for viral replication origin prediction have mostly been tested within the family of herpesviruses. This paper proposes a new approach by least-squares support vector machines (LS-SVMs) and tests its performance not only on the herpes family but also on a collection of caudoviruses coming from three viral families under the order of caudovirales. The LS-SVM approach provides sensitivities and positive predictive values superior or comparable to those given by the previous methods. When suitably combined with previous methods, the LS-SVM approach further improves the prediction accuracy for the herpesvirus replication origins. Furthermore, by recursive feature elimination, the LS-SVM has also helped find the most significant features of the data sets. The results suggest that the LS-SVMs will be a highly useful addition to the set of computational tools for viral replication origin prediction and illustrate the value of optimization-based computing techniques in biomedical applications. PMID:20729987
Least-Squares Support Vector Machine Approach to Viral Replication Origin Prediction.

PubMed

Cruz-Cano, Raul; Chew, David S H; Kwok-Pui, Choi; Ming-Ying, Leung

2010-06-01

Replication of their DNA genomes is a central step in the reproduction of many viruses. Procedures to find replication origins, which are initiation sites of the DNA replication process, are therefore of great importance for controlling the growth and spread of such viruses. Existing computational methods for viral replication origin prediction have mostly been tested within the family of herpesviruses. This paper proposes a new approach by least-squares support vector machines (LS-SVMs) and tests its performance not only on the herpes family but also on a collection of caudoviruses coming from three viral families under the order of caudovirales. The LS-SVM approach provides sensitivities and positive predictive values superior or comparable to those given by the previous methods. When suitably combined with previous methods, the LS-SVM approach further improves the prediction accuracy for the herpesvirus replication origins. Furthermore, by recursive feature elimination, the LS-SVM has also helped find the most significant features of the data sets. The results suggest that the LS-SVMs will be a highly useful addition to the set of computational tools for viral replication origin prediction and illustrate the value of optimization-based computing techniques in biomedical applications.
Using support vector machines with tract-based spatial statistics for automated classification of Tourette syndrome children

NASA Astrophysics Data System (ADS)

Wen, Hongwei; Liu, Yue; Wang, Jieqiong; Zhang, Jishui; Peng, Yun; He, Huiguang

2016-03-01

Tourette syndrome (TS) is a developmental neuropsychiatric disorder with the cardinal symptoms of motor and vocal tics which emerges in early childhood and fluctuates in severity in later years. To date, the neural basis of TS is not fully understood yet and TS has a long-term prognosis that is difficult to accurately estimate. Few studies have looked at the potential of using diffusion tensor imaging (DTI) in conjunction with machine learning algorithms in order to automate the classification of healthy children and TS children. Here we apply Tract-Based Spatial Statistics (TBSS) method to 44 TS children and 48 age and gender matched healthy children in order to extract the diffusion values from each voxel in the white matter (WM) skeleton, and a feature selection algorithm (ReliefF) was used to select the most salient voxels for subsequent classification with support vector machine (SVM). We use a nested cross validation to yield an unbiased assessment of the classification method and prevent overestimation. The accuracy (88.04%), sensitivity (88.64%) and specificity (87.50%) were achieved in our method as peak performance of the SVM classifier was achieved using the axial diffusion (AD) metric, demonstrating the potential of a joint TBSS and SVM pipeline for fast, objective classification of healthy and TS children. These results support that our methods may be useful for the early identification of subjects with TS, and hold promise for predicting prognosis and treatment outcome for individuals with TS.
Obstacle Recognition Based on Machine Learning for On-Chip LiDAR Sensors in a Cyber-Physical System

PubMed Central

Beruvides, Gerardo

2017-01-01

Collision avoidance is an important feature in advanced driver-assistance systems, aimed at providing correct, timely and reliable warnings before an imminent collision (with objects, vehicles, pedestrians, etc.). The obstacle recognition library is designed and implemented to address the design and evaluation of obstacle detection in a transportation cyber-physical system. The library is integrated into a co-simulation framework that is supported on the interaction between SCANeR software and Matlab/Simulink. From the best of the authors’ knowledge, two main contributions are reported in this paper. Firstly, the modelling and simulation of virtual on-chip light detection and ranging sensors in a cyber-physical system, for traffic scenarios, is presented. The cyber-physical system is designed and implemented in SCANeR. Secondly, three specific artificial intelligence-based methods for obstacle recognition libraries are also designed and applied using a sensory information database provided by SCANeR. The computational library has three methods for obstacle detection: a multi-layer perceptron neural network, a self-organization map and a support vector machine. Finally, a comparison among these methods under different weather conditions is presented, with very promising results in terms of accuracy. The best results are achieved using the multi-layer perceptron in sunny and foggy conditions, the support vector machine in rainy conditions and the self-organized map in snowy conditions. PMID:28906450
Obstacle Recognition Based on Machine Learning for On-Chip LiDAR Sensors in a Cyber-Physical System.

PubMed

Castaño, Fernando; Beruvides, Gerardo; Haber, Rodolfo E; Artuñedo, Antonio

2017-09-14

Collision avoidance is an important feature in advanced driver-assistance systems, aimed at providing correct, timely and reliable warnings before an imminent collision (with objects, vehicles, pedestrians, etc.). The obstacle recognition library is designed and implemented to address the design and evaluation of obstacle detection in a transportation cyber-physical system. The library is integrated into a co-simulation framework that is supported on the interaction between SCANeR software and Matlab/Simulink. From the best of the authors' knowledge, two main contributions are reported in this paper. Firstly, the modelling and simulation of virtual on-chip light detection and ranging sensors in a cyber-physical system, for traffic scenarios, is presented. The cyber-physical system is designed and implemented in SCANeR. Secondly, three specific artificial intelligence-based methods for obstacle recognition libraries are also designed and applied using a sensory information database provided by SCANeR. The computational library has three methods for obstacle detection: a multi-layer perceptron neural network, a self-organization map and a support vector machine. Finally, a comparison among these methods under different weather conditions is presented, with very promising results in terms of accuracy. The best results are achieved using the multi-layer perceptron in sunny and foggy conditions, the support vector machine in rainy conditions and the self-organized map in snowy conditions.

Comparison between refraction measured by Spot Vision ScreeningTM and subjective clinical refractometry

PubMed Central

de Jesus, Daniela Lima; Villela, Flávio Fernandes; Orlandin, Luis Fernando; Eiji, Fernando Naves; Dantas, Daniel Oliveira; Alves, Milton Ruiz

2016-01-01

OBJECTIVE: The purpose of this study was to evaluate the accuracy of Spot Vision ScreeningTM as an autorefractor by comparing refraction measurements to subjective clinical refractometry results in children and adult patients. METHODS: One-hundred and thirty-four eyes of 134 patients were submitted to refractometry by Spot and clinical refractometry under cycloplegia. Patients, students, physicians, staff and children of staff from the Hospital das Clínicas (School of Medicine, University of São Paulo) aged 7-50 years without signs of ocular disease were examined. Only right-eye refraction data were analyzed. The findings were converted in magnitude vectors for analysis. RESULTS: The difference between Spot Vision ScreeningTM and subjective clinical refractometry expressed in spherical equivalents was +0.66±0.56 diopters (D), +0.16±0.27 D for the vector projected on the 90 axis and +0.02±0.15 D for the oblique vector. CONCLUSIONS: Despite the statistical significance of the difference between the two methods, we consider the difference non-relevant in a clinical setting, supporting the use of Spot Vision ScreeningTM as an ancillary method for estimating refraction. PMID:26934234
Emotion recognition based on multiple order features using fractional Fourier transform

NASA Astrophysics Data System (ADS)

Ren, Bo; Liu, Deyin; Qi, Lin

2017-07-01

In order to deal with the insufficiency of recently algorithms based on Two Dimensions Fractional Fourier Transform (2D-FrFT), this paper proposes a multiple order features based method for emotion recognition. Most existing methods utilize the feature of single order or a couple of orders of 2D-FrFT. However, different orders of 2D-FrFT have different contributions on the feature extraction of emotion recognition. Combination of these features can enhance the performance of an emotion recognition system. The proposed approach obtains numerous features that extracted in different orders of 2D-FrFT in the directions of x-axis and y-axis, and uses the statistical magnitudes as the final feature vectors for recognition. The Support Vector Machine (SVM) is utilized for the classification and RML Emotion database and Cohn-Kanade (CK) database are used for the experiment. The experimental results demonstrate the effectiveness of the proposed method.
Detrended cross-correlation coefficient: Application to predict apoptosis protein subcellular localization.

PubMed

Liang, Yunyun; Liu, Sanyang; Zhang, Shengli

2016-12-01

Apoptosis, or programed cell death, plays a central role in the development and homeostasis of an organism. Obtaining information on subcellular location of apoptosis proteins is very helpful for understanding the apoptosis mechanism. The prediction of subcellular localization of an apoptosis protein is still a challenging task, and existing methods mainly based on protein primary sequences. In this paper, we introduce a new position-specific scoring matrix (PSSM)-based method by using detrended cross-correlation (DCCA) coefficient of non-overlapping windows. Then a 190-dimensional (190D) feature vector is constructed on two widely used datasets: CL317 and ZD98, and support vector machine is adopted as classifier. To evaluate the proposed method, objective and rigorous jackknife cross-validation tests are performed on the two datasets. The results show that our approach offers a novel and reliable PSSM-based tool for prediction of apoptosis protein subcellular localization. Copyright © 2016 Elsevier Inc. All rights reserved.
A Wavelet Support Vector Machine Combination Model for Singapore Tourist Arrival to Malaysia

NASA Astrophysics Data System (ADS)

Rafidah, A.; Shabri, Ani; Nurulhuda, A.; Suhaila, Y.

2017-08-01

In this study, wavelet support vector machine model (WSVM) is proposed and applied for monthly data Singapore tourist time series prediction. The WSVM model is combination between wavelet analysis and support vector machine (SVM). In this study, we have two parts, first part we compare between the kernel function and second part we compare between the developed models with single model, SVM. The result showed that kernel function linear better than RBF while WSVM outperform with single model SVM to forecast monthly Singapore tourist arrival to Malaysia.
Prediction of pH of cola beverage using Vis/NIR spectroscopy and least squares-support vector machine

NASA Astrophysics Data System (ADS)

Liu, Fei; He, Yong

2008-02-01

Visible and near infrared (Vis/NIR) transmission spectroscopy and chemometric methods were utilized to predict the pH values of cola beverages. Five varieties of cola were prepared and 225 samples (45 samples for each variety) were selected for the calibration set, while 75 samples (15 samples for each variety) for the validation set. The smoothing way of Savitzky-Golay and standard normal variate (SNV) followed by first-derivative were used as the pre-processing methods. Partial least squares (PLS) analysis was employed to extract the principal components (PCs) which were used as the inputs of least squares-support vector machine (LS-SVM) model according to their accumulative reliabilities. Then LS-SVM with radial basis function (RBF) kernel function and a two-step grid search technique were applied to build the regression model with a comparison of PLS regression. The correlation coefficient (r), root mean square error of prediction (RMSEP) and bias were 0.961, 0.040 and 0.012 for PLS, while 0.975, 0.031 and 4.697x10 -3 for LS-SVM, respectively. Both methods obtained a satisfying precision. The results indicated that Vis/NIR spectroscopy combined with chemometric methods could be applied as an alternative way for the prediction of pH of cola beverages.
A Fault Alarm and Diagnosis Method Based on Sensitive Parameters and Support Vector Machine

NASA Astrophysics Data System (ADS)

Zhang, Jinjie; Yao, Ziyun; Lv, Zhiquan; Zhu, Qunxiong; Xu, Fengtian; Jiang, Zhinong

2015-08-01

Study on the extraction of fault feature and the diagnostic technique of reciprocating compressor is one of the hot research topics in the field of reciprocating machinery fault diagnosis at present. A large number of feature extraction and classification methods have been widely applied in the related research, but the practical fault alarm and the accuracy of diagnosis have not been effectively improved. Developing feature extraction and classification methods to meet the requirements of typical fault alarm and automatic diagnosis in practical engineering is urgent task. The typical mechanical faults of reciprocating compressor are presented in the paper, and the existing data of online monitoring system is used to extract fault feature parameters within 15 types in total; the inner sensitive connection between faults and the feature parameters has been made clear by using the distance evaluation technique, also sensitive characteristic parameters of different faults have been obtained. On this basis, a method based on fault feature parameters and support vector machine (SVM) is developed, which will be applied to practical fault diagnosis. A better ability of early fault warning has been proved by the experiment and the practical fault cases. Automatic classification by using the SVM to the data of fault alarm has obtained better diagnostic accuracy.
A Non-Parametric Approach for the Activation Detection of Block Design fMRI Simulated Data Using Self-Organizing Maps and Support Vector Machine.

PubMed

Bahrami, Sheyda; Shamsi, Mousa

2017-01-01

Functional magnetic resonance imaging (fMRI) is a popular method to probe the functional organization of the brain using hemodynamic responses. In this method, volume images of the entire brain are obtained with a very good spatial resolution and low temporal resolution. However, they always suffer from high dimensionality in the face of classification algorithms. In this work, we combine a support vector machine (SVM) with a self-organizing map (SOM) for having a feature-based classification by using SVM. Then, a linear kernel SVM is used for detecting the active areas. Here, we use SOM for feature extracting and labeling the datasets. SOM has two major advances: (i) it reduces dimension of data sets for having less computational complexity and (ii) it is useful for identifying brain regions with small onset differences in hemodynamic responses. Our non-parametric model is compared with parametric and non-parametric methods. We use simulated fMRI data sets and block design inputs in this paper and consider the contrast to noise ratio (CNR) value equal to 0.6 for simulated datasets. fMRI simulated dataset has contrast 1-4% in active areas. The accuracy of our proposed method is 93.63% and the error rate is 6.37%.
DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest.

PubMed

Manavalan, Balachandran; Shin, Tae Hwan; Lee, Gwang

2018-01-05

DNase I hypersensitive sites (DHSs) are genomic regions that provide important information regarding the presence of transcriptional regulatory elements and the state of chromatin. Therefore, identifying DHSs in uncharacterized DNA sequences is crucial for understanding their biological functions and mechanisms. Although many experimental methods have been proposed to identify DHSs, they have proven to be expensive for genome-wide application. Therefore, it is necessary to develop computational methods for DHS prediction. In this study, we proposed a support vector machine (SVM)-based method for predicting DHSs, called DHSpred (DNase I Hypersensitive Site predictor in human DNA sequences), which was trained with 174 optimal features. The optimal combination of features was identified from a large set that included nucleotide composition and di- and trinucleotide physicochemical properties, using a random forest algorithm. DHSpred achieved a Matthews correlation coefficient and accuracy of 0.660 and 0.871, respectively, which were 3% higher than those of control SVM predictors trained with non-optimized features, indicating the efficiency of the feature selection method. Furthermore, the performance of DHSpred was superior to that of state-of-the-art predictors. An online prediction server has been developed to assist the scientific community, and is freely available at: http://www.thegleelab.org/DHSpred.html.
DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest

PubMed Central

Manavalan, Balachandran; Shin, Tae Hwan; Lee, Gwang

2018-01-01

DNase I hypersensitive sites (DHSs) are genomic regions that provide important information regarding the presence of transcriptional regulatory elements and the state of chromatin. Therefore, identifying DHSs in uncharacterized DNA sequences is crucial for understanding their biological functions and mechanisms. Although many experimental methods have been proposed to identify DHSs, they have proven to be expensive for genome-wide application. Therefore, it is necessary to develop computational methods for DHS prediction. In this study, we proposed a support vector machine (SVM)-based method for predicting DHSs, called DHSpred (DNase I Hypersensitive Site predictor in human DNA sequences), which was trained with 174 optimal features. The optimal combination of features was identified from a large set that included nucleotide composition and di- and trinucleotide physicochemical properties, using a random forest algorithm. DHSpred achieved a Matthews correlation coefficient and accuracy of 0.660 and 0.871, respectively, which were 3% higher than those of control SVM predictors trained with non-optimized features, indicating the efficiency of the feature selection method. Furthermore, the performance of DHSpred was superior to that of state-of-the-art predictors. An online prediction server has been developed to assist the scientific community, and is freely available at: http://www.thegleelab.org/DHSpred.html PMID:29416743
High-order distance-based multiview stochastic learning in image classification.

PubMed

Yu, Jun; Rui, Yong; Tang, Yuan Yan; Tao, Dacheng

2014-12-01

How do we find all images in a larger set of images which have a specific content? Or estimate the position of a specific object relative to the camera? Image classification methods, like support vector machine (supervised) and transductive support vector machine (semi-supervised), are invaluable tools for the applications of content-based image retrieval, pose estimation, and optical character recognition. However, these methods only can handle the images represented by single feature. In many cases, different features (or multiview data) can be obtained, and how to efficiently utilize them is a challenge. It is inappropriate for the traditionally concatenating schema to link features of different views into a long vector. The reason is each view has its specific statistical property and physical interpretation. In this paper, we propose a high-order distance-based multiview stochastic learning (HD-MSL) method for image classification. HD-MSL effectively combines varied features into a unified representation and integrates the labeling information based on a probabilistic framework. In comparison with the existing strategies, our approach adopts the high-order distance obtained from the hypergraph to replace pairwise distance in estimating the probability matrix of data distribution. In addition, the proposed approach can automatically learn a combination coefficient for each view, which plays an important role in utilizing the complementary information of multiview data. An alternative optimization is designed to solve the objective functions of HD-MSL and obtain different views on coefficients and classification scores simultaneously. Experiments on two real world datasets demonstrate the effectiveness of HD-MSL in image classification.
TriatoKey: a web and mobile tool for biodiversity identification of Brazilian triatomine species

PubMed Central

Márcia de Oliveira, Luciana; Nogueira de Brito, Raissa; Anderson Souza Guimarães, Paul; Vitor Mastrângelo Amaro dos Santos, Rômulo; Gonçalves Diotaiuti, Liléia; de Cássia Moreira de Souza, Rita

2017-01-01

Abstract Triatomines are blood-sucking insects that transmit the causative agent of Chagas disease, Trypanosoma cruzi. Despite being recognized as a difficult task, the correct taxonomic identification of triatomine species is crucial for vector control in Latin America, where the disease is endemic. In this context, we have developed a web and mobile tool based on PostgreSQL database to help healthcare technicians to overcome the difficulties to identify triatomine vectors when the technical expertise is missing. The web and mobile version makes use of real triatomine species pictures and dichotomous key method to support the identification of potential vectors that occur in Brazil. It provides a user example-driven interface with simple language. TriatoKey can also be useful for educational purposes. Database URL: http://triatokey.cpqrr.fiocruz.br PMID:28605769
Predicting primary progressive aphasias with support vector machine approaches in structural MRI data.

PubMed

Bisenius, Sandrine; Mueller, Karsten; Diehl-Schmid, Janine; Fassbender, Klaus; Grimmer, Timo; Jessen, Frank; Kassubek, Jan; Kornhuber, Johannes; Landwehrmeyer, Bernhard; Ludolph, Albert; Schneider, Anja; Anderl-Straub, Sarah; Stuke, Katharina; Danek, Adrian; Otto, Markus; Schroeter, Matthias L

2017-01-01

Primary progressive aphasia (PPA) encompasses the three subtypes nonfluent/agrammatic variant PPA, semantic variant PPA, and the logopenic variant PPA, which are characterized by distinct patterns of language difficulties and regional brain atrophy. To validate the potential of structural magnetic resonance imaging data for early individual diagnosis, we used support vector machine classification on grey matter density maps obtained by voxel-based morphometry analysis to discriminate PPA subtypes (44 patients: 16 nonfluent/agrammatic variant PPA, 17 semantic variant PPA, 11 logopenic variant PPA) from 20 healthy controls (matched for sample size, age, and gender) in the cohort of the multi-center study of the German consortium for frontotemporal lobar degeneration. Here, we compared a whole-brain with a meta-analysis-based disease-specific regions-of-interest approach for support vector machine classification. We also used support vector machine classification to discriminate the three PPA subtypes from each other. Whole brain support vector machine classification enabled a very high accuracy between 91 and 97% for identifying specific PPA subtypes vs. healthy controls, and 78/95% for the discrimination between semantic variant vs. nonfluent/agrammatic or logopenic PPA variants. Only for the discrimination between nonfluent/agrammatic and logopenic PPA variants accuracy was low with 55%. Interestingly, the regions that contributed the most to the support vector machine classification of patients corresponded largely to the regions that were atrophic in these patients as revealed by group comparisons. Although the whole brain approach took also into account regions that were not covered in the regions-of-interest approach, both approaches showed similar accuracies due to the disease-specificity of the selected networks. Conclusion, support vector machine classification of multi-center structural magnetic resonance imaging data enables prediction of PPA subtypes with a very high accuracy paving the road for its application in clinical settings.
A Fast Reduced Kernel Extreme Learning Machine.

PubMed

Deng, Wan-Yu; Ong, Yew-Soon; Zheng, Qing-Hua

2016-04-01

In this paper, we present a fast and accurate kernel-based supervised algorithm referred to as the Reduced Kernel Extreme Learning Machine (RKELM). In contrast to the work on Support Vector Machine (SVM) or Least Square SVM (LS-SVM), which identifies the support vectors or weight vectors iteratively, the proposed RKELM randomly selects a subset of the available data samples as support vectors (or mapping samples). By avoiding the iterative steps of SVM, significant cost savings in the training process can be readily attained, especially on Big datasets. RKELM is established based on the rigorous proof of universal learning involving reduced kernel-based SLFN. In particular, we prove that RKELM can approximate any nonlinear functions accurately under the condition of support vectors sufficiency. Experimental results on a wide variety of real world small instance size and large instance size applications in the context of binary classification, multi-class problem and regression are then reported to show that RKELM can perform at competitive level of generalized performance as the SVM/LS-SVM at only a fraction of the computational effort incurred. Copyright © 2015 Elsevier Ltd. All rights reserved.
Support vector machines

NASA Technical Reports Server (NTRS)

Garay, Michael J.; Mazzoni, Dominic; Davies, Roger; Wagstaff, Kiri

2004-01-01

Support Vector Machines (SVMs) are a type of supervised learning algorith,, other examples of which are Artificial Neural Networks (ANNs), Decision Trees, and Naive Bayesian Classifiers. Supervised learning algorithms are used to classify objects labled by a 'supervisor' - typically a human 'expert.'.
Dropout Prediction in E-Learning Courses through the Combination of Machine Learning Techniques

ERIC Educational Resources Information Center

Lykourentzou, Ioanna; Giannoukos, Ioannis; Nikolopoulos, Vassilis; Mpardis, George; Loumos, Vassili

2009-01-01

In this paper, a dropout prediction method for e-learning courses, based on three popular machine learning techniques and detailed student data, is proposed. The machine learning techniques used are feed-forward neural networks, support vector machines and probabilistic ensemble simplified fuzzy ARTMAP. Since a single technique may fail to…
Automatic Cataloguing and Searching for Retrospective Data by Use of OCR Text.

ERIC Educational Resources Information Center

Tseng, Yuen-Hsien

2001-01-01

Describes efforts in supporting information retrieval from OCR (optical character recognition) degraded text. Reports on approaches used in an automatic cataloging and searching contest for books in multiple languages, including a vector space retrieval model, an n-gram indexing method, and a weighting scheme; and discusses problems of Asian…
Estimating top-of-atmosphere thermal infrared radiance using MERRA-2 atmospheric data

NASA Astrophysics Data System (ADS)

Kleynhans, Tania; Montanaro, Matthew; Gerace, Aaron; Kanan, Christopher

2017-05-01

Thermal infrared satellite images have been widely used in environmental studies. However, satellites have limited temporal resolution, e.g., 16 day Landsat or 1 to 2 day Terra MODIS. This paper investigates the use of the Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) reanalysis data product, produced by NASA's Global Modeling and Assimilation Office (GMAO) to predict global topof-atmosphere (TOA) thermal infrared radiance. The high temporal resolution of the MERRA-2 data product presents opportunities for novel research and applications. Various methods were applied to estimate TOA radiance from MERRA-2 variables namely (1) a parameterized physics based method, (2) Linear regression models and (3) non-linear Support Vector Regression. Model prediction accuracy was evaluated using temporally and spatially coincident Moderate Resolution Imaging Spectroradiometer (MODIS) thermal infrared data as reference data. This research found that Support Vector Regression with a radial basis function kernel produced the lowest error rates. Sources of errors are discussed and defined. Further research is currently being conducted to train deep learning models to predict TOA thermal radiance
Application of structured support vector machine backpropagation to a convolutional neural network for human pose estimation.

PubMed

Witoonchart, Peerajak; Chongstitvatana, Prabhas

2017-08-01

In this study, for the first time, we show how to formulate a structured support vector machine (SSVM) as two layers in a convolutional neural network, where the top layer is a loss augmented inference layer and the bottom layer is the normal convolutional layer. We show that a deformable part model can be learned with the proposed structured SVM neural network by backpropagating the error of the deformable part model to the convolutional neural network. The forward propagation calculates the loss augmented inference and the backpropagation calculates the gradient from the loss augmented inference layer to the convolutional layer. Thus, we obtain a new type of convolutional neural network called an Structured SVM convolutional neural network, which we applied to the human pose estimation problem. This new neural network can be used as the final layers in deep learning. Our method jointly learns the structural model parameters and the appearance model parameters. We implemented our method as a new layer in the existing Caffe library. Copyright © 2017 Elsevier Ltd. All rights reserved.
Using Support Vector Machine Ensembles for Target Audience Classification on Twitter

PubMed Central

Lo, Siaw Ling; Chiong, Raymond; Cornforth, David

2015-01-01

The vast amount and diversity of the content shared on social media can pose a challenge for any business wanting to use it to identify potential customers. In this paper, our aim is to investigate the use of both unsupervised and supervised learning methods for target audience classification on Twitter with minimal annotation efforts. Topic domains were automatically discovered from contents shared by followers of an account owner using Twitter Latent Dirichlet Allocation (LDA). A Support Vector Machine (SVM) ensemble was then trained using contents from different account owners of the various topic domains identified by Twitter LDA. Experimental results show that the methods presented are able to successfully identify a target audience with high accuracy. In addition, we show that using a statistical inference approach such as bootstrapping in over-sampling, instead of using random sampling, to construct training datasets can achieve a better classifier in an SVM ensemble. We conclude that such an ensemble system can take advantage of data diversity, which enables real-world applications for differentiating prospective customers from the general audience, leading to business advantage in the crowded social media space. PMID:25874768
Using Support Vector Machine to identify imaging biomarkers of neurological and psychiatric disease: a critical review.

PubMed

Orrù, Graziella; Pettersson-Yeo, William; Marquand, Andre F; Sartori, Giuseppe; Mechelli, Andrea

2012-04-01

Standard univariate analysis of neuroimaging data has revealed a host of neuroanatomical and functional differences between healthy individuals and patients suffering a wide range of neurological and psychiatric disorders. Significant only at group level however these findings have had limited clinical translation, and recent attention has turned toward alternative forms of analysis, including Support-Vector-Machine (SVM). A type of machine learning, SVM allows categorisation of an individual's previously unseen data into a predefined group using a classification algorithm, developed on a training data set. In recent years, SVM has been successfully applied in the context of disease diagnosis, transition prediction and treatment prognosis, using both structural and functional neuroimaging data. Here we provide a brief overview of the method and review those studies that applied it to the investigation of Alzheimer's disease, schizophrenia, major depression, bipolar disorder, presymptomatic Huntington's disease, Parkinson's disease and autistic spectrum disorder. We conclude by discussing the main theoretical and practical challenges associated with the implementation of this method into the clinic and possible future directions. Copyright © 2012 Elsevier Ltd. All rights reserved.

Identification and Mapping of Tree Species in Urban Areas Using WORLDVIEW-2 Imagery

NASA Astrophysics Data System (ADS)

Mustafa, Y. T.; Habeeb, H. N.; Stein, A.; Sulaiman, F. Y.

2015-10-01

Monitoring and mapping of urban trees are essential to provide urban forestry authorities with timely and consistent information. Modern techniques increasingly facilitate these tasks, but require the development of semi-automatic tree detection and classification methods. In this article, we propose an approach to delineate and map the crown of 15 tree species in the city of Duhok, Kurdistan Region of Iraq using WorldView-2 (WV-2) imagery. A tree crown object is identified first and is subsequently delineated as an image object (IO) using vegetation indices and texture measurements. Next, three classification methods: Maximum Likelihood, Neural Network, and Support Vector Machine were used to classify IOs using selected IO features. The best results are obtained with Support Vector Machine classification that gives the best map of urban tree species in Duhok. The overall accuracy was between 60.93% to 88.92% and κ-coefficient was between 0.57 to 0.75. We conclude that fifteen tree species were identified and mapped at a satisfactory accuracy in urban areas of this study.
Using support vector machine ensembles for target audience classification on Twitter.

PubMed

Lo, Siaw Ling; Chiong, Raymond; Cornforth, David

2015-01-01

The vast amount and diversity of the content shared on social media can pose a challenge for any business wanting to use it to identify potential customers. In this paper, our aim is to investigate the use of both unsupervised and supervised learning methods for target audience classification on Twitter with minimal annotation efforts. Topic domains were automatically discovered from contents shared by followers of an account owner using Twitter Latent Dirichlet Allocation (LDA). A Support Vector Machine (SVM) ensemble was then trained using contents from different account owners of the various topic domains identified by Twitter LDA. Experimental results show that the methods presented are able to successfully identify a target audience with high accuracy. In addition, we show that using a statistical inference approach such as bootstrapping in over-sampling, instead of using random sampling, to construct training datasets can achieve a better classifier in an SVM ensemble. We conclude that such an ensemble system can take advantage of data diversity, which enables real-world applications for differentiating prospective customers from the general audience, leading to business advantage in the crowded social media space.
Study of support vector machine and serum surface-enhanced Raman spectroscopy for noninvasive esophageal cancer detection

NASA Astrophysics Data System (ADS)

Li, Shao-Xin; Zeng, Qiu-Yao; Li, Lin-Fang; Zhang, Yan-Jiao; Wan, Ming-Ming; Liu, Zhi-Ming; Xiong, Hong-Lian; Guo, Zhou-Yi; Liu, Song-Hao

2013-02-01

The ability of combining serum surface-enhanced Raman spectroscopy (SERS) with support vector machine (SVM) for improving classification esophageal cancer patients from normal volunteers is investigated. Two groups of serum SERS spectra based on silver nanoparticles (AgNPs) are obtained: one group from patients with pathologically confirmed esophageal cancer (n=30) and the other group from healthy volunteers (n=31). Principal components analysis (PCA), conventional SVM (C-SVM) and conventional SVM combination with PCA (PCA-SVM) methods are implemented to classify the same spectral dataset. Results show that a diagnostic accuracy of 77.0% is acquired for PCA technique, while diagnostic accuracies of 83.6% and 85.2% are obtained for C-SVM and PCA-SVM methods based on radial basis functions (RBF) models. The results prove that RBF SVM models are superior to PCA algorithm in classification serum SERS spectra. The study demonstrates that serum SERS in combination with SVM technique has great potential to provide an effective and accurate diagnostic schema for noninvasive detection of esophageal cancer.
Support vector machine with a Pearson VII function kernel for discriminating halophilic and non-halophilic proteins.

PubMed

Zhang, Guangya; Ge, Huihua

2013-10-01

Understanding of proteins adaptive to hypersaline environment and identifying them is a challenging task and would help to design stable proteins. Here, we have systematically analyzed the normalized amino acid compositions of 2121 halophilic and 2400 non-halophilic proteins. The results showed that halophilic protein contained more Asp at the expense of Lys, Ile, Cys and Met, fewer small and hydrophobic residues, and showed a large excess of acidic over basic amino acids. Then, we introduce a support vector machine method to discriminate the halophilic and non-halophilic proteins, by using a novel Pearson VII universal function based kernel. In the three validation check methods, it achieved an overall accuracy of 97.7%, 91.7% and 86.9% and outperformed other machine learning algorithms. We also address the influence of protein size on prediction accuracy and found the worse performance for small size proteins might be some significant residues (Cys and Lys) were missing in the proteins. Copyright © 2013 The Authors. Published by Elsevier Ltd.. All rights reserved.
Mathematical models application for mapping soils spatial distribution on the example of the farm from the North of Udmurt Republic of Russia

NASA Astrophysics Data System (ADS)

Dokuchaev, P. M.; Meshalkina, J. L.; Yaroslavtsev, A. M.

2018-01-01

Comparative analysis of soils geospatial modeling using multinomial logistic regression, decision trees, random forest, regression trees and support vector machines algorithms was conducted. The visual interpretation of the digital maps obtained and their comparison with the existing map, as well as the quantitative assessment of the individual soil groups detection overall accuracy and of the models kappa showed that multiple logistic regression, support vector method, and random forest models application with spatial prediction of the conditional soil groups distribution can be reliably used for mapping of the study area. It has shown the most accurate detection for sod-podzolics soils (Phaeozems Albic) lightly eroded and moderately eroded soils. In second place, according to the mean overall accuracy of the prediction, there are sod-podzolics soils - non-eroded and warp one, as well as sod-gley soils (Umbrisols Gleyic) and alluvial soils (Fluvisols Dystric, Umbric). Heavy eroded sod-podzolics and gray forest soils (Phaeozems Albic) were detected by methods of automatic classification worst of all.
Predicting disulfide connectivity from protein sequence using multiple sequence feature vectors and secondary structure.

PubMed

Song, Jiangning; Yuan, Zheng; Tan, Hao; Huber, Thomas; Burrage, Kevin

2007-12-01

Disulfide bonds are primary covalent crosslinks between two cysteine residues in proteins that play critical roles in stabilizing the protein structures and are commonly found in extracy-toplasmatic or secreted proteins. In protein folding prediction, the localization of disulfide bonds can greatly reduce the search in conformational space. Therefore, there is a great need to develop computational methods capable of accurately predicting disulfide connectivity patterns in proteins that could have potentially important applications. We have developed a novel method to predict disulfide connectivity patterns from protein primary sequence, using a support vector regression (SVR) approach based on multiple sequence feature vectors and predicted secondary structure by the PSIPRED program. The results indicate that our method could achieve a prediction accuracy of 74.4% and 77.9%, respectively, when averaged on proteins with two to five disulfide bridges using 4-fold cross-validation, measured on the protein and cysteine pair on a well-defined non-homologous dataset. We assessed the effects of different sequence encoding schemes on the prediction performance of disulfide connectivity. It has been shown that the sequence encoding scheme based on multiple sequence feature vectors coupled with predicted secondary structure can significantly improve the prediction accuracy, thus enabling our method to outperform most of other currently available predictors. Our work provides a complementary approach to the current algorithms that should be useful in computationally assigning disulfide connectivity patterns and helps in the annotation of protein sequences generated by large-scale whole-genome projects. The prediction web server and Supplementary Material are accessible at http://foo.maths.uq.edu.au/~huber/disulfide
Detecting double compressed MPEG videos with the same quantization matrix and synchronized group of pictures structure

NASA Astrophysics Data System (ADS)

Aghamaleki, Javad Abbasi; Behrad, Alireza

2018-01-01

Double compression detection is a crucial stage in digital image and video forensics. However, the detection of double compressed videos is challenging when the video forger uses the same quantization matrix and synchronized group of pictures (GOP) structure during the recompression history to conceal tampering effects. A passive approach is proposed for detecting double compressed MPEG videos with the same quantization matrix and synchronized GOP structure. To devise the proposed algorithm, the effects of recompression on P frames are mathematically studied. Then, based on the obtained guidelines, a feature vector is proposed to detect double compressed frames on the GOP level. Subsequently, sparse representations of the feature vectors are used for dimensionality reduction and enrich the traces of recompression. Finally, a support vector machine classifier is employed to detect and localize double compression in temporal domain. The experimental results show that the proposed algorithm achieves the accuracy of more than 95%. In addition, the comparisons of the results of the proposed method with those of other methods reveal the efficiency of the proposed algorithm.
Comparison between refraction measured by Spot Vision Screening™ and subjective clinical refractometry.

PubMed

de Jesus, Daniela Lima; Villela, Flávio Fernandes; Orlandin, Luis Fernando; Eiji, Fernando Naves; Dantas, Daniel Oliveira; Alves, Milton Ruiz

2016-02-01

The purpose of this study was to evaluate the accuracy of Spot Vision Screening™ as an autorefractor by comparing refraction measurements to subjective clinical refractometry results in children and adult patients. One-hundred and thirty-four eyes of 134 patients were submitted to refractometry by Spot and clinical refractometry under cycloplegia. Patients, students, physicians, staff and children of staff from the Hospital das Clínicas (School of Medicine, University of São Paulo) aged 7-50 years without signs of ocular disease were examined. Only right-eye refraction data were analyzed. The findings were converted in magnitude vectors for analysis. The difference between Spot Vision Screening™ and subjective clinical refractometry expressed in spherical equivalents was +0.66±0.56 diopters (D), +0.16±0.27 D for the vector projected on the 90 axis and +0.02±0.15 D for the oblique vector. Despite the statistical significance of the difference between the two methods, we consider the difference non-relevant in a clinical setting, supporting the use of Spot Vision Screening™ as an ancillary method for estimating refraction.
Topic detection using paragraph vectors to support active learning in systematic reviews.

PubMed

Hashimoto, Kazuma; Kontonatsios, Georgios; Miwa, Makoto; Ananiadou, Sophia

2016-08-01

Systematic reviews require expert reviewers to manually screen thousands of citations in order to identify all relevant articles to the review. Active learning text classification is a supervised machine learning approach that has been shown to significantly reduce the manual annotation workload by semi-automating the citation screening process of systematic reviews. In this paper, we present a new topic detection method that induces an informative representation of studies, to improve the performance of the underlying active learner. Our proposed topic detection method uses a neural network-based vector space model to capture semantic similarities between documents. We firstly represent documents within the vector space, and cluster the documents into a predefined number of clusters. The centroids of the clusters are treated as latent topics. We then represent each document as a mixture of latent topics. For evaluation purposes, we employ the active learning strategy using both our novel topic detection method and a baseline topic model (i.e., Latent Dirichlet Allocation). Results obtained demonstrate that our method is able to achieve a high sensitivity of eligible studies and a significantly reduced manual annotation cost when compared to the baseline method. This observation is consistent across two clinical and three public health reviews. The tool introduced in this work is available from https://nactem.ac.uk/pvtopic/. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
HybridGO-Loc: mining hybrid features on gene ontology for predicting subcellular localization of multi-location proteins.

PubMed

Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan

2014-01-01

Protein subcellular localization prediction, as an essential step to elucidate the functions in vivo of proteins and identify drugs targets, has been extensively studied in previous decades. Instead of only determining subcellular localization of single-label proteins, recent studies have focused on predicting both single- and multi-location proteins. Computational methods based on Gene Ontology (GO) have been demonstrated to be superior to methods based on other features. However, existing GO-based methods focus on the occurrences of GO terms and disregard their relationships. This paper proposes a multi-label subcellular-localization predictor, namely HybridGO-Loc, that leverages not only the GO term occurrences but also the inter-term relationships. This is achieved by hybridizing the GO frequencies of occurrences and the semantic similarity between GO terms. Given a protein, a set of GO terms are retrieved by searching against the gene ontology database, using the accession numbers of homologous proteins obtained via BLAST search as the keys. The frequency of GO occurrences and semantic similarity (SS) between GO terms are used to formulate frequency vectors and semantic similarity vectors, respectively, which are subsequently hybridized to construct fusion vectors. An adaptive-decision based multi-label support vector machine (SVM) classifier is proposed to classify the fusion vectors. Experimental results based on recent benchmark datasets and a new dataset containing novel proteins show that the proposed hybrid-feature predictor significantly outperforms predictors based on individual GO features as well as other state-of-the-art predictors. For readers' convenience, the HybridGO-Loc server, which is for predicting virus or plant proteins, is available online at http://bioinfo.eie.polyu.edu.hk/HybridGoServer/.
Classification of diesel pool refinery streams through near infrared spectroscopy and support vector machines using C-SVC and ν-SVC.

PubMed

Alves, Julio Cesar L; Henriques, Claudete B; Poppi, Ronei J

2014-01-03

The use of near infrared (NIR) spectroscopy combined with chemometric methods have been widely used in petroleum and petrochemical industry and provides suitable methods for process control and quality control. The algorithm support vector machines (SVM) has demonstrated to be a powerful chemometric tool for development of classification models due to its ability to nonlinear modeling and with high generalization capability and these characteristics can be especially important for treating near infrared (NIR) spectroscopy data of complex mixtures such as petroleum refinery streams. In this work, a study on the performance of the support vector machines algorithm for classification was carried out, using C-SVC and ν-SVC, applied to near infrared (NIR) spectroscopy data of different types of streams that make up the diesel pool in a petroleum refinery: light gas oil, heavy gas oil, hydrotreated diesel, kerosene, heavy naphtha and external diesel. In addition to these six streams, the diesel final blend produced in the refinery was added to complete the data set. C-SVC and ν-SVC classification models with 2, 4, 6 and 7 classes were developed for comparison between its results and also for comparison with the soft independent modeling of class analogy (SIMCA) models results. It is demonstrated the superior performance of SVC models especially using ν-SVC for development of classification models for 6 and 7 classes leading to an improvement of sensitivity on validation sample sets of 24% and 15%, respectively, when compared to SIMCA models, providing better identification of chemical compositions of different diesel pool refinery streams. Copyright © 2013 Elsevier B.V. All rights reserved.
Prediction of the distillation temperatures of crude oils using ¹H NMR and support vector regression with estimated confidence intervals.

PubMed

Filgueiras, Paulo R; Terra, Luciana A; Castro, Eustáquio V R; Oliveira, Lize M S L; Dias, Júlio C M; Poppi, Ronei J

2015-09-01

This paper aims to estimate the temperature equivalent to 10% (T10%), 50% (T50%) and 90% (T90%) of distilled volume in crude oils using (1)H NMR and support vector regression (SVR). Confidence intervals for the predicted values were calculated using a boosting-type ensemble method in a procedure called ensemble support vector regression (eSVR). The estimated confidence intervals obtained by eSVR were compared with previously accepted calculations from partial least squares (PLS) models and a boosting-type ensemble applied in the PLS method (ePLS). By using the proposed boosting strategy, it was possible to identify outliers in the T10% property dataset. The eSVR procedure improved the accuracy of the distillation temperature predictions in relation to standard PLS, ePLS and SVR. For T10%, a root mean square error of prediction (RMSEP) of 11.6°C was obtained in comparison with 15.6°C for PLS, 15.1°C for ePLS and 28.4°C for SVR. The RMSEPs for T50% were 24.2°C, 23.4°C, 22.8°C and 14.4°C for PLS, ePLS, SVR and eSVR, respectively. For T90%, the values of RMSEP were 39.0°C, 39.9°C and 39.9°C for PLS, ePLS, SVR and eSVR, respectively. The confidence intervals calculated by the proposed boosting methodology presented acceptable values for the three properties analyzed; however, they were lower than those calculated by the standard methodology for PLS. Copyright © 2015 Elsevier B.V. All rights reserved.
Recurrence predictive models for patients with hepatocellular carcinoma after radiofrequency ablation using support vector machines with feature selection methods.

PubMed

Liang, Ja-Der; Ping, Xiao-Ou; Tseng, Yi-Ju; Huang, Guan-Tarn; Lai, Feipei; Yang, Pei-Ming

2014-12-01

Recurrence of hepatocellular carcinoma (HCC) is an important issue despite effective treatments with tumor eradication. Identification of patients who are at high risk for recurrence may provide more efficacious screening and detection of tumor recurrence. The aim of this study was to develop recurrence predictive models for HCC patients who received radiofrequency ablation (RFA) treatment. From January 2007 to December 2009, 83 newly diagnosed HCC patients receiving RFA as their first treatment were enrolled. Five feature selection methods including genetic algorithm (GA), simulated annealing (SA) algorithm, random forests (RF) and hybrid methods (GA+RF and SA+RF) were utilized for selecting an important subset of features from a total of 16 clinical features. These feature selection methods were combined with support vector machine (SVM) for developing predictive models with better performance. Five-fold cross-validation was used to train and test SVM models. The developed SVM-based predictive models with hybrid feature selection methods and 5-fold cross-validation had averages of the sensitivity, specificity, accuracy, positive predictive value, negative predictive value, and area under the ROC curve as 67%, 86%, 82%, 69%, 90%, and 0.69, respectively. The SVM derived predictive model can provide suggestive high-risk recurrent patients, who should be closely followed up after complete RFA treatment. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Prediction of endoplasmic reticulum resident proteins using fragmented amino acid composition and support vector machine.

PubMed

Kumar, Ravindra; Kumari, Bandana; Kumar, Manish

2017-01-01

The endoplasmic reticulum plays an important role in many cellular processes, which includes protein synthesis, folding and post-translational processing of newly synthesized proteins. It is also the site for quality control of misfolded proteins and entry point of extracellular proteins to the secretory pathway. Hence at any given point of time, endoplasmic reticulum contains two different cohorts of proteins, (i) proteins involved in endoplasmic reticulum-specific function, which reside in the lumen of the endoplasmic reticulum, called as endoplasmic reticulum resident proteins and (ii) proteins which are in process of moving to the extracellular space. Thus, endoplasmic reticulum resident proteins must somehow be distinguished from newly synthesized secretory proteins, which pass through the endoplasmic reticulum on their way out of the cell. Approximately only 50% of the proteins used in this study as training data had endoplasmic reticulum retention signal, which shows that these signals are not essentially present in all endoplasmic reticulum resident proteins. This also strongly indicates the role of additional factors in retention of endoplasmic reticulum-specific proteins inside the endoplasmic reticulum. This is a support vector machine based method, where we had used different forms of protein features as inputs for support vector machine to develop the prediction models. During training leave-one-out approach of cross-validation was used. Maximum performance was obtained with a combination of amino acid compositions of different part of proteins. In this study, we have reported a novel support vector machine based method for predicting endoplasmic reticulum resident proteins, named as ERPred. During training we achieved a maximum accuracy of 81.42% with leave-one-out approach of cross-validation. When evaluated on independent dataset, ERPred did prediction with sensitivity of 72.31% and specificity of 83.69%. We have also annotated six different proteomes to predict the candidate endoplasmic reticulum resident proteins in them. A webserver, ERPred, was developed to make the method available to the scientific community, which can be accessed at http://proteininformatics.org/mkumar/erpred/index.html. We found that out of 124 proteins of the training dataset, only 66 proteins had endoplasmic reticulum retention signals, which shows that these signals are not an absolute necessity for endoplasmic reticulum resident proteins to remain inside the endoplasmic reticulum. This observation also strongly indicates the role of additional factors in retention of proteins inside the endoplasmic reticulum. Our proposed predictor, ERPred, is a signal independent tool. It is tuned for the prediction of endoplasmic reticulum resident proteins, even if the query protein does not contain specific ER-retention signal.
[Identification of special quality eggs with NIR spectroscopy technology based on symbol entropy feature extraction method].

PubMed

Zhao, Yong; Hong, Wen-Xue

2011-11-01

Fast, nondestructive and accurate identification of special quality eggs is an urgent problem. The present paper proposed a new feature extraction method based on symbol entropy to identify near infrared spectroscopy of special quality eggs. The authors selected normal eggs, free range eggs, selenium-enriched eggs and zinc-enriched eggs as research objects and measured the near-infrared diffuse reflectance spectra in the range of 12 000-4 000 cm(-1). Raw spectra were symbolically represented with aggregation approximation algorithm and symbolic entropy was extracted as feature vector. An error-correcting output codes multiclass support vector machine classifier was designed to identify the spectrum. Symbolic entropy feature is robust when parameter changed and the highest recognition rate reaches up to 100%. The results show that the identification method of special quality eggs using near-infrared is feasible and the symbol entropy can be used as a new feature extraction method of near-infrared spectra.
Support vector machine as a binary classifier for automated object detection in remotely sensed data

NASA Astrophysics Data System (ADS)

Wardaya, P. D.

2014-02-01

In the present paper, author proposes the application of Support Vector Machine (SVM) for the analysis of satellite imagery. One of the advantages of SVM is that, with limited training data, it may generate comparable or even better results than the other methods. The SVM algorithm is used for automated object detection and characterization. Specifically, the SVM is applied in its basic nature as a binary classifier where it classifies two classes namely, object and background. The algorithm aims at effectively detecting an object from its background with the minimum training data. The synthetic image containing noises is used for algorithm testing. Furthermore, it is implemented to perform remote sensing image analysis such as identification of Island vegetation, water body, and oil spill from the satellite imagery. It is indicated that SVM provides the fast and accurate analysis with the acceptable result.
Prediction of biochar yield from cattle manure pyrolysis via least squares support vector machine intelligent approach.

PubMed

Cao, Hongliang; Xin, Ya; Yuan, Qiaoxia

2016-02-01

To predict conveniently the biochar yield from cattle manure pyrolysis, intelligent modeling approach was introduced in this research. A traditional artificial neural networks (ANN) model and a novel least squares support vector machine (LS-SVM) model were developed. For the identification and prediction evaluation of the models, a data set with 33 experimental data was used, which were obtained using a laboratory-scale fixed bed reaction system. The results demonstrated that the intelligent modeling approach is greatly convenient and effective for the prediction of the biochar yield. In particular, the novel LS-SVM model has a more satisfying predicting performance and its robustness is better than the traditional ANN model. The introduction and application of the LS-SVM modeling method gives a successful example, which is a good reference for the modeling study of cattle manure pyrolysis process, even other similar processes. Copyright © 2015 Elsevier Ltd. All rights reserved.
Detection of periods of food intake using Support Vector Machines.

PubMed

Lopez-Meyer, Paulo; Schuckers, Stephanie; Makeyev, Oleksandr; Sazonov, Edward

2010-01-01

Studies of obesity and eating disorders need objective tools of Monitoring of Ingestive Behavior (MIB) that can detect and characterize food intake. In this paper we describe detection of food intake by a Support Vector Machine classifier trained on time history of chews and swallows. The training was performed on data collected from 18 subjects in 72 experiments involving eating and other activities (for example, talking). The highest accuracy of detecting food intake (94%) was achieved in configuration where both chews and swallows were used as predictors. Using only swallowing as a predictor resulted in 80% accuracy. Experimental results suggest that these two predictors may be used for differentiation between periods of resting and food intake with a resolution of 30 seconds. Proposed methods may be utilized for development of an accurate, inexpensive, and non-intrusive methodology to objectively monitor food intake in free living conditions.
Method of assessing the state of a rolling bearing based on the relative compensation distance of multiple-domain features and locally linear embedding

NASA Astrophysics Data System (ADS)

Kang, Shouqiang; Ma, Danyang; Wang, Yujing; Lan, Chaofeng; Chen, Qingguo; Mikulovich, V. I.

2017-03-01

To effectively assess different fault locations and different degrees of performance degradation of a rolling bearing with a unified assessment index, a novel state assessment method based on the relative compensation distance of multiple-domain features and locally linear embedding is proposed. First, for a single-sample signal, time-domain and frequency-domain indexes can be calculated for the original vibration signal and each sensitive intrinsic mode function obtained by improved ensemble empirical mode decomposition, and the singular values of the sensitive intrinsic mode function matrix can be extracted by singular value decomposition to construct a high-dimensional hybrid-domain feature vector. Second, a feature matrix can be constructed by arranging each feature vector of multiple samples, the dimensions of each row vector of the feature matrix can be reduced by the locally linear embedding algorithm, and the compensation distance of each fault state of the rolling bearing can be calculated using the support vector machine. Finally, the relative distance between different fault locations and different degrees of performance degradation and the normal-state optimal classification surface can be compensated, and on the basis of the proposed relative compensation distance, the assessment model can be constructed and an assessment curve drawn. Experimental results show that the proposed method can effectively assess different fault locations and different degrees of performance degradation of the rolling bearing under certain conditions.
Disentangling Vector-Borne Transmission Networks: A Universal DNA Barcoding Method to Identify Vertebrate Hosts from Arthropod Bloodmeals

PubMed Central

Alcaide, Miguel; Rico, Ciro; Ruiz, Santiago; Soriguer, Ramón; Muñoz, Joaquín; Figuerola, Jordi

2009-01-01

Emerging infectious diseases represent a challenge for global economies and public health. About one fourth of the last pandemics have been originated by the spread of vector-borne pathogens. In this sense, the advent of modern molecular techniques has enhanced our capabilities to understand vector-host interactions and disease ecology. However, host identification protocols have poorly profited of international DNA barcoding initiatives and/or have focused exclusively on a limited array of vector species. Therefore, ascertaining the potential afforded by DNA barcoding tools in other vector-host systems of human and veterinary importance would represent a major advance in tracking pathogen life cycles and hosts. Here, we show the applicability of a novel and efficient molecular method for the identification of the vertebrate host's DNA contained in the midgut of blood-feeding arthropods. To this end, we designed a eukaryote-universal forward primer and a vertebrate-specific reverse primer to selectively amplify 758 base pairs (bp) of the vertebrate mitochondrial Cytochrome c Oxidase Subunit I (COI) gene. Our method was validated using both extensive sequence surveys from the public domain and Polymerase Chain Reaction (PCR) experiments carried out over specimens from different Classes of vertebrates (Mammalia, Aves, Reptilia and Amphibia) and invertebrate ectoparasites (Arachnida and Insecta). The analysis of mosquito, culicoid, phlebotomie, sucking bugs, and tick bloodmeals revealed up to 40 vertebrate hosts, including 23 avian, 16 mammalian and one reptilian species. Importantly, the inspection and analysis of direct sequencing electropherograms also assisted the resolving of mixed bloodmeals. We therefore provide a universal and high-throughput diagnostic tool for the study of the ecology of haematophagous invertebrates in relation to their vertebrate hosts. Such information is crucial to support the efficient management of initiatives aimed at reducing epidemiologic risks of arthropod vector-borne pathogens, a priority for public health. PMID:19768113

Application of machine learning on brain cancer multiclass classification

NASA Astrophysics Data System (ADS)

Panca, V.; Rustam, Z.

2017-07-01

Classification of brain cancer is a problem of multiclass classification. One approach to solve this problem is by first transforming it into several binary problems. The microarray gene expression dataset has the two main characteristics of medical data: extremely many features (genes) and only a few number of samples. The application of machine learning on microarray gene expression dataset mainly consists of two steps: feature selection and classification. In this paper, the features are selected using a method based on support vector machine recursive feature elimination (SVM-RFE) principle which is improved to solve multiclass classification, called multiple multiclass SVM-RFE. Instead of using only the selected features on a single classifier, this method combines the result of multiple classifiers. The features are divided into subsets and SVM-RFE is used on each subset. Then, the selected features on each subset are put on separate classifiers. This method enhances the feature selection ability of each single SVM-RFE. Twin support vector machine (TWSVM) is used as the method of the classifier to reduce computational complexity. While ordinary SVM finds single optimum hyperplane, the main objective Twin SVM is to find two non-parallel optimum hyperplanes. The experiment on the brain cancer microarray gene expression dataset shows this method could classify 71,4% of the overall test data correctly, using 100 and 1000 genes selected from multiple multiclass SVM-RFE feature selection method. Furthermore, the per class results show that this method could classify data of normal and MD class with 100% accuracy.
Classification of different kinds of pesticide residues on lettuce based on fluorescence spectra and WT-BCC-SVM algorithm

NASA Astrophysics Data System (ADS)

Zhou, Xin; Jun, Sun; Zhang, Bing; Jun, Wu

2017-07-01

In order to improve the reliability of the spectrum feature extracted by wavelet transform, a method combining wavelet transform (WT) with bacterial colony chemotaxis algorithm and support vector machine (BCC-SVM) algorithm (WT-BCC-SVM) was proposed in this paper. Besides, we aimed to identify different kinds of pesticide residues on lettuce leaves in a novel and rapid non-destructive way by using fluorescence spectra technology. The fluorescence spectral data of 150 lettuce leaf samples of five different kinds of pesticide residues on the surface of lettuce were obtained using Cary Eclipse fluorescence spectrometer. Standard normalized variable detrending (SNV detrending), Savitzky-Golay coupled with Standard normalized variable detrending (SG-SNV detrending) were used to preprocess the raw spectra, respectively. Bacterial colony chemotaxis combined with support vector machine (BCC-SVM) and support vector machine (SVM) classification models were established based on full spectra (FS) and wavelet transform characteristics (WTC), respectively. Moreover, WTC were selected by WT. The results showed that the accuracy of training set, calibration set and the prediction set of the best optimal classification model (SG-SNV detrending-WT-BCC-SVM) were 100%, 98% and 93.33%, respectively. In addition, the results indicated that it was feasible to use WT-BCC-SVM to establish diagnostic model of different kinds of pesticide residues on lettuce leaves.
Control-group feature normalization for multivariate pattern analysis of structural MRI data using the support vector machine.

PubMed

Linn, Kristin A; Gaonkar, Bilwaj; Satterthwaite, Theodore D; Doshi, Jimit; Davatzikos, Christos; Shinohara, Russell T

2016-05-15

Normalization of feature vector values is a common practice in machine learning. Generally, each feature value is standardized to the unit hypercube or by normalizing to zero mean and unit variance. Classification decisions based on support vector machines (SVMs) or by other methods are sensitive to the specific normalization used on the features. In the context of multivariate pattern analysis using neuroimaging data, standardization effectively up- and down-weights features based on their individual variability. Since the standard approach uses the entire data set to guide the normalization, it utilizes the total variability of these features. This total variation is inevitably dependent on the amount of marginal separation between groups. Thus, such a normalization may attenuate the separability of the data in high dimensional space. In this work we propose an alternate approach that uses an estimate of the control-group standard deviation to normalize features before training. We study our proposed approach in the context of group classification using structural MRI data. We show that control-based normalization leads to better reproducibility of estimated multivariate disease patterns and improves the classifier performance in many cases. Copyright © 2016 Elsevier Inc. All rights reserved.
Estimation of salient regions related to chronic gastritis using gastric X-ray images.

PubMed

Togo, Ren; Ishihara, Kenta; Ogawa, Takahiro; Haseyama, Miki

2016-10-01

Since technical knowledge and a high degree of experience are necessary for diagnosis of chronic gastritis, computer-aided diagnosis (CAD) systems that analyze gastric X-ray images are desirable in the field of medicine. Therefore, a new method that estimates salient regions related to chronic gastritis/non-gastritis for supporting diagnosis is presented in this paper. In order to estimate salient regions related to chronic gastritis/non-gastritis, the proposed method monitors the distance between a target image feature and Support Vector Machine (SVM)-based hyperplane for its classification. Furthermore, our method realizes removal of the influence of regions outside the stomach by using positional relationships between the stomach and other organs. Consequently, since the proposed method successfully estimates salient regions of gastric X-ray images for which chronic gastritis and non-gastritis are unknown, visual support for inexperienced clinicians becomes feasible. Copyright © 2016 Elsevier Ltd. All rights reserved.
Summary of Fluidic Thrust Vectoring Research Conducted at NASA Langley Research Center

NASA Technical Reports Server (NTRS)

Deere, Karen A.

2003-01-01

Interest in low-observable aircraft and in lowering an aircraft's exhaust system weight sparked decades of research for fixed geometry exhaust nozzles. The desire for such integrated exhaust nozzles was the catalyst for new fluidic control techniques; including throat area control, expansion control, and thrust-vector angle control. This paper summarizes a variety of fluidic thrust vectoring concepts that have been tested both experimentally and computationally at NASA Langley Research Center. The nozzle concepts are divided into three categories according to the method used for fluidic thrust vectoring: the shock vector control method, the throat shifting method, and the counterflow method. This paper explains the thrust vectoring mechanism for each fluidic method, provides examples of configurations tested for each method, and discusses the advantages and disadvantages of each method.
A New Method of Facial Expression Recognition Based on SPE Plus SVM

NASA Astrophysics Data System (ADS)

Ying, Zilu; Huang, Mingwei; Wang, Zhen; Wang, Zhewei

A novel method of facial expression recognition (FER) is presented, which uses stochastic proximity embedding (SPE) for data dimension reduction, and support vector machine (SVM) for expression classification. The proposed algorithm is applied to Japanese Female Facial Expression (JAFFE) database for FER, better performance is obtained compared with some traditional algorithms, such as PCA and LDA etc.. The result have further proved the effectiveness of the proposed algorithm.
Serological Markers of Sand Fly Exposure to Evaluate Insecticidal Nets against Visceral Leishmaniasis in India and Nepal: A Cluster-Randomized Trial

PubMed Central

Gidwani, Kamlesh; Picado, Albert; Rijal, Suman; Singh, Shri Prakash; Roy, Lalita; Volfova, Vera; Andersen, Elisabeth Wreford; Uranw, Surendra; Ostyn, Bart; Sudarshan, Medhavi; Chakravarty, Jaya; Volf, Petr; Sundar, Shyam; Boelaert, Marleen; Rogers, Matthew Edward

2011-01-01

Background Visceral leishmaniasis is the world' second largest vector-borne parasitic killer and a neglected tropical disease, prevalent in poor communities. Long-lasting insecticidal nets (LNs) are a low cost proven vector intervention method for malaria control; however, their effectiveness against visceral leishmaniasis (VL) is unknown. This study quantified the effect of LNs on exposure to the sand fly vector of VL in India and Nepal during a two year community intervention trial. Methods As part of a paired-cluster randomized controlled clinical trial in VL-endemic regions of India and Nepal we tested the effect of LNs on sand fly biting by measuring the antibody response of subjects to the saliva of Leishmania donovani vector Phlebotomus argentipes and the sympatric (non-vector) Phlebotomus papatasi. Fifteen to 20 individuals above 15 years of age from 26 VL endemic clusters were asked to provide a blood sample at baseline, 12 and 24 months post-intervention. Results A total of 305 individuals were included in the study, 68 participants provided two blood samples and 237 gave three samples. A random effect linear regression model showed that cluster-wide distribution of LNs reduced exposure to P. argentipes by 12% at 12 months (effect 0.88; 95% CI 0.83–0.94) and 9% at 24 months (effect 0.91; 95% CI 0.80–1.02) in the intervention group compared to control adjusting for baseline values and pair. Similar results were obtained for P. papatasi. Conclusions This trial provides evidence that LNs have a limited effect on sand fly exposure in VL endemic communities in India and Nepal and supports the use of sand fly saliva antibodies as a marker to evaluate vector control interventions. PMID:21931871
Vector-model-supported approach in prostate plan optimization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, Eva Sau Fan; Department of Health Technology and Informatics, The Hong Kong Polytechnic University; Wu, Vincent Wing Cheung

Lengthy time consumed in traditional manual plan optimization can limit the use of step-and-shoot intensity-modulated radiotherapy/volumetric-modulated radiotherapy (S&S IMRT/VMAT). A vector model base, retrieving similar radiotherapy cases, was developed with respect to the structural and physiologic features extracted from the Digital Imaging and Communications in Medicine (DICOM) files. Planning parameters were retrieved from the selected similar reference case and applied to the test case to bypass the gradual adjustment of planning parameters. Therefore, the planning time spent on the traditional trial-and-error manual optimization approach in the beginning of optimization could be reduced. Each S&S IMRT/VMAT prostate reference database comprised 100more » previously treated cases. Prostate cases were replanned with both traditional optimization and vector-model-supported optimization based on the oncologists' clinical dose prescriptions. A total of 360 plans, which consisted of 30 cases of S&S IMRT, 30 cases of 1-arc VMAT, and 30 cases of 2-arc VMAT plans including first optimization and final optimization with/without vector-model-supported optimization, were compared using the 2-sided t-test and paired Wilcoxon signed rank test, with a significance level of 0.05 and a false discovery rate of less than 0.05. For S&S IMRT, 1-arc VMAT, and 2-arc VMAT prostate plans, there was a significant reduction in the planning time and iteration with vector-model-supported optimization by almost 50%. When the first optimization plans were compared, 2-arc VMAT prostate plans had better plan quality than 1-arc VMAT plans. The volume receiving 35 Gy in the femoral head for 2-arc VMAT plans was reduced with the vector-model-supported optimization compared with the traditional manual optimization approach. Otherwise, the quality of plans from both approaches was comparable. Vector-model-supported optimization was shown to offer much shortened planning time and iteration number without compromising the plan quality.« less
A helper virus-free HSV-1 vector containing the vesicular glutamate transporter-1 promoter supports expression preferentially in VGLUT1-containing glutamatergic neurons.

PubMed

Zhang, Guo-rong; Geller, Alfred I

2010-05-17

Multiple potential uses of direct gene transfer into neurons require restricting expression to specific classes of glutamatergic neurons. Thus, it is desirable to develop vectors containing glutamatergic class-specific promoters. The three vesicular glutamate transporters (VGLUTs) are expressed in distinct populations of neurons, and VGLUT1 is the predominant VGLUT in the neocortex, hippocampus, and cerebellar cortex. We previously reported a plasmid (amplicon) Herpes Simplex Virus (HSV-1) vector that placed the Lac Z gene under the regulation of the VGLUT1 promoter (pVGLUT1lac). Using helper virus-free vector stocks, we showed that this vector supported approximately 90% glutamatergic neuron-specific expression in postrhinal (POR) cortex, in rats sacrificed at either 4 days or 2 months after gene transfer. We now show that pVGLUT1lac supports expression preferentially in VGLUT1-containing glutamatergic neurons. pVGLUT1lac vector stock was injected into either POR cortex, which contains primarily VGLUT1-containing glutamatergic neurons, or into the ventral medial hypothalamus (VMH), which contains predominantly VGLUT2-containing glutamatergic neurons. Rats were sacrificed at 4 days after gene transfer, and the types of cells expressing ss-galactosidase were determined by immunofluorescent costaining. Cell counts showed that pVGLUT1lac supported expression in approximately 10-fold more cells in POR cortex than in the VMH, whereas a control vector supported expression in similar numbers of cells in these two areas. Further, in POR cortex, pVGLUT1lac supported expression predominately in VGLUT1-containing neurons, and, in the VMH, pVGLUT1lac showed an approximately 10-fold preference for the rare VGLUT1-containing neurons. VGLUT1-specific expression may benefit specific experiments on learning or specific gene therapy approaches, particularly in the neocortex. Copyright 2010 Elsevier B.V. All rights reserved.
Development of a sugar-binding residue prediction system from protein sequences using support vector machine.

PubMed

Banno, Masaki; Komiyama, Yusuke; Cao, Wei; Oku, Yuya; Ueki, Kokoro; Sumikoshi, Kazuya; Nakamura, Shugo; Terada, Tohru; Shimizu, Kentaro

2017-02-01

Several methods have been proposed for protein-sugar binding site prediction using machine learning algorithms. However, they are not effective to learn various properties of binding site residues caused by various interactions between proteins and sugars. In this study, we classified sugars into acidic and nonacidic sugars and showed that their binding sites have different amino acid occurrence frequencies. By using this result, we developed sugar-binding residue predictors dedicated to the two classes of sugars: an acid sugar binding predictor and a nonacidic sugar binding predictor. We also developed a combination predictor which combines the results of the two predictors. We showed that when a sugar is known to be an acidic sugar, the acidic sugar binding predictor achieves the best performance, and showed that when a sugar is known to be a nonacidic sugar or is not known to be either of the two classes, the combination predictor achieves the best performance. Our method uses only amino acid sequences for prediction. Support vector machine was used as a machine learning algorithm and the position-specific scoring matrix created by the position-specific iterative basic local alignment search tool was used as the feature vector. We evaluated the performance of the predictors using five-fold cross-validation. We have launched our system, as an open source freeware tool on the GitHub repository (https://doi.org/10.5281/zenodo.61513). Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
MGRA: Motion Gesture Recognition via Accelerometer.

PubMed

Hong, Feng; You, Shujuan; Wei, Meiyu; Zhang, Yongtuo; Guo, Zhongwen

2016-04-13

Accelerometers have been widely embedded in most current mobile devices, enabling easy and intuitive operations. This paper proposes a Motion Gesture Recognition system (MGRA) based on accelerometer data only, which is entirely implemented on mobile devices and can provide users with real-time interactions. A robust and unique feature set is enumerated through the time domain, the frequency domain and singular value decomposition analysis using our motion gesture set containing 11,110 traces. The best feature vector for classification is selected, taking both static and mobile scenarios into consideration. MGRA exploits support vector machine as the classifier with the best feature vector. Evaluations confirm that MGRA can accommodate a broad set of gesture variations within each class, including execution time, amplitude and non-gestural movement. Extensive evaluations confirm that MGRA achieves higher accuracy under both static and mobile scenarios and costs less computation time and energy on an LG Nexus 5 than previous methods.
Integrated pest management and allocation of control efforts for vector-borne diseases

USGS Publications Warehouse

Ginsberg, H.S.

2001-01-01

Applications of various control methods were evaluated to determine how to integrate methods so as to minimize the number of human cases of vector-borne diseases. These diseases can be controlled by lowering the number of vector-human contacts (e.g., by pesticide applications or use of repellents), or by lowering the proportion of vectors infected with pathogens (e.g., by lowering or vaccinating reservoir host populations). Control methods should be combined in such a way as to most efficiently lower the probability of human encounter with an infected vector. Simulations using a simple probabilistic model of pathogen transmission suggest that the most efficient way to integrate different control methods is to combine methods that have the same effect (e.g., combine treatments that lower the vector population; or combine treatments that lower pathogen prevalence in vectors). Combining techniques that have different effects (e.g., a technique that lowers vector populations with a technique that lowers pathogen prevalence in vectors) will be less efficient than combining two techniques that both lower vector populations or combining two techniques that both lower pathogen prevalence, costs being the same. Costs of alternative control methods generally differ, so the efficiency of various combinations at lowering human contact with infected vectors should be estimated at available funding levels. Data should be collected from initial trials to improve the effects of subsequent interventions on the number of human cases.
Versatile generation of optical vector fields and vector beams using a non-interferometric approach.

PubMed

Tripathi, Santosh; Toussaint, Kimani C

2012-05-07

We present a versatile, non-interferometric method for generating vector fields and vector beams which can produce all the states of polarization represented on a higher-order Poincaré sphere. The versatility and non-interferometric nature of this method is expected to enable exploration of various exotic properties of vector fields and vector beams. To illustrate this, we study the propagation properties of some vector fields and find that, in general, propagation alters both their intensity and polarization distribution, and more interestingly, converts some vector fields into vector beams. In the article, we also suggest a modified Jones vector formalism to represent vector fields and vector beams.
The classification of the patients with pulmonary diseases using breath air samples spectral analysis

NASA Astrophysics Data System (ADS)

Kistenev, Yury V.; Borisov, Alexey V.; Kuzmin, Dmitry A.; Bulanova, Anna A.

2016-08-01

Technique of exhaled breath sampling is discussed. The procedure of wavelength auto-calibration is proposed and tested. Comparison of the experimental data with the model absorption spectra of 5% CO2 is conducted. The classification results of three study groups obtained by using support vector machine and principal component analysis methods are presented.
Methods for Scaling to Doubly Stochastic Form,

DTIC Science & Technology

1981-06-26

Frobenius -Konig Theorem (MARCUS and MINC [1964],p 97) A nonnegative n xn matrix without support contains an s x t zero subma- trix where: s +t =n + -3...that YA(k) has row sums 1. Then normalize the columns by a diagonal similarity transform defined as follows: Let x = (zx , • z,,) be a left Perron vector
Discrimination of crop and weeds on visible and visible/near-infrared spectrums using support vector machines, artificial neural network and decision tree

USDA-ARS?s Scientific Manuscript database

Weeds are regarded as farmers' natural enemy. In order to avoid excessive pesticide residues, the destruction of ecological environment, and to guarantee the quality and safety of agricultural products, it is urgent to develop highly-efficient weed management methods. Amongst, weed discrimination is...
Emotion detection from text

NASA Astrophysics Data System (ADS)

Ramalingam, V. V.; Pandian, A.; Jaiswal, Abhijeet; Bhatia, Nikhar

2018-04-01

This paper presents a novel method based on concept of Machine Learning for Emotion Detection using various algorithms of Support Vector Machine and major emotions described are linked to the Word-Net for enhanced accuracy. The approach proposed plays a promising role to augment the Artificial Intelligence in the near future and could be vital in optimization of Human-Machine Interface.
Prediction of plant pre-microRNAs and their microRNAs in genome-scale sequences using structure-sequence features and support vector machine.

PubMed

Meng, Jun; Liu, Dong; Sun, Chao; Luan, Yushi

2014-12-30

MicroRNAs (miRNAs) are a family of non-coding RNAs approximately 21 nucleotides in length that play pivotal roles at the post-transcriptional level in animals, plants and viruses. These molecules silence their target genes by degrading transcription or suppressing translation. Studies have shown that miRNAs are involved in biological responses to a variety of biotic and abiotic stresses. Identification of these molecules and their targets can aid the understanding of regulatory processes. Recently, prediction methods based on machine learning have been widely used for miRNA prediction. However, most of these methods were designed for mammalian miRNA prediction, and few are available for predicting miRNAs in the pre-miRNAs of specific plant species. Although the complete Solanum lycopersicum genome has been published, only 77 Solanum lycopersicum miRNAs have been identified, far less than the estimated number. Therefore, it is essential to develop a prediction method based on machine learning to identify new plant miRNAs. A novel classification model based on a support vector machine (SVM) was trained to identify real and pseudo plant pre-miRNAs together with their miRNAs. An initial set of 152 novel features related to sequential structures was used to train the model. By applying feature selection, we obtained the best subset of 47 features for use with the Back Support Vector Machine-Recursive Feature Elimination (B-SVM-RFE) method for the classification of plant pre-miRNAs. Using this method, 63 features were obtained for plant miRNA classification. We then developed an integrated classification model, miPlantPreMat, which comprises MiPlantPre and MiPlantMat, to identify plant pre-miRNAs and their miRNAs. This model achieved approximately 90% accuracy using plant datasets from nine plant species, including Arabidopsis thaliana, Glycine max, Oryza sativa, Physcomitrella patens, Medicago truncatula, Sorghum bicolor, Arabidopsis lyrata, Zea mays and Solanum lycopersicum. Using miPlantPreMat, 522 Solanum lycopersicum miRNAs were identified in the Solanum lycopersicum genome sequence. We developed an integrated classification model, miPlantPreMat, based on structure-sequence features and SVM. MiPlantPreMat was used to identify both plant pre-miRNAs and the corresponding mature miRNAs. An improved feature selection method was proposed, resulting in high classification accuracy, sensitivity and specificity.
A hybrid approach to select features and classify diseases based on medical data

NASA Astrophysics Data System (ADS)

AbdelLatif, Hisham; Luo, Jiawei

2018-03-01

Feature selection is popular problem in the classification of diseases in clinical medicine. Here, we developing a hybrid methodology to classify diseases, based on three medical datasets, Arrhythmia, Breast cancer, and Hepatitis datasets. This methodology called k-means ANOVA Support Vector Machine (K-ANOVA-SVM) uses K-means cluster with ANOVA statistical to preprocessing data and selection the significant features, and Support Vector Machines in the classification process. To compare and evaluate the performance, we choice three classification algorithms, decision tree Naïve Bayes, Support Vector Machines and applied the medical datasets direct to these algorithms. Our methodology was a much better classification accuracy is given of 98% in Arrhythmia datasets, 92% in Breast cancer datasets and 88% in Hepatitis datasets, Compare to use the medical data directly with decision tree Naïve Bayes, and Support Vector Machines. Also, the ROC curve and precision with (K-ANOVA-SVM) Achieved best results than other algorithms
Alpharetroviral Self-inactivating Vectors: Long-term Transgene Expression in Murine Hematopoietic Cells and Low Genotoxicity

PubMed Central

Suerth, Julia D; Maetzig, Tobias; Brugman, Martijn H; Heinz, Niels; Appelt, Jens-Uwe; Kaufmann, Kerstin B; Schmidt, Manfred; Grez, Manuel; Modlich, Ute; Baum, Christopher; Schambach, Axel

2012-01-01

Comparative integrome analyses have highlighted alpharetroviral vectors with a relatively neutral, and thus favorable, integration spectrum. However, previous studies used alpharetroviral vectors harboring viral coding sequences and intact long-terminal repeats (LTRs). We recently developed self-inactivating (SIN) alpharetroviral vectors with an advanced split-packaging design. In a murine bone marrow (BM) transplantation model we now compared alpharetroviral, gammaretroviral, and lentiviral SIN vectors and showed that all vectors transduced hematopoietic stem cells (HSCs), leading to comparable, sustained multilineage transgene expression in primary and secondary transplanted mice. Alpharetroviral integrations were decreased near transcription start sites, CpG islands, and potential cancer genes compared with gammaretroviral, and decreased in genes compared with lentiviral integrations. Analyzing the transcriptome and intragenic integrations in engrafting cells, we observed stronger correlations between in-gene integration targeting and transcriptional activity for gammaretroviral and lentiviral vectors than for alpharetroviral vectors. Importantly, the relatively “extragenic” alpharetroviral integration pattern still supported long-term transgene expression upon serial transplantation. Furthermore, sensitive genotoxicity studies revealed a decreased immortalization incidence compared with gammaretroviral and lentiviral SIN vectors. We conclude that alpharetroviral SIN vectors have a favorable integration pattern which lowers the risk of insertional mutagenesis while supporting long-term transgene expression in the progeny of transplanted HSCs. PMID:22334016

A targeted change-detection procedure by combining change vector analysis and post-classification approach

NASA Astrophysics Data System (ADS)

Ye, Su; Chen, Dongmei; Yu, Jie

2016-04-01

In remote sensing, conventional supervised change-detection methods usually require effective training data for multiple change types. This paper introduces a more flexible and efficient procedure that seeks to identify only the changes that users are interested in, here after referred to as "targeted change detection". Based on a one-class classifier "Support Vector Domain Description (SVDD)", a novel algorithm named "Three-layer SVDD Fusion (TLSF)" is developed specially for targeted change detection. The proposed algorithm combines one-class classification generated from change vector maps, as well as before- and after-change images in order to get a more reliable detecting result. In addition, this paper introduces a detailed workflow for implementing this algorithm. This workflow has been applied to two case studies with different practical monitoring objectives: urban expansion and forest fire assessment. The experiment results of these two case studies show that the overall accuracy of our proposed algorithm is superior (Kappa statistics are 86.3% and 87.8% for Case 1 and 2, respectively), compared to applying SVDD to change vector analysis and post-classification comparison.
Discriminant analysis for fast multiclass data classification through regularized kernel function approximation.

PubMed

Ghorai, Santanu; Mukherjee, Anirban; Dutta, Pranab K

2010-06-01

In this brief we have proposed the multiclass data classification by computationally inexpensive discriminant analysis through vector-valued regularized kernel function approximation (VVRKFA). VVRKFA being an extension of fast regularized kernel function approximation (FRKFA), provides the vector-valued response at single step. The VVRKFA finds a linear operator and a bias vector by using a reduced kernel that maps a pattern from feature space into the low dimensional label space. The classification of patterns is carried out in this low dimensional label subspace. A test pattern is classified depending on its proximity to class centroids. The effectiveness of the proposed method is experimentally verified and compared with multiclass support vector machine (SVM) on several benchmark data sets as well as on gene microarray data for multi-category cancer classification. The results indicate the significant improvement in both training and testing time compared to that of multiclass SVM with comparable testing accuracy principally in large data sets. Experiments in this brief also serve as comparison of performance of VVRKFA with stratified random sampling and sub-sampling.
Vector control for malaria and other mosquito-borne diseases. Report of a WHO study group.

PubMed

1995-01-01

Since the Ministerial Conference on Malaria in 1992, which acknowledged the urgent need for worldwide commitment to malaria control, efforts have been directed to implementation of a Global Malaria Control Strategy. Vector control, an essential component of malaria control, has become less effective in recent years, partly as a result of poor use of alternative control tools, inappropriate use of insecticides, lack of an epidemiological basis for interventions, inadequate resources and infrastructure, and weak management. Changing environmental conditions, the behavioural characteristics of certain vectors, and resistance to insecticides have added to the difficulties. This report of a WHO Study Group provides guidelines for the planning, implementation and evaluation of cost-effective and sustainable vector control in the context of the Global Malaria Control Strategy. It reviews the available methods - indoor residual spraying, personal protection, larval control and environmental management - stressing the need for selective and flexible use of interventions according to local conditions. Requirements for data collection and the appropriate use of entomological parameters and techniques are discussed and priorities identified for the development of local capacity for vector control and for operational research. Emphasis is placed both on the monitoring and evaluation of vector control to ensure cost-effectiveness and on the development of strong managerial structures, which can support community participation and intersectoral collaboration and accommodate the control of other vector-borne diseases. The report concludes with recommendations aimed at promoting the targeted and efficient use of vector control in preventing and controlling malaria, thereby reducing the threat to health and socioeconomic development in many tropical countries.
A region-based segmentation of tumour from brain CT images using nonlinear support vector machine classifier.

PubMed

Nanthagopal, A Padma; Rajamony, R Sukanesh

2012-07-01

The proposed system provides new textural information for segmenting tumours, efficiently and accurately and with less computational time, from benign and malignant tumour images, especially in smaller dimensions of tumour regions of computed tomography (CT) images. Region-based segmentation of tumour from brain CT image data is an important but time-consuming task performed manually by medical experts. The objective of this work is to segment brain tumour from CT images using combined grey and texture features with new edge features and nonlinear support vector machine (SVM) classifier. The selected optimal features are used to model and train the nonlinear SVM classifier to segment the tumour from computed tomography images and the segmentation accuracies are evaluated for each slice of the tumour image. The method is applied on real data of 80 benign, malignant tumour images. The results are compared with the radiologist labelled ground truth. Quantitative analysis between ground truth and the segmented tumour is presented in terms of segmentation accuracy and the overlap similarity measure dice metric. From the analysis and performance measures such as segmentation accuracy and dice metric, it is inferred that better segmentation accuracy and higher dice metric are achieved with the normalized cut segmentation method than with the fuzzy c-means clustering method.
[A prediction model for the activity of insecticidal crystal proteins from Bacillus thuringiensis based on support vector machine].

PubMed

Lin, Yi; Cai, Fu-Ying; Zhang, Guang-Ya

2007-01-01

A quantitative structure-property relationship (QSPR) model in terms of amino acid composition and the activity of Bacillus thuringiensis insecticidal crystal proteins was established. Support vector machine (SVM) is a novel general machine-learning tool based on the structural risk minimization principle that exhibits good generalization when fault samples are few; it is especially suitable for classification, forecasting, and estimation in cases where small amounts of samples are involved such as fault diagnosis; however, some parameters of SVM are selected based on the experience of the operator, which has led to decreased efficiency of SVM in practical application. The uniform design (UD) method was applied to optimize the running parameters of SVM. It was found that the average accuracy rate approached 73% when the penalty factor was 0.01, the epsilon 0.2, the gamma 0.05, and the range 0.5. The results indicated that UD might be used an effective method to optimize the parameters of SVM and SVM and could be used as an alternative powerful modeling tool for QSPR studies of the activity of Bacillus thuringiensis (Bt) insecticidal crystal proteins. Therefore, a novel method for predicting the insecticidal activity of Bt insecticidal crystal proteins was proposed by the authors of this study.
Lithium-ion battery state of health monitoring and remaining useful life prediction based on support vector regression-particle filter

NASA Astrophysics Data System (ADS)

Dong, Hancheng; Jin, Xiaoning; Lou, Yangbing; Wang, Changhong

2014-12-01

Lithium-ion batteries are used as the main power source in many electronic and electrical devices. In particular, with the growth in battery-powered electric vehicle development, the lithium-ion battery plays a critical role in the reliability of vehicle systems. In order to provide timely maintenance and replacement of battery systems, it is necessary to develop a reliable and accurate battery health diagnostic that takes a prognostic approach. Therefore, this paper focuses on two main methods to determine a battery's health: (1) Battery State-of-Health (SOH) monitoring and (2) Remaining Useful Life (RUL) prediction. Both of these are calculated by using a filter algorithm known as the Support Vector Regression-Particle Filter (SVR-PF). Models for battery SOH monitoring based on SVR-PF are developed with novel capacity degradation parameters introduced to determine battery health in real time. Moreover, the RUL prediction model is proposed, which is able to provide the RUL value and update the RUL probability distribution to the End-of-Life cycle. Results for both methods are presented, showing that the proposed SOH monitoring and RUL prediction methods have good performance and that the SVR-PF has better monitoring and prediction capability than the standard particle filter (PF).
PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine

PubMed Central

Manavalan, Balachandran; Shin, Tae H.; Lee, Gwang

2018-01-01

Accurately identifying bacteriophage virion proteins from uncharacterized sequences is important to understand interactions between the phage and its host bacteria in order to develop new antibacterial drugs. However, identification of such proteins using experimental techniques is expensive and often time consuming; hence, development of an efficient computational algorithm for the prediction of phage virion proteins (PVPs) prior to in vitro experimentation is needed. Here, we describe a support vector machine (SVM)-based PVP predictor, called PVP-SVM, which was trained with 136 optimal features. A feature selection protocol was employed to identify the optimal features from a large set that included amino acid composition, dipeptide composition, atomic composition, physicochemical properties, and chain-transition-distribution. PVP-SVM achieved an accuracy of 0.870 during leave-one-out cross-validation, which was 6% higher than control SVM predictors trained with all features, indicating the efficiency of the feature selection method. Furthermore, PVP-SVM displayed superior performance compared to the currently available method, PVPred, and two other machine-learning methods developed in this study when objectively evaluated with an independent dataset. For the convenience of the scientific community, a user-friendly and publicly accessible web server has been established at www.thegleelab.org/PVP-SVM/PVP-SVM.html. PMID:29616000
PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine.

PubMed

Manavalan, Balachandran; Shin, Tae H; Lee, Gwang

2018-01-01

Accurately identifying bacteriophage virion proteins from uncharacterized sequences is important to understand interactions between the phage and its host bacteria in order to develop new antibacterial drugs. However, identification of such proteins using experimental techniques is expensive and often time consuming; hence, development of an efficient computational algorithm for the prediction of phage virion proteins (PVPs) prior to in vitro experimentation is needed. Here, we describe a support vector machine (SVM)-based PVP predictor, called PVP-SVM, which was trained with 136 optimal features. A feature selection protocol was employed to identify the optimal features from a large set that included amino acid composition, dipeptide composition, atomic composition, physicochemical properties, and chain-transition-distribution. PVP-SVM achieved an accuracy of 0.870 during leave-one-out cross-validation, which was 6% higher than control SVM predictors trained with all features, indicating the efficiency of the feature selection method. Furthermore, PVP-SVM displayed superior performance compared to the currently available method, PVPred, and two other machine-learning methods developed in this study when objectively evaluated with an independent dataset. For the convenience of the scientific community, a user-friendly and publicly accessible web server has been established at www.thegleelab.org/PVP-SVM/PVP-SVM.html.
Use seismic colored inversion and power law committee machine based on imperial competitive algorithm for improving porosity prediction in a heterogeneous reservoir

NASA Astrophysics Data System (ADS)

Ansari, Hamid Reza

2014-09-01

In this paper we propose a new method for predicting rock porosity based on a combination of several artificial intelligence systems. The method focuses on one of the Iranian carbonate fields in the Persian Gulf. Because there is strong heterogeneity in carbonate formations, estimation of rock properties experiences more challenge than sandstone. For this purpose, seismic colored inversion (SCI) and a new approach of committee machine are used in order to improve porosity estimation. The study comprises three major steps. First, a series of sample-based attributes is calculated from 3D seismic volume. Acoustic impedance is an important attribute that is obtained by the SCI method in this study. Second, porosity log is predicted from seismic attributes using common intelligent computation systems including: probabilistic neural network (PNN), radial basis function network (RBFN), multi-layer feed forward network (MLFN), ε-support vector regression (ε-SVR) and adaptive neuro-fuzzy inference system (ANFIS). Finally, a power law committee machine (PLCM) is constructed based on imperial competitive algorithm (ICA) to combine the results of all previous predictions in a single solution. This technique is called PLCM-ICA in this paper. The results show that PLCM-ICA model improved the results of neural networks, support vector machine and neuro-fuzzy system.
Kernel parameter variation-based selective ensemble support vector data description for oil spill detection on the ocean via hyperspectral imaging

NASA Astrophysics Data System (ADS)

Uslu, Faruk Sukru

2017-07-01

Oil spills on the ocean surface cause serious environmental, political, and economic problems. Therefore, these catastrophic threats to marine ecosystems require detection and monitoring. Hyperspectral sensors are powerful optical sensors used for oil spill detection with the help of detailed spectral information of materials. However, huge amounts of data in hyperspectral imaging (HSI) require fast and accurate computation methods for detection problems. Support vector data description (SVDD) is one of the most suitable methods for detection, especially for large data sets. Nevertheless, the selection of kernel parameters is one of the main problems in SVDD. This paper presents a method, inspired by ensemble learning, for improving performance of SVDD without tuning its kernel parameters. Additionally, a classifier selection technique is proposed to get more gain. The proposed approach also aims to solve the small sample size problem, which is very important for processing high-dimensional data in HSI. The algorithm is applied to two HSI data sets for detection problems. In the first HSI data set, various targets are detected; in the second HSI data set, oil spill detection in situ is realized. The experimental results demonstrate the feasibility and performance improvement of the proposed algorithm for oil spill detection problems.
Using support vector machines to improve elemental ion identification in macromolecular crystal structures

DOE PAGES

Morshed, Nader; Echols, Nathaniel; Adams, Paul D.

2015-04-25

In the process of macromolecular model building, crystallographers must examine electron density for isolated atoms and differentiate sites containing structured solvent molecules from those containing elemental ions. This task requires specific knowledge of metal-binding chemistry and scattering properties and is prone to error. A method has previously been described to identify ions based on manually chosen criteria for a number of elements. Here, the use of support vector machines (SVMs) to automatically classify isolated atoms as either solvent or one of various ions is described. Two data sets of protein crystal structures, one containing manually curated structures deposited with anomalousmore » diffraction data and another with automatically filtered, high-resolution structures, were constructed. On the manually curated data set, an SVM classifier was able to distinguish calcium from manganese, zinc, iron and nickel, as well as all five of these ions from water molecules, with a high degree of accuracy. Additionally, SVMs trained on the automatically curated set of high-resolution structures were able to successfully classify most common elemental ions in an independent validation test set. This method is readily extensible to other elemental ions and can also be used in conjunction with previous methods based on a priori expectations of the chemical environment and X-ray scattering.« less
Cost-effectiveness of environmental management for vector control in resource development projects.

PubMed

Bos, R

1991-01-01

Vector control methods are traditionally divided in chemical, biological and environmental management approaches, and this distinction also reflected in certain financial and economic aspects. This is particularly true for environmental modification, usually engineering or other structural works. It is highly capital intensive, as opposed to chemical and biological control which require recurrent expenditures, and discount rates are therefore a prominent consideration in deciding for one or the other approach. Environmental manipulation requires recurrent action, but can often be carried out with the community participation, which raises the issue of opportunity costs. The incorporation of environmental management in resource projects is generally impeded by economic considerations. The Internal Rate of Return continues to be a crucial criterion for funding agencies and development banks to support new projects; at the same time Governments of debt-riden countries in the Third World will do their best to avoid additional loans on such frills as environmental and health safeguards. Two approaches can be recommended to nevertheless ensure the incorporation of environmental management measures in resource projects in an affordable way. First, there are several examples of cases where environmental management measures either have a dual benefit (increasing both agricultural production and reducing vector-borne disease transmission) or can be implemented at zero costs. Second, the additional costs involved in structural modifications can be separated from the project development costs considered in the calculations of the Internal Rate of Return, and financial support can be sought from bilateral technical cooperation agencies particularly interested in environmental and health issues. There is a dearth of information in the cost-effectiveness of alternative vector control strategies in the developing country context. The process of integrating vector control in the general health services will make it even more difficult to gain a clear insight in the matter.
A Discriminant Distance Based Composite Vector Selection Method for Odor Classification

PubMed Central

Choi, Sang-Il; Jeong, Gu-Min

2014-01-01

We present a composite vector selection method for an effective electronic nose system that performs well even in noisy environments. Each composite vector generated from a electronic nose data sample is evaluated by computing the discriminant distance. By quantitatively measuring the amount of discriminative information in each composite vector, composite vectors containing informative variables can be distinguished and the final composite features for odor classification are extracted using the selected composite vectors. Using the only informative composite vectors can be also helpful to extract better composite features instead of using all the generated composite vectors. Experimental results with different volatile organic compound data show that the proposed system has good classification performance even in a noisy environment compared to other methods. PMID:24747735
Killing-Yano tensors in spaces admitting a hypersurface orthogonal Killing vector

NASA Astrophysics Data System (ADS)

Garfinkle, David; Glass, E. N.

2013-03-01

Methods are presented for finding Killing-Yano tensors, conformal Killing-Yano tensors, and conformal Killing vectors in spacetimes with a hypersurface orthogonal Killing vector. These methods are similar to a method developed by the authors for finding Killing tensors. In all cases one decomposes both the tensor and the equation it satisfies into pieces along the Killing vector and pieces orthogonal to the Killing vector. Solving the separate equations that result from this decomposition requires less computing than integrating the original equation. In each case, examples are given to illustrate the method.
Daily sea level prediction at Chiayi coast, Taiwan using extreme learning machine and relevance vector machine

NASA Astrophysics Data System (ADS)

Imani, Moslem; Kao, Huan-Chin; Lan, Wen-Hau; Kuo, Chung-Yen

2018-02-01

The analysis and the prediction of sea level fluctuations are core requirements of marine meteorology and operational oceanography. Estimates of sea level with hours-to-days warning times are especially important for low-lying regions and coastal zone management. The primary purpose of this study is to examine the applicability and capability of extreme learning machine (ELM) and relevance vector machine (RVM) models for predicting sea level variations and compare their performances with powerful machine learning methods, namely, support vector machine (SVM) and radial basis function (RBF) models. The input dataset from the period of January 2004 to May 2011 used in the study was obtained from the Dongshi tide gauge station in Chiayi, Taiwan. Results showed that the ELM and RVM models outperformed the other methods. The performance of the RVM approach was superior in predicting the daily sea level time series given the minimum root mean square error of 34.73 mm and the maximum determination coefficient of 0.93 (R2) during the testing periods. Furthermore, the obtained results were in close agreement with the original tide-gauge data, which indicates that RVM approach is a promising alternative method for time series prediction and could be successfully used for daily sea level forecasts.
Encoding the local connectivity patterns of fMRI for cognitive task and state classification.

PubMed

Onal Ertugrul, Itir; Ozay, Mete; Yarman Vural, Fatos T

2018-06-15

In this work, we propose a novel framework to encode the local connectivity patterns of brain, using Fisher vectors (FV), vector of locally aggregated descriptors (VLAD) and bag-of-words (BoW) methods. We first obtain local descriptors, called mesh arc descriptors (MADs) from fMRI data, by forming local meshes around anatomical regions, and estimating their relationship within a neighborhood. Then, we extract a dictionary of relationships, called brain connectivity dictionary by fitting a generative Gaussian mixture model (GMM) to a set of MADs, and selecting codewords at the mean of each component of the mixture. Codewords represent connectivity patterns among anatomical regions. We also encode MADs by VLAD and BoW methods using k-Means clustering. We classify cognitive tasks using the Human Connectome Project (HCP) task fMRI dataset and cognitive states using the Emotional Memory Retrieval (EMR). We train support vector machines (SVMs) using the encoded MADs. Results demonstrate that, FV encoding of MADs can be successfully employed for classification of cognitive tasks, and outperform VLAD and BoW representations. Moreover, we identify the significant Gaussians in mixture models by computing energy of their corresponding FV parts, and analyze their effect on classification accuracy. Finally, we suggest a new method to visualize the codewords of the learned brain connectivity dictionary.
Margin-maximizing feature elimination methods for linear and nonlinear kernel-based discriminant functions.

PubMed

Aksu, Yaman; Miller, David J; Kesidis, George; Yang, Qing X

2010-05-01

Feature selection for classification in high-dimensional spaces can improve generalization, reduce classifier complexity, and identify important, discriminating feature "markers." For support vector machine (SVM) classification, a widely used technique is recursive feature elimination (RFE). We demonstrate that RFE is not consistent with margin maximization, central to the SVM learning approach. We thus propose explicit margin-based feature elimination (MFE) for SVMs and demonstrate both improved margin and improved generalization, compared with RFE. Moreover, for the case of a nonlinear kernel, we show that RFE assumes that the squared weight vector 2-norm is strictly decreasing as features are eliminated. We demonstrate this is not true for the Gaussian kernel and, consequently, RFE may give poor results in this case. MFE for nonlinear kernels gives better margin and generalization. We also present an extension which achieves further margin gains, by optimizing only two degrees of freedom--the hyperplane's intercept and its squared 2-norm--with the weight vector orientation fixed. We finally introduce an extension that allows margin slackness. We compare against several alternatives, including RFE and a linear programming method that embeds feature selection within the classifier design. On high-dimensional gene microarray data sets, University of California at Irvine (UCI) repository data sets, and Alzheimer's disease brain image data, MFE methods give promising results.
Detection of anomaly in human retina using Laplacian Eigenmaps and vectorized matched filtering

NASA Astrophysics Data System (ADS)

Yacoubou Djima, Karamatou A.; Simonelli, Lucia D.; Cunningham, Denise; Czaja, Wojciech

2015-03-01

We present a novel method for automated anomaly detection on auto fluorescent data provided by the National Institute of Health (NIH). This is motivated by the need for new tools to improve the capability of diagnosing macular degeneration in its early stages, track the progression over time, and test the effectiveness of new treatment methods. In previous work, macular anomalies have been detected automatically through multiscale analysis procedures such as wavelet analysis or dimensionality reduction algorithms followed by a classification algorithm, e.g., Support Vector Machine. The method that we propose is a Vectorized Matched Filtering (VMF) algorithm combined with Laplacian Eigenmaps (LE), a nonlinear dimensionality reduction algorithm with locality preserving properties. By applying LE, we are able to represent the data in the form of eigenimages, some of which accentuate the visibility of anomalies. We pick significant eigenimages and proceed with the VMF algorithm that classifies anomalies across all of these eigenimages simultaneously. To evaluate our performance, we compare our method to two other schemes: a matched filtering algorithm based on anomaly detection on single images and a combination of PCA and VMF. LE combined with VMF algorithm performs best, yielding a high rate of accurate anomaly detection. This shows the advantage of using a nonlinear approach to represent the data and the effectiveness of VMF, which operates on the images as a data cube rather than individual images.
Ship Detection in Optical Satellite Image Based on RX Method and PCAnet

NASA Astrophysics Data System (ADS)

Shao, Xiu; Li, Huali; Lin, Hui; Kang, Xudong; Lu, Ting

2017-12-01

In this paper, we present a novel method for ship detection in optical satellite image based on the ReedXiaoli (RX) method and the principal component analysis network (PCAnet). The proposed method consists of the following three steps. First, the spatially adjacent pixels in optical image are arranged into a vector, transforming the optical image into a 3D cube image. By taking this process, the contextual information of the spatially adjacent pixels can be integrated to magnify the discrimination between ship and background. Second, the RX anomaly detection method is adopted to preliminarily extract ship candidates from the produced 3D cube image. Finally, real ships are further confirmed among ship candidates by applying the PCAnet and the support vector machine (SVM). Specifically, the PCAnet is a simple deep learning network which is exploited to perform feature extraction, and the SVM is applied to achieve feature pooling and decision making. Experimental results demonstrate that our approach is effective in discriminating between ships and false alarms, and has a good ship detection performance.
Fault diagnosis method based on FFT-RPCA-SVM for Cascaded-Multilevel Inverter.

PubMed

Wang, Tianzhen; Qi, Jie; Xu, Hao; Wang, Yide; Liu, Lei; Gao, Diju

2016-01-01

Thanks to reduced switch stress, high quality of load wave, easy packaging and good extensibility, the cascaded H-bridge multilevel inverter is widely used in wind power system. To guarantee stable operation of system, a new fault diagnosis method, based on Fast Fourier Transform (FFT), Relative Principle Component Analysis (RPCA) and Support Vector Machine (SVM), is proposed for H-bridge multilevel inverter. To avoid the influence of load variation on fault diagnosis, the output voltages of the inverter is chosen as the fault characteristic signals. To shorten the time of diagnosis and improve the diagnostic accuracy, the main features of the fault characteristic signals are extracted by FFT. To further reduce the training time of SVM, the feature vector is reduced based on RPCA that can get a lower dimensional feature space. The fault classifier is constructed via SVM. An experimental prototype of the inverter is built to test the proposed method. Compared to other fault diagnosis methods, the experimental results demonstrate the high accuracy and efficiency of the proposed method. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.

Using artificial intelligence strategies for process-related automated inspection in the production environment

NASA Astrophysics Data System (ADS)

Anding, K.; Kuritcyn, P.; Garten, D.

2016-11-01

In this paper a new method for the automatic visual inspection of metallic surfaces is proposed by using Convolutional Neural Networks (CNN). The different combinations of network parameters were developed and tested. The obtained results of CNN were analysed and compared with the results of our previous investigations with color and texture features as input parameters for a Support Vector Machine. Advantages and disadvantages of the different classifying methods are explained.
Applying integrals of motion to the numerical solution of differential equations

NASA Technical Reports Server (NTRS)

Vezewski, D. J.

1980-01-01

A method is developed for using the integrals of systems of nonlinear, ordinary, differential equations in a numerical integration process to control the local errors in these integrals and reduce the global errors of the solution. The method is general and can be applied to either scalar or vector integrals. A number of example problems, with accompanying numerical results, are used to verify the analysis and support the conjecture of global error reduction.
Research in computer science

NASA Technical Reports Server (NTRS)

Ortega, J. M.

1985-01-01

Synopses are given for NASA supported work in computer science at the University of Virginia. Some areas of research include: error seeding as a testing method; knowledge representation for engineering design; analysis of faults in a multi-version software experiment; implementation of a parallel programming environment; two computer graphics systems for visualization of pressure distribution and convective density particles; task decomposition for multiple robot arms; vectorized incomplete conjugate gradient; and iterative methods for solving linear equations on the Flex/32.
Applying integrals of motion to the numerical solution of differential equations

NASA Technical Reports Server (NTRS)

Jezewski, D. J.

1979-01-01

A method is developed for using the integrals of systems of nonlinear, ordinary differential equations in a numerical integration process to control the local errors in these integrals and reduce the global errors of the solution. The method is general and can be applied to either scaler or vector integrals. A number of example problems, with accompanying numerical results, are used to verify the analysis and support the conjecture of global error reduction.
Semantic classification of business images

NASA Astrophysics Data System (ADS)

Erol, Berna; Hull, Jonathan J.

2006-01-01

Digital cameras are becoming increasingly common for capturing information in business settings. In this paper, we describe a novel method for classifying images into the following semantic classes: document, whiteboard, business card, slide, and regular images. Our method is based on combining low-level image features, such as text color, layout, and handwriting features with high-level OCR output analysis. Several Support Vector Machine Classifiers are combined for multi-class classification of input images. The system yields 95% accuracy in classification.
Using distances between Top-n-gram and residue pairs for protein remote homology detection.

PubMed

Liu, Bin; Xu, Jinghao; Zou, Quan; Xu, Ruifeng; Wang, Xiaolong; Chen, Qingcai

2014-01-01

Protein remote homology detection is one of the central problems in bioinformatics, which is important for both basic research and practical application. Currently, discriminative methods based on Support Vector Machines (SVMs) achieve the state-of-the-art performance. Exploring feature vectors incorporating the position information of amino acids or other protein building blocks is a key step to improve the performance of the SVM-based methods. Two new methods for protein remote homology detection were proposed, called SVM-DR and SVM-DT. SVM-DR is a sequence-based method, in which the feature vector representation for protein is based on the distances between residue pairs. SVM-DT is a profile-based method, which considers the distances between Top-n-gram pairs. Top-n-gram can be viewed as a profile-based building block of proteins, which is calculated from the frequency profiles. These two methods are position dependent approaches incorporating the sequence-order information of protein sequences. Various experiments were conducted on a benchmark dataset containing 54 families and 23 superfamilies. Experimental results showed that these two new methods are very promising. Compared with the position independent methods, the performance improvement is obvious. Furthermore, the proposed methods can also provide useful insights for studying the features of protein families. The better performance of the proposed methods demonstrates that the position dependant approaches are efficient for protein remote homology detection. Another advantage of our methods arises from the explicit feature space representation, which can be used to analyze the characteristic features of protein families. The source code of SVM-DT and SVM-DR is available at http://bioinformatics.hitsz.edu.cn/DistanceSVM/index.jsp.
A support vector machine-based method to identify mild cognitive impairment with multi-level characteristics of magnetic resonance imaging.

PubMed

Long, Zhuqing; Jing, Bin; Yan, Huagang; Dong, Jianxin; Liu, Han; Mo, Xiao; Han, Ying; Li, Haiyun

2016-09-07

Mild cognitive impairment (MCI) represents a transitional state between normal aging and Alzheimer's disease (AD). Non-invasive diagnostic methods are desirable to identify MCI for early therapeutic interventions. In this study, we proposed a support vector machine (SVM)-based method to discriminate between MCI patients and normal controls (NCs) using multi-level characteristics of magnetic resonance imaging (MRI). This method adopted a radial basis function (RBF) as the kernel function, and a grid search method to optimize the two parameters of SVM. The calculated characteristics, i.e., the Hurst exponent (HE), amplitude of low-frequency fluctuations (ALFF), regional homogeneity (ReHo) and gray matter density (GMD), were adopted as the classification features. A leave-one-out cross-validation (LOOCV) was used to evaluate the classification performance of the method. Applying the proposed method to the experimental data from 29 MCI patients and 33 healthy subjects, we achieved a classification accuracy of up to 96.77%, with a sensitivity of 93.10% and a specificity of 100%, and the area under the curve (AUC) yielded up to 0.97. Furthermore, the most discriminative features for classification were found to predominantly involve default-mode regions, such as hippocampus (HIP), parahippocampal gyrus (PHG), posterior cingulate gyrus (PCG) and middle frontal gyrus (MFG), and subcortical regions such as lentiform nucleus (LN) and amygdala (AMYG). Therefore, our method is promising in distinguishing MCI patients from NCs and may be useful for the diagnosis of MCI. Copyright © 2016 IBRO. Published by Elsevier Ltd. All rights reserved.
Evaluating the High Risk Groups for Suicide: A Comparison of Logistic Regression, Support Vector Machine, Decision Tree and Artificial Neural Network

PubMed Central

AMINI, Payam; AHMADINIA, Hasan; POOROLAJAL, Jalal; MOQADDASI AMIRI, Mohammad

2016-01-01

Background: We aimed to assess the high-risk group for suicide using different classification methods includinglogistic regression (LR), decision tree (DT), artificial neural network (ANN), and support vector machine (SVM). Methods: We used the dataset of a study conducted to predict risk factors of completed suicide in Hamadan Province, the west of Iran, in 2010. To evaluate the high-risk groups for suicide, LR, SVM, DT and ANN were performed. The applied methods were compared using sensitivity, specificity, positive predicted value, negative predicted value, accuracy and the area under curve. Cochran-Q test was implied to check differences in proportion among methods. To assess the association between the observed and predicted values, Ø coefficient, contingency coefficient, and Kendall tau-b were calculated. Results: Gender, age, and job were the most important risk factors for fatal suicide attempts in common for four methods. SVM method showed the highest accuracy 0.68 and 0.67 for training and testing sample, respectively. However, this method resulted in the highest specificity (0.67 for training and 0.68 for testing sample) and the highest sensitivity for training sample (0.85), but the lowest sensitivity for the testing sample (0.53). Cochran-Q test resulted in differences between proportions in different methods (P<0.001). The association of SVM predictions and observed values, Ø coefficient, contingency coefficient, and Kendall tau-b were 0.239, 0.232 and 0.239, respectively. Conclusion: SVM had the best performance to classify fatal suicide attempts comparing to DT, LR and ANN. PMID:27957463
An improved method for identification of small non-coding RNAs in bacteria using support vector machine

NASA Astrophysics Data System (ADS)

Barman, Ranjan Kumar; Mukhopadhyay, Anirban; Das, Santasabuj

2017-04-01

Bacterial small non-coding RNAs (sRNAs) are not translated into proteins, but act as functional RNAs. They are involved in diverse biological processes like virulence, stress response and quorum sensing. Several high-throughput techniques have enabled identification of sRNAs in bacteria, but experimental detection remains a challenge and grossly incomplete for most species. Thus, there is a need to develop computational tools to predict bacterial sRNAs. Here, we propose a computational method to identify sRNAs in bacteria using support vector machine (SVM) classifier. The primary sequence and secondary structure features of experimentally-validated sRNAs of Salmonella Typhimurium LT2 (SLT2) was used to build the optimal SVM model. We found that a tri-nucleotide composition feature of sRNAs achieved an accuracy of 88.35% for SLT2. We validated the SVM model also on the experimentally-detected sRNAs of E. coli and Salmonella Typhi. The proposed model had robustly attained an accuracy of 81.25% and 88.82% for E. coli K-12 and S. Typhi Ty2, respectively. We confirmed that this method significantly improved the identification of sRNAs in bacteria. Furthermore, we used a sliding window-based method and identified sRNAs from complete genomes of SLT2, S. Typhi Ty2 and E. coli K-12 with sensitivities of 89.09%, 83.33% and 67.39%, respectively.
Support vector machines for nuclear reactor state estimation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zavaljevski, N.; Gross, K. C.

2000-02-14

Validation of nuclear power reactor signals is often performed by comparing signal prototypes with the actual reactor signals. The signal prototypes are often computed based on empirical data. The implementation of an estimation algorithm which can make predictions on limited data is an important issue. A new machine learning algorithm called support vector machines (SVMS) recently developed by Vladimir Vapnik and his coworkers enables a high level of generalization with finite high-dimensional data. The improved generalization in comparison with standard methods like neural networks is due mainly to the following characteristics of the method. The input data space is transformedmore » into a high-dimensional feature space using a kernel function, and the learning problem is formulated as a convex quadratic programming problem with a unique solution. In this paper the authors have applied the SVM method for data-based state estimation in nuclear power reactors. In particular, they implemented and tested kernels developed at Argonne National Laboratory for the Multivariate State Estimation Technique (MSET), a nonlinear, nonparametric estimation technique with a wide range of applications in nuclear reactors. The methodology has been applied to three data sets from experimental and commercial nuclear power reactor applications. The results are promising. The combination of MSET kernels with the SVM method has better noise reduction and generalization properties than the standard MSET algorithm.« less
Identifying N6-methyladenosine sites using multi-interval nucleotide pair position specificity and support vector machine

NASA Astrophysics Data System (ADS)

Xing, Pengwei; Su, Ran; Guo, Fei; Wei, Leyi

2017-04-01

N6-methyladenosine (m6A) refers to methylation of the adenosine nucleotide acid at the nitrogen-6 position. It plays an important role in a series of biological processes, such as splicing events, mRNA exporting, nascent mRNA synthesis, nuclear translocation and translation process. Numerous experiments have been done to successfully characterize m6A sites within sequences since high-resolution mapping of m6A sites was established. However, as the explosive growth of genomic sequences, using experimental methods to identify m6A sites are time-consuming and expensive. Thus, it is highly desirable to develop fast and accurate computational identification methods. In this study, we propose a sequence-based predictor called RAM-NPPS for identifying m6A sites within RNA sequences, in which we present a novel feature representation algorithm based on multi-interval nucleotide pair position specificity, and use support vector machine classifier to construct the prediction model. Comparison results show that our proposed method outperforms the state-of-the-art predictors on three benchmark datasets across the three species, indicating the effectiveness and robustness of our method. Moreover, an online webserver implementing the proposed predictor has been established at http://server.malab.cn/RAM-NPPS/. It is anticipated to be a useful prediction tool to assist biologists to reveal the mechanisms of m6A site functions.
A communication-avoiding, hybrid-parallel, rank-revealing orthogonalization method.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hoemmen, Mark

2010-11-01

Orthogonalization consumes much of the run time of many iterative methods for solving sparse linear systems and eigenvalue problems. Commonly used algorithms, such as variants of Gram-Schmidt or Householder QR, have performance dominated by communication. Here, 'communication' includes both data movement between the CPU and memory, and messages between processors in parallel. Our Tall Skinny QR (TSQR) family of algorithms requires asymptotically fewer messages between processors and data movement between CPU and memory than typical orthogonalization methods, yet achieves the same accuracy as Householder QR factorization. Furthermore, in block orthogonalizations, TSQR is faster and more accurate than existing approaches formore » orthogonalizing the vectors within each block ('normalization'). TSQR's rank-revealing capability also makes it useful for detecting deflation in block iterative methods, for which existing approaches sacrifice performance, accuracy, or both. We have implemented a version of TSQR that exploits both distributed-memory and shared-memory parallelism, and supports real and complex arithmetic. Our implementation is optimized for the case of orthogonalizing a small number (5-20) of very long vectors. The shared-memory parallel component uses Intel's Threading Building Blocks, though its modular design supports other shared-memory programming models as well, including computation on the GPU. Our implementation achieves speedups of 2 times or more over competing orthogonalizations. It is available now in the development branch of the Trilinos software package, and will be included in the 10.8 release.« less
Identifying saltcedar with hyperspectral data and support vector machines

USDA-ARS?s Scientific Manuscript database

Saltcedar (Tamarix spp.) are a group of dense phreatophytic shrubs and trees that are invasive to riparian areas throughout the United States. This study determined the feasibility of using hyperspectral data and a support vector machine (SVM) classifier to discriminate saltcedar from other cover t...
A Comprehensive Optimization Strategy for Real-time Spatial Feature Sharing and Visual Analytics in Cyberinfrastructure

NASA Astrophysics Data System (ADS)

Li, W.; Shao, H.

2017-12-01

For geospatial cyberinfrastructure enabled web services, the ability of rapidly transmitting and sharing spatial data over the Internet plays a critical role to meet the demands of real-time change detection, response and decision-making. Especially for the vector datasets which serve as irreplaceable and concrete material in data-driven geospatial applications, their rich geometry and property information facilitates the development of interactive, efficient and intelligent data analysis and visualization applications. However, the big-data issues of vector datasets have hindered their wide adoption in web services. In this research, we propose a comprehensive optimization strategy to enhance the performance of vector data transmitting and processing. This strategy combines: 1) pre- and on-the-fly generalization, which automatically determines proper simplification level through the introduction of appropriate distance tolerance (ADT) to meet various visualization requirements, and at the same time speed up simplification efficiency; 2) a progressive attribute transmission method to reduce data size and therefore the service response time; 3) compressed data transmission and dynamic adoption of a compression method to maximize the service efficiency under different computing and network environments. A cyberinfrastructure web portal was developed for implementing the proposed technologies. After applying our optimization strategies, substantial performance enhancement is achieved. We expect this work to widen the use of web service providing vector data to support real-time spatial feature sharing, visual analytics and decision-making.
Mining protein function from text using term-based support vector machines

PubMed Central

Rice, Simon B; Nenadic, Goran; Stapley, Benjamin J

2005-01-01

Background Text mining has spurred huge interest in the domain of biology. The goal of the BioCreAtIvE exercise was to evaluate the performance of current text mining systems. We participated in Task 2, which addressed assigning Gene Ontology terms to human proteins and selecting relevant evidence from full-text documents. We approached it as a modified form of the document classification task. We used a supervised machine-learning approach (based on support vector machines) to assign protein function and select passages that support the assignments. As classification features, we used a protein's co-occurring terms that were automatically extracted from documents. Results The results evaluated by curators were modest, and quite variable for different problems: in many cases we have relatively good assignment of GO terms to proteins, but the selected supporting text was typically non-relevant (precision spanning from 3% to 50%). The method appears to work best when a substantial set of relevant documents is obtained, while it works poorly on single documents and/or short passages. The initial results suggest that our approach can also mine annotations from text even when an explicit statement relating a protein to a GO term is absent. Conclusion A machine learning approach to mining protein function predictions from text can yield good performance only if sufficient training data is available, and significant amount of supporting data is used for prediction. The most promising results are for combined document retrieval and GO term assignment, which calls for the integration of methods developed in BioCreAtIvE Task 1 and Task 2. PMID:15960835
Light bullets in coupled nonlinear Schrödinger equations with variable coefficients and a trapping potential.

PubMed

Xu, Si-Liu; Zhao, Guo-Peng; Belić, Milivoj R; He, Jun-Rong; Xue, Li

2017-04-17

We analyze three-dimensional (3D) vector solitary waves in a system of coupled nonlinear Schrödinger equations with spatially modulated diffraction and nonlinearity, under action of a composite self-consistent trapping potential. Exact vector solitary waves, or light bullets (LBs), are found using the self-similarity method. The stability of vortex 3D LB pairs is examined by direct numerical simulations; the results show that only low-order vortex soliton pairs with the mode parameter values n ≤ 1, l ≤ 1 and m = 0 can be supported by the spatially modulated interaction in the composite trap. Higher-order LBs are found unstable over prolonged distances.
Complexity of Kronecker Operations on Sparse Matrices with Applications to the Solution of Markov Models

NASA Technical Reports Server (NTRS)

Buchholz, Peter; Ciardo, Gianfranco; Donatelli, Susanna; Kemper, Peter

1997-01-01

We present a systematic discussion of algorithms to multiply a vector by a matrix expressed as the Kronecker product of sparse matrices, extending previous work in a unified notational framework. Then, we use our results to define new algorithms for the solution of large structured Markov models. In addition to a comprehensive overview of existing approaches, we give new results with respect to: (1) managing certain types of state-dependent behavior without incurring extra cost; (2) supporting both Jacobi-style and Gauss-Seidel-style methods by appropriate multiplication algorithms; (3) speeding up algorithms that consider probability vectors of size equal to the "actual" state space instead of the "potential" state space.
Mechanism of interlayer exchange in magnetic multilayers

NASA Astrophysics Data System (ADS)

Slonczewski, J. C.

1993-09-01

The spin-current method is used to calculate the oscillatory exchange energy that couples two semi-infinite ferromagnets with exchange-split parabolic bands which are joined by a nonmagnetic metallic spacer. A closed asymptotic formula extends the previous RKKY-type formula to the case in which the ferromagnets and spacer have different Fermi vectors. The predicted amplitude of oscillatory coupling increases steeply with Fermi vector or electron density in the spacer, as do the experimental trends reported by Parkin. Numerical computations relevant to iron support this closed formula and show that the amplitude of the biquadratic ( J2 cos 2θ) and higher-order corrections to the conventional - J1 cos θ form of energy is less than 2%.
Large-scale production of lentiviral vector in a closed system hollow fiber bioreactor

PubMed Central

Sheu, Jonathan; Beltzer, Jim; Fury, Brian; Wilczek, Katarzyna; Tobin, Steve; Falconer, Danny; Nolta, Jan; Bauer, Gerhard

2015-01-01

Lentiviral vectors are widely used in the field of gene therapy as an effective method for permanent gene delivery. While current methods of producing small scale vector batches for research purposes depend largely on culture flasks, the emergence and popularity of lentiviral vectors in translational, preclinical and clinical research has demanded their production on a much larger scale, a task that can be difficult to manage with the numbers of producer cell culture flasks required for large volumes of vector. To generate a large scale, partially closed system method for the manufacturing of clinical grade lentiviral vector suitable for the generation of induced pluripotent stem cells (iPSCs), we developed a method employing a hollow fiber bioreactor traditionally used for cell expansion. We have demonstrated the growth, transfection, and vector-producing capability of 293T producer cells in this system. Vector particle RNA titers after subsequent vector concentration yielded values comparable to lentiviral iPSC induction vector batches produced using traditional culture methods in 225 cm2 flasks (T225s) and in 10-layer cell factories (CF10s), while yielding a volume nearly 145 times larger than the yield from a T225 flask and nearly three times larger than the yield from a CF10. Employing a closed system hollow fiber bioreactor for vector production offers the possibility of manufacturing large quantities of gene therapy vector while minimizing reagent usage, equipment footprint, and open system manipulation. PMID:26151065
The application of continuous wavelet transform and least squares support vector machine for the simultaneous quantitative spectrophotometric determination of Myricetin, Kaempferol and Quercetin as flavonoids in pharmaceutical plants

NASA Astrophysics Data System (ADS)

Sohrabi, Mahmoud Reza; Darabi, Golnaz

2016-01-01

Flavonoids are γ-benzopyrone derivatives, which are highly regarded in these researchers for their antioxidant property. In this study, two new signals processing methods been coupled with UV spectroscopy for spectral resolution and simultaneous quantitative determination of Myricetin, Kaempferol and Quercetin as flavonoids in Laurel, St. John's Wort and Green Tea without the need for any previous separation procedure. The developed methods are continuous wavelet transform (CWT) and least squares support vector machine (LS-SVM) methods integrated with UV spectroscopy individually. Different wavelet families were tested by CWT method and finally the Daubechies wavelet family (Db4) for Myricetin and the Gaussian wavelet families for Kaempferol (Gaus3) and Quercetin (Gaus7) were selected and applied for simultaneous analysis under the optimal conditions. The LS-SVM was applied to build the flavonoids prediction model based on absorption spectra. The root mean square errors for prediction (RMSEP) of Myricetin, Kaempferol and Quercetin were 0.0552, 0.0275 and 0.0374, respectively. The developed methods were validated by the analysis of the various synthetic mixtures associated with a well- known flavonoid contents. Mean recovery values of Myricetin, Kaempferol and Quercetin, in CWT method were 100.123, 100.253, 100.439 and in LS-SVM method were 99.94, 99.81 and 99.682, respectively. The results achieved by analyzing the real samples from the CWT and LS-SVM methods were compared to the HPLC reference method and the results were very close to the reference method. Meanwhile, the obtained results of the one-way ANOVA (analysis of variance) test revealed that there was no significant difference between the suggested methods.

A new automated assessment method for contrast-detail images by applying support vector machine and its robustness to nonlinear image processing.

PubMed

Takei, Takaaki; Ikeda, Mitsuru; Imai, Kuniharu; Yamauchi-Kawaura, Chiyo; Kato, Katsuhiko; Isoda, Haruo

2013-09-01

The automated contrast-detail (C-D) analysis methods developed so-far cannot be expected to work well on images processed with nonlinear methods, such as noise reduction methods. Therefore, we have devised a new automated C-D analysis method by applying support vector machine (SVM), and tested for its robustness to nonlinear image processing. We acquired the CDRAD (a commercially available C-D test object) images at a tube voltage of 120 kV and a milliampere-second product (mAs) of 0.5-5.0. A partial diffusion equation based technique was used as noise reduction method. Three radiologists and three university students participated in the observer performance study. The training data for our SVM method was the classification data scored by the one radiologist for the CDRAD images acquired at 1.6 and 3.2 mAs and their noise-reduced images. We also compared the performance of our SVM method with the CDRAD Analyser algorithm. The mean C-D diagrams (that is a plot of the mean of the smallest visible hole diameter vs. hole depth) obtained from our devised SVM method agreed well with the ones averaged across the six human observers for both original and noise-reduced CDRAD images, whereas the mean C-D diagrams from the CDRAD Analyser algorithm disagreed with the ones from the human observers for both original and noise-reduced CDRAD images. In conclusion, our proposed SVM method for C-D analysis will work well for the images processed with the non-linear noise reduction method as well as for the original radiographic images.
The application of continuous wavelet transform and least squares support vector machine for the simultaneous quantitative spectrophotometric determination of Myricetin, Kaempferol and Quercetin as flavonoids in pharmaceutical plants.

PubMed

Sohrabi, Mahmoud Reza; Darabi, Golnaz

2016-01-05

Flavonoids are γ-benzopyrone derivatives, which are highly regarded in these researchers for their antioxidant property. In this study, two new signals processing methods been coupled with UV spectroscopy for spectral resolution and simultaneous quantitative determination of Myricetin, Kaempferol and Quercetin as flavonoids in Laurel, St. John's Wort and Green Tea without the need for any previous separation procedure. The developed methods are continuous wavelet transform (CWT) and least squares support vector machine (LS-SVM) methods integrated with UV spectroscopy individually. Different wavelet families were tested by CWT method and finally the Daubechies wavelet family (Db4) for Myricetin and the Gaussian wavelet families for Kaempferol (Gaus3) and Quercetin (Gaus7) were selected and applied for simultaneous analysis under the optimal conditions. The LS-SVM was applied to build the flavonoids prediction model based on absorption spectra. The root mean square errors for prediction (RMSEP) of Myricetin, Kaempferol and Quercetin were 0.0552, 0.0275 and 0.0374, respectively. The developed methods were validated by the analysis of the various synthetic mixtures associated with a well- known flavonoid contents. Mean recovery values of Myricetin, Kaempferol and Quercetin, in CWT method were 100.123, 100.253, 100.439 and in LS-SVM method were 99.94, 99.81 and 99.682, respectively. The results achieved by analyzing the real samples from the CWT and LS-SVM methods were compared to the HPLC reference method and the results were very close to the reference method. Meanwhile, the obtained results of the one-way ANOVA (analysis of variance) test revealed that there was no significant difference between the suggested methods. Copyright © 2015 Elsevier B.V. All rights reserved.
Balancing aggregation and smoothing errors in inverse models

DOE PAGES

Turner, A. J.; Jacob, D. J.

2015-06-30

Inverse models use observations of a system (observation vector) to quantify the variables driving that system (state vector) by statistical optimization. When the observation vector is large, such as with satellite data, selecting a suitable dimension for the state vector is a challenge. A state vector that is too large cannot be effectively constrained by the observations, leading to smoothing error. However, reducing the dimension of the state vector leads to aggregation error as prior relationships between state vector elements are imposed rather than optimized. Here we present a method for quantifying aggregation and smoothing errors as a function ofmore » state vector dimension, so that a suitable dimension can be selected by minimizing the combined error. Reducing the state vector within the aggregation error constraints can have the added advantage of enabling analytical solution to the inverse problem with full error characterization. We compare three methods for reducing the dimension of the state vector from its native resolution: (1) merging adjacent elements (grid coarsening), (2) clustering with principal component analysis (PCA), and (3) applying a Gaussian mixture model (GMM) with Gaussian pdfs as state vector elements on which the native-resolution state vector elements are projected using radial basis functions (RBFs). The GMM method leads to somewhat lower aggregation error than the other methods, but more importantly it retains resolution of major local features in the state vector while smoothing weak and broad features.« less
Balancing aggregation and smoothing errors in inverse models

NASA Astrophysics Data System (ADS)

Turner, A. J.; Jacob, D. J.

2015-01-01

Inverse models use observations of a system (observation vector) to quantify the variables driving that system (state vector) by statistical optimization. When the observation vector is large, such as with satellite data, selecting a suitable dimension for the state vector is a challenge. A state vector that is too large cannot be effectively constrained by the observations, leading to smoothing error. However, reducing the dimension of the state vector leads to aggregation error as prior relationships between state vector elements are imposed rather than optimized. Here we present a method for quantifying aggregation and smoothing errors as a function of state vector dimension, so that a suitable dimension can be selected by minimizing the combined error. Reducing the state vector within the aggregation error constraints can have the added advantage of enabling analytical solution to the inverse problem with full error characterization. We compare three methods for reducing the dimension of the state vector from its native resolution: (1) merging adjacent elements (grid coarsening), (2) clustering with principal component analysis (PCA), and (3) applying a Gaussian mixture model (GMM) with Gaussian pdfs as state vector elements on which the native-resolution state vector elements are projected using radial basis functions (RBFs). The GMM method leads to somewhat lower aggregation error than the other methods, but more importantly it retains resolution of major local features in the state vector while smoothing weak and broad features.
Balancing aggregation and smoothing errors in inverse models

NASA Astrophysics Data System (ADS)

Turner, A. J.; Jacob, D. J.

2015-06-01

Inverse models use observations of a system (observation vector) to quantify the variables driving that system (state vector) by statistical optimization. When the observation vector is large, such as with satellite data, selecting a suitable dimension for the state vector is a challenge. A state vector that is too large cannot be effectively constrained by the observations, leading to smoothing error. However, reducing the dimension of the state vector leads to aggregation error as prior relationships between state vector elements are imposed rather than optimized. Here we present a method for quantifying aggregation and smoothing errors as a function of state vector dimension, so that a suitable dimension can be selected by minimizing the combined error. Reducing the state vector within the aggregation error constraints can have the added advantage of enabling analytical solution to the inverse problem with full error characterization. We compare three methods for reducing the dimension of the state vector from its native resolution: (1) merging adjacent elements (grid coarsening), (2) clustering with principal component analysis (PCA), and (3) applying a Gaussian mixture model (GMM) with Gaussian pdfs as state vector elements on which the native-resolution state vector elements are projected using radial basis functions (RBFs). The GMM method leads to somewhat lower aggregation error than the other methods, but more importantly it retains resolution of major local features in the state vector while smoothing weak and broad features.
Applying spectral unmixing and support vector machine to airborne hyperspectral imagery for detecting giant reed

USDA-ARS?s Scientific Manuscript database

This study evaluated linear spectral unmixing (LSU), mixture tuned matched filtering (MTMF) and support vector machine (SVM) techniques for detecting and mapping giant reed (Arundo donax L.), an invasive weed that presents a severe threat to agroecosystems and riparian areas throughout the southern ...
Support vector machines classifiers of physical activities in preschoolers

USDA-ARS?s Scientific Manuscript database

The goal of this study is to develop, test, and compare multinomial logistic regression (MLR) and support vector machines (SVM) in classifying preschool-aged children physical activity data acquired from an accelerometer. In this study, 69 children aged 3-5 years old were asked to participate in a s...
Comparison of Support Vector Machine, Neural Network, and CART Algorithms for the Land-Cover Classification Using Limited Training Data Points

EPA Science Inventory

Support vector machine (SVM) was applied for land-cover characterization using MODIS time-series data. Classification performance was examined with respect to training sample size, sample variability, and landscape homogeneity (purity). The results were compared to two convention...
Instruction-based clinical eye-tracking study on the visual interpretation of divergence: How do students look at vector field plots?

NASA Astrophysics Data System (ADS)

Klein, P.; Viiri, J.; Mozaffari, S.; Dengel, A.; Kuhn, J.

2018-06-01

Relating mathematical concepts to graphical representations is a challenging task for students. In this paper, we introduce two visual strategies to qualitatively interpret the divergence of graphical vector field representations. One strategy is based on the graphical interpretation of partial derivatives, while the other is based on the flux concept. We test the effectiveness of both strategies in an instruction-based eye-tracking study with N =41 physics majors. We found that students' performance improved when both strategies were introduced (74% correct) instead of only one strategy (64% correct), and students performed best when they were free to choose between the two strategies (88% correct). This finding supports the idea of introducing multiple representations of a physical concept to foster student understanding. Relevant eye-tracking measures demonstrate that both strategies imply different visual processing of the vector field plots, therefore reflecting conceptual differences between the strategies. Advanced analysis methods further reveal significant differences in eye movements between the best and worst performing students. For instance, the best students performed predominantly horizontal and vertical saccades, indicating correct interpretation of partial derivatives. They also focused on smaller regions when they balanced positive and negative flux. This mixed-method research leads to new insights into student visual processing of vector field representations, highlights the advantages and limitations of eye-tracking methodologies in this context, and discusses implications for teaching and for future research. The introduction of saccadic direction analysis expands traditional methods, and shows the potential to discover new insights into student understanding and learning difficulties.
Thin-layer chromatographic identification of Chinese propolis using chemometric fingerprinting.

PubMed

Tang, Tie-xin; Guo, Wei-yan; Xu, Ye; Zhang, Si-ming; Xu, Xin-jun; Wang, Dong-mei; Zhao, Zhi-min; Zhu, Long-ping; Yang, De-po

2014-01-01

Poplar tree gum has a similar chemical composition and appearance to Chinese propolis (bee glue) and has been widely used as a counterfeit propolis because Chinese propolis is typically the poplar-type propolis, the chemical composition of which is determined mainly by the resin of poplar trees. The discrimination of Chinese propolis from poplar tree gum is a challenging task. To develop a rapid thin-layer chromatographic (TLC) identification method using chemometric fingerprinting to discriminate Chinese propolis from poplar tree gum. A new TLC method using a combination of ammonia and hydrogen peroxide vapours as the visualisation reagent was developed to characterise the chemical profile of Chinese propolis. Three separate people performed TLC on eight Chinese propolis samples and three poplar tree gum samples of varying origins. Five chemometric methods, including similarity analysis, hierarchical clustering, k-means clustering, neural network and support vector machine, were compared for use in classifying the samples based on their densitograms obtained from the TLC chromatograms via image analysis. Hierarchical clustering, neural network and support vector machine analyses achieved a correct classification rate of 100% in classifying the samples. A strategy for TLC identification of Chinese propolis using chemometric fingerprinting was proposed and it provided accurate sample classification. The study has shown that the TLC identification method using chemometric fingerprinting is a rapid, low-cost method for the discrimination of Chinese propolis from poplar tree gum and may be used for the quality control of Chinese propolis. Copyright © 2014 John Wiley & Sons, Ltd.
A data-driven approach of load monitoring on laminated composite plates using support vector machine

NASA Astrophysics Data System (ADS)

Gwon, Y. S.; Fekrmandi, H.

2018-03-01

In this study, the surface response to excitation method (SuRE) is investigated using a data-driven method for load monitoring on a laminated composite plate structure. The SuRE method is an emerging approach in ultrasonic wavebased structural health monitoring (SHM) field. In this method, a range of high-frequency, surface-guided waves are excited on the structure using piezoceramic elements. The waves propagate on the structure and interact with internal or surface damages. Initially, a baseline data of the intact structure is created by measuring the frequency transfer function between the excitation and sensing point. The integrity of structure is evaluated by monitoring changes in the frequency spectrums. The SuRE method has effectively been used for a variety of SHM applications including the detection of loose bolts, delamination in composite structures, internal corrosion in pipelines, and load and impact monitoring. Data obtained using the SuRE method was used for identifying the location of the applied load on a laminated composite plate using Support Vector Machine (SVM). A set of two piezoelectric elements were attached on the surface of the plate. A sweep excitation (150-250 kHz) generated surface-guided waves, and the transmitted waves were monitored at the sensory positions. The reference data set was measured simultaneously from the sensors. The plate was subjected to static loads while health monitoring data was being captured using the SuRE method. The confusion matrix indicated that the model classified correctly with up to 99.8% accuracy.
Waterbodies Extraction from LANDSAT8-OLI Imagery Using Awater Indexs-Guied Stochastic Fully-Connected Conditional Random Field Model and the Support Vector Machine

NASA Astrophysics Data System (ADS)

Wang, X.; Xu, L.

2018-04-01

One of the most important applications of remote sensing classification is water extraction. The water index (WI) based on Landsat images is one of the most common ways to distinguish water bodies from other land surface features. But conventional WI methods take into account spectral information only form a limited number of bands, and therefore the accuracy of those WI methods may be constrained in some areas which are covered with snow/ice, clouds, etc. An accurate and robust water extraction method is the key to the study at present. The support vector machine (SVM) using all bands spectral information can reduce for these classification error to some extent. Nevertheless, SVM which barely considers spatial information is relatively sensitive to noise in local regions. Conditional random field (CRF) which considers both spatial information and spectral information has proven to be able to compensate for these limitations. Hence, in this paper, we develop a systematic water extraction method by taking advantage of the complementarity between the SVM and a water index-guided stochastic fully-connected conditional random field (SVM-WIGSFCRF) to address the above issues. In addition, we comprehensively evaluate the reliability and accuracy of the proposed method using Landsat-8 operational land imager (OLI) images of one test site. We assess the method's performance by calculating the following accuracy metrics: Omission Errors (OE) and Commission Errors (CE); Kappa coefficient (KP) and Total Error (TE). Experimental results show that the new method can improve target detection accuracy under complex and changeable environments.
Vector-model-supported optimization in volumetric-modulated arc stereotactic radiotherapy planning for brain metastasis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, Eva Sau Fan; Department of Health Technology and Informatics, The Hong Kong Polytechnic University; Wu, Vincent Wing Cheung

Long planning time in volumetric-modulated arc stereotactic radiotherapy (VMA-SRT) cases can limit its clinical efficiency and use. A vector model could retrieve previously successful radiotherapy cases that share various common anatomic features with the current case. The prsent study aimed to develop a vector model that could reduce planning time by applying the optimization parameters from those retrieved reference cases. Thirty-six VMA-SRT cases of brain metastasis (gender, male [n = 23], female [n = 13]; age range, 32 to 81 years old) were collected and used as a reference database. Another 10 VMA-SRT cases were planned with both conventional optimization and vector-model-supported optimization, followingmore » the oncologists' clinical dose prescriptions. Planning time and plan quality measures were compared using the 2-sided paired Wilcoxon signed rank test with a significance level of 0.05, with positive false discovery rate (pFDR) of less than 0.05. With vector-model-supported optimization, there was a significant reduction in the median planning time, a 40% reduction from 3.7 to 2.2 hours (p = 0.002, pFDR = 0.032), and for the number of iterations, a 30% reduction from 8.5 to 6.0 (p = 0.006, pFDR = 0.047). The quality of plans from both approaches was comparable. From these preliminary results, vector-model-supported optimization can expedite the optimization of VMA-SRT for brain metastasis while maintaining plan quality.« less
A native promoter and inclusion of an intron is necessary for efficient expression of GFP or mRFP in Armillaria mellea

PubMed Central

Ford, Kathryn L.; Baumgartner, Kendra; Henricot, Béatrice; Bailey, Andy M.; Foster, Gary D.

2016-01-01

Armillaria mellea is a significant pathogen that causes Armillaria root disease on numerous hosts in forests, gardens and agricultural environments worldwide. Using a yeast-adapted pCAMBIA0380 Agrobacterium vector, we have constructed a series of vectors for transformation of A. mellea, assembled using yeast-based recombination methods. These have been designed to allow easy exchange of promoters and inclusion of introns. The vectors were first tested by transformation into basidiomycete Clitopilus passeckerianus to ascertain vector functionality then used to transform A. mellea. We show that heterologous promoters from the basidiomycetes Agaricus bisporus and Phanerochaete chrysosporium that were used successfully to control the hygromycin resistance cassette were not able to support expression of mRFP or GFP in A. mellea. The endogenous A. mellea gpd promoter delivered efficient expression, and we show that inclusion of an intron was also required for transgene expression. GFP and mRFP expression was stable in mycelia and fluorescence was visible in transgenic fruiting bodies and GFP was detectable in planta. Use of these vectors has been successful in giving expression of the fluorescent proteins GFP and mRFP in A. mellea, providing an additional molecular tool for this pathogen. PMID:27384974
Evaluation of laser cutting process with auxiliary gas pressure by soft computing approach

NASA Astrophysics Data System (ADS)

Lazov, Lyubomir; Nikolić, Vlastimir; Jovic, Srdjan; Milovančević, Miloš; Deneva, Heristina; Teirumenieka, Erika; Arsic, Nebojsa

2018-06-01

Evaluation of the optimal laser cutting parameters is very important for the high cut quality. This is highly nonlinear process with different parameters which is the main challenge in the optimization process. Data mining methodology is one of most versatile method which can be used laser cutting process optimization. Support vector regression (SVR) procedure is implemented since it is a versatile and robust technique for very nonlinear data regression. The goal in this study was to determine the optimal laser cutting parameters to ensure robust condition for minimization of average surface roughness. Three cutting parameters, the cutting speed, the laser power, and the assist gas pressure, were used in the investigation. As a laser type TruLaser 1030 technological system was used. Nitrogen as an assisted gas was used in the laser cutting process. As the data mining method, support vector regression procedure was used. Data mining prediction accuracy was very high according the coefficient (R2) of determination and root mean square error (RMSE): R2 = 0.9975 and RMSE = 0.0337. Therefore the data mining approach could be used effectively for determination of the optimal conditions of the laser cutting process.
A Semisupervised Support Vector Machines Algorithm for BCI Systems

PubMed Central

Qin, Jianzhao; Li, Yuanqing; Sun, Wei

2007-01-01

As an emerging technology, brain-computer interfaces (BCIs) bring us new communication interfaces which translate brain activities into control signals for devices like computers, robots, and so forth. In this study, we propose a semisupervised support vector machine (SVM) algorithm for brain-computer interface (BCI) systems, aiming at reducing the time-consuming training process. In this algorithm, we apply a semisupervised SVM for translating the features extracted from the electrical recordings of brain into control signals. This SVM classifier is built from a small labeled data set and a large unlabeled data set. Meanwhile, to reduce the time for training semisupervised SVM, we propose a batch-mode incremental learning method, which can also be easily applied to the online BCI systems. Additionally, it is suggested in many studies that common spatial pattern (CSP) is very effective in discriminating two different brain states. However, CSP needs a sufficient labeled data set. In order to overcome the drawback of CSP, we suggest a two-stage feature extraction method for the semisupervised learning algorithm. We apply our algorithm to two BCI experimental data sets. The offline data analysis results demonstrate the effectiveness of our algorithm. PMID:18368141
Micro-Raman spectroscopic identification of bacterial cells of the genus Staphylococcus and dependence on their cultivation conditions.

PubMed

Harz, M; Rösch, P; Peschke, K-D; Ronneberger, O; Burkhardt, H; Popp, J

2005-11-01

Microbial contamination is not only a medical problem, but also plays a large role in pharmaceutical clean room production and food processing technology. Therefore many techniques were developed to achieve differentiation and identification of microorganisms. Among these methods vibrational spectroscopic techniques (IR, Raman and SERS) are useful tools because of their rapidity and sensitivity. Recently we have shown that micro-Raman spectroscopy in combination with a support vector machine is an extremely capable approach for a fast and reliable, non-destructive online identification of single bacteria belonging to different genera. In order to simulate different environmental conditions we analyzed in this contribution different Staphylococcus strains with varying cultivation conditions in order to evaluate our method with a reliable dataset. First, micro-Raman spectra of the bulk material and single bacterial cells that were grown under the same conditions were recorded and used separately for a distinct chemotaxonomic classification of the strains. Furthermore Raman spectra were recorded from single bacterial cells that were cultured under various conditions to study the influence of cultivation on the discrimination ability. This dataset was analyzed both with a hierarchical cluster analysis (HCA) and a support vector machine (SVM).
Structural features that predict real-value fluctuations of globular proteins

PubMed Central

Jamroz, Michal; Kolinski, Andrzej; Kihara, Daisuke

2012-01-01

It is crucial to consider dynamics for understanding the biological function of proteins. We used a large number of molecular dynamics trajectories of non-homologous proteins as references and examined static structural features of proteins that are most relevant to fluctuations. We examined correlation of individual structural features with fluctuations and further investigated effective combinations of features for predicting the real-value of residue fluctuations using the support vector regression. It was found that some structural features have higher correlation than crystallographic B-factors with fluctuations observed in molecular dynamics trajectories. Moreover, support vector regression that uses combinations of static structural features showed accurate prediction of fluctuations with an average Pearson’s correlation coefficient of 0.669 and a root mean square error of 1.04 Å. This correlation coefficient is higher than the one observed for the prediction by the Gaussian network model. An advantage of the developed method over the Gaussian network models is that the former predicts the real-value of fluctuation. The results help improve our understanding of relationships between protein structure and fluctuation. Furthermore, the developed method provides a convienient practial way to predict fluctuations of proteins using easily computed static structural features of proteins. PMID:22328193
Support vector machine and principal component analysis for microarray data classification

NASA Astrophysics Data System (ADS)

Astuti, Widi; Adiwijaya

2018-03-01

Cancer is a leading cause of death worldwide although a significant proportion of it can be cured if it is detected early. In recent decades, technology called microarray takes an important role in the diagnosis of cancer. By using data mining technique, microarray data classification can be performed to improve the accuracy of cancer diagnosis compared to traditional techniques. The characteristic of microarray data is small sample but it has huge dimension. Since that, there is a challenge for researcher to provide solutions for microarray data classification with high performance in both accuracy and running time. This research proposed the usage of Principal Component Analysis (PCA) as a dimension reduction method along with Support Vector Method (SVM) optimized by kernel functions as a classifier for microarray data classification. The proposed scheme was applied on seven data sets using 5-fold cross validation and then evaluation and analysis conducted on term of both accuracy and running time. The result showed that the scheme can obtained 100% accuracy for Ovarian and Lung Cancer data when Linear and Cubic kernel functions are used. In term of running time, PCA greatly reduced the running time for every data sets.
A Novel RSSI Prediction Using Imperialist Competition Algorithm (ICA), Radial Basis Function (RBF) and Firefly Algorithm (FFA) in Wireless Networks

PubMed Central

Goudarzi, Shidrokh; Haslina Hassan, Wan; Abdalla Hashim, Aisha-Hassan; Soleymani, Seyed Ahmad; Anisi, Mohammad Hossein; Zakaria, Omar M.

2016-01-01

This study aims to design a vertical handover prediction method to minimize unnecessary handovers for a mobile node (MN) during the vertical handover process. This relies on a novel method for the prediction of a received signal strength indicator (RSSI) referred to as IRBF-FFA, which is designed by utilizing the imperialist competition algorithm (ICA) to train the radial basis function (RBF), and by hybridizing with the firefly algorithm (FFA) to predict the optimal solution. The prediction accuracy of the proposed IRBF–FFA model was validated by comparing it to support vector machines (SVMs) and multilayer perceptron (MLP) models. In order to assess the model’s performance, we measured the coefficient of determination (R2), correlation coefficient (r), root mean square error (RMSE) and mean absolute percentage error (MAPE). The achieved results indicate that the IRBF–FFA model provides more precise predictions compared to different ANNs, namely, support vector machines (SVMs) and multilayer perceptron (MLP). The performance of the proposed model is analyzed through simulated and real-time RSSI measurements. The results also suggest that the IRBF–FFA model can be applied as an efficient technique for the accurate prediction of vertical handover. PMID:27438600

A Novel RSSI Prediction Using Imperialist Competition Algorithm (ICA), Radial Basis Function (RBF) and Firefly Algorithm (FFA) in Wireless Networks.

PubMed

Goudarzi, Shidrokh; Haslina Hassan, Wan; Abdalla Hashim, Aisha-Hassan; Soleymani, Seyed Ahmad; Anisi, Mohammad Hossein; Zakaria, Omar M

2016-01-01

This study aims to design a vertical handover prediction method to minimize unnecessary handovers for a mobile node (MN) during the vertical handover process. This relies on a novel method for the prediction of a received signal strength indicator (RSSI) referred to as IRBF-FFA, which is designed by utilizing the imperialist competition algorithm (ICA) to train the radial basis function (RBF), and by hybridizing with the firefly algorithm (FFA) to predict the optimal solution. The prediction accuracy of the proposed IRBF-FFA model was validated by comparing it to support vector machines (SVMs) and multilayer perceptron (MLP) models. In order to assess the model's performance, we measured the coefficient of determination (R2), correlation coefficient (r), root mean square error (RMSE) and mean absolute percentage error (MAPE). The achieved results indicate that the IRBF-FFA model provides more precise predictions compared to different ANNs, namely, support vector machines (SVMs) and multilayer perceptron (MLP). The performance of the proposed model is analyzed through simulated and real-time RSSI measurements. The results also suggest that the IRBF-FFA model can be applied as an efficient technique for the accurate prediction of vertical handover.
Extracting Information from Electronic Medical Records to Identify the Obesity Status of a Patient Based on Comorbidities and Bodyweight Measures.

PubMed

Figueroa, Rosa L; Flores, Christopher A

2016-08-01

Obesity is a chronic disease with an increasing impact on the world's population. In this work, we present a method of identifying obesity automatically using text mining techniques and information related to body weight measures and obesity comorbidities. We used a dataset of 3015 de-identified medical records that contain labels for two classification problems. The first classification problem distinguishes between obesity, overweight, normal weight, and underweight. The second classification problem differentiates between obesity types: super obesity, morbid obesity, severe obesity and moderate obesity. We used a Bag of Words approach to represent the records together with unigram and bigram representations of the features. We implemented two approaches: a hierarchical method and a nonhierarchical one. We used Support Vector Machine and Naïve Bayes together with ten-fold cross validation to evaluate and compare performances. Our results indicate that the hierarchical approach does not work as well as the nonhierarchical one. In general, our results show that Support Vector Machine obtains better performances than Naïve Bayes for both classification problems. We also observed that bigram representation improves performance compared with unigram representation.
Identification of transformer fault based on dissolved gas analysis using hybrid support vector machine-modified evolutionary particle swarm optimisation

PubMed Central

2018-01-01

Early detection of power transformer fault is important because it can reduce the maintenance cost of the transformer and it can ensure continuous electricity supply in power systems. Dissolved Gas Analysis (DGA) technique is commonly used to identify oil-filled power transformer fault type but utilisation of artificial intelligence method with optimisation methods has shown convincing results. In this work, a hybrid support vector machine (SVM) with modified evolutionary particle swarm optimisation (EPSO) algorithm was proposed to determine the transformer fault type. The superiority of the modified PSO technique with SVM was evaluated by comparing the results with the actual fault diagnosis, unoptimised SVM and previous reported works. Data reduction was also applied using stepwise regression prior to the training process of SVM to reduce the training time. It was found that the proposed hybrid SVM-Modified EPSO (MEPSO)-Time Varying Acceleration Coefficient (TVAC) technique results in the highest correct identification percentage of faults in a power transformer compared to other PSO algorithms. Thus, the proposed technique can be one of the potential solutions to identify the transformer fault type based on DGA data on site. PMID:29370230
Identification of transformer fault based on dissolved gas analysis using hybrid support vector machine-modified evolutionary particle swarm optimisation.

PubMed

Illias, Hazlee Azil; Zhao Liang, Wee

2018-01-01

Early detection of power transformer fault is important because it can reduce the maintenance cost of the transformer and it can ensure continuous electricity supply in power systems. Dissolved Gas Analysis (DGA) technique is commonly used to identify oil-filled power transformer fault type but utilisation of artificial intelligence method with optimisation methods has shown convincing results. In this work, a hybrid support vector machine (SVM) with modified evolutionary particle swarm optimisation (EPSO) algorithm was proposed to determine the transformer fault type. The superiority of the modified PSO technique with SVM was evaluated by comparing the results with the actual fault diagnosis, unoptimised SVM and previous reported works. Data reduction was also applied using stepwise regression prior to the training process of SVM to reduce the training time. It was found that the proposed hybrid SVM-Modified EPSO (MEPSO)-Time Varying Acceleration Coefficient (TVAC) technique results in the highest correct identification percentage of faults in a power transformer compared to other PSO algorithms. Thus, the proposed technique can be one of the potential solutions to identify the transformer fault type based on DGA data on site.
Biomarkers of Eating Disorders Using Support Vector Machine Analysis of Structural Neuroimaging Data: Preliminary Results

PubMed Central

Cerasa, Antonio; Castiglioni, Isabella; Salvatore, Christian; Funaro, Angela; Martino, Iolanda; Alfano, Stefania; Donzuso, Giulia; Perrotta, Paolo; Gioia, Maria Cecilia; Gilardi, Maria Carla; Quattrone, Aldo

2015-01-01

Presently, there are no valid biomarkers to identify individuals with eating disorders (ED). The aim of this work was to assess the feasibility of a machine learning method for extracting reliable neuroimaging features allowing individual categorization of patients with ED. Support Vector Machine (SVM) technique, combined with a pattern recognition method, was employed utilizing structural magnetic resonance images. Seventeen females with ED (six with diagnosis of anorexia nervosa and 11 with bulimia nervosa) were compared against 17 body mass index-matched healthy controls (HC). Machine learning allowed individual diagnosis of ED versus HC with an Accuracy ≥ 0.80. Voxel-based pattern recognition analysis demonstrated that voxels influencing the classification Accuracy involved the occipital cortex, the posterior cerebellar lobule, precuneus, sensorimotor/premotor cortices, and the medial prefrontal cortex, all critical regions known to be strongly involved in the pathophysiological mechanisms of ED. Although these findings should be considered preliminary given the small size investigated, SVM analysis highlights the role of well-known brain regions as possible biomarkers to distinguish ED from HC at an individual level, thus encouraging the translational implementation of this new multivariate approach in the clinical practice. PMID:26648660
Full-motion video analysis for improved gender classification

NASA Astrophysics Data System (ADS)

Flora, Jeffrey B.; Lochtefeld, Darrell F.; Iftekharuddin, Khan M.

2014-06-01

The ability of computer systems to perform gender classification using the dynamic motion of the human subject has important applications in medicine, human factors, and human-computer interface systems. Previous works in motion analysis have used data from sensors (including gyroscopes, accelerometers, and force plates), radar signatures, and video. However, full-motion video, motion capture, range data provides a higher resolution time and spatial dataset for the analysis of dynamic motion. Works using motion capture data have been limited by small datasets in a controlled environment. In this paper, we explore machine learning techniques to a new dataset that has a larger number of subjects. Additionally, these subjects move unrestricted through a capture volume, representing a more realistic, less controlled environment. We conclude that existing linear classification methods are insufficient for the gender classification for larger dataset captured in relatively uncontrolled environment. A method based on a nonlinear support vector machine classifier is proposed to obtain gender classification for the larger dataset. In experimental testing with a dataset consisting of 98 trials (49 subjects, 2 trials per subject), classification rates using leave-one-out cross-validation are improved from 73% using linear discriminant analysis to 88% using the nonlinear support vector machine classifier.
Classifying Physical Morphology of Cocoa Beans Digital Images using Multiclass Ensemble Least-Squares Support Vector Machine

NASA Astrophysics Data System (ADS)

Lawi, Armin; Adhitya, Yudhi

2018-03-01

The objective of this research is to determine the quality of cocoa beans through morphology of their digital images. Samples of cocoa beans were scattered on a bright white paper under a controlled lighting condition. A compact digital camera was used to capture the images. The images were then processed to extract their morphological parameters. Classification process begins with an analysis of cocoa beans image based on morphological feature extraction. Parameters for extraction of morphological or physical feature parameters, i.e., Area, Perimeter, Major Axis Length, Minor Axis Length, Aspect Ratio, Circularity, Roundness, Ferret Diameter. The cocoa beans are classified into 4 groups, i.e.: Normal Beans, Broken Beans, Fractured Beans, and Skin Damaged Beans. The model of classification used in this paper is the Multiclass Ensemble Least-Squares Support Vector Machine (MELS-SVM), a proposed improvement model of SVM using ensemble method in which the separate hyperplanes are obtained by least square approach and the multiclass procedure uses One-Against- All method. The result of our proposed model showed that the classification with morphological feature input parameters were accurately as 99.705% for the four classes, respectively.
PlasmoGEM, a database supporting a community resource for large-scale experimental genetics in malaria parasites.

PubMed

Schwach, Frank; Bushell, Ellen; Gomes, Ana Rita; Anar, Burcu; Girling, Gareth; Herd, Colin; Rayner, Julian C; Billker, Oliver

2015-01-01

The Plasmodium Genetic Modification (PlasmoGEM) database (http://plasmogem.sanger.ac.uk) provides access to a resource of modular, versatile and adaptable vectors for genome modification of Plasmodium spp. parasites. PlasmoGEM currently consists of >2000 plasmids designed to modify the genome of Plasmodium berghei, a malaria parasite of rodents, which can be requested by non-profit research organisations free of charge. PlasmoGEM vectors are designed with long homology arms for efficient genome integration and carry gene specific barcodes to identify individual mutants. They can be used for a wide array of applications, including protein localisation, gene interaction studies and high-throughput genetic screens. The vector production pipeline is supported by a custom software suite that automates both the vector design process and quality control by full-length sequencing of the finished vectors. The PlasmoGEM web interface allows users to search a database of finished knock-out and gene tagging vectors, view details of their designs, download vector sequence in different formats and view available quality control data as well as suggested genotyping strategies. We also make gDNA library clones and intermediate vectors available for researchers to produce vectors for themselves. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Method and apparatus for enhanced detection of toxic agents

DOEpatents

Greenbaum, Elias; Rodriguez, Jr., Miguel; Wu, Jie Jayne; Qi, Hairong

2013-10-01

A biosensor based detection of toxins includes enhancing a fluorescence signal by concentrating a plurality of photosynthetic organisms in a fluid into a concentrated region using biased AC electro-osmosis. A measured photosynthetic activity of the photosynthetic organisms is obtained in the concentrated region, where chemical, biological or radiological agents reduce a nominal photosynthetic activity of the photosynthetic organisms. A presence of the chemical, biological and/or radiological agents or precursors thereof, is determined in the fluid based on the measured photosynthetic activity of the concentrated plurality of photosynthetic organisms. A lab-on-a-chip system is used for the concentrating step. The presence of agents is determined from feature vectors, obtained from processing a time dependent signal using amplitude statistics and/or time-frequency analysis, relative to a control signal. A linear discriminant method including support vector machine classification (SVM) is used to identify the agents.
Vectorized Jiles-Atherton hysteresis model

NASA Astrophysics Data System (ADS)

Szymański, Grzegorz; Waszak, Michał

2004-01-01

This paper deals with vector hysteresis modeling. A vector model consisting of individual Jiles-Atherton components placed along principal axes is proposed. The cross-axis coupling ensures general vector model properties. Minor loops are obtained using scaling method. The model is intended for efficient finite element method computations defined in terms of magnetic vector potential. Numerical efficiency is ensured by differential susceptibility approach.
Effective data compaction algorithm for vector scan EB writing system

NASA Astrophysics Data System (ADS)

Ueki, Shinichi; Ashida, Isao; Kawahira, Hiroichi

2001-01-01

We have developed a new mask data compaction algorithm dedicated to vector scan electron beam (EB) writing systems for 0.13 μm device generation. Large mask data size has become a significant problem at mask data processing for which data compaction is an important technique. In our new mask data compaction, 'array' representation and 'cell' representation are used. The mask data format for the EB writing system with vector scan supports these representations. The array representation has a pitch and a number of repetitions in both X and Y direction. The cell representation has a definition of figure group and its reference. The new data compaction method has the following three steps. (1) Search arrays of figures by selecting pitches of array so that a number of figures are included. (2) Find out same arrays that have same repetitive pitch and number of figures. (3) Search cells of figures, where the figures in each cell take identical positional relationship. By this new method for the mask data of a 4M-DRAM block gate layer with peripheral circuits, 202 Mbytes without compaction was highly compacted to 6.7 Mbytes in 20 minutes on a 500 MHz PC.
Naive Bayes Bearing Fault Diagnosis Based on Enhanced Independence of Data

PubMed Central

Zhang, Nannan; Wu, Lifeng; Yang, Jing; Guan, Yong

2018-01-01

The bearing is the key component of rotating machinery, and its performance directly determines the reliability and safety of the system. Data-based bearing fault diagnosis has become a research hotspot. Naive Bayes (NB), which is based on independent presumption, is widely used in fault diagnosis. However, the bearing data are not completely independent, which reduces the performance of NB algorithms. In order to solve this problem, we propose a NB bearing fault diagnosis method based on enhanced independence of data. The method deals with data vector from two aspects: the attribute feature and the sample dimension. After processing, the classification limitation of NB is reduced by the independence hypothesis. First, we extract the statistical characteristics of the original signal of the bearings effectively. Then, the Decision Tree algorithm is used to select the important features of the time domain signal, and the low correlation features is selected. Next, the Selective Support Vector Machine (SSVM) is used to prune the dimension data and remove redundant vectors. Finally, we use NB to diagnose the fault with the low correlation data. The experimental results show that the independent enhancement of data is effective for bearing fault diagnosis. PMID:29401730
Remote sensing of suspended sediment water research: principles, methods, and progress

NASA Astrophysics Data System (ADS)

Shen, Ping; Zhang, Jing

2011-12-01

In this paper, we reviewed the principle, data, methods and steps in suspended sediment research by using remote sensing, summed up some representative models and methods, and analyzes the deficiencies of existing methods. Combined with the recent progress of remote sensing theory and application in water suspended sediment research, we introduced in some data processing methods such as atmospheric correction method, adjacent effect correction, and some intelligence algorithms such as neural networks, genetic algorithms, support vector machines into the suspended sediment inversion research, combined with other geographic information, based on Bayesian theory, we improved the suspended sediment inversion precision, and aim to give references to the related researchers.
ℓ(p)-Norm multikernel learning approach for stock market price forecasting.

PubMed

Shao, Xigao; Wu, Kun; Liao, Bifeng

2012-01-01

Linear multiple kernel learning model has been used for predicting financial time series. However, ℓ(1)-norm multiple support vector regression is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures that generalize well, we adopt ℓ(p)-norm multiple kernel support vector regression (1 ≤ p < ∞) as a stock price prediction model. The optimization problem is decomposed into smaller subproblems, and the interleaved optimization strategy is employed to solve the regression model. The model is evaluated on forecasting the daily stock closing prices of Shanghai Stock Index in China. Experimental results show that our proposed model performs better than ℓ(1)-norm multiple support vector regression model.
Applications of Support Vector Machine (SVM) Learning in Cancer Genomics

PubMed Central

HUANG, SHUJUN; CAI, NIANGUANG; PACHECO, PEDRO PENZUTI; NARANDES, SHAVIRA; WANG, YANG; XU, WAYNE

2017-01-01

Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better understanding of cancer driver genes. Herein we reviewed the recent progress of SVMs in cancer genomic studies. We intend to comprehend the strength of the SVM learning and its future perspective in cancer genomic applications. PMID:29275361
Instructional Videos for Unsupervised Harvesting and Learning of Action Examples

DTIC Science & Technology

2014-11-03

collection of image or video anno - tations has been tackled in different ways, but most existing methods still require a human in the loop. The...the views of ARO and NSF. 7. REFERENCES [1] C.-C. Chang and C.- J . Lin. LIBSVM: A library for support vector machines. In ACM Transactions on...feature encoding methods. In BMVC, 2011. [3] J . Chen, Y. Cui, G. Ye, D. Liu, and S.-F. Chang. Event-driven semantic concept discovery by exploiting
Anytime query-tuned kernel machine classifiers via Cholesky factorization

NASA Technical Reports Server (NTRS)

DeCoste, D.

2002-01-01

We recently demonstrated 2 to 64-fold query-time speedups of Support Vector Machine and Kernel Fisher classifiers via a new computational geometry method for anytime output bounds (DeCoste,2002). This new paper refines our approach in two key ways. First, we introduce a simple linear algebra formulation based on Cholesky factorization, yielding simpler equations and lower computational overhead. Second, this new formulation suggests new methods for achieving additional speedups, including tuning on query samples. We demonstrate effectiveness on benchmark datasets.
Identification of Migratory Insects from their Physical Features using a Decision-Tree Support Vector Machine and its Application to Radar Entomology.

PubMed

Hu, Cheng; Kong, Shaoyang; Wang, Rui; Long, Teng; Fu, Xiaowei

2018-04-03

Migration is a key process in the population dynamics of numerous insect species, including many that are pests or vectors of disease. Identification of insect migrants is critically important to studies of insect migration. Radar is an effective means of monitoring nocturnal insect migrants. However, species identification of migrating insects is often unachievable with current radar technology. Special-purpose entomological radar can measure radar cross-sections (RCSs) from which the insect mass, wingbeat frequency and body length-to-width ratio (a measure of morphological form) can be estimated. These features may be valuable for species identification. This paper explores the identification of insect migrants based on the mass, wingbeat frequency and length-to-width ratio, and body length is also introduced to assess the benefit of adding another variable. A total of 23 species of migratory insects captured by a searchlight trap are used to develop a classification model based on decision-tree support vector machine method. The results reveal that the identification accuracy exceeds 80% for all species if the mass, wingbeat frequency and length-to-width ratio are utilized, and the addition of body length is shown to further increase accuracy. It is also shown that improving the precision of the measurements leads to increased identification accuracy.
Support vector machine incremental learning triggered by wrongly predicted samples

NASA Astrophysics Data System (ADS)

Tang, Ting-long; Guan, Qiu; Wu, Yi-rong

2018-05-01

According to the classic Karush-Kuhn-Tucker (KKT) theorem, at every step of incremental support vector machine (SVM) learning, the newly adding sample which violates the KKT conditions will be a new support vector (SV) and migrate the old samples between SV set and non-support vector (NSV) set, and at the same time the learning model should be updated based on the SVs. However, it is not exactly clear at this moment that which of the old samples would change between SVs and NSVs. Additionally, the learning model will be unnecessarily updated, which will not greatly increase its accuracy but decrease the training speed. Therefore, how to choose the new SVs from old sets during the incremental stages and when to process incremental steps will greatly influence the accuracy and efficiency of incremental SVM learning. In this work, a new algorithm is proposed to select candidate SVs and use the wrongly predicted sample to trigger the incremental processing simultaneously. Experimental results show that the proposed algorithm can achieve good performance with high efficiency, high speed and good accuracy.
Quantum Support Vector Machine for Big Data Classification

NASA Astrophysics Data System (ADS)

Rebentrost, Patrick; Mohseni, Masoud; Lloyd, Seth

2014-09-01

Supervised machine learning is the classification of new data based on already classified training examples. In this work, we show that the support vector machine, an optimized binary classifier, can be implemented on a quantum computer, with complexity logarithmic in the size of the vectors and the number of training examples. In cases where classical sampling algorithms require polynomial time, an exponential speedup is obtained. At the core of this quantum big data algorithm is a nonsparse matrix exponentiation technique for efficiently performing a matrix inversion of the training data inner-product (kernel) matrix.

Automated innovative diagnostic, data management and communication tool, for improving malaria vector control in endemic settings.

PubMed

Vontas, John; Mitsakakis, Konstantinos; Zengerle, Roland; Yewhalaw, Delenasaw; Sikaala, Chadwick Haadezu; Etang, Josiane; Fallani, Matteo; Carman, Bill; Müller, Pie; Chouaïbou, Mouhamadou; Coleman, Marlize; Coleman, Michael

2016-01-01

Malaria is a life-threatening disease that caused more than 400,000 deaths in sub-Saharan Africa in 2015. Mass prevention of the disease is best achieved by vector control which heavily relies on the use of insecticides. Monitoring mosquito vector populations is an integral component of control programs and a prerequisite for effective interventions. Several individual methods are used for this task; however, there are obstacles to their uptake, as well as challenges in organizing, interpreting and communicating vector population data. The Horizon 2020 project "DMC-MALVEC" consortium will develop a fully integrated and automated multiplex vector-diagnostic platform (LabDisk) for characterizing mosquito populations in terms of species composition, Plasmodium infections and biochemical insecticide resistance markers. The LabDisk will be interfaced with a Disease Data Management System (DDMS), a custom made data management software which will collate and manage data from routine entomological monitoring activities providing information in a timely fashion based on user needs and in a standardized way. The ResistanceSim, a serious game, a modern ICT platform that uses interactive ways of communicating guidelines and exemplifying good practices of optimal use of interventions in the health sector will also be a key element. The use of the tool will teach operational end users the value of quality data (relevant, timely and accurate) to make informed decisions. The integrated system (LabDisk, DDMS & ResistanceSim) will be evaluated in four malaria endemic countries, representative of the vector control challenges in sub-Saharan Africa, (Cameroon, Ivory Coast, Ethiopia and Zambia), highly representative of malaria settings with different levels of endemicity and vector control challenges, to support informed decision-making in vector control and disease management.
[New strategy for RNA vectorization in mammalian cells. Use of a peptide vector].

PubMed

Vidal, P; Morris, M C; Chaloin, L; Heitz, F; Divita, G

1997-04-01

A major barrier for gene delivery is the low permeability of nucleic acids to cellular membranes. The development of antisenses and gene therapy has focused mainly on improving methods of oligonucleotide or gene delivery to the cell. In this report we described a new strategy for RNA cell delivery, based on a short single peptide. This peptide vector is derived from both the fusion domain of the gp41 protein of HIV and the nuclear localization sequence of the SV40 large T antigen. This peptide vector localizes rapidly to the cytoplasm then to the nucleus of human fibroblasts (HS-68) within a few minutes and exhibits a high affinity for a single-stranded mRNA encoding the p66 subunit of the HIV-1 reverse transcriptase (in a 100 nM range). The peptide/RNA complex formation involves mainly electrostatic interactions between the basic residues of the peptide and the charges on the phosphate group of the RNA. In the presence of the peptide-vector fluorescently-labelled mRNA is delivered into the cytoplasm of mammalian cells (HS68 human fibroblasts) in less than 1 h with a relatively high efficiency (80%). This new concept based on a peptide-derived vector offers several advantages compared to other compounds commonly used in gene delivery. This vector is highly soluble and exhibits no cytotoxicity at the concentrations used for optimal gene delivery. This result clearly supports the fact that this peptide vector is a powerful tool and that it can be used widely, as much for laboratory research as for new applications and development in gene and/or antisense therapy.
Predicting residue-wise contact orders in proteins by support vector regression.

PubMed

Song, Jiangning; Burrage, Kevin

2006-10-03

The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.
A discriminative method for protein remote homology detection and fold recognition combining Top-n-grams and latent semantic analysis.

PubMed

Liu, Bin; Wang, Xiaolong; Lin, Lei; Dong, Qiwen; Wang, Xuan

2008-12-01

Protein remote homology detection and fold recognition are central problems in bioinformatics. Currently, discriminative methods based on support vector machine (SVM) are the most effective and accurate methods for solving these problems. A key step to improve the performance of the SVM-based methods is to find a suitable representation of protein sequences. In this paper, a novel building block of proteins called Top-n-grams is presented, which contains the evolutionary information extracted from the protein sequence frequency profiles. The protein sequence frequency profiles are calculated from the multiple sequence alignments outputted by PSI-BLAST and converted into Top-n-grams. The protein sequences are transformed into fixed-dimension feature vectors by the occurrence times of each Top-n-gram. The training vectors are evaluated by SVM to train classifiers which are then used to classify the test protein sequences. We demonstrate that the prediction performance of remote homology detection and fold recognition can be improved by combining Top-n-grams and latent semantic analysis (LSA), which is an efficient feature extraction technique from natural language processing. When tested on superfamily and fold benchmarks, the method combining Top-n-grams and LSA gives significantly better results compared to related methods. The method based on Top-n-grams significantly outperforms the methods based on many other building blocks including N-grams, patterns, motifs and binary profiles. Therefore, Top-n-gram is a good building block of the protein sequences and can be widely used in many tasks of the computational biology, such as the sequence alignment, the prediction of domain boundary, the designation of knowledge-based potentials and the prediction of protein binding sites.
Population of Aedes sp in Highland of Wonosobo District and Its Competence as A Dengue Vector

NASA Astrophysics Data System (ADS)

Martini, Martini; Widjanarko, Bagoes; Hestiningsih, Retno; Purwantisari, Susiana; Yuliawati, Sri

2017-02-01

The increased cases of dengue fever have occurred in the highland of Wonosobo District, and the epidemic taken place in 2009 had 59.3 cases per 100,000 populations. This study aimed to describe of vector competence of the mosquitoes as a dengue vector in the highland of Wonosobo District, Central Java Province. The serial laboratory work was done to measure of vector competence complementary with vector bionomic study. The samples were 20 villages, which were located at Wonosobo sub district. Every village was observed about 15-20 houses. The observed variables were vector competition, bionomic and transovarial infection level, and titer of virus on the mosquitoes after injection. Immunohistochemistry or IHC methods were used to identify transovarial infection status. The number of Ae. aegypti and Ae. albopictus were almost similar and both were found indoors or outdoors. Based on HI and OI index, the larvae density in the highland was enough high than standard of the program. Transovarial infection was found on Ae. aegypti and Ae. albopictus. Environment parameters such as temperature and relative humidity fulfilled the optimum requirement to support the vectors’ life cycle. Transovarial infection has been proven, thus, it indicates that the local transmission has been occurred in this area. Titer of virus was also increasing after day per day. This indicate that the mosquitoes has the ability being vector. As used to do in other area, it is important to conduct breeding places elimination (PSN) indoors as well as outdoors, through active participation of the community in highland area.
TU-H-CAMPUS-JeP2-03: Machine-Learning-Based Delineation Framework of GTV Regions of Solid and Ground Glass Opacity Lung Tumors at Datasets of Planning CT and PET/CT Images

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ikushima, K; Arimura, H; Jin, Z

Purpose: In radiation treatment planning, delineation of gross tumor volume (GTV) is very important, because the GTVs affect the accuracies of radiation therapy procedure. To assist radiation oncologists in the delineation of GTV regions while treatment planning for lung cancer, we have proposed a machine-learning-based delineation framework of GTV regions of solid and ground glass opacity (GGO) lung tumors following by optimum contour selection (OCS) method. Methods: Our basic idea was to feed voxel-based image features around GTV contours determined by radiation oncologists into a machine learning classifier in the training step, after which the classifier produced the degree ofmore » GTV for each voxel in the testing step. Ten data sets of planning CT and PET/CT images were selected for this study. The support vector machine (SVM), which learned voxel-based features which include voxel value and magnitudes of image gradient vector that obtained from each voxel in the planning CT and PET/CT images, extracted initial GTV regions. The final GTV regions were determined using the OCS method that was able to select a global optimum object contour based on multiple active delineations with a level set method around the GTV. To evaluate the results of proposed framework for ten cases (solid:6, GGO:4), we used the three-dimensional Dice similarity coefficient (DSC), which denoted the degree of region similarity between the GTVs delineated by radiation oncologists and the proposed framework. Results: The proposed method achieved an average three-dimensional DSC of 0.81 for ten lung cancer patients, while a standardized uptake value-based method segmented GTV regions with the DSC of 0.43. The average DSCs for solid and GGO were 0.84 and 0.76, respectively, obtained by the proposed framework. Conclusion: The proposed framework with the support vector machine may be useful for assisting radiation oncologists in delineating solid and GGO lung tumors.« less
Molecular taxonomy of the two Leishmania vectors Lutzomyia umbratilis and Lutzomyia anduzei (Diptera: Psychodidae) from the Brazilian Amazon

PubMed Central

2013-01-01

Background Lutzomyia umbratilis (a probable species complex) is the main vector of Leishmania guyanensis in the northern region of Brazil. Lutzomyia anduzei has been implicated as a secondary vector of this parasite. These species are closely related and exhibit high morphological similarity in the adult stage; therefore, they have been wrongly identified, both in the past and in the present. This shows the need for employing integrated taxonomy. Methods With the aim of gathering information on the molecular taxonomy and evolutionary relationships of these two vectors, 118 sequences of 663 base pairs (barcode region of the mitochondrial DNA cytochrome oxidase I – COI) were generated from 72 L. umbratilis and 46 L. anduzei individuals captured, respectively, in six and five localities of the Brazilian Amazon. The efficiency of the barcode region to differentiate the L. umbratilis lineages I and II was also evaluated. The data were analyzed using the pairwise genetic distances matrix and the Neighbor-Joining (NJ) tree, both based on the Kimura Two Parameter (K2P) evolutionary model. Results The analyses resulted in 67 haplotypes: 32 for L. umbratilis and 35 for L. anduzei. The mean intra-specific genetic distance was 0.008 (0.002 to 0.010 for L. umbratilis; 0.008 to 0.014 for L. anduzei), whereas the mean interspecific genetic distance was 0.044 (0.041 to 0.046), supporting the barcoding gap. Between the L. umbratilis lineages I and II, it was 0.009 to 0.010. The NJ tree analysis strongly supported monophyletic clades for both L. umbratilis and L. anduzei, whereas the L. umbratilis lineages I and II formed two poorly supported monophyletic subclades. Conclusions The barcode region clearly separated the two species and may therefore constitute a valuable tool in the identification of the sand fly vectors of Leishmania in endemic leishmaniasis areas. However, the barcode region had not enough power to separate the two lineages of L. umbratilis, likely reflecting incipient species that have not yet reached the status of distinct species. PMID:24021095
Enhancing spatial resolution of (18)F positron imaging with the Timepix detector by classification of primary fired pixels using support vector machine.

PubMed

Wang, Qian; Liu, Zhen; Ziegler, Sibylle I; Shi, Kuangyu

2015-07-07

Position-sensitive positron cameras using silicon pixel detectors have been applied for some preclinical and intraoperative clinical applications. However, the spatial resolution of a positron camera is limited by positron multiple scattering in the detector. An incident positron may fire a number of successive pixels on the imaging plane. It is still impossible to capture the primary fired pixel along a particle trajectory by hardware or to perceive the pixel firing sequence by direct observation. Here, we propose a novel data-driven method to improve the spatial resolution by classifying the primary pixels within the detector using support vector machine. A classification model is constructed by learning the features of positron trajectories based on Monte-Carlo simulations using Geant4. Topological and energy features of pixels fired by (18)F positrons were considered for the training and classification. After applying the classification model on measurements, the primary fired pixels of the positron tracks in the silicon detector were estimated. The method was tested and assessed for [(18)F]FDG imaging of an absorbing edge protocol and a leaf sample. The proposed method improved the spatial resolution from 154.6 ± 4.2 µm (energy weighted centroid approximation) to 132.3 ± 3.5 µm in the absorbing edge measurements. For the positron imaging of a leaf sample, the proposed method achieved lower root mean square error relative to phosphor plate imaging, and higher similarity with the reference optical image. The improvements of the preliminary results support further investigation of the proposed algorithm for the enhancement of positron imaging in clinical and preclinical applications.
Enhancing spatial resolution of 18F positron imaging with the Timepix detector by classification of primary fired pixels using support vector machine

NASA Astrophysics Data System (ADS)

Wang, Qian; Liu, Zhen; Ziegler, Sibylle I.; Shi, Kuangyu

2015-07-01

Position-sensitive positron cameras using silicon pixel detectors have been applied for some preclinical and intraoperative clinical applications. However, the spatial resolution of a positron camera is limited by positron multiple scattering in the detector. An incident positron may fire a number of successive pixels on the imaging plane. It is still impossible to capture the primary fired pixel along a particle trajectory by hardware or to perceive the pixel firing sequence by direct observation. Here, we propose a novel data-driven method to improve the spatial resolution by classifying the primary pixels within the detector using support vector machine. A classification model is constructed by learning the features of positron trajectories based on Monte-Carlo simulations using Geant4. Topological and energy features of pixels fired by 18F positrons were considered for the training and classification. After applying the classification model on measurements, the primary fired pixels of the positron tracks in the silicon detector were estimated. The method was tested and assessed for [18F]FDG imaging of an absorbing edge protocol and a leaf sample. The proposed method improved the spatial resolution from 154.6 ± 4.2 µm (energy weighted centroid approximation) to 132.3 ± 3.5 µm in the absorbing edge measurements. For the positron imaging of a leaf sample, the proposed method achieved lower root mean square error relative to phosphor plate imaging, and higher similarity with the reference optical image. The improvements of the preliminary results support further investigation of the proposed algorithm for the enhancement of positron imaging in clinical and preclinical applications.
A support vector machine approach for classification of welding defects from ultrasonic signals

NASA Astrophysics Data System (ADS)

Chen, Yuan; Ma, Hong-Wei; Zhang, Guang-Ming

2014-07-01

Defect classification is an important issue in ultrasonic non-destructive evaluation. A layered multi-class support vector machine (LMSVM) classification system, which combines multiple SVM classifiers through a layered architecture, is proposed in this paper. The proposed LMSVM classification system is applied to the classification of welding defects from ultrasonic test signals. The measured ultrasonic defect echo signals are first decomposed into wavelet coefficients by the wavelet packet transform. The energy of the wavelet coefficients at different frequency channels are used to construct the feature vectors. The bees algorithm (BA) is then used for feature selection and SVM parameter optimisation for the LMSVM classification system. The BA-based feature selection optimises the energy feature vectors. The optimised feature vectors are input to the LMSVM classification system for training and testing. Experimental results of classifying welding defects demonstrate that the proposed technique is highly robust, precise and reliable for ultrasonic defect classification.
Support vector machine based decision for mechanical fault condition monitoring in induction motor using an advanced Hilbert-Park transform.

PubMed

Ben Salem, Samira; Bacha, Khmais; Chaari, Abdelkader

2012-09-01

In this work we suggest an original fault signature based on an improved combination of Hilbert and Park transforms. Starting from this combination we can create two fault signatures: Hilbert modulus current space vector (HMCSV) and Hilbert phase current space vector (HPCSV). These two fault signatures are subsequently analysed using the classical fast Fourier transform (FFT). The effects of mechanical faults on the HMCSV and HPCSV spectrums are described, and the related frequencies are determined. The magnitudes of spectral components, relative to the studied faults (air-gap eccentricity and outer raceway ball bearing defect), are extracted in order to develop the input vector necessary for learning and testing the support vector machine with an aim of classifying automatically the various states of the induction motor. Copyright © 2012 ISA. Published by Elsevier Ltd. All rights reserved.
Evaluation of different classification methods for the diagnosis of schizophrenia based on functional near-infrared spectroscopy.

PubMed

Li, Zhaohua; Wang, Yuduo; Quan, Wenxiang; Wu, Tongning; Lv, Bin

2015-02-15

Based on near-infrared spectroscopy (NIRS), recent converging evidence has been observed that patients with schizophrenia exhibit abnormal functional activities in the prefrontal cortex during a verbal fluency task (VFT). Therefore, some studies have attempted to employ NIRS measurements to differentiate schizophrenia patients from healthy controls with different classification methods. However, no systematic evaluation was conducted to compare their respective classification performances on the same study population. In this study, we evaluated the classification performance of four classification methods (including linear discriminant analysis, k-nearest neighbors, Gaussian process classifier, and support vector machines) on an NIRS-aided schizophrenia diagnosis. We recruited a large sample of 120 schizophrenia patients and 120 healthy controls and measured the hemoglobin response in the prefrontal cortex during the VFT using a multichannel NIRS system. Features for classification were extracted from three types of NIRS data in each channel. We subsequently performed a principal component analysis (PCA) for feature selection prior to comparison of the different classification methods. We achieved a maximum accuracy of 85.83% and an overall mean accuracy of 83.37% using a PCA-based feature selection on oxygenated hemoglobin signals and support vector machine classifier. This is the first comprehensive evaluation of different classification methods for the diagnosis of schizophrenia based on different types of NIRS signals. Our results suggested that, using the appropriate classification method, NIRS has the potential capacity to be an effective objective biomarker for the diagnosis of schizophrenia. Copyright © 2014 Elsevier B.V. All rights reserved.
BEST: Improved Prediction of B-Cell Epitopes from Antigen Sequences

PubMed Central

Gao, Jianzhao; Faraggi, Eshel; Zhou, Yaoqi; Ruan, Jishou; Kurgan, Lukasz

2012-01-01

Accurate identification of immunogenic regions in a given antigen chain is a difficult and actively pursued problem. Although accurate predictors for T-cell epitopes are already in place, the prediction of the B-cell epitopes requires further research. We overview the available approaches for the prediction of B-cell epitopes and propose a novel and accurate sequence-based solution. Our BEST (B-cell Epitope prediction using Support vector machine Tool) method predicts epitopes from antigen sequences, in contrast to some method that predict only from short sequence fragments, using a new architecture based on averaging selected scores generated from sliding 20-mers by a Support Vector Machine (SVM). The SVM predictor utilizes a comprehensive and custom designed set of inputs generated by combining information derived from the chain, sequence conservation, similarity to known (training) epitopes, and predicted secondary structure and relative solvent accessibility. Empirical evaluation on benchmark datasets demonstrates that BEST outperforms several modern sequence-based B-cell epitope predictors including ABCPred, method by Chen et al. (2007), BCPred, COBEpro, BayesB, and CBTOPE, when considering the predictions from antigen chains and from the chain fragments. Our method obtains a cross-validated area under the receiver operating characteristic curve (AUC) for the fragment-based prediction at 0.81 and 0.85, depending on the dataset. The AUCs of BEST on the benchmark sets of full antigen chains equal 0.57 and 0.6, which is significantly and slightly better than the next best method we tested. We also present case studies to contrast the propensity profiles generated by BEST and several other methods. PMID:22761950
Support vector machine-based facial-expression recognition method combining shape and appearance

NASA Astrophysics Data System (ADS)

Han, Eun Jung; Kang, Byung Jun; Park, Kang Ryoung; Lee, Sangyoun

2010-11-01

Facial expression recognition can be widely used for various applications, such as emotion-based human-machine interaction, intelligent robot interfaces, face recognition robust to expression variation, etc. Previous studies have been classified as either shape- or appearance-based recognition. The shape-based method has the disadvantage that the individual variance of facial feature points exists irrespective of similar expressions, which can cause a reduction of the recognition accuracy. The appearance-based method has a limitation in that the textural information of the face is very sensitive to variations in illumination. To overcome these problems, a new facial-expression recognition method is proposed, which combines both shape and appearance information, based on the support vector machine (SVM). This research is novel in the following three ways as compared to previous works. First, the facial feature points are automatically detected by using an active appearance model. From these, the shape-based recognition is performed by using the ratios between the facial feature points based on the facial-action coding system. Second, the SVM, which is trained to recognize the same and different expression classes, is proposed to combine two matching scores obtained from the shape- and appearance-based recognitions. Finally, a single SVM is trained to discriminate four different expressions, such as neutral, a smile, anger, and a scream. By determining the expression of the input facial image whose SVM output is at a minimum, the accuracy of the expression recognition is much enhanced. The experimental results showed that the recognition accuracy of the proposed method was better than previous researches and other fusion methods.
A hybrid least squares support vector machines and GMDH approach for river flow forecasting

NASA Astrophysics Data System (ADS)

Samsudin, R.; Saad, P.; Shabri, A.

2010-06-01

This paper proposes a novel hybrid forecasting model, which combines the group method of data handling (GMDH) and the least squares support vector machine (LSSVM), known as GLSSVM. The GMDH is used to determine the useful input variables for LSSVM model and the LSSVM model which works as time series forecasting. In this study the application of GLSSVM for monthly river flow forecasting of Selangor and Bernam River are investigated. The results of the proposed GLSSVM approach are compared with the conventional artificial neural network (ANN) models, Autoregressive Integrated Moving Average (ARIMA) model, GMDH and LSSVM models using the long term observations of monthly river flow discharge. The standard statistical, the root mean square error (RMSE) and coefficient of correlation (R) are employed to evaluate the performance of various models developed. Experiment result indicates that the hybrid model was powerful tools to model discharge time series and can be applied successfully in complex hydrological modeling.
A Feature Fusion Based Forecasting Model for Financial Time Series

PubMed Central

Guo, Zhiqiang; Wang, Huaiqing; Liu, Quan; Yang, Jie

2014-01-01

Predicting the stock market has become an increasingly interesting research area for both researchers and investors, and many prediction models have been proposed. In these models, feature selection techniques are used to pre-process the raw data and remove noise. In this paper, a prediction model is constructed to forecast stock market behavior with the aid of independent component analysis, canonical correlation analysis, and a support vector machine. First, two types of features are extracted from the historical closing prices and 39 technical variables obtained by independent component analysis. Second, a canonical correlation analysis method is utilized to combine the two types of features and extract intrinsic features to improve the performance of the prediction model. Finally, a support vector machine is applied to forecast the next day's closing price. The proposed model is applied to the Shanghai stock market index and the Dow Jones index, and experimental results show that the proposed model performs better in the area of prediction than other two similar models. PMID:24971455
Objective Auscultation of TCM Based on Wavelet Packet Fractal Dimension and Support Vector Machine.

PubMed

Yan, Jian-Jun; Guo, Rui; Wang, Yi-Qin; Liu, Guo-Ping; Yan, Hai-Xia; Xia, Chun-Ming; Shen, Xiaojing

2014-01-01

This study was conducted to illustrate that auscultation features based on the fractal dimension combined with wavelet packet transform (WPT) were conducive to the identification the pattern of syndromes of Traditional Chinese Medicine (TCM). The WPT and the fractal dimension were employed to extract features of auscultation signals of 137 patients with lung Qi-deficient pattern, 49 patients with lung Yin-deficient pattern, and 43 healthy subjects. With these features, the classification model was constructed based on multiclass support vector machine (SVM). When all auscultation signals were trained by SVM to decide the patterns of TCM syndromes, the overall recognition rate of model was 79.49%; when male and female auscultation signals were trained, respectively, to decide the patterns, the overall recognition rate of model reached 86.05%. The results showed that the methods proposed in this paper were effective to analyze auscultation signals, and the performance of model can be greatly improved when the distinction of gender was considered.
Active relearning for robust supervised classification of pulmonary emphysema

NASA Astrophysics Data System (ADS)

Raghunath, Sushravya; Rajagopalan, Srinivasan; Karwoski, Ronald A.; Bartholmai, Brian J.; Robb, Richard A.

2012-03-01

Radiologists are adept at recognizing the appearance of lung parenchymal abnormalities in CT scans. However, the inconsistent differential diagnosis, due to subjective aggregation, mandates supervised classification. Towards optimizing Emphysema classification, we introduce a physician-in-the-loop feedback approach in order to minimize uncertainty in the selected training samples. Using multi-view inductive learning with the training samples, an ensemble of Support Vector Machine (SVM) models, each based on a specific pair-wise dissimilarity metric, was constructed in less than six seconds. In the active relearning phase, the ensemble-expert label conflicts were resolved by an expert. This just-in-time feedback with unoptimized SVMs yielded 15% increase in classification accuracy and 25% reduction in the number of support vectors. The generality of relearning was assessed in the optimized parameter space of six different classifiers across seven dissimilarity metrics. The resultant average accuracy improved to 21%. The co-operative feedback method proposed here could enhance both diagnostic and staging throughput efficiency in chest radiology practice.
Predicting pork loin intramuscular fat using computer vision system.

PubMed

Liu, J-H; Sun, X; Young, J M; Bachmeier, L A; Newman, D J

2018-09-01

The objective of this study was to investigate the ability of computer vision system to predict pork intramuscular fat percentage (IMF%). Center-cut loin samples (n = 85) were trimmed of subcutaneous fat and connective tissue. Images were acquired and pixels were segregated to estimate image IMF% and 18 image color features for each image. Subjective IMF% was determined by a trained grader. Ether extract IMF% was calculated using ether extract method. Image color features and image IMF% were used as predictors for stepwise regression and support vector machine models. Results showed that subjective IMF% had a correlation of 0.81 with ether extract IMF% while the image IMF% had a 0.66 correlation with ether extract IMF%. Accuracy rates for regression models were 0.63 for stepwise and 0.75 for support vector machine. Although subjective IMF% has shown to have better prediction, results from computer vision system demonstrates the potential of being used as a tool in predicting pork IMF% in the future. Copyright © 2018 Elsevier Ltd. All rights reserved.
Objective Auscultation of TCM Based on Wavelet Packet Fractal Dimension and Support Vector Machine

PubMed Central

Yan, Jian-Jun; Wang, Yi-Qin; Liu, Guo-Ping; Yan, Hai-Xia; Xia, Chun-Ming; Shen, Xiaojing

2014-01-01

This study was conducted to illustrate that auscultation features based on the fractal dimension combined with wavelet packet transform (WPT) were conducive to the identification the pattern of syndromes of Traditional Chinese Medicine (TCM). The WPT and the fractal dimension were employed to extract features of auscultation signals of 137 patients with lung Qi-deficient pattern, 49 patients with lung Yin-deficient pattern, and 43 healthy subjects. With these features, the classification model was constructed based on multiclass support vector machine (SVM). When all auscultation signals were trained by SVM to decide the patterns of TCM syndromes, the overall recognition rate of model was 79.49%; when male and female auscultation signals were trained, respectively, to decide the patterns, the overall recognition rate of model reached 86.05%. The results showed that the methods proposed in this paper were effective to analyze auscultation signals, and the performance of model can be greatly improved when the distinction of gender was considered. PMID:24883068

Some links on this page may take you to non-federal websites. Their policies may differ from this site.