Rough sets and Laplacian score based cost-sensitive feature selection
Yu, Shenglong
2018-01-01
Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of “good” features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms. PMID:29912884
Rough sets and Laplacian score based cost-sensitive feature selection.
Yu, Shenglong; Zhao, Hong
2018-01-01
Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of "good" features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms.
Zhao, Yu-Xiang; Chou, Chien-Hsing
2016-01-01
In this study, a new feature selection algorithm, the neighborhood-relationship feature selection (NRFS) algorithm, is proposed for identifying rat electroencephalogram signals and recognizing Chinese characters. In these two applications, dependent relationships exist among the feature vectors and their neighboring feature vectors. Therefore, the proposed NRFS algorithm was designed for solving this problem. By applying the NRFS algorithm, unselected feature vectors have a high priority of being added into the feature subset if the neighboring feature vectors have been selected. In addition, selected feature vectors have a high priority of being eliminated if the neighboring feature vectors are not selected. In the experiments conducted in this study, the NRFS algorithm was compared with two feature algorithms. The experimental results indicated that the NRFS algorithm can extract the crucial frequency bands for identifying rat vigilance states and identifying crucial character regions for recognizing Chinese characters. PMID:27314346
Kim, Hyungjin; Park, Chang Min; Lee, Myunghee; Park, Sang Joon; Song, Yong Sub; Lee, Jong Hyuk; Hwang, Eui Jin; Goo, Jin Mo
2016-01-01
To identify the impact of reconstruction algorithms on CT radiomic features of pulmonary tumors and to reveal and compare the intra- and inter-reader and inter-reconstruction algorithm variability of each feature. Forty-two patients (M:F = 19:23; mean age, 60.43±10.56 years) with 42 pulmonary tumors (22.56±8.51mm) underwent contrast-enhanced CT scans, which were reconstructed with filtered back projection and commercial iterative reconstruction algorithm (level 3 and 5). Two readers independently segmented the whole tumor volume. Fifteen radiomic features were extracted and compared among reconstruction algorithms. Intra- and inter-reader variability and inter-reconstruction algorithm variability were calculated using coefficients of variation (CVs) and then compared. Among the 15 features, 5 first-order tumor intensity features and 4 gray level co-occurrence matrix (GLCM)-based features showed significant differences (p<0.05) among reconstruction algorithms. As for the variability, effective diameter, sphericity, entropy, and GLCM entropy were the most robust features (CV≤5%). Inter-reader variability was larger than intra-reader or inter-reconstruction algorithm variability in 9 features. However, for entropy, homogeneity, and 4 GLCM-based features, inter-reconstruction algorithm variability was significantly greater than inter-reader variability (p<0.013). Most of the radiomic features were significantly affected by the reconstruction algorithms. Inter-reconstruction algorithm variability was greater than inter-reader variability for entropy, homogeneity, and GLCM-based features.
A Feature and Algorithm Selection Method for Improving the Prediction of Protein Structural Class.
Ni, Qianwu; Chen, Lei
2017-01-01
Correct prediction of protein structural class is beneficial to investigation on protein functions, regulations and interactions. In recent years, several computational methods have been proposed in this regard. However, based on various features, it is still a great challenge to select proper classification algorithm and extract essential features to participate in classification. In this study, a feature and algorithm selection method was presented for improving the accuracy of protein structural class prediction. The amino acid compositions and physiochemical features were adopted to represent features and thirty-eight machine learning algorithms collected in Weka were employed. All features were first analyzed by a feature selection method, minimum redundancy maximum relevance (mRMR), producing a feature list. Then, several feature sets were constructed by adding features in the list one by one. For each feature set, thirtyeight algorithms were executed on a dataset, in which proteins were represented by features in the set. The predicted classes yielded by these algorithms and true class of each protein were collected to construct a dataset, which were analyzed by mRMR method, yielding an algorithm list. From the algorithm list, the algorithm was taken one by one to build an ensemble prediction model. Finally, we selected the ensemble prediction model with the best performance as the optimal ensemble prediction model. Experimental results indicate that the constructed model is much superior to models using single algorithm and other models that only adopt feature selection procedure or algorithm selection procedure. The feature selection procedure or algorithm selection procedure are really helpful for building an ensemble prediction model that can yield a better performance. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
A study of metaheuristic algorithms for high dimensional feature selection on microarray data
NASA Astrophysics Data System (ADS)
Dankolo, Muhammad Nasiru; Radzi, Nor Haizan Mohamed; Sallehuddin, Roselina; Mustaffa, Noorfa Haszlinna
2017-11-01
Microarray systems enable experts to examine gene profile at molecular level using machine learning algorithms. It increases the potentials of classification and diagnosis of many diseases at gene expression level. Though, numerous difficulties may affect the efficiency of machine learning algorithms which includes vast number of genes features comprised in the original data. Many of these features may be unrelated to the intended analysis. Therefore, feature selection is necessary to be performed in the data pre-processing. Many feature selection algorithms are developed and applied on microarray which including the metaheuristic optimization algorithms. This paper discusses the application of the metaheuristics algorithms for feature selection in microarray dataset. This study reveals that, the algorithms have yield an interesting result with limited resources thereby saving computational expenses of machine learning algorithms.
Automated Recognition of 3D Features in GPIR Images
NASA Technical Reports Server (NTRS)
Park, Han; Stough, Timothy; Fijany, Amir
2007-01-01
A method of automated recognition of three-dimensional (3D) features in images generated by ground-penetrating imaging radar (GPIR) is undergoing development. GPIR 3D images can be analyzed to detect and identify such subsurface features as pipes and other utility conduits. Until now, much of the analysis of GPIR images has been performed manually by expert operators who must visually identify and track each feature. The present method is intended to satisfy a need for more efficient and accurate analysis by means of algorithms that can automatically identify and track subsurface features, with minimal supervision by human operators. In this method, data from multiple sources (for example, data on different features extracted by different algorithms) are fused together for identifying subsurface objects. The algorithms of this method can be classified in several different ways. In one classification, the algorithms fall into three classes: (1) image-processing algorithms, (2) feature- extraction algorithms, and (3) a multiaxis data-fusion/pattern-recognition algorithm that includes a combination of machine-learning, pattern-recognition, and object-linking algorithms. The image-processing class includes preprocessing algorithms for reducing noise and enhancing target features for pattern recognition. The feature-extraction algorithms operate on preprocessed data to extract such specific features in images as two-dimensional (2D) slices of a pipe. Then the multiaxis data-fusion/ pattern-recognition algorithm identifies, classifies, and reconstructs 3D objects from the extracted features. In this process, multiple 2D features extracted by use of different algorithms and representing views along different directions are used to identify and reconstruct 3D objects. In object linking, which is an essential part of this process, features identified in successive 2D slices and located within a threshold radius of identical features in adjacent slices are linked in a directed-graph data structure. Relative to past approaches, this multiaxis approach offers the advantages of more reliable detections, better discrimination of objects, and provision of redundant information, which can be helpful in filling gaps in feature recognition by one of the component algorithms. The image-processing class also includes postprocessing algorithms that enhance identified features to prepare them for further scrutiny by human analysts (see figure). Enhancement of images as a postprocessing step is a significant departure from traditional practice, in which enhancement of images is a preprocessing step.
NASA Astrophysics Data System (ADS)
Hayana Hasibuan, Eka; Mawengkang, Herman; Efendi, Syahril
2017-12-01
The use of Partical Swarm Optimization Algorithm in this research is to optimize the feature weights on the Voting Feature Interval 5 algorithm so that we can find the model of using PSO algorithm with VFI 5. Optimization of feature weight on Diabetes or Dyspesia data is considered important because it is very closely related to the livelihood of many people, so if there is any inaccuracy in determining the most dominant feature weight in the data will cause death. Increased accuracy by using PSO Algorithm ie fold 1 from 92.31% to 96.15% increase accuracy of 3.8%, accuracy of fold 2 on Algorithm VFI5 of 92.52% as well as generated on PSO Algorithm means accuracy fixed, then in fold 3 increase accuracy of 85.19% Increased to 96.29% Accuracy increased by 11%. The total accuracy of all three trials increased by 14%. In general the Partical Swarm Optimization algorithm has succeeded in increasing the accuracy to several fold, therefore it can be concluded the PSO algorithm is well used in optimizing the VFI5 Classification Algorithm.
Improved classification accuracy by feature extraction using genetic algorithms
NASA Astrophysics Data System (ADS)
Patriarche, Julia; Manduca, Armando; Erickson, Bradley J.
2003-05-01
A feature extraction algorithm has been developed for the purposes of improving classification accuracy. The algorithm uses a genetic algorithm / hill-climber hybrid to generate a set of linearly recombined features, which may be of reduced dimensionality compared with the original set. The genetic algorithm performs the global exploration, and a hill climber explores local neighborhoods. Hybridizing the genetic algorithm with a hill climber improves both the rate of convergence, and the final overall cost function value; it also reduces the sensitivity of the genetic algorithm to parameter selection. The genetic algorithm includes the operators: crossover, mutation, and deletion / reactivation - the last of these effects dimensionality reduction. The feature extractor is supervised, and is capable of deriving a separate feature space for each tissue (which are reintegrated during classification). A non-anatomical digital phantom was developed as a gold standard for testing purposes. In tests with the phantom, and with images of multiple sclerosis patients, classification with feature extractor derived features yielded lower error rates than using standard pulse sequences, and with features derived using principal components analysis. Using the multiple sclerosis patient data, the algorithm resulted in a mean 31% reduction in classification error of pure tissues.
McTwo: a two-step feature selection algorithm based on maximal information coefficient.
Ge, Ruiquan; Zhou, Manli; Luo, Youxi; Meng, Qinghan; Mai, Guoqin; Ma, Dongli; Wang, Guoqing; Zhou, Fengfeng
2016-03-23
High-throughput bio-OMIC technologies are producing high-dimension data from bio-samples at an ever increasing rate, whereas the training sample number in a traditional experiment remains small due to various difficulties. This "large p, small n" paradigm in the area of biomedical "big data" may be at least partly solved by feature selection algorithms, which select only features significantly associated with phenotypes. Feature selection is an NP-hard problem. Due to the exponentially increased time requirement for finding the globally optimal solution, all the existing feature selection algorithms employ heuristic rules to find locally optimal solutions, and their solutions achieve different performances on different datasets. This work describes a feature selection algorithm based on a recently published correlation measurement, Maximal Information Coefficient (MIC). The proposed algorithm, McTwo, aims to select features associated with phenotypes, independently of each other, and achieving high classification performance of the nearest neighbor algorithm. Based on the comparative study of 17 datasets, McTwo performs about as well as or better than existing algorithms, with significantly reduced numbers of selected features. The features selected by McTwo also appear to have particular biomedical relevance to the phenotypes from the literature. McTwo selects a feature subset with very good classification performance, as well as a small feature number. So McTwo may represent a complementary feature selection algorithm for the high-dimensional biomedical datasets.
Hyperspectral feature mapping classification based on mathematical morphology
NASA Astrophysics Data System (ADS)
Liu, Chang; Li, Junwei; Wang, Guangping; Wu, Jingli
2016-03-01
This paper proposed a hyperspectral feature mapping classification algorithm based on mathematical morphology. Without the priori information such as spectral library etc., the spectral and spatial information can be used to realize the hyperspectral feature mapping classification. The mathematical morphological erosion and dilation operations are performed respectively to extract endmembers. The spectral feature mapping algorithm is used to carry on hyperspectral image classification. The hyperspectral image collected by AVIRIS is applied to evaluate the proposed algorithm. The proposed algorithm is compared with minimum Euclidean distance mapping algorithm, minimum Mahalanobis distance mapping algorithm, SAM algorithm and binary encoding mapping algorithm. From the results of the experiments, it is illuminated that the proposed algorithm's performance is better than that of the other algorithms under the same condition and has higher classification accuracy.
Image Registration Algorithm Based on Parallax Constraint and Clustering Analysis
NASA Astrophysics Data System (ADS)
Wang, Zhe; Dong, Min; Mu, Xiaomin; Wang, Song
2018-01-01
To resolve the problem of slow computation speed and low matching accuracy in image registration, a new image registration algorithm based on parallax constraint and clustering analysis is proposed. Firstly, Harris corner detection algorithm is used to extract the feature points of two images. Secondly, use Normalized Cross Correlation (NCC) function to perform the approximate matching of feature points, and the initial feature pair is obtained. Then, according to the parallax constraint condition, the initial feature pair is preprocessed by K-means clustering algorithm, which is used to remove the feature point pairs with obvious errors in the approximate matching process. Finally, adopt Random Sample Consensus (RANSAC) algorithm to optimize the feature points to obtain the final feature point matching result, and the fast and accurate image registration is realized. The experimental results show that the image registration algorithm proposed in this paper can improve the accuracy of the image matching while ensuring the real-time performance of the algorithm.
Lin, Fan; Xiao, Bin
2017-01-01
Based on the traditional Fast Retina Keypoint (FREAK) feature description algorithm, this paper proposed a Gravity-FREAK feature description algorithm based on Micro-electromechanical Systems (MEMS) sensor to overcome the limited computing performance and memory resources of mobile devices and further improve the reality interaction experience of clients through digital information added to the real world by augmented reality technology. The algorithm takes the gravity projection vector corresponding to the feature point as its feature orientation, which saved the time of calculating the neighborhood gray gradient of each feature point, reduced the cost of calculation and improved the accuracy of feature extraction. In the case of registration method of matching and tracking natural features, the adaptive and generic corner detection based on the Gravity-FREAK matching purification algorithm was used to eliminate abnormal matches, and Gravity Kaneda-Lucas Tracking (KLT) algorithm based on MEMS sensor can be used for the tracking registration of the targets and robustness improvement of tracking registration algorithm under mobile environment. PMID:29088228
Hong, Zhiling; Lin, Fan; Xiao, Bin
2017-01-01
Based on the traditional Fast Retina Keypoint (FREAK) feature description algorithm, this paper proposed a Gravity-FREAK feature description algorithm based on Micro-electromechanical Systems (MEMS) sensor to overcome the limited computing performance and memory resources of mobile devices and further improve the reality interaction experience of clients through digital information added to the real world by augmented reality technology. The algorithm takes the gravity projection vector corresponding to the feature point as its feature orientation, which saved the time of calculating the neighborhood gray gradient of each feature point, reduced the cost of calculation and improved the accuracy of feature extraction. In the case of registration method of matching and tracking natural features, the adaptive and generic corner detection based on the Gravity-FREAK matching purification algorithm was used to eliminate abnormal matches, and Gravity Kaneda-Lucas Tracking (KLT) algorithm based on MEMS sensor can be used for the tracking registration of the targets and robustness improvement of tracking registration algorithm under mobile environment.
Effective traffic features selection algorithm for cyber-attacks samples
NASA Astrophysics Data System (ADS)
Li, Yihong; Liu, Fangzheng; Du, Zhenyu
2018-05-01
By studying the defense scheme of Network attacks, this paper propose an effective traffic features selection algorithm based on k-means++ clustering to deal with the problem of high dimensionality of traffic features which extracted from cyber-attacks samples. Firstly, this algorithm divide the original feature set into attack traffic feature set and background traffic feature set by the clustering. Then, we calculates the variation of clustering performance after removing a certain feature. Finally, evaluating the degree of distinctiveness of the feature vector according to the result. Among them, the effective feature vector is whose degree of distinctiveness exceeds the set threshold. The purpose of this paper is to select out the effective features from the extracted original feature set. In this way, it can reduce the dimensionality of the features so as to reduce the space-time overhead of subsequent detection. The experimental results show that the proposed algorithm is feasible and it has some advantages over other selection algorithms.
Jeyasingh, Suganthi; Veluchamy, Malathi
2017-05-01
Early diagnosis of breast cancer is essential to save lives of patients. Usually, medical datasets include a large variety of data that can lead to confusion during diagnosis. The Knowledge Discovery on Database (KDD) process helps to improve efficiency. It requires elimination of inappropriate and repeated data from the dataset before final diagnosis. This can be done using any of the feature selection algorithms available in data mining. Feature selection is considered as a vital step to increase the classification accuracy. This paper proposes a Modified Bat Algorithm (MBA) for feature selection to eliminate irrelevant features from an original dataset. The Bat algorithm was modified using simple random sampling to select the random instances from the dataset. Ranking was with the global best features to recognize the predominant features available in the dataset. The selected features are used to train a Random Forest (RF) classification algorithm. The MBA feature selection algorithm enhanced the classification accuracy of RF in identifying the occurrence of breast cancer. The Wisconsin Diagnosis Breast Cancer Dataset (WDBC) was used for estimating the performance analysis of the proposed MBA feature selection algorithm. The proposed algorithm achieved better performance in terms of Kappa statistic, Mathew’s Correlation Coefficient, Precision, F-measure, Recall, Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Relative Absolute Error (RAE) and Root Relative Squared Error (RRSE). Creative Commons Attribution License
Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis
NASA Astrophysics Data System (ADS)
Ahmad, Siti Rohaidah; Yusop, Nurhafizah Moziyana Mohd; Bakar, Azuraliza Abu; Yaakub, Mohd Ridzwan
2017-10-01
This research paper aims to propose a hybrid of ant colony optimization (ACO) and k-nearest neighbor (KNN) algorithms as feature selections for selecting and choosing relevant features from customer review datasets. Information gain (IG), genetic algorithm (GA), and rough set attribute reduction (RSAR) were used as baseline algorithms in a performance comparison with the proposed algorithm. This paper will also discuss the significance test, which was used to evaluate the performance differences between the ACO-KNN, IG-GA, and IG-RSAR algorithms. This study evaluated the performance of the ACO-KNN algorithm using precision, recall, and F-score, which were validated using the parametric statistical significance tests. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. In addition, the experimental results have proven that the ACO-KNN can be used as a feature selection technique in sentiment analysis to obtain quality, optimal feature subset that can represent the actual data in customer review data.
Chen, Qiang; Chen, Yunhao; Jiang, Weiguo
2016-07-30
In the field of multiple features Object-Based Change Detection (OBCD) for very-high-resolution remotely sensed images, image objects have abundant features and feature selection affects the precision and efficiency of OBCD. Through object-based image analysis, this paper proposes a Genetic Particle Swarm Optimization (GPSO)-based feature selection algorithm to solve the optimization problem of feature selection in multiple features OBCD. We select the Ratio of Mean to Variance (RMV) as the fitness function of GPSO, and apply the proposed algorithm to the object-based hybrid multivariate alternative detection model. Two experiment cases on Worldview-2/3 images confirm that GPSO can significantly improve the speed of convergence, and effectively avoid the problem of premature convergence, relative to other feature selection algorithms. According to the accuracy evaluation of OBCD, GPSO is superior at overall accuracy (84.17% and 83.59%) and Kappa coefficient (0.6771 and 0.6314) than other algorithms. Moreover, the sensitivity analysis results show that the proposed algorithm is not easily influenced by the initial parameters, but the number of features to be selected and the size of the particle swarm would affect the algorithm. The comparison experiment results reveal that RMV is more suitable than other functions as the fitness function of GPSO-based feature selection algorithm.
Online Feature Transformation Learning for Cross-Domain Object Category Recognition.
Zhang, Xuesong; Zhuang, Yan; Wang, Wei; Pedrycz, Witold
2017-06-09
In this paper, we introduce a new research problem termed online feature transformation learning in the context of multiclass object category recognition. The learning of a feature transformation is viewed as learning a global similarity metric function in an online manner. We first consider the problem of online learning a feature transformation matrix expressed in the original feature space and propose an online passive aggressive feature transformation algorithm. Then these original features are mapped to kernel space and an online single kernel feature transformation (OSKFT) algorithm is developed to learn a nonlinear feature transformation. Based on the OSKFT and the existing Hedge algorithm, a novel online multiple kernel feature transformation algorithm is also proposed, which can further improve the performance of online feature transformation learning in large-scale application. The classifier is trained with k nearest neighbor algorithm together with the learned similarity metric function. Finally, we experimentally examined the effect of setting different parameter values in the proposed algorithms and evaluate the model performance on several multiclass object recognition data sets. The experimental results demonstrate the validity and good performance of our methods on cross-domain and multiclass object recognition application.
Automatic Image Registration of Multimodal Remotely Sensed Data with Global Shearlet Features
NASA Technical Reports Server (NTRS)
Murphy, James M.; Le Moigne, Jacqueline; Harding, David J.
2015-01-01
Automatic image registration is the process of aligning two or more images of approximately the same scene with minimal human assistance. Wavelet-based automatic registration methods are standard, but sometimes are not robust to the choice of initial conditions. That is, if the images to be registered are too far apart relative to the initial guess of the algorithm, the registration algorithm does not converge or has poor accuracy, and is thus not robust. These problems occur because wavelet techniques primarily identify isotropic textural features and are less effective at identifying linear and curvilinear edge features. We integrate the recently developed mathematical construction of shearlets, which is more effective at identifying sparse anisotropic edges, with an existing automatic wavelet-based registration algorithm. Our shearlet features algorithm produces more distinct features than wavelet features algorithms; the separation of edges from textures is even stronger than with wavelets. Our algorithm computes shearlet and wavelet features for the images to be registered, then performs least squares minimization on these features to compute a registration transformation. Our algorithm is two-staged and multiresolution in nature. First, a cascade of shearlet features is used to provide a robust, though approximate, registration. This is then refined by registering with a cascade of wavelet features. Experiments across a variety of image classes show an improved robustness to initial conditions, when compared to wavelet features alone.
Automatic Image Registration of Multi-Modal Remotely Sensed Data with Global Shearlet Features
Murphy, James M.; Le Moigne, Jacqueline; Harding, David J.
2017-01-01
Automatic image registration is the process of aligning two or more images of approximately the same scene with minimal human assistance. Wavelet-based automatic registration methods are standard, but sometimes are not robust to the choice of initial conditions. That is, if the images to be registered are too far apart relative to the initial guess of the algorithm, the registration algorithm does not converge or has poor accuracy, and is thus not robust. These problems occur because wavelet techniques primarily identify isotropic textural features and are less effective at identifying linear and curvilinear edge features. We integrate the recently developed mathematical construction of shearlets, which is more effective at identifying sparse anisotropic edges, with an existing automatic wavelet-based registration algorithm. Our shearlet features algorithm produces more distinct features than wavelet features algorithms; the separation of edges from textures is even stronger than with wavelets. Our algorithm computes shearlet and wavelet features for the images to be registered, then performs least squares minimization on these features to compute a registration transformation. Our algorithm is two-staged and multiresolution in nature. First, a cascade of shearlet features is used to provide a robust, though approximate, registration. This is then refined by registering with a cascade of wavelet features. Experiments across a variety of image classes show an improved robustness to initial conditions, when compared to wavelet features alone. PMID:29123329
The Speech multi features fusion perceptual hash algorithm based on tensor decomposition
NASA Astrophysics Data System (ADS)
Huang, Y. B.; Fan, M. H.; Zhang, Q. Y.
2018-03-01
With constant progress in modern speech communication technologies, the speech data is prone to be attacked by the noise or maliciously tampered. In order to make the speech perception hash algorithm has strong robustness and high efficiency, this paper put forward a speech perception hash algorithm based on the tensor decomposition and multi features is proposed. This algorithm analyses the speech perception feature acquires each speech component wavelet packet decomposition. LPCC, LSP and ISP feature of each speech component are extracted to constitute the speech feature tensor. Speech authentication is done by generating the hash values through feature matrix quantification which use mid-value. Experimental results showing that the proposed algorithm is robust for content to maintain operations compared with similar algorithms. It is able to resist the attack of the common background noise. Also, the algorithm is highly efficiency in terms of arithmetic, and is able to meet the real-time requirements of speech communication and complete the speech authentication quickly.
Low complexity feature extraction for classification of harmonic signals
NASA Astrophysics Data System (ADS)
William, Peter E.
In this dissertation, feature extraction algorithms have been developed for extraction of characteristic features from harmonic signals. The common theme for all developed algorithms is the simplicity in generating a significant set of features directly from the time domain harmonic signal. The features are a time domain representation of the composite, yet sparse, harmonic signature in the spectral domain. The algorithms are adequate for low-power unattended sensors which perform sensing, feature extraction, and classification in a standalone scenario. The first algorithm generates the characteristic features using only the duration between successive zero-crossing intervals. The second algorithm estimates the harmonics' amplitudes of the harmonic structure employing a simplified least squares method without the need to estimate the true harmonic parameters of the source signal. The third algorithm, resulting from a collaborative effort with Daniel White at the DSP Lab, University of Nebraska-Lincoln, presents an analog front end approach that utilizes a multichannel analog projection and integration to extract the sparse spectral features from the analog time domain signal. Classification is performed using a multilayer feedforward neural network. Evaluation of the proposed feature extraction algorithms for classification through the processing of several acoustic and vibration data sets (including military vehicles and rotating electric machines) with comparison to spectral features shows that, for harmonic signals, time domain features are simpler to extract and provide equivalent or improved reliability over the spectral features in both the detection probabilities and false alarm rate.
Chen, Qiang; Chen, Yunhao; Jiang, Weiguo
2016-01-01
In the field of multiple features Object-Based Change Detection (OBCD) for very-high-resolution remotely sensed images, image objects have abundant features and feature selection affects the precision and efficiency of OBCD. Through object-based image analysis, this paper proposes a Genetic Particle Swarm Optimization (GPSO)-based feature selection algorithm to solve the optimization problem of feature selection in multiple features OBCD. We select the Ratio of Mean to Variance (RMV) as the fitness function of GPSO, and apply the proposed algorithm to the object-based hybrid multivariate alternative detection model. Two experiment cases on Worldview-2/3 images confirm that GPSO can significantly improve the speed of convergence, and effectively avoid the problem of premature convergence, relative to other feature selection algorithms. According to the accuracy evaluation of OBCD, GPSO is superior at overall accuracy (84.17% and 83.59%) and Kappa coefficient (0.6771 and 0.6314) than other algorithms. Moreover, the sensitivity analysis results show that the proposed algorithm is not easily influenced by the initial parameters, but the number of features to be selected and the size of the particle swarm would affect the algorithm. The comparison experiment results reveal that RMV is more suitable than other functions as the fitness function of GPSO-based feature selection algorithm. PMID:27483285
Multi-task feature selection in microarray data by binary integer programming.
Lan, Liang; Vucetic, Slobodan
2013-12-20
A major challenge in microarray classification is that the number of features is typically orders of magnitude larger than the number of examples. In this paper, we propose a novel feature filter algorithm to select the feature subset with maximal discriminative power and minimal redundancy by solving a quadratic objective function with binary integer constraints. To improve the computational efficiency, the binary integer constraints are relaxed and a low-rank approximation to the quadratic term is applied. The proposed feature selection algorithm was extended to solve multi-task microarray classification problems. We compared the single-task version of the proposed feature selection algorithm with 9 existing feature selection methods on 4 benchmark microarray data sets. The empirical results show that the proposed method achieved the most accurate predictions overall. We also evaluated the multi-task version of the proposed algorithm on 8 multi-task microarray datasets. The multi-task feature selection algorithm resulted in significantly higher accuracy than when using the single-task feature selection methods.
MotieGhader, Habib; Gharaghani, Sajjad; Masoudi-Sobhanzadeh, Yosef; Masoudi-Nejad, Ali
2017-01-01
Feature selection is of great importance in Quantitative Structure-Activity Relationship (QSAR) analysis. This problem has been solved using some meta-heuristic algorithms such as GA, PSO, ACO and so on. In this work two novel hybrid meta-heuristic algorithms i.e. Sequential GA and LA (SGALA) and Mixed GA and LA (MGALA), which are based on Genetic algorithm and learning automata for QSAR feature selection are proposed. SGALA algorithm uses advantages of Genetic algorithm and Learning Automata sequentially and the MGALA algorithm uses advantages of Genetic Algorithm and Learning Automata simultaneously. We applied our proposed algorithms to select the minimum possible number of features from three different datasets and also we observed that the MGALA and SGALA algorithms had the best outcome independently and in average compared to other feature selection algorithms. Through comparison of our proposed algorithms, we deduced that the rate of convergence to optimal result in MGALA and SGALA algorithms were better than the rate of GA, ACO, PSO and LA algorithms. In the end, the results of GA, ACO, PSO, LA, SGALA, and MGALA algorithms were applied as the input of LS-SVR model and the results from LS-SVR models showed that the LS-SVR model had more predictive ability with the input from SGALA and MGALA algorithms than the input from all other mentioned algorithms. Therefore, the results have corroborated that not only is the predictive efficiency of proposed algorithms better, but their rate of convergence is also superior to the all other mentioned algorithms. PMID:28979308
MotieGhader, Habib; Gharaghani, Sajjad; Masoudi-Sobhanzadeh, Yosef; Masoudi-Nejad, Ali
2017-01-01
Feature selection is of great importance in Quantitative Structure-Activity Relationship (QSAR) analysis. This problem has been solved using some meta-heuristic algorithms such as GA, PSO, ACO and so on. In this work two novel hybrid meta-heuristic algorithms i.e. Sequential GA and LA (SGALA) and Mixed GA and LA (MGALA), which are based on Genetic algorithm and learning automata for QSAR feature selection are proposed. SGALA algorithm uses advantages of Genetic algorithm and Learning Automata sequentially and the MGALA algorithm uses advantages of Genetic Algorithm and Learning Automata simultaneously. We applied our proposed algorithms to select the minimum possible number of features from three different datasets and also we observed that the MGALA and SGALA algorithms had the best outcome independently and in average compared to other feature selection algorithms. Through comparison of our proposed algorithms, we deduced that the rate of convergence to optimal result in MGALA and SGALA algorithms were better than the rate of GA, ACO, PSO and LA algorithms. In the end, the results of GA, ACO, PSO, LA, SGALA, and MGALA algorithms were applied as the input of LS-SVR model and the results from LS-SVR models showed that the LS-SVR model had more predictive ability with the input from SGALA and MGALA algorithms than the input from all other mentioned algorithms. Therefore, the results have corroborated that not only is the predictive efficiency of proposed algorithms better, but their rate of convergence is also superior to the all other mentioned algorithms.
Game theory-based visual tracking approach focusing on color and texture features.
Jin, Zefenfen; Hou, Zhiqiang; Yu, Wangsheng; Chen, Chuanhua; Wang, Xin
2017-07-20
It is difficult for a single-feature tracking algorithm to achieve strong robustness under a complex environment. To solve this problem, we proposed a multifeature fusion tracking algorithm that is based on game theory. By focusing on color and texture features as two gamers, this algorithm accomplishes tracking by using a mean shift iterative formula to search for the Nash equilibrium of the game. The contribution of different features is always keeping the state of optical balance, so that the algorithm can fully take advantage of feature fusion. According to the experiment results, this algorithm proves to possess good performance, especially under the condition of scene variation, target occlusion, and similar interference.
Face recognition algorithm using extended vector quantization histogram features.
Yan, Yan; Lee, Feifei; Wu, Xueqian; Chen, Qiu
2018-01-01
In this paper, we propose a face recognition algorithm based on a combination of vector quantization (VQ) and Markov stationary features (MSF). The VQ algorithm has been shown to be an effective method for generating features; it extracts a codevector histogram as a facial feature representation for face recognition. Still, the VQ histogram features are unable to convey spatial structural information, which to some extent limits their usefulness in discrimination. To alleviate this limitation of VQ histograms, we utilize Markov stationary features (MSF) to extend the VQ histogram-based features so as to add spatial structural information. We demonstrate the effectiveness of our proposed algorithm by achieving recognition results superior to those of several state-of-the-art methods on publicly available face databases.
Multiple feature fusion via covariance matrix for visual tracking
NASA Astrophysics Data System (ADS)
Jin, Zefenfen; Hou, Zhiqiang; Yu, Wangsheng; Wang, Xin; Sun, Hui
2018-04-01
Aiming at the problem of complicated dynamic scenes in visual target tracking, a multi-feature fusion tracking algorithm based on covariance matrix is proposed to improve the robustness of the tracking algorithm. In the frame-work of quantum genetic algorithm, this paper uses the region covariance descriptor to fuse the color, edge and texture features. It also uses a fast covariance intersection algorithm to update the model. The low dimension of region covariance descriptor, the fast convergence speed and strong global optimization ability of quantum genetic algorithm, and the fast computation of fast covariance intersection algorithm are used to improve the computational efficiency of fusion, matching, and updating process, so that the algorithm achieves a fast and effective multi-feature fusion tracking. The experiments prove that the proposed algorithm can not only achieve fast and robust tracking but also effectively handle interference of occlusion, rotation, deformation, motion blur and so on.
An Extended Spectral-Spatial Classification Approach for Hyperspectral Data
NASA Astrophysics Data System (ADS)
Akbari, D.
2017-11-01
In this paper an extended classification approach for hyperspectral imagery based on both spectral and spatial information is proposed. The spatial information is obtained by an enhanced marker-based minimum spanning forest (MSF) algorithm. Three different methods of dimension reduction are first used to obtain the subspace of hyperspectral data: (1) unsupervised feature extraction methods including principal component analysis (PCA), independent component analysis (ICA), and minimum noise fraction (MNF); (2) supervised feature extraction including decision boundary feature extraction (DBFE), discriminate analysis feature extraction (DAFE), and nonparametric weighted feature extraction (NWFE); (3) genetic algorithm (GA). The spectral features obtained are then fed into the enhanced marker-based MSF classification algorithm. In the enhanced MSF algorithm, the markers are extracted from the classification maps obtained by both SVM and watershed segmentation algorithm. To evaluate the proposed approach, the Pavia University hyperspectral data is tested. Experimental results show that the proposed approach using GA achieves an approximately 8 % overall accuracy higher than the original MSF-based algorithm.
Zhang, Xiaoheng; Wang, Lirui; Cao, Yao; Wang, Pin; Zhang, Cheng; Yang, Liuyang; Li, Yongming; Zhang, Yanling; Cheng, Oumei
2018-02-01
Diagnosis of Parkinson's disease (PD) based on speech data has been proved to be an effective way in recent years. However, current researches just care about the feature extraction and classifier design, and do not consider the instance selection. Former research by authors showed that the instance selection can lead to improvement on classification accuracy. However, no attention is paid on the relationship between speech sample and feature until now. Therefore, a new diagnosis algorithm of PD is proposed in this paper by simultaneously selecting speech sample and feature based on relevant feature weighting algorithm and multiple kernel method, so as to find their synergy effects, thereby improving classification accuracy. Experimental results showed that this proposed algorithm obtained apparent improvement on classification accuracy. It can obtain mean classification accuracy of 82.5%, which was 30.5% higher than the relevant algorithm. Besides, the proposed algorithm detected the synergy effects of speech sample and feature, which is valuable for speech marker extraction.
Lin, Kuan-Cheng; Hsieh, Yi-Hsiu
2015-10-01
The classification and analysis of data is an important issue in today's research. Selecting a suitable set of features makes it possible to classify an enormous quantity of data quickly and efficiently. Feature selection is generally viewed as a problem of feature subset selection, such as combination optimization problems. Evolutionary algorithms using random search methods have proven highly effective in obtaining solutions to problems of optimization in a diversity of applications. In this study, we developed a hybrid evolutionary algorithm based on endocrine-based particle swarm optimization (EPSO) and artificial bee colony (ABC) algorithms in conjunction with a support vector machine (SVM) for the selection of optimal feature subsets for the classification of datasets. The results of experiments using specific UCI medical datasets demonstrate that the accuracy of the proposed hybrid evolutionary algorithm is superior to that of basic PSO, EPSO and ABC algorithms, with regard to classification accuracy using subsets with a reduced number of features.
Online feature selection with streaming features.
Wu, Xindong; Yu, Kui; Ding, Wei; Wang, Hao; Zhu, Xingquan
2013-05-01
We propose a new online feature selection framework for applications with streaming features where the knowledge of the full feature space is unknown in advance. We define streaming features as features that flow in one by one over time whereas the number of training examples remains fixed. This is in contrast with traditional online learning methods that only deal with sequentially added observations, with little attention being paid to streaming features. The critical challenges for Online Streaming Feature Selection (OSFS) include 1) the continuous growth of feature volumes over time, 2) a large feature space, possibly of unknown or infinite size, and 3) the unavailability of the entire feature set before learning starts. In the paper, we present a novel Online Streaming Feature Selection method to select strongly relevant and nonredundant features on the fly. An efficient Fast-OSFS algorithm is proposed to improve feature selection performance. The proposed algorithms are evaluated extensively on high-dimensional datasets and also with a real-world case study on impact crater detection. Experimental results demonstrate that the algorithms achieve better compactness and higher prediction accuracy than existing streaming feature selection algorithms.
AdaBoost-based algorithm for network intrusion detection.
Hu, Weiming; Hu, Wei; Maybank, Steve
2008-04-01
Network intrusion detection aims at distinguishing the attacks on the Internet from normal use of the Internet. It is an indispensable part of the information security system. Due to the variety of network behaviors and the rapid development of attack fashions, it is necessary to develop fast machine-learning-based intrusion detection algorithms with high detection rates and low false-alarm rates. In this correspondence, we propose an intrusion detection algorithm based on the AdaBoost algorithm. In the algorithm, decision stumps are used as weak classifiers. The decision rules are provided for both categorical and continuous features. By combining the weak classifiers for continuous features and the weak classifiers for categorical features into a strong classifier, the relations between these two different types of features are handled naturally, without any forced conversions between continuous and categorical features. Adaptable initial weights and a simple strategy for avoiding overfitting are adopted to improve the performance of the algorithm. Experimental results show that our algorithm has low computational complexity and error rates, as compared with algorithms of higher computational complexity, as tested on the benchmark sample data.
SKL algorithm based fabric image matching and retrieval
NASA Astrophysics Data System (ADS)
Cao, Yichen; Zhang, Xueqin; Ma, Guojian; Sun, Rongqing; Dong, Deping
2017-07-01
Intelligent computer image processing technology provides convenience and possibility for designers to carry out designs. Shape analysis can be achieved by extracting SURF feature. However, high dimension of SURF feature causes to lower matching speed. To solve this problem, this paper proposed a fast fabric image matching algorithm based on SURF K-means and LSH algorithm. By constructing the bag of visual words on K-Means algorithm, and forming feature histogram of each image, the dimension of SURF feature is reduced at the first step. Then with the help of LSH algorithm, the features are encoded and the dimension is further reduced. In addition, the indexes of each image and each class of image are created, and the number of matching images is decreased by LSH hash bucket. Experiments on fabric image database show that this algorithm can speed up the matching and retrieval process, the result can satisfy the requirement of dress designers with accuracy and speed.
Linear feature detection algorithm for astronomical surveys - I. Algorithm description
NASA Astrophysics Data System (ADS)
Bektešević, Dino; Vinković, Dejan
2017-11-01
Computer vision algorithms are powerful tools in astronomical image analyses, especially when automation of object detection and extraction is required. Modern object detection algorithms in astronomy are oriented towards detection of stars and galaxies, ignoring completely the detection of existing linear features. With the emergence of wide-field sky surveys, linear features attract scientific interest as possible trails of fast flybys of near-Earth asteroids and meteors. In this work, we describe a new linear feature detection algorithm designed specifically for implementation in big data astronomy. The algorithm combines a series of algorithmic steps that first remove other objects (stars and galaxies) from the image and then enhance the line to enable more efficient line detection with the Hough algorithm. The rate of false positives is greatly reduced thanks to a step that replaces possible line segments with rectangles and then compares lines fitted to the rectangles with the lines obtained directly from the image. The speed of the algorithm and its applicability in astronomical surveys are also discussed.
Nankali, Saber; Miandoab, Payam Samadi; Baghizadeh, Amin
2016-01-01
In external‐beam radiotherapy, using external markers is one of the most reliable tools to predict tumor position, in clinical applications. The main challenge in this approach is tumor motion tracking with highest accuracy that depends heavily on external markers location, and this issue is the objective of this study. Four commercially available feature selection algorithms entitled 1) Correlation‐based Feature Selection, 2) Classifier, 3) Principal Components, and 4) Relief were proposed to find optimum location of external markers in combination with two “Genetic” and “Ranker” searching procedures. The performance of these algorithms has been evaluated using four‐dimensional extended cardiac‐torso anthropomorphic phantom. Six tumors in lung, three tumors in liver, and 49 points on the thorax surface were taken into account to simulate internal and external motions, respectively. The root mean square error of an adaptive neuro‐fuzzy inference system (ANFIS) as prediction model was considered as metric for quantitatively evaluating the performance of proposed feature selection algorithms. To do this, the thorax surface region was divided into nine smaller segments and predefined tumors motion was predicted by ANFIS using external motion data of given markers at each small segment, separately. Our comparative results showed that all feature selection algorithms can reasonably select specific external markers from those segments where the root mean square error of the ANFIS model is minimum. Moreover, the performance accuracy of proposed feature selection algorithms was compared, separately. For this, each tumor motion was predicted using motion data of those external markers selected by each feature selection algorithm. Duncan statistical test, followed by F‐test, on final results reflected that all proposed feature selection algorithms have the same performance accuracy for lung tumors. But for liver tumors, a correlation‐based feature selection algorithm, in combination with a genetic search algorithm, proved to yield best performance accuracy for selecting optimum markers. PACS numbers: 87.55.km, 87.56.Fc PMID:26894358
Nankali, Saber; Torshabi, Ahmad Esmaili; Miandoab, Payam Samadi; Baghizadeh, Amin
2016-01-08
In external-beam radiotherapy, using external markers is one of the most reliable tools to predict tumor position, in clinical applications. The main challenge in this approach is tumor motion tracking with highest accuracy that depends heavily on external markers location, and this issue is the objective of this study. Four commercially available feature selection algorithms entitled 1) Correlation-based Feature Selection, 2) Classifier, 3) Principal Components, and 4) Relief were proposed to find optimum location of external markers in combination with two "Genetic" and "Ranker" searching procedures. The performance of these algorithms has been evaluated using four-dimensional extended cardiac-torso anthropomorphic phantom. Six tumors in lung, three tumors in liver, and 49 points on the thorax surface were taken into account to simulate internal and external motions, respectively. The root mean square error of an adaptive neuro-fuzzy inference system (ANFIS) as prediction model was considered as metric for quantitatively evaluating the performance of proposed feature selection algorithms. To do this, the thorax surface region was divided into nine smaller segments and predefined tumors motion was predicted by ANFIS using external motion data of given markers at each small segment, separately. Our comparative results showed that all feature selection algorithms can reasonably select specific external markers from those segments where the root mean square error of the ANFIS model is minimum. Moreover, the performance accuracy of proposed feature selection algorithms was compared, separately. For this, each tumor motion was predicted using motion data of those external markers selected by each feature selection algorithm. Duncan statistical test, followed by F-test, on final results reflected that all proposed feature selection algorithms have the same performance accuracy for lung tumors. But for liver tumors, a correlation-based feature selection algorithm, in combination with a genetic search algorithm, proved to yield best performance accuracy for selecting optimum markers.
NASA Astrophysics Data System (ADS)
Zhang, Chen; Ni, Zhiwei; Ni, Liping; Tang, Na
2016-10-01
Feature selection is an important method of data preprocessing in data mining. In this paper, a novel feature selection method based on multi-fractal dimension and harmony search algorithm is proposed. Multi-fractal dimension is adopted as the evaluation criterion of feature subset, which can determine the number of selected features. An improved harmony search algorithm is used as the search strategy to improve the efficiency of feature selection. The performance of the proposed method is compared with that of other feature selection algorithms on UCI data-sets. Besides, the proposed method is also used to predict the daily average concentration of PM2.5 in China. Experimental results show that the proposed method can obtain competitive results in terms of both prediction accuracy and the number of selected features.
Compact cancer biomarkers discovery using a swarm intelligence feature selection algorithm.
Martinez, Emmanuel; Alvarez, Mario Moises; Trevino, Victor
2010-08-01
Biomarker discovery is a typical application from functional genomics. Due to the large number of genes studied simultaneously in microarray data, feature selection is a key step. Swarm intelligence has emerged as a solution for the feature selection problem. However, swarm intelligence settings for feature selection fail to select small features subsets. We have proposed a swarm intelligence feature selection algorithm based on the initialization and update of only a subset of particles in the swarm. In this study, we tested our algorithm in 11 microarray datasets for brain, leukemia, lung, prostate, and others. We show that the proposed swarm intelligence algorithm successfully increase the classification accuracy and decrease the number of selected features compared to other swarm intelligence methods. Copyright © 2010 Elsevier Ltd. All rights reserved.
Research on sparse feature matching of improved RANSAC algorithm
NASA Astrophysics Data System (ADS)
Kong, Xiangsi; Zhao, Xian
2018-04-01
In this paper, a sparse feature matching method based on modified RANSAC algorithm is proposed to improve the precision and speed. Firstly, the feature points of the images are extracted using the SIFT algorithm. Then, the image pair is matched roughly by generating SIFT feature descriptor. At last, the precision of image matching is optimized by the modified RANSAC algorithm,. The RANSAC algorithm is improved from three aspects: instead of the homography matrix, this paper uses the fundamental matrix generated by the 8 point algorithm as the model; the sample is selected by a random block selecting method, which ensures the uniform distribution and the accuracy; adds sequential probability ratio test(SPRT) on the basis of standard RANSAC, which cut down the overall running time of the algorithm. The experimental results show that this method can not only get higher matching accuracy, but also greatly reduce the computation and improve the matching speed.
An Improved Iris Recognition Algorithm Based on Hybrid Feature and ELM
NASA Astrophysics Data System (ADS)
Wang, Juan
2018-03-01
The iris image is easily polluted by noise and uneven light. This paper proposed an improved extreme learning machine (ELM) based iris recognition algorithm with hybrid feature. 2D-Gabor filters and GLCM is employed to generate a multi-granularity hybrid feature vector. 2D-Gabor filter and GLCM feature work for capturing low-intermediate frequency and high frequency texture information, respectively. Finally, we utilize extreme learning machine for iris recognition. Experimental results reveal our proposed ELM based multi-granularity iris recognition algorithm (ELM-MGIR) has higher accuracy of 99.86%, and lower EER of 0.12% under the premise of real-time performance. The proposed ELM-MGIR algorithm outperforms other mainstream iris recognition algorithms.
A novel feature extraction approach for microarray data based on multi-algorithm fusion
Jiang, Zhu; Xu, Rong
2015-01-01
Feature extraction is one of the most important and effective method to reduce dimension in data mining, with emerging of high dimensional data such as microarray gene expression data. Feature extraction for gene selection, mainly serves two purposes. One is to identify certain disease-related genes. The other is to find a compact set of discriminative genes to build a pattern classifier with reduced complexity and improved generalization capabilities. Depending on the purpose of gene selection, two types of feature extraction algorithms including ranking-based feature extraction and set-based feature extraction are employed in microarray gene expression data analysis. In ranking-based feature extraction, features are evaluated on an individual basis, without considering inter-relationship between features in general, while set-based feature extraction evaluates features based on their role in a feature set by taking into account dependency between features. Just as learning methods, feature extraction has a problem in its generalization ability, which is robustness. However, the issue of robustness is often overlooked in feature extraction. In order to improve the accuracy and robustness of feature extraction for microarray data, a novel approach based on multi-algorithm fusion is proposed. By fusing different types of feature extraction algorithms to select the feature from the samples set, the proposed approach is able to improve feature extraction performance. The new approach is tested against gene expression dataset including Colon cancer data, CNS data, DLBCL data, and Leukemia data. The testing results show that the performance of this algorithm is better than existing solutions. PMID:25780277
A novel feature extraction approach for microarray data based on multi-algorithm fusion.
Jiang, Zhu; Xu, Rong
2015-01-01
Feature extraction is one of the most important and effective method to reduce dimension in data mining, with emerging of high dimensional data such as microarray gene expression data. Feature extraction for gene selection, mainly serves two purposes. One is to identify certain disease-related genes. The other is to find a compact set of discriminative genes to build a pattern classifier with reduced complexity and improved generalization capabilities. Depending on the purpose of gene selection, two types of feature extraction algorithms including ranking-based feature extraction and set-based feature extraction are employed in microarray gene expression data analysis. In ranking-based feature extraction, features are evaluated on an individual basis, without considering inter-relationship between features in general, while set-based feature extraction evaluates features based on their role in a feature set by taking into account dependency between features. Just as learning methods, feature extraction has a problem in its generalization ability, which is robustness. However, the issue of robustness is often overlooked in feature extraction. In order to improve the accuracy and robustness of feature extraction for microarray data, a novel approach based on multi-algorithm fusion is proposed. By fusing different types of feature extraction algorithms to select the feature from the samples set, the proposed approach is able to improve feature extraction performance. The new approach is tested against gene expression dataset including Colon cancer data, CNS data, DLBCL data, and Leukemia data. The testing results show that the performance of this algorithm is better than existing solutions.
Score-Level Fusion of Phase-Based and Feature-Based Fingerprint Matching Algorithms
NASA Astrophysics Data System (ADS)
Ito, Koichi; Morita, Ayumi; Aoki, Takafumi; Nakajima, Hiroshi; Kobayashi, Koji; Higuchi, Tatsuo
This paper proposes an efficient fingerprint recognition algorithm combining phase-based image matching and feature-based matching. In our previous work, we have already proposed an efficient fingerprint recognition algorithm using Phase-Only Correlation (POC), and developed commercial fingerprint verification units for access control applications. The use of Fourier phase information of fingerprint images makes it possible to achieve robust recognition for weakly impressed, low-quality fingerprint images. This paper presents an idea of improving the performance of POC-based fingerprint matching by combining it with feature-based matching, where feature-based matching is introduced in order to improve recognition efficiency for images with nonlinear distortion. Experimental evaluation using two different types of fingerprint image databases demonstrates efficient recognition performance of the combination of the POC-based algorithm and the feature-based algorithm.
Speech Emotion Feature Selection Method Based on Contribution Analysis Algorithm of Neural Network
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang Xiaojia; Mao Qirong; Zhan Yongzhao
There are many emotion features. If all these features are employed to recognize emotions, redundant features may be existed. Furthermore, recognition result is unsatisfying and the cost of feature extraction is high. In this paper, a method to select speech emotion features based on contribution analysis algorithm of NN is presented. The emotion features are selected by using contribution analysis algorithm of NN from the 95 extracted features. Cluster analysis is applied to analyze the effectiveness for the features selected, and the time of feature extraction is evaluated. Finally, 24 emotion features selected are used to recognize six speech emotions.more » The experiments show that this method can improve the recognition rate and the time of feature extraction.« less
Detection of Coronal Mass Ejections Using Multiple Features and Space-Time Continuity
NASA Astrophysics Data System (ADS)
Zhang, Ling; Yin, Jian-qin; Lin, Jia-ben; Feng, Zhi-quan; Zhou, Jin
2017-07-01
Coronal Mass Ejections (CMEs) release tremendous amounts of energy in the solar system, which has an impact on satellites, power facilities and wireless transmission. To effectively detect a CME in Large Angle Spectrometric Coronagraph (LASCO) C2 images, we propose a novel algorithm to locate the suspected CME regions, using the Extreme Learning Machine (ELM) method and taking into account the features of the grayscale and the texture. Furthermore, space-time continuity is used in the detection algorithm to exclude the false CME regions. The algorithm includes three steps: i) define the feature vector which contains textural and grayscale features of a running difference image; ii) design the detection algorithm based on the ELM method according to the feature vector; iii) improve the detection accuracy rate by using the decision rule of the space-time continuum. Experimental results show the efficiency and the superiority of the proposed algorithm in the detection of CMEs compared with other traditional methods. In addition, our algorithm is insensitive to most noise.
Automated target classification in high resolution dual frequency sonar imagery
NASA Astrophysics Data System (ADS)
Aridgides, Tom; Fernández, Manuel
2007-04-01
An improved computer-aided-detection / computer-aided-classification (CAD/CAC) processing string has been developed. The classified objects of 2 distinct strings are fused using the classification confidence values and their expansions as features, and using "summing" or log-likelihood-ratio-test (LLRT) based fusion rules. The utility of the overall processing strings and their fusion was demonstrated with new high-resolution dual frequency sonar imagery. Three significant fusion algorithm improvements were made. First, a nonlinear 2nd order (Volterra) feature LLRT fusion algorithm was developed. Second, a Box-Cox nonlinear feature LLRT fusion algorithm was developed. The Box-Cox transformation consists of raising the features to a to-be-determined power. Third, a repeated application of a subset feature selection / feature orthogonalization / Volterra feature LLRT fusion block was utilized. It was shown that cascaded Volterra feature LLRT fusion of the CAD/CAC processing strings outperforms summing, baseline single-stage Volterra and Box-Cox feature LLRT algorithms, yielding significant improvements over the best single CAD/CAC processing string results, and providing the capability to correctly call the majority of targets while maintaining a very low false alarm rate. Additionally, the robustness of cascaded Volterra feature fusion was demonstrated, by showing that the algorithm yields similar performance with the training and test sets.
NASA Astrophysics Data System (ADS)
Paino, A.; Keller, J.; Popescu, M.; Stone, K.
2014-06-01
In this paper we present an approach that uses Genetic Programming (GP) to evolve novel feature extraction algorithms for greyscale images. Our motivation is to create an automated method of building new feature extraction algorithms for images that are competitive with commonly used human-engineered features, such as Local Binary Pattern (LBP) and Histogram of Oriented Gradients (HOG). The evolved feature extraction algorithms are functions defined over the image space, and each produces a real-valued feature vector of variable length. Each evolved feature extractor breaks up the given image into a set of cells centered on every pixel, performs evolved operations on each cell, and then combines the results of those operations for every cell using an evolved operator. Using this method, the algorithm is flexible enough to reproduce both LBP and HOG features. The dataset we use to train and test our approach consists of a large number of pre-segmented image "chips" taken from a Forward Looking Infrared Imagery (FLIR) camera mounted on the hood of a moving vehicle. The goal is to classify each image chip as either containing or not containing a buried object. To this end, we define the fitness of a candidate solution as the cross-fold validation accuracy of the features generated by said candidate solution when used in conjunction with a Support Vector Machine (SVM) classifier. In order to validate our approach, we compare the classification accuracy of an SVM trained using our evolved features with the accuracy of an SVM trained using mainstream feature extraction algorithms, including LBP and HOG.
An improved feature extraction algorithm based on KAZE for multi-spectral image
NASA Astrophysics Data System (ADS)
Yang, Jianping; Li, Jun
2018-02-01
Multi-spectral image contains abundant spectral information, which is widely used in all fields like resource exploration, meteorological observation and modern military. Image preprocessing, such as image feature extraction and matching, is indispensable while dealing with multi-spectral remote sensing image. Although the feature matching algorithm based on linear scale such as SIFT and SURF performs strong on robustness, the local accuracy cannot be guaranteed. Therefore, this paper proposes an improved KAZE algorithm, which is based on nonlinear scale, to raise the number of feature and to enhance the matching rate by using the adjusted-cosine vector. The experiment result shows that the number of feature and the matching rate of the improved KAZE are remarkably than the original KAZE algorithm.
He, Ying; Liang, Bin; Yang, Jun; Li, Shunzhi; He, Jin
2017-08-11
The Iterative Closest Points (ICP) algorithm is the mainstream algorithm used in the process of accurate registration of 3D point cloud data. The algorithm requires a proper initial value and the approximate registration of two point clouds to prevent the algorithm from falling into local extremes, but in the actual point cloud matching process, it is difficult to ensure compliance with this requirement. In this paper, we proposed the ICP algorithm based on point cloud features (GF-ICP). This method uses the geometrical features of the point cloud to be registered, such as curvature, surface normal and point cloud density, to search for the correspondence relationships between two point clouds and introduces the geometric features into the error function to realize the accurate registration of two point clouds. The experimental results showed that the algorithm can improve the convergence speed and the interval of convergence without setting a proper initial value.
Liang, Bin; Yang, Jun; Li, Shunzhi; He, Jin
2017-01-01
The Iterative Closest Points (ICP) algorithm is the mainstream algorithm used in the process of accurate registration of 3D point cloud data. The algorithm requires a proper initial value and the approximate registration of two point clouds to prevent the algorithm from falling into local extremes, but in the actual point cloud matching process, it is difficult to ensure compliance with this requirement. In this paper, we proposed the ICP algorithm based on point cloud features (GF-ICP). This method uses the geometrical features of the point cloud to be registered, such as curvature, surface normal and point cloud density, to search for the correspondence relationships between two point clouds and introduces the geometric features into the error function to realize the accurate registration of two point clouds. The experimental results showed that the algorithm can improve the convergence speed and the interval of convergence without setting a proper initial value. PMID:28800096
A triangle voting algorithm based on double feature constraints for star sensors
NASA Astrophysics Data System (ADS)
Fan, Qiaoyun; Zhong, Xuyang
2018-02-01
A novel autonomous star identification algorithm is presented in this study. In the proposed algorithm, each sensor star constructs multi-triangle with its bright neighbor stars and obtains its candidates by triangle voting process, in which the triangle is considered as the basic voting element. In order to accelerate the speed of this algorithm and reduce the required memory for star database, feature extraction is carried out to reduce the dimension of triangles and each triangle is described by its base and height. During the identification period, the voting scheme based on double feature constraints is proposed to implement triangle voting. This scheme guarantees that only the catalog star satisfying two features can vote for the sensor star, which improves the robustness towards false stars. The simulation and real star image test demonstrate that compared with the other two algorithms, the proposed algorithm is more robust towards position noise, magnitude noise and false stars.
A curvature-based weighted fuzzy c-means algorithm for point clouds de-noising
NASA Astrophysics Data System (ADS)
Cui, Xin; Li, Shipeng; Yan, Xiutian; He, Xinhua
2018-04-01
In order to remove the noise of three-dimensional scattered point cloud and smooth the data without damnify the sharp geometric feature simultaneity, a novel algorithm is proposed in this paper. The feature-preserving weight is added to fuzzy c-means algorithm which invented a curvature weighted fuzzy c-means clustering algorithm. Firstly, the large-scale outliers are removed by the statistics of r radius neighboring points. Then, the algorithm estimates the curvature of the point cloud data by using conicoid parabolic fitting method and calculates the curvature feature value. Finally, the proposed clustering algorithm is adapted to calculate the weighted cluster centers. The cluster centers are regarded as the new points. The experimental results show that this approach is efficient to different scale and intensities of noise in point cloud with a high precision, and perform a feature-preserving nature at the same time. Also it is robust enough to different noise model.
NASA Technical Reports Server (NTRS)
Eigen, D. J.; Fromm, F. R.; Northouse, R. A.
1974-01-01
A new clustering algorithm is presented that is based on dimensional information. The algorithm includes an inherent feature selection criterion, which is discussed. Further, a heuristic method for choosing the proper number of intervals for a frequency distribution histogram, a feature necessary for the algorithm, is presented. The algorithm, although usable as a stand-alone clustering technique, is then utilized as a global approximator. Local clustering techniques and configuration of a global-local scheme are discussed, and finally the complete global-local and feature selector configuration is shown in application to a real-time adaptive classification scheme for the analysis of remote sensed multispectral scanner data.
Joshuva, A; Sugumaran, V
2017-03-01
Wind energy is one of the important renewable energy resources available in nature. It is one of the major resources for production of energy because of its dependability due to the development of the technology and relatively low cost. Wind energy is converted into electrical energy using rotating blades. Due to environmental conditions and large structure, the blades are subjected to various vibration forces that may cause damage to the blades. This leads to a liability in energy production and turbine shutdown. The downtime can be reduced when the blades are diagnosed continuously using structural health condition monitoring. These are considered as a pattern recognition problem which consists of three phases namely, feature extraction, feature selection, and feature classification. In this study, statistical features were extracted from vibration signals, feature selection was carried out using a J48 decision tree algorithm and feature classification was performed using best-first tree algorithm and functional trees algorithm. The better algorithm is suggested for fault diagnosis of wind turbine blade. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.
Target recognition of ladar range images using slice image: comparison of four improved algorithms
NASA Astrophysics Data System (ADS)
Xia, Wenze; Han, Shaokun; Cao, Jingya; Wang, Liang; Zhai, Yu; Cheng, Yang
2017-07-01
Compared with traditional 3-D shape data, ladar range images possess properties of strong noise, shape degeneracy, and sparsity, which make feature extraction and representation difficult. The slice image is an effective feature descriptor to resolve this problem. We propose four improved algorithms on target recognition of ladar range images using slice image. In order to improve resolution invariance of the slice image, mean value detection instead of maximum value detection is applied in these four improved algorithms. In order to improve rotation invariance of the slice image, three new improved feature descriptors-which are feature slice image, slice-Zernike moments, and slice-Fourier moments-are applied to the last three improved algorithms, respectively. Backpropagation neural networks are used as feature classifiers in the last two improved algorithms. The performance of these four improved recognition systems is analyzed comprehensively in the aspects of the three invariances, recognition rate, and execution time. The final experiment results show that the improvements for these four algorithms reach the desired effect, the three invariances of feature descriptors are not directly related to the final recognition performance of recognition systems, and these four improved recognition systems have different performances under different conditions.
Tele-Autonomous control involving contact. Final Report Thesis; [object localization
NASA Technical Reports Server (NTRS)
Shao, Lejun; Volz, Richard A.; Conway, Lynn; Walker, Michael W.
1990-01-01
Object localization and its application in tele-autonomous systems are studied. Two object localization algorithms are presented together with the methods of extracting several important types of object features. The first algorithm is based on line-segment to line-segment matching. Line range sensors are used to extract line-segment features from an object. The extracted features are matched to corresponding model features to compute the location of the object. The inputs of the second algorithm are not limited only to the line features. Featured points (point to point matching) and featured unit direction vectors (vector to vector matching) can also be used as the inputs of the algorithm, and there is no upper limit on the number of the features inputed. The algorithm will allow the use of redundant features to find a better solution. The algorithm uses dual number quaternions to represent the position and orientation of an object and uses the least squares optimization method to find an optimal solution for the object's location. The advantage of using this representation is that the method solves for the location estimation by minimizing a single cost function associated with the sum of the orientation and position errors and thus has a better performance on the estimation, both in accuracy and speed, than that of other similar algorithms. The difficulties when the operator is controlling a remote robot to perform manipulation tasks are also discussed. The main problems facing the operator are time delays on the signal transmission and the uncertainties of the remote environment. How object localization techniques can be used together with other techniques such as predictor display and time desynchronization to help to overcome these difficulties are then discussed.
Al-Rajab, Murad; Lu, Joan; Xu, Qiang
2017-07-01
This paper examines the accuracy and efficiency (time complexity) of high performance genetic data feature selection and classification algorithms for colon cancer diagnosis. The need for this research derives from the urgent and increasing need for accurate and efficient algorithms. Colon cancer is a leading cause of death worldwide, hence it is vitally important for the cancer tissues to be expertly identified and classified in a rapid and timely manner, to assure both a fast detection of the disease and to expedite the drug discovery process. In this research, a three-phase approach was proposed and implemented: Phases One and Two examined the feature selection algorithms and classification algorithms employed separately, and Phase Three examined the performance of the combination of these. It was found from Phase One that the Particle Swarm Optimization (PSO) algorithm performed best with the colon dataset as a feature selection (29 genes selected) and from Phase Two that the Support Vector Machine (SVM) algorithm outperformed other classifications, with an accuracy of almost 86%. It was also found from Phase Three that the combined use of PSO and SVM surpassed other algorithms in accuracy and performance, and was faster in terms of time analysis (94%). It is concluded that applying feature selection algorithms prior to classification algorithms results in better accuracy than when the latter are applied alone. This conclusion is important and significant to industry and society. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhang, Lei; Yang, Fengbao; Ji, Linna; Lv, Sheng
2018-01-01
Diverse image fusion methods perform differently. Each method has advantages and disadvantages compared with others. One notion is that the advantages of different image methods can be effectively combined. A multiple-algorithm parallel fusion method based on algorithmic complementarity and synergy is proposed. First, in view of the characteristics of the different algorithms and difference-features among images, an index vector-based feature-similarity is proposed to define the degree of complementarity and synergy. This proposed index vector is a reliable evidence indicator for algorithm selection. Second, the algorithms with a high degree of complementarity and synergy are selected. Then, the different degrees of various features and infrared intensity images are used as the initial weights for the nonnegative matrix factorization (NMF). This avoids randomness of the NMF initialization parameter. Finally, the fused images of different algorithms are integrated using the NMF because of its excellent data fusing performance on independent features. Experimental results demonstrate that the visual effect and objective evaluation index of the fused images obtained using the proposed method are better than those obtained using traditional methods. The proposed method retains all the advantages that individual fusion algorithms have.
NASA Astrophysics Data System (ADS)
Khehra, Baljit Singh; Pharwaha, Amar Partap Singh
2017-04-01
Ductal carcinoma in situ (DCIS) is one type of breast cancer. Clusters of microcalcifications (MCCs) are symptoms of DCIS that are recognized by mammography. Selection of robust features vector is the process of selecting an optimal subset of features from a large number of available features in a given problem domain after the feature extraction and before any classification scheme. Feature selection reduces the feature space that improves the performance of classifier and decreases the computational burden imposed by using many features on classifier. Selection of an optimal subset of features from a large number of available features in a given problem domain is a difficult search problem. For n features, the total numbers of possible subsets of features are 2n. Thus, selection of an optimal subset of features problem belongs to the category of NP-hard problems. In this paper, an attempt is made to find the optimal subset of MCCs features from all possible subsets of features using genetic algorithm (GA), particle swarm optimization (PSO) and biogeography-based optimization (BBO). For simulation, a total of 380 benign and malignant MCCs samples have been selected from mammogram images of DDSM database. A total of 50 features extracted from benign and malignant MCCs samples are used in this study. In these algorithms, fitness function is correct classification rate of classifier. Support vector machine is used as a classifier. From experimental results, it is also observed that the performance of PSO-based and BBO-based algorithms to select an optimal subset of features for classifying MCCs as benign or malignant is better as compared to GA-based algorithm.
NASA Astrophysics Data System (ADS)
Emaminejad, Nastaran; Wahi-Anwar, Muhammad; Hoffman, John; Kim, Grace H.; Brown, Matthew S.; McNitt-Gray, Michael
2018-02-01
Translation of radiomics into clinical practice requires confidence in its interpretations. This may be obtained via understanding and overcoming the limitations in current radiomic approaches. Currently there is a lack of standardization in radiomic feature extraction. In this study we examined a few factors that are potential sources of inconsistency in characterizing lung nodules, such as 1)different choices of parameters and algorithms in feature calculation, 2)two CT image dose levels, 3)different CT reconstruction algorithms (WFBP, denoised WFBP, and Iterative). We investigated the effect of variation of these factors on entropy textural feature of lung nodules. CT images of 19 lung nodules identified from our lung cancer screening program were identified by a CAD tool and contours provided. The radiomics features were extracted by calculating 36 GLCM based and 4 histogram based entropy features in addition to 2 intensity based features. A robustness index was calculated across different image acquisition parameters to illustrate the reproducibility of features. Most GLCM based and all histogram based entropy features were robust across two CT image dose levels. Denoising of images slightly improved robustness of some entropy features at WFBP. Iterative reconstruction resulted in improvement of robustness in a fewer times and caused more variation in entropy feature values and their robustness. Within different choices of parameters and algorithms texture features showed a wide range of variation, as much as 75% for individual nodules. Results indicate the need for harmonization of feature calculations and identification of optimum parameters and algorithms in a radiomics study.
New development of the image matching algorithm
NASA Astrophysics Data System (ADS)
Zhang, Xiaoqiang; Feng, Zhao
2018-04-01
To study the image matching algorithm, algorithm four elements are described, i.e., similarity measurement, feature space, search space and search strategy. Four common indexes for evaluating the image matching algorithm are described, i.e., matching accuracy, matching efficiency, robustness and universality. Meanwhile, this paper describes the principle of image matching algorithm based on the gray value, image matching algorithm based on the feature, image matching algorithm based on the frequency domain analysis, image matching algorithm based on the neural network and image matching algorithm based on the semantic recognition, and analyzes their characteristics and latest research achievements. Finally, the development trend of image matching algorithm is discussed. This study is significant for the algorithm improvement, new algorithm design and algorithm selection in practice.
Ma, Li; Fan, Suohai
2017-03-14
The random forests algorithm is a type of classifier with prominent universality, a wide application range, and robustness for avoiding overfitting. But there are still some drawbacks to random forests. Therefore, to improve the performance of random forests, this paper seeks to improve imbalanced data processing, feature selection and parameter optimization. We propose the CURE-SMOTE algorithm for the imbalanced data classification problem. Experiments on imbalanced UCI data reveal that the combination of Clustering Using Representatives (CURE) enhances the original synthetic minority oversampling technique (SMOTE) algorithms effectively compared with the classification results on the original data using random sampling, Borderline-SMOTE1, safe-level SMOTE, C-SMOTE, and k-means-SMOTE. Additionally, the hybrid RF (random forests) algorithm has been proposed for feature selection and parameter optimization, which uses the minimum out of bag (OOB) data error as its objective function. Simulation results on binary and higher-dimensional data indicate that the proposed hybrid RF algorithms, hybrid genetic-random forests algorithm, hybrid particle swarm-random forests algorithm and hybrid fish swarm-random forests algorithm can achieve the minimum OOB error and show the best generalization ability. The training set produced from the proposed CURE-SMOTE algorithm is closer to the original data distribution because it contains minimal noise. Thus, better classification results are produced from this feasible and effective algorithm. Moreover, the hybrid algorithm's F-value, G-mean, AUC and OOB scores demonstrate that they surpass the performance of the original RF algorithm. Hence, this hybrid algorithm provides a new way to perform feature selection and parameter optimization.
Retention time alignment of LC/MS data by a divide-and-conquer algorithm.
Zhang, Zhongqi
2012-04-01
Liquid chromatography-mass spectrometry (LC/MS) has become the method of choice for characterizing complex mixtures. These analyses often involve quantitative comparison of components in multiple samples. To achieve automated sample comparison, the components of interest must be detected and identified, and their retention times aligned and peak areas calculated. This article describes a simple pairwise iterative retention time alignment algorithm, based on the divide-and-conquer approach, for alignment of ion features detected in LC/MS experiments. In this iterative algorithm, ion features in the sample run are first aligned with features in the reference run by applying a single constant shift of retention time. The sample chromatogram is then divided into two shorter chromatograms, which are aligned to the reference chromatogram the same way. Each shorter chromatogram is further divided into even shorter chromatograms. This process continues until each chromatogram is sufficiently narrow so that ion features within it have a similar retention time shift. In six pairwise LC/MS alignment examples containing a total of 6507 confirmed true corresponding feature pairs with retention time shifts up to five peak widths, the algorithm successfully aligned these features with an error rate of 0.2%. The alignment algorithm is demonstrated to be fast, robust, fully automatic, and superior to other algorithms. After alignment and gap-filling of detected ion features, their abundances can be tabulated for direct comparison between samples.
Fang, Hongqing; He, Lei; Si, Hao; Liu, Peng; Xie, Xiaolei
2014-09-01
In this paper, Back-propagation(BP) algorithm has been used to train the feed forward neural network for human activity recognition in smart home environments, and inter-class distance method for feature selection of observed motion sensor events is discussed and tested. And then, the human activity recognition performances of neural network using BP algorithm have been evaluated and compared with other probabilistic algorithms: Naïve Bayes(NB) classifier and Hidden Markov Model(HMM). The results show that different feature datasets yield different activity recognition accuracy. The selection of unsuitable feature datasets increases the computational complexity and degrades the activity recognition accuracy. Furthermore, neural network using BP algorithm has relatively better human activity recognition performances than NB classifier and HMM. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
Constraint programming based biomarker optimization.
Zhou, Manli; Luo, Youxi; Sun, Guoquan; Mai, Guoqin; Zhou, Fengfeng
2015-01-01
Efficient and intuitive characterization of biological big data is becoming a major challenge for modern bio-OMIC based scientists. Interactive visualization and exploration of big data is proven to be one of the successful solutions. Most of the existing feature selection algorithms do not allow the interactive inputs from users in the optimizing process of feature selection. This study investigates this question as fixing a few user-input features in the finally selected feature subset and formulates these user-input features as constraints for a programming model. The proposed algorithm, fsCoP (feature selection based on constrained programming), performs well similar to or much better than the existing feature selection algorithms, even with the constraints from both literature and the existing algorithms. An fsCoP biomarker may be intriguing for further wet lab validation, since it satisfies both the classification optimization function and the biomedical knowledge. fsCoP may also be used for the interactive exploration of bio-OMIC big data by interactively adding user-defined constraints for modeling.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, B; Tan, Y; Tsai, W
2014-06-15
Purpose: Radiogenomics promises the ability to study cancer tumor genotype from the phenotype obtained through radiographic imaging. However, little attention has been paid to the sensitivity of image features, the image-based biomarkers, to imaging acquisition techniques. This study explores the impact of CT dose, slice thickness and reconstruction algorithm on measuring image features using a thorax phantom. Methods: Twentyfour phantom lesions of known volume (1 and 2mm), shape (spherical, elliptical, lobular and spicular) and density (-630, -10 and +100 HU) were scanned on a GE VCT at four doses (25, 50, 100, and 200 mAs). For each scan, six imagemore » series were reconstructed at three slice thicknesses of 5, 2.5 and 1.25mm with continuous intervals, using the lung and standard reconstruction algorithms. The lesions were segmented with an in-house 3D algorithm. Fifty (50) image features representing lesion size, shape, edge, and density distribution/texture were computed. Regression method was employed to analyze the effect of CT dose, slice of thickness and reconstruction algorithm on these features adjusting 3 confounding factors (size, density and shape of phantom lesions). Results: The coefficients of CT dose, slice thickness and reconstruction algorithm are presented in Table 1 in the supplementary material. No significant difference was found between the image features calculated on low dose CT scans (25mAs and 50mAs). About 50% texture features were found statistically different between low doses and high doses (100 and 200mAs). Significant differences were found for almost all features when calculated on 1.25mm, 2.5mm, and 5mm slice thickness images. Reconstruction algorithms significantly affected all density-based image features, but not morphological features. Conclusions: There is a great need to standardize the CT imaging protocols for radiogenomics study because CT dose, slice thickness and reconstruction algorithm impact quantitative image features to various degrees as our study has shown.« less
A novel automated spike sorting algorithm with adaptable feature extraction.
Bestel, Robert; Daus, Andreas W; Thielemann, Christiane
2012-10-15
To study the electrophysiological properties of neuronal networks, in vitro studies based on microelectrode arrays have become a viable tool for analysis. Although in constant progress, a challenging task still remains in this area: the development of an efficient spike sorting algorithm that allows an accurate signal analysis at the single-cell level. Most sorting algorithms currently available only extract a specific feature type, such as the principal components or Wavelet coefficients of the measured spike signals in order to separate different spike shapes generated by different neurons. However, due to the great variety in the obtained spike shapes, the derivation of an optimal feature set is still a very complex issue that current algorithms struggle with. To address this problem, we propose a novel algorithm that (i) extracts a variety of geometric, Wavelet and principal component-based features and (ii) automatically derives a feature subset, most suitable for sorting an individual set of spike signals. Thus, there is a new approach that evaluates the probability distribution of the obtained spike features and consequently determines the candidates most suitable for the actual spike sorting. These candidates can be formed into an individually adjusted set of spike features, allowing a separation of the various shapes present in the obtained neuronal signal by a subsequent expectation maximisation clustering algorithm. Test results with simulated data files and data obtained from chick embryonic neurons cultured on microelectrode arrays showed an excellent classification result, indicating the superior performance of the described algorithm approach. Copyright © 2012 Elsevier B.V. All rights reserved.
Learning Behavior Characterization with Multi-Feature, Hierarchical Activity Sequences
ERIC Educational Resources Information Center
Ye, Cheng; Segedy, James R.; Kinnebrew, John S.; Biswas, Gautam
2015-01-01
This paper discusses Multi-Feature Hierarchical Sequential Pattern Mining, MFH-SPAM, a novel algorithm that efficiently extracts patterns from students' learning activity sequences. This algorithm extends an existing sequential pattern mining algorithm by dynamically selecting the level of specificity for hierarchically-defined features…
A bootstrap based Neyman-Pearson test for identifying variable importance.
Ditzler, Gregory; Polikar, Robi; Rosen, Gail
2015-04-01
Selection of most informative features that leads to a small loss on future data are arguably one of the most important steps in classification, data analysis and model selection. Several feature selection (FS) algorithms are available; however, due to noise present in any data set, FS algorithms are typically accompanied by an appropriate cross-validation scheme. In this brief, we propose a statistical hypothesis test derived from the Neyman-Pearson lemma for determining if a feature is statistically relevant. The proposed approach can be applied as a wrapper to any FS algorithm, regardless of the FS criteria used by that algorithm, to determine whether a feature belongs in the relevant set. Perhaps more importantly, this procedure efficiently determines the number of relevant features given an initial starting point. We provide freely available software implementations of the proposed methodology.
A scale-invariant keypoint detector in log-polar space
NASA Astrophysics Data System (ADS)
Tao, Tao; Zhang, Yun
2017-02-01
The scale-invariant feature transform (SIFT) algorithm is devised to detect keypoints via the difference of Gaussian (DoG) images. However, the DoG data lacks the high-frequency information, which can lead to a performance drop of the algorithm. To address this issue, this paper proposes a novel log-polar feature detector (LPFD) to detect scale-invariant blubs (keypoints) in log-polar space, which, in contrast, can retain all the image information. The algorithm consists of three components, viz. keypoint detection, descriptor extraction and descriptor matching. Besides, the algorithm is evaluated in detecting keypoints from the INRIA dataset by comparing with the SIFT algorithm and one of its fast versions, the speed up robust features (SURF) algorithm in terms of three performance measures, viz. correspondences, repeatability, correct matches and matching score.
Fast object detection algorithm based on HOG and CNN
NASA Astrophysics Data System (ADS)
Lu, Tongwei; Wang, Dandan; Zhang, Yanduo
2018-04-01
In the field of computer vision, object classification and object detection are widely used in many fields. The traditional object detection have two main problems:one is that sliding window of the regional selection strategy is high time complexity and have window redundancy. And the other one is that Robustness of the feature is not well. In order to solve those problems, Regional Proposal Network (RPN) is used to select candidate regions instead of selective search algorithm. Compared with traditional algorithms and selective search algorithms, RPN has higher efficiency and accuracy. We combine HOG feature and convolution neural network (CNN) to extract features. And we use SVM to classify. For TorontoNet, our algorithm's mAP is 1.6 percentage points higher. For OxfordNet, our algorithm's mAP is 1.3 percentage higher.
Modified kernel-based nonlinear feature extraction.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ma, J.; Perkins, S. J.; Theiler, J. P.
2002-01-01
Feature Extraction (FE) techniques are widely used in many applications to pre-process data in order to reduce the complexity of subsequent processes. A group of Kernel-based nonlinear FE ( H E ) algorithms has attracted much attention due to their high performance. However, a serious limitation that is inherent in these algorithms -- the maximal number of features extracted by them is limited by the number of classes involved -- dramatically degrades their flexibility. Here we propose a modified version of those KFE algorithms (MKFE), This algorithm is developed from a special form of scatter-matrix, whose rank is not determinedmore » by the number of classes involved, and thus breaks the inherent limitation in those KFE algorithms. Experimental results suggest that MKFE algorithm is .especially useful when the training set is small.« less
A spectrum fractal feature classification algorithm for agriculture crops with hyper spectrum image
NASA Astrophysics Data System (ADS)
Su, Junying
2011-11-01
A fractal dimension feature analysis method in spectrum domain for hyper spectrum image is proposed for agriculture crops classification. Firstly, a fractal dimension calculation algorithm in spectrum domain is presented together with the fast fractal dimension value calculation algorithm using the step measurement method. Secondly, the hyper spectrum image classification algorithm and flowchart is presented based on fractal dimension feature analysis in spectrum domain. Finally, the experiment result of the agricultural crops classification with FCL1 hyper spectrum image set with the proposed method and SAM (spectral angle mapper). The experiment results show it can obtain better classification result than the traditional SAM feature analysis which can fulfill use the spectrum information of hyper spectrum image to realize precision agricultural crops classification.
A novel feature ranking algorithm for biometric recognition with PPG signals.
Reşit Kavsaoğlu, A; Polat, Kemal; Recep Bozkurt, M
2014-06-01
This study is intended for describing the application of the Photoplethysmography (PPG) signal and the time domain features acquired from its first and second derivatives for biometric identification. For this purpose, a sum of 40 features has been extracted and a feature-ranking algorithm is proposed. This proposed algorithm calculates the contribution of each feature to biometric recognition and collocates the features, the contribution of which is from great to small. While identifying the contribution of the features, the Euclidean distance and absolute distance formulas are used. The efficiency of the proposed algorithms is demonstrated by the results of the k-NN (k-nearest neighbor) classifier applications of the features. During application, each 15-period-PPG signal belonging to two different durations from each of the thirty healthy subjects were used with a PPG data acquisition card. The first PPG signals recorded from the subjects were evaluated as the 1st configuration; the PPG signals recorded later at a different time as the 2nd configuration and the combination of both were evaluated as the 3rd configuration. When the results were evaluated for the k-NN classifier model created along with the proposed algorithm, an identification of 90.44% for the 1st configuration, 94.44% for the 2nd configuration, and 87.22% for the 3rd configuration has successfully been attained. The obtained results showed that both the proposed algorithm and the biometric identification model based on this developed PPG signal are very promising for contactless recognizing the people with the proposed method. Copyright © 2014 Elsevier Ltd. All rights reserved.
Sun, Wenqing; Zheng, Bin; Qian, Wei
2017-10-01
This study aimed to analyze the ability of extracting automatically generated features using deep structured algorithms in lung nodule CT image diagnosis, and compare its performance with traditional computer aided diagnosis (CADx) systems using hand-crafted features. All of the 1018 cases were acquired from Lung Image Database Consortium (LIDC) public lung cancer database. The nodules were segmented according to four radiologists' markings, and 13,668 samples were generated by rotating every slice of nodule images. Three multichannel ROI based deep structured algorithms were designed and implemented in this study: convolutional neural network (CNN), deep belief network (DBN), and stacked denoising autoencoder (SDAE). For the comparison purpose, we also implemented a CADx system using hand-crafted features including density features, texture features and morphological features. The performance of every scheme was evaluated by using a 10-fold cross-validation method and an assessment index of the area under the receiver operating characteristic curve (AUC). The observed highest area under the curve (AUC) was 0.899±0.018 achieved by CNN, which was significantly higher than traditional CADx with the AUC=0.848±0.026. The results from DBN was also slightly higher than CADx, while SDAE was slightly lower. By visualizing the automatic generated features, we found some meaningful detectors like curvy stroke detectors from deep structured schemes. The study results showed the deep structured algorithms with automatically generated features can achieve desirable performance in lung nodule diagnosis. With well-tuned parameters and large enough dataset, the deep learning algorithms can have better performance than current popular CADx. We believe the deep learning algorithms with similar data preprocessing procedure can be used in other medical image analysis areas as well. Copyright © 2017. Published by Elsevier Ltd.
A Low Cost VLSI Architecture for Spike Sorting Based on Feature Extraction with Peak Search.
Chang, Yuan-Jyun; Hwang, Wen-Jyi; Chen, Chih-Chang
2016-12-07
The goal of this paper is to present a novel VLSI architecture for spike sorting with high classification accuracy, low area costs and low power consumption. A novel feature extraction algorithm with low computational complexities is proposed for the design of the architecture. In the feature extraction algorithm, a spike is separated into two portions based on its peak value. The area of each portion is then used as a feature. The algorithm is simple to implement and less susceptible to noise interference. Based on the algorithm, a novel architecture capable of identifying peak values and computing spike areas concurrently is proposed. To further accelerate the computation, a spike can be divided into a number of segments for the local feature computation. The local features are subsequently merged with the global ones by a simple hardware circuit. The architecture can also be easily operated in conjunction with the circuits for commonly-used spike detection algorithms, such as the Non-linear Energy Operator (NEO). The architecture has been implemented by an Application-Specific Integrated Circuit (ASIC) with 90-nm technology. Comparisons to the existing works show that the proposed architecture is well suited for real-time multi-channel spike detection and feature extraction requiring low hardware area costs, low power consumption and high classification accuracy.
Automated method for measuring the extent of selective logging damage with airborne LiDAR data
NASA Astrophysics Data System (ADS)
Melendy, L.; Hagen, S. C.; Sullivan, F. B.; Pearson, T. R. H.; Walker, S. M.; Ellis, P.; Kustiyo; Sambodo, Ari Katmoko; Roswintiarti, O.; Hanson, M. A.; Klassen, A. W.; Palace, M. W.; Braswell, B. H.; Delgado, G. M.
2018-05-01
Selective logging has an impact on the global carbon cycle, as well as on the forest micro-climate, and longer-term changes in erosion, soil and nutrient cycling, and fire susceptibility. Our ability to quantify these impacts is dependent on methods and tools that accurately identify the extent and features of logging activity. LiDAR-based measurements of these features offers significant promise. Here, we present a set of algorithms for automated detection and mapping of critical features associated with logging - roads/decks, skid trails, and gaps - using commercial airborne LiDAR data as input. The automated algorithm was applied to commercial LiDAR data collected over two logging concessions in Kalimantan, Indonesia in 2014. The algorithm results were compared to measurements of the logging features collected in the field soon after logging was complete. The automated algorithm-mapped road/deck and skid trail features match closely with features measured in the field, with agreement levels ranging from 69% to 99% when adjusting for GPS location error. The algorithm performed most poorly with gaps, which, by their nature, are variable due to the unpredictable impact of tree fall versus the linear and regular features directly created by mechanical means. Overall, the automated algorithm performs well and offers significant promise as a generalizable tool useful to efficiently and accurately capture the effects of selective logging, including the potential to distinguish reduced impact logging from conventional logging.
Local-search based prediction of medical image registration error
NASA Astrophysics Data System (ADS)
Saygili, Görkem
2018-03-01
Medical image registration is a crucial task in many different medical imaging applications. Hence, considerable amount of work has been published recently that aim to predict the error in a registration without any human effort. If provided, these error predictions can be used as a feedback to the registration algorithm to further improve its performance. Recent methods generally start with extracting image-based and deformation-based features, then apply feature pooling and finally train a Random Forest (RF) regressor to predict the real registration error. Image-based features can be calculated after applying a single registration but provide limited accuracy whereas deformation-based features such as variation of deformation vector field may require up to 20 registrations which is a considerably high time-consuming task. This paper proposes to use extracted features from a local search algorithm as image-based features to estimate the error of a registration. The proposed method comprises a local search algorithm to find corresponding voxels between registered image pairs and based on the amount of shifts and stereo confidence measures, it predicts the amount of registration error in millimetres densely using a RF regressor. Compared to other algorithms in the literature, the proposed algorithm does not require multiple registrations, can be efficiently implemented on a Graphical Processing Unit (GPU) and can still provide highly accurate error predictions in existence of large registration error. Experimental results with real registrations on a public dataset indicate a substantially high accuracy achieved by using features from the local search algorithm.
Single and Multiple Object Tracking Using a Multi-Feature Joint Sparse Representation.
Hu, Weiming; Li, Wei; Zhang, Xiaoqin; Maybank, Stephen
2015-04-01
In this paper, we propose a tracking algorithm based on a multi-feature joint sparse representation. The templates for the sparse representation can include pixel values, textures, and edges. In the multi-feature joint optimization, noise or occlusion is dealt with using a set of trivial templates. A sparse weight constraint is introduced to dynamically select the relevant templates from the full set of templates. A variance ratio measure is adopted to adaptively adjust the weights of different features. The multi-feature template set is updated adaptively. We further propose an algorithm for tracking multi-objects with occlusion handling based on the multi-feature joint sparse reconstruction. The observation model based on sparse reconstruction automatically focuses on the visible parts of an occluded object by using the information in the trivial templates. The multi-object tracking is simplified into a joint Bayesian inference. The experimental results show the superiority of our algorithm over several state-of-the-art tracking algorithms.
Decontaminate feature for tracking: adaptive tracking via evolutionary feature subset
NASA Astrophysics Data System (ADS)
Liu, Qiaoyuan; Wang, Yuru; Yin, Minghao; Ren, Jinchang; Li, Ruizhi
2017-11-01
Although various visual tracking algorithms have been proposed in the last 2-3 decades, it remains a challenging problem for effective tracking with fast motion, deformation, occlusion, etc. Under complex tracking conditions, most tracking models are not discriminative and adaptive enough. When the combined feature vectors are inputted to the visual models, this may lead to redundancy causing low efficiency and ambiguity causing poor performance. An effective tracking algorithm is proposed to decontaminate features for each video sequence adaptively, where the visual modeling is treated as an optimization problem from the perspective of evolution. Every feature vector is compared to a biological individual and then decontaminated via classical evolutionary algorithms. With the optimized subsets of features, the "curse of dimensionality" has been avoided while the accuracy of the visual model has been improved. The proposed algorithm has been tested on several publicly available datasets with various tracking challenges and benchmarked with a number of state-of-the-art approaches. The comprehensive experiments have demonstrated the efficacy of the proposed methodology.
NASA Astrophysics Data System (ADS)
Zhang, Ming; Xie, Fei; Zhao, Jing; Sun, Rui; Zhang, Lei; Zhang, Yue
2018-04-01
The prosperity of license plate recognition technology has made great contribution to the development of Intelligent Transport System (ITS). In this paper, a robust and efficient license plate recognition method is proposed which is based on a combined feature extraction model and BPNN (Back Propagation Neural Network) algorithm. Firstly, the candidate region of the license plate detection and segmentation method is developed. Secondly, a new feature extraction model is designed considering three sets of features combination. Thirdly, the license plates classification and recognition method using the combined feature model and BPNN algorithm is presented. Finally, the experimental results indicate that the license plate segmentation and recognition both can be achieved effectively by the proposed algorithm. Compared with three traditional methods, the recognition accuracy of the proposed method has increased to 95.7% and the consuming time has decreased to 51.4ms.
NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.
Ruyssinck, Joeri; Huynh-Thu, Vân Anh; Geurts, Pierre; Dhaene, Tom; Demeester, Piet; Saeys, Yvan
2014-01-01
One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available.
Fast detection of vascular plaque in optical coherence tomography images using a reduced feature set
NASA Astrophysics Data System (ADS)
Prakash, Ammu; Ocana Macias, Mariano; Hewko, Mark; Sowa, Michael; Sherif, Sherif
2018-03-01
Optical coherence tomography (OCT) images are capable of detecting vascular plaque by using the full set of 26 Haralick textural features and a standard K-means clustering algorithm. However, the use of the full set of 26 textural features is computationally expensive and may not be feasible for real time implementation. In this work, we identified a reduced set of 3 textural feature which characterizes vascular plaque and used a generalized Fuzzy C-means clustering algorithm. Our work involves three steps: 1) the reduction of a full set 26 textural feature to a reduced set of 3 textural features by using genetic algorithm (GA) optimization method 2) the implementation of an unsupervised generalized clustering algorithm (Fuzzy C-means) on the reduced feature space, and 3) the validation of our results using histology and actual photographic images of vascular plaque. Our results show an excellent match with histology and actual photographic images of vascular tissue. Therefore, our results could provide an efficient pre-clinical tool for the detection of vascular plaque in real time OCT imaging.
Color Feature-Based Object Tracking through Particle Swarm Optimization with Improved Inertia Weight
Guo, Siqiu; Zhang, Tao; Song, Yulong
2018-01-01
This paper presents a particle swarm tracking algorithm with improved inertia weight based on color features. The weighted color histogram is used as the target feature to reduce the contribution of target edge pixels in the target feature, which makes the algorithm insensitive to the target non-rigid deformation, scale variation, and rotation. Meanwhile, the influence of partial obstruction on the description of target features is reduced. The particle swarm optimization algorithm can complete the multi-peak search, which can cope well with the object occlusion tracking problem. This means that the target is located precisely where the similarity function appears multi-peak. When the particle swarm optimization algorithm is applied to the object tracking, the inertia weight adjustment mechanism has some limitations. This paper presents an improved method. The concept of particle maturity is introduced to improve the inertia weight adjustment mechanism, which could adjust the inertia weight in time according to the different states of each particle in each generation. Experimental results show that our algorithm achieves state-of-the-art performance in a wide range of scenarios. PMID:29690610
Guo, Siqiu; Zhang, Tao; Song, Yulong; Qian, Feng
2018-04-23
This paper presents a particle swarm tracking algorithm with improved inertia weight based on color features. The weighted color histogram is used as the target feature to reduce the contribution of target edge pixels in the target feature, which makes the algorithm insensitive to the target non-rigid deformation, scale variation, and rotation. Meanwhile, the influence of partial obstruction on the description of target features is reduced. The particle swarm optimization algorithm can complete the multi-peak search, which can cope well with the object occlusion tracking problem. This means that the target is located precisely where the similarity function appears multi-peak. When the particle swarm optimization algorithm is applied to the object tracking, the inertia weight adjustment mechanism has some limitations. This paper presents an improved method. The concept of particle maturity is introduced to improve the inertia weight adjustment mechanism, which could adjust the inertia weight in time according to the different states of each particle in each generation. Experimental results show that our algorithm achieves state-of-the-art performance in a wide range of scenarios.
A Novel Real-Time Reference Key Frame Scan Matching Method.
Mohamed, Haytham; Moussa, Adel; Elhabiby, Mohamed; El-Sheimy, Naser; Sesay, Abu
2017-05-07
Unmanned aerial vehicles represent an effective technology for indoor search and rescue operations. Typically, most indoor missions' environments would be unknown, unstructured, and/or dynamic. Navigation of UAVs in such environments is addressed by simultaneous localization and mapping approach using either local or global approaches. Both approaches suffer from accumulated errors and high processing time due to the iterative nature of the scan matching method. Moreover, point-to-point scan matching is prone to outlier association processes. This paper proposes a low-cost novel method for 2D real-time scan matching based on a reference key frame (RKF). RKF is a hybrid scan matching technique comprised of feature-to-feature and point-to-point approaches. This algorithm aims at mitigating errors accumulation using the key frame technique, which is inspired from video streaming broadcast process. The algorithm depends on the iterative closest point algorithm during the lack of linear features which is typically exhibited in unstructured environments. The algorithm switches back to the RKF once linear features are detected. To validate and evaluate the algorithm, the mapping performance and time consumption are compared with various algorithms in static and dynamic environments. The performance of the algorithm exhibits promising navigational, mapping results and very short computational time, that indicates the potential use of the new algorithm with real-time systems.
An adaptive clustering algorithm for image matching based on corner feature
NASA Astrophysics Data System (ADS)
Wang, Zhe; Dong, Min; Mu, Xiaomin; Wang, Song
2018-04-01
The traditional image matching algorithm always can not balance the real-time and accuracy better, to solve the problem, an adaptive clustering algorithm for image matching based on corner feature is proposed in this paper. The method is based on the similarity of the matching pairs of vector pairs, and the adaptive clustering is performed on the matching point pairs. Harris corner detection is carried out first, the feature points of the reference image and the perceived image are extracted, and the feature points of the two images are first matched by Normalized Cross Correlation (NCC) function. Then, using the improved algorithm proposed in this paper, the matching results are clustered to reduce the ineffective operation and improve the matching speed and robustness. Finally, the Random Sample Consensus (RANSAC) algorithm is used to match the matching points after clustering. The experimental results show that the proposed algorithm can effectively eliminate the most wrong matching points while the correct matching points are retained, and improve the accuracy of RANSAC matching, reduce the computation load of whole matching process at the same time.
featsel: A framework for benchmarking of feature selection algorithms and cost functions
NASA Astrophysics Data System (ADS)
Reis, Marcelo S.; Estrela, Gustavo; Ferreira, Carlos Eduardo; Barrera, Junior
In this paper, we introduce featsel, a framework for benchmarking of feature selection algorithms and cost functions. This framework allows the user to deal with the search space as a Boolean lattice and has its core coded in C++ for computational efficiency purposes. Moreover, featsel includes Perl scripts to add new algorithms and/or cost functions, generate random instances, plot graphs and organize results into tables. Besides, this framework already comes with dozens of algorithms and cost functions for benchmarking experiments. We also provide illustrative examples, in which featsel outperforms the popular Weka workbench in feature selection procedures on data sets from the UCI Machine Learning Repository.
A graph-Laplacian-based feature extraction algorithm for neural spike sorting.
Ghanbari, Yasser; Spence, Larry; Papamichalis, Panos
2009-01-01
Analysis of extracellular neural spike recordings is highly dependent upon the accuracy of neural waveform classification, commonly referred to as spike sorting. Feature extraction is an important stage of this process because it can limit the quality of clustering which is performed in the feature space. This paper proposes a new feature extraction method (which we call Graph Laplacian Features, GLF) based on minimizing the graph Laplacian and maximizing the weighted variance. The algorithm is compared with Principal Components Analysis (PCA, the most commonly-used feature extraction method) using simulated neural data. The results show that the proposed algorithm produces more compact and well-separated clusters compared to PCA. As an added benefit, tentative cluster centers are output which can be used to initialize a subsequent clustering stage.
Space Object Classification Using Fused Features of Time Series Data
NASA Astrophysics Data System (ADS)
Jia, B.; Pham, K. D.; Blasch, E.; Shen, D.; Wang, Z.; Chen, G.
In this paper, a fused feature vector consisting of raw time series and texture feature information is proposed for space object classification. The time series data includes historical orbit trajectories and asteroid light curves. The texture feature is derived from recurrence plots using Gabor filters for both unsupervised learning and supervised learning algorithms. The simulation results show that the classification algorithms using the fused feature vector achieve better performance than those using raw time series or texture features only.
Machine Learning for Big Data: A Study to Understand Limits at Scale
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sukumar, Sreenivas R.; Del-Castillo-Negrete, Carlos Emilio
This report aims to empirically understand the limits of machine learning when applied to Big Data. We observe that recent innovations in being able to collect, access, organize, integrate, and query massive amounts of data from a wide variety of data sources have brought statistical data mining and machine learning under more scrutiny, evaluation and application for gleaning insights from the data than ever before. Much is expected from algorithms without understanding their limitations at scale while dealing with massive datasets. In that context, we pose and address the following questions How does a machine learning algorithm perform on measuresmore » such as accuracy and execution time with increasing sample size and feature dimensionality? Does training with more samples guarantee better accuracy? How many features to compute for a given problem? Do more features guarantee better accuracy? Do efforts to derive and calculate more features and train on larger samples worth the effort? As problems become more complex and traditional binary classification algorithms are replaced with multi-task, multi-class categorization algorithms do parallel learners perform better? What happens to the accuracy of the learning algorithm when trained to categorize multiple classes within the same feature space? Towards finding answers to these questions, we describe the design of an empirical study and present the results. We conclude with the following observations (i) accuracy of the learning algorithm increases with increasing sample size but saturates at a point, beyond which more samples do not contribute to better accuracy/learning, (ii) the richness of the feature space dictates performance - both accuracy and training time, (iii) increased dimensionality often reflected in better performance (higher accuracy in spite of longer training times) but the improvements are not commensurate the efforts for feature computation and training and (iv) accuracy of the learning algorithms drop significantly with multi-class learners training on the same feature matrix and (v) learning algorithms perform well when categories in labeled data are independent (i.e., no relationship or hierarchy exists among categories).« less
Sweeney, Elizabeth M.; Vogelstein, Joshua T.; Cuzzocreo, Jennifer L.; Calabresi, Peter A.; Reich, Daniel S.; Crainiceanu, Ciprian M.; Shinohara, Russell T.
2014-01-01
Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS) lesion segmentation in structural magnetic resonance imaging (MRI). We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w), T2-weighted (T2-w) and fluid-attenuated inversion recovery (FLAIR) MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance. PMID:24781953
Sweeney, Elizabeth M; Vogelstein, Joshua T; Cuzzocreo, Jennifer L; Calabresi, Peter A; Reich, Daniel S; Crainiceanu, Ciprian M; Shinohara, Russell T
2014-01-01
Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS) lesion segmentation in structural magnetic resonance imaging (MRI). We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w), T2-weighted (T2-w) and fluid-attenuated inversion recovery (FLAIR) MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance.
Multi scales based sparse matrix spectral clustering image segmentation
NASA Astrophysics Data System (ADS)
Liu, Zhongmin; Chen, Zhicai; Li, Zhanming; Hu, Wenjin
2018-04-01
In image segmentation, spectral clustering algorithms have to adopt the appropriate scaling parameter to calculate the similarity matrix between the pixels, which may have a great impact on the clustering result. Moreover, when the number of data instance is large, computational complexity and memory use of the algorithm will greatly increase. To solve these two problems, we proposed a new spectral clustering image segmentation algorithm based on multi scales and sparse matrix. We devised a new feature extraction method at first, then extracted the features of image on different scales, at last, using the feature information to construct sparse similarity matrix which can improve the operation efficiency. Compared with traditional spectral clustering algorithm, image segmentation experimental results show our algorithm have better degree of accuracy and robustness.
Recognition of fiducial marks applied to robotic systems. Thesis
NASA Technical Reports Server (NTRS)
Georges, Wayne D.
1991-01-01
The objective was to devise a method to determine the position and orientation of the links of a PUMA 560 using fiducial marks. As a result, it is necessary to design fiducial marks and a corresponding feature extraction algorithm. The marks used are composites of three basic shapes, a circle, an equilateral triangle and a square. Once a mark is imaged, it is thresholded and the borders of each shape are extracted. These borders are subsequently used in a feature extraction algorithm. Two feature extraction algorithms are used to determine which one produces the most reliable results. The first algorithm is based on moment invariants and the second is based on the discrete version of the psi-s curve of the boundary. The latter algorithm is clearly superior for this application.
Structural health monitoring feature design by genetic programming
NASA Astrophysics Data System (ADS)
Harvey, Dustin Y.; Todd, Michael D.
2014-09-01
Structural health monitoring (SHM) systems provide real-time damage and performance information for civil, aerospace, and other high-capital or life-safety critical structures. Conventional data processing involves pre-processing and extraction of low-dimensional features from in situ time series measurements. The features are then input to a statistical pattern recognition algorithm to perform the relevant classification or regression task necessary to facilitate decisions by the SHM system. Traditional design of signal processing and feature extraction algorithms can be an expensive and time-consuming process requiring extensive system knowledge and domain expertise. Genetic programming, a heuristic program search method from evolutionary computation, was recently adapted by the authors to perform automated, data-driven design of signal processing and feature extraction algorithms for statistical pattern recognition applications. The proposed method, called Autofead, is particularly suitable to handle the challenges inherent in algorithm design for SHM problems where the manifestation of damage in structural response measurements is often unclear or unknown. Autofead mines a training database of response measurements to discover information-rich features specific to the problem at hand. This study provides experimental validation on three SHM applications including ultrasonic damage detection, bearing damage classification for rotating machinery, and vibration-based structural health monitoring. Performance comparisons with common feature choices for each problem area are provided demonstrating the versatility of Autofead to produce significant algorithm improvements on a wide range of problems.
NASA Astrophysics Data System (ADS)
Sheikhan, Mansour; Abbasnezhad Arabi, Mahdi; Gharavian, Davood
2015-10-01
Artificial neural networks are efficient models in pattern recognition applications, but their performance is dependent on employing suitable structure and connection weights. This study used a hybrid method for obtaining the optimal weight set and architecture of a recurrent neural emotion classifier based on gravitational search algorithm (GSA) and its binary version (BGSA), respectively. By considering the features of speech signal that were related to prosody, voice quality, and spectrum, a rich feature set was constructed. To select more efficient features, a fast feature selection method was employed. The performance of the proposed hybrid GSA-BGSA method was compared with similar hybrid methods based on particle swarm optimisation (PSO) algorithm and its binary version, PSO and discrete firefly algorithm, and hybrid of error back-propagation and genetic algorithm that were used for optimisation. Experimental tests on Berlin emotional database demonstrated the superior performance of the proposed method using a lighter network structure.
An algorithm for calculating minimum Euclidean distance between two geographic features
NASA Astrophysics Data System (ADS)
Peuquet, Donna J.
1992-09-01
An efficient algorithm is presented for determining the shortest Euclidean distance between two features of arbitrary shape that are represented in quadtree form. These features may be disjoint point sets, lines, or polygons. It is assumed that the features do not overlap. Features also may be intertwined and polygons may be complex (i.e. have holes). Utilizing a spatial divide-and-conquer approach inherent in the quadtree data model, the basic rationale is to narrow-in on portions of each feature quickly that are on a facing edge relative to the other feature, and to minimize the number of point-to-point Euclidean distance calculations that must be performed. Besides offering an efficient, grid-based alternative solution, another unique and useful aspect of the current algorithm is that is can be used for rapidly calculating distance approximations at coarser levels of resolution. The overall process can be viewed as a top-down parallel search. Using one list of leafcode addresses for each of the two features as input, the algorithm is implemented by successively dividing these lists into four sublists for each descendant quadrant. The algorithm consists of two primary phases. The first determines facing adjacent quadrant pairs where part or all of the two features are separated between the two quadrants, respectively. The second phase then determines the closest pixel-level subquadrant pairs within each facing quadrant pair at the lowest level. The key element of the second phase is a quick estimate distance heuristic for further elimination of locations that are not as near as neighboring locations.
Feature Selection Methods for Robust Decoding of Finger Movements in a Non-human Primate
Padmanaban, Subash; Baker, Justin; Greger, Bradley
2018-01-01
Objective: The performance of machine learning algorithms used for neural decoding of dexterous tasks may be impeded due to problems arising when dealing with high-dimensional data. The objective of feature selection algorithms is to choose a near-optimal subset of features from the original feature space to improve the performance of the decoding algorithm. The aim of our study was to compare the effects of four feature selection techniques, Wilcoxon signed-rank test, Relative Importance, Principal Component Analysis (PCA), and Mutual Information Maximization on SVM classification performance for a dexterous decoding task. Approach: A nonhuman primate (NHP) was trained to perform small coordinated movements—similar to typing. An array of microelectrodes was implanted in the hand area of the motor cortex of the NHP and used to record action potentials (AP) during finger movements. A Support Vector Machine (SVM) was used to classify which finger movement the NHP was making based upon AP firing rates. We used the SVM classification to examine the functional parameters of (i) robustness to simulated failure and (ii) longevity of classification. We also compared the effect of using isolated-neuron and multi-unit firing rates as the feature vector supplied to the SVM. Main results: The average decoding accuracy for multi-unit features and single-unit features using Mutual Information Maximization (MIM) across 47 sessions was 96.74 ± 3.5% and 97.65 ± 3.36% respectively. The reduction in decoding accuracy between using 100% of the features and 10% of features based on MIM was 45.56% (from 93.7 to 51.09%) and 4.75% (from 95.32 to 90.79%) for multi-unit and single-unit features respectively. MIM had best performance compared to other feature selection methods. Significance: These results suggest improved decoding performance can be achieved by using optimally selected features. The results based on clinically relevant performance metrics also suggest that the decoding algorithm can be made robust by using optimal features and feature selection algorithms. We believe that even a few percent increase in performance is important and improves the decoding accuracy of the machine learning algorithm potentially increasing the ease of use of a brain machine interface. PMID:29467602
Iris recognition based on key image feature extraction.
Ren, X; Tian, Q; Zhang, J; Wu, S; Zeng, Y
2008-01-01
In iris recognition, feature extraction can be influenced by factors such as illumination and contrast, and thus the features extracted may be unreliable, which can cause a high rate of false results in iris pattern recognition. In order to obtain stable features, an algorithm was proposed in this paper to extract key features of a pattern from multiple images. The proposed algorithm built an iris feature template by extracting key features and performed iris identity enrolment. Simulation results showed that the selected key features have high recognition accuracy on the CASIA Iris Set, where both contrast and illumination variance exist.
Alici, Ibrahim Onur; Yılmaz Demirci, Nilgün; Yılmaz, Aydın; Karakaya, Jale; Özaydın, Esra
2016-09-01
There are several papers on the sonographic features of mediastinal lymph nodes affected by several diseases, but none gives the importance and clinical utility of the features. In order to find out which lymph node should be sampled in a particular nodal station during endobronchial ultrasound, we investigated the diagnostic performances of certain sonographic features and proposed an algorithmic approach. We retrospectively analyzed 1051 lymph nodes and randomly assigned them into a preliminary experimental and a secondary study group. The diagnostic performances of the sonographic features (gray scale, echogeneity, shape, size, margin, presence of necrosis, presence of calcification and absence of central hilar structure) were calculated, and an algorithm for lymph node sampling was obtained with decision tree analysis in the experimental group. Later, a modified algorithm was applied to the patients in the study group to give the accuracy. The demographic characteristics of the patients were not statistically significant between the primary and the secondary groups. All of the features were discriminative between malignant and benign diseases. The modified algorithm sensitivity, specificity, and positive and negative predictive values and diagnostic accuracy for detecting metastatic lymph nodes were 100%, 51.2%, 50.6%, 100% and 67.5%, respectively. In this retrospective analysis, the standardized sonographic classification system and the proposed algorithm performed well in choosing the node that should be sampled in a particular station during endobronchial ultrasound. © 2015 John Wiley & Sons Ltd.
Solomon, Justin; Mileto, Achille; Nelson, Rendon C; Roy Choudhury, Kingshuk; Samei, Ehsan
2016-04-01
To determine if radiation dose and reconstruction algorithm affect the computer-based extraction and analysis of quantitative imaging features in lung nodules, liver lesions, and renal stones at multi-detector row computed tomography (CT). Retrospective analysis of data from a prospective, multicenter, HIPAA-compliant, institutional review board-approved clinical trial was performed by extracting 23 quantitative imaging features (size, shape, attenuation, edge sharpness, pixel value distribution, and texture) of lesions on multi-detector row CT images of 20 adult patients (14 men, six women; mean age, 63 years; range, 38-72 years) referred for known or suspected focal liver lesions, lung nodules, or kidney stones. Data were acquired between September 2011 and April 2012. All multi-detector row CT scans were performed at two different radiation dose levels; images were reconstructed with filtered back projection, adaptive statistical iterative reconstruction, and model-based iterative reconstruction (MBIR) algorithms. A linear mixed-effects model was used to assess the effect of radiation dose and reconstruction algorithm on extracted features. Among the 23 imaging features assessed, radiation dose had a significant effect on five, three, and four of the features for liver lesions, lung nodules, and renal stones, respectively (P < .002 for all comparisons). Adaptive statistical iterative reconstruction had a significant effect on three, one, and one of the features for liver lesions, lung nodules, and renal stones, respectively (P < .002 for all comparisons). MBIR reconstruction had a significant effect on nine, 11, and 15 of the features for liver lesions, lung nodules, and renal stones, respectively (P < .002 for all comparisons). Of note, the measured size of lung nodules and renal stones with MBIR was significantly different than those for the other two algorithms (P < .002 for all comparisons). Although lesion texture was significantly affected by the reconstruction algorithm used (average of 3.33 features affected by MBIR throughout lesion types; P < .002, for all comparisons), no significant effect of the radiation dose setting was observed for all but one of the texture features (P = .002-.998). Radiation dose settings and reconstruction algorithms affect the extraction and analysis of quantitative imaging features in lesions at multi-detector row CT.
Zhang, Xin; Cui, Jintian; Wang, Weisheng; Lin, Chao
2017-01-01
To address the problem of image texture feature extraction, a direction measure statistic that is based on the directionality of image texture is constructed, and a new method of texture feature extraction, which is based on the direction measure and a gray level co-occurrence matrix (GLCM) fusion algorithm, is proposed in this paper. This method applies the GLCM to extract the texture feature value of an image and integrates the weight factor that is introduced by the direction measure to obtain the final texture feature of an image. A set of classification experiments for the high-resolution remote sensing images were performed by using support vector machine (SVM) classifier with the direction measure and gray level co-occurrence matrix fusion algorithm. Both qualitative and quantitative approaches were applied to assess the classification results. The experimental results demonstrated that texture feature extraction based on the fusion algorithm achieved a better image recognition, and the accuracy of classification based on this method has been significantly improved. PMID:28640181
Feature engineering for MEDLINE citation categorization with MeSH.
Jimeno Yepes, Antonio Jose; Plaza, Laura; Carrillo-de-Albornoz, Jorge; Mork, James G; Aronson, Alan R
2015-04-08
Research in biomedical text categorization has mostly used the bag-of-words representation. Other more sophisticated representations of text based on syntactic, semantic and argumentative properties have been less studied. In this paper, we evaluate the impact of different text representations of biomedical texts as features for reproducing the MeSH annotations of some of the most frequent MeSH headings. In addition to unigrams and bigrams, these features include noun phrases, citation meta-data, citation structure, and semantic annotation of the citations. Traditional features like unigrams and bigrams exhibit strong performance compared to other feature sets. Little or no improvement is obtained when using meta-data or citation structure. Noun phrases are too sparse and thus have lower performance compared to more traditional features. Conceptual annotation of the texts by MetaMap shows similar performance compared to unigrams, but adding concepts from the UMLS taxonomy does not improve the performance of using only mapped concepts. The combination of all the features performs largely better than any individual feature set considered. In addition, this combination improves the performance of a state-of-the-art MeSH indexer. Concerning the machine learning algorithms, we find that those that are more resilient to class imbalance largely obtain better performance. We conclude that even though traditional features such as unigrams and bigrams have strong performance compared to other features, it is possible to combine them to effectively improve the performance of the bag-of-words representation. We have also found that the combination of the learning algorithm and feature sets has an influence in the overall performance of the system. Moreover, using learning algorithms resilient to class imbalance largely improves performance. However, when using a large set of features, consideration needs to be taken with algorithms due to the risk of over-fitting. Specific combinations of learning algorithms and features for individual MeSH headings could further increase the performance of an indexing system.
A Novel Real-Time Reference Key Frame Scan Matching Method
Mohamed, Haytham; Moussa, Adel; Elhabiby, Mohamed; El-Sheimy, Naser; Sesay, Abu
2017-01-01
Unmanned aerial vehicles represent an effective technology for indoor search and rescue operations. Typically, most indoor missions’ environments would be unknown, unstructured, and/or dynamic. Navigation of UAVs in such environments is addressed by simultaneous localization and mapping approach using either local or global approaches. Both approaches suffer from accumulated errors and high processing time due to the iterative nature of the scan matching method. Moreover, point-to-point scan matching is prone to outlier association processes. This paper proposes a low-cost novel method for 2D real-time scan matching based on a reference key frame (RKF). RKF is a hybrid scan matching technique comprised of feature-to-feature and point-to-point approaches. This algorithm aims at mitigating errors accumulation using the key frame technique, which is inspired from video streaming broadcast process. The algorithm depends on the iterative closest point algorithm during the lack of linear features which is typically exhibited in unstructured environments. The algorithm switches back to the RKF once linear features are detected. To validate and evaluate the algorithm, the mapping performance and time consumption are compared with various algorithms in static and dynamic environments. The performance of the algorithm exhibits promising navigational, mapping results and very short computational time, that indicates the potential use of the new algorithm with real-time systems. PMID:28481285
Gaussian mixture models-based ship target recognition algorithm in remote sensing infrared images
NASA Astrophysics Data System (ADS)
Yao, Shoukui; Qin, Xiaojuan
2018-02-01
Since the resolution of remote sensing infrared images is low, the features of ship targets become unstable. The issue of how to recognize ships with fuzzy features is an open problem. In this paper, we propose a novel ship target recognition algorithm based on Gaussian mixture models (GMMs). In the proposed algorithm, there are mainly two steps. At the first step, the Hu moments of these ship target images are calculated, and the GMMs are trained on the moment features of ships. At the second step, the moment feature of each ship image is assigned to the trained GMMs for recognition. Because of the scale, rotation, translation invariance property of Hu moments and the power feature-space description ability of GMMs, the GMMs-based ship target recognition algorithm can recognize ship reliably. Experimental results of a large simulating image set show that our approach is effective in distinguishing different ship types, and obtains a satisfactory ship recognition performance.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Harvey, Neal R; Ruggiero, Christy E; Pawley, Norma H
2009-01-01
Detecting complex targets, such as facilities, in commercially available satellite imagery is a difficult problem that human analysts try to solve by applying world knowledge. Often there are known observables that can be extracted by pixel-level feature detectors that can assist in the facility detection process. Individually, each of these observables is not sufficient for an accurate and reliable detection, but in combination, these auxiliary observables may provide sufficient context for detection by a machine learning algorithm. We describe an approach for automatic detection of facilities that uses an automated feature extraction algorithm to extract auxiliary observables, and a semi-supervisedmore » assisted target recognition algorithm to then identify facilities of interest. We illustrate the approach using an example of finding schools in Quickbird image data of Albuquerque, New Mexico. We use Los Alamos National Laboratory's Genie Pro automated feature extraction algorithm to find a set of auxiliary features that should be useful in the search for schools, such as parking lots, large buildings, sports fields and residential areas and then combine these features using Genie Pro's assisted target recognition algorithm to learn a classifier that finds schools in the image data.« less
Curve Set Feature-Based Robust and Fast Pose Estimation Algorithm
Hashimoto, Koichi
2017-01-01
Bin picking refers to picking the randomly-piled objects from a bin for industrial production purposes, and robotic bin picking is always used in automated assembly lines. In order to achieve a higher productivity, a fast and robust pose estimation algorithm is necessary to recognize and localize the randomly-piled parts. This paper proposes a pose estimation algorithm for bin picking tasks using point cloud data. A novel descriptor Curve Set Feature (CSF) is proposed to describe a point by the surface fluctuation around this point and is also capable of evaluating poses. The Rotation Match Feature (RMF) is proposed to match CSF efficiently. The matching process combines the idea of the matching in 2D space of origin Point Pair Feature (PPF) algorithm with nearest neighbor search. A voxel-based pose verification method is introduced to evaluate the poses and proved to be more than 30-times faster than the kd-tree-based verification method. Our algorithm is evaluated against a large number of synthetic and real scenes and proven to be robust to noise, able to detect metal parts, more accurately and more than 10-times faster than PPF and Oriented, Unique and Repeatable (OUR)-Clustered Viewpoint Feature Histogram (CVFH). PMID:28771216
Epidermis area detection for immunofluorescence microscopy
NASA Astrophysics Data System (ADS)
Dovganich, Andrey; Krylov, Andrey; Nasonov, Andrey; Makhneva, Natalia
2018-04-01
We propose a novel image segmentation method for immunofluorescence microscopy images of skin tissue for the diagnosis of various skin diseases. The segmentation is based on machine learning algorithms. The feature vector is filled by three groups of features: statistical features, Laws' texture energy measures and local binary patterns. The images are preprocessed for better learning. Different machine learning algorithms have been used and the best results have been obtained with random forest algorithm. We use the proposed method to detect the epidermis region as a part of pemphigus diagnosis system.
A Spiking Neural Network in sEMG Feature Extraction.
Lobov, Sergey; Mironov, Vasiliy; Kastalskiy, Innokentiy; Kazantsev, Victor
2015-11-03
We have developed a novel algorithm for sEMG feature extraction and classification. It is based on a hybrid network composed of spiking and artificial neurons. The spiking neuron layer with mutual inhibition was assigned as feature extractor. We demonstrate that the classification accuracy of the proposed model could reach high values comparable with existing sEMG interface systems. Moreover, the algorithm sensibility for different sEMG collecting systems characteristics was estimated. Results showed rather equal accuracy, despite a significant sampling rate difference. The proposed algorithm was successfully tested for mobile robot control.
Facial Affect Recognition Using Regularized Discriminant Analysis-Based Algorithms
NASA Astrophysics Data System (ADS)
Lee, Chien-Cheng; Huang, Shin-Sheng; Shih, Cheng-Yuan
2010-12-01
This paper presents a novel and effective method for facial expression recognition including happiness, disgust, fear, anger, sadness, surprise, and neutral state. The proposed method utilizes a regularized discriminant analysis-based boosting algorithm (RDAB) with effective Gabor features to recognize the facial expressions. Entropy criterion is applied to select the effective Gabor feature which is a subset of informative and nonredundant Gabor features. The proposed RDAB algorithm uses RDA as a learner in the boosting algorithm. The RDA combines strengths of linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA). It solves the small sample size and ill-posed problems suffered from QDA and LDA through a regularization technique. Additionally, this study uses the particle swarm optimization (PSO) algorithm to estimate optimal parameters in RDA. Experiment results demonstrate that our approach can accurately and robustly recognize facial expressions.
Artificial bee colony algorithm for single-trial electroencephalogram analysis.
Hsu, Wei-Yen; Hu, Ya-Ping
2015-04-01
In this study, we propose an analysis system combined with feature selection to further improve the classification accuracy of single-trial electroencephalogram (EEG) data. Acquiring event-related brain potential data from the sensorimotor cortices, the system comprises artifact and background noise removal, feature extraction, feature selection, and feature classification. First, the artifacts and background noise are removed automatically by means of independent component analysis and surface Laplacian filter, respectively. Several potential features, such as band power, autoregressive model, and coherence and phase-locking value, are then extracted for subsequent classification. Next, artificial bee colony (ABC) algorithm is used to select features from the aforementioned feature combination. Finally, selected subfeatures are classified by support vector machine. Comparing with and without artifact removal and feature selection, using a genetic algorithm on single-trial EEG data for 6 subjects, the results indicate that the proposed system is promising and suitable for brain-computer interface applications. © EEG and Clinical Neuroscience Society (ECNS) 2014.
Tensor Rank Preserving Discriminant Analysis for Facial Recognition.
Tao, Dapeng; Guo, Yanan; Li, Yaotang; Gao, Xinbo
2017-10-12
Facial recognition, one of the basic topics in computer vision and pattern recognition, has received substantial attention in recent years. However, for those traditional facial recognition algorithms, the facial images are reshaped to a long vector, thereby losing part of the original spatial constraints of each pixel. In this paper, a new tensor-based feature extraction algorithm termed tensor rank preserving discriminant analysis (TRPDA) for facial image recognition is proposed; the proposed method involves two stages: in the first stage, the low-dimensional tensor subspace of the original input tensor samples was obtained; in the second stage, discriminative locality alignment was utilized to obtain the ultimate vector feature representation for subsequent facial recognition. On the one hand, the proposed TRPDA algorithm fully utilizes the natural structure of the input samples, and it applies an optimization criterion that can directly handle the tensor spectral analysis problem, thereby decreasing the computation cost compared those traditional tensor-based feature selection algorithms. On the other hand, the proposed TRPDA algorithm extracts feature by finding a tensor subspace that preserves most of the rank order information of the intra-class input samples. Experiments on the three facial databases are performed here to determine the effectiveness of the proposed TRPDA algorithm.
An Autonomous Star Identification Algorithm Based on One-Dimensional Vector Pattern for Star Sensors
Luo, Liyan; Xu, Luping; Zhang, Hua
2015-01-01
In order to enhance the robustness and accelerate the recognition speed of star identification, an autonomous star identification algorithm for star sensors is proposed based on the one-dimensional vector pattern (one_DVP). In the proposed algorithm, the space geometry information of the observed stars is used to form the one-dimensional vector pattern of the observed star. The one-dimensional vector pattern of the same observed star remains unchanged when the stellar image rotates, so the problem of star identification is simplified as the comparison of the two feature vectors. The one-dimensional vector pattern is adopted to build the feature vector of the star pattern, which makes it possible to identify the observed stars robustly. The characteristics of the feature vector and the proposed search strategy for the matching pattern make it possible to achieve the recognition result as quickly as possible. The simulation results demonstrate that the proposed algorithm can effectively accelerate the star identification. Moreover, the recognition accuracy and robustness by the proposed algorithm are better than those by the pyramid algorithm, the modified grid algorithm, and the LPT algorithm. The theoretical analysis and experimental results show that the proposed algorithm outperforms the other three star identification algorithms. PMID:26198233
Luo, Liyan; Xu, Luping; Zhang, Hua
2015-07-07
In order to enhance the robustness and accelerate the recognition speed of star identification, an autonomous star identification algorithm for star sensors is proposed based on the one-dimensional vector pattern (one_DVP). In the proposed algorithm, the space geometry information of the observed stars is used to form the one-dimensional vector pattern of the observed star. The one-dimensional vector pattern of the same observed star remains unchanged when the stellar image rotates, so the problem of star identification is simplified as the comparison of the two feature vectors. The one-dimensional vector pattern is adopted to build the feature vector of the star pattern, which makes it possible to identify the observed stars robustly. The characteristics of the feature vector and the proposed search strategy for the matching pattern make it possible to achieve the recognition result as quickly as possible. The simulation results demonstrate that the proposed algorithm can effectively accelerate the star identification. Moreover, the recognition accuracy and robustness by the proposed algorithm are better than those by the pyramid algorithm, the modified grid algorithm, and the LPT algorithm. The theoretical analysis and experimental results show that the proposed algorithm outperforms the other three star identification algorithms.
NASA Astrophysics Data System (ADS)
Wu, Yuanfeng; Gao, Lianru; Zhang, Bing; Zhao, Haina; Li, Jun
2014-01-01
We present a parallel implementation of the optimized maximum noise fraction (G-OMNF) transform algorithm for feature extraction of hyperspectral images on commodity graphics processing units (GPUs). The proposed approach explored the algorithm data-level concurrency and optimized the computing flow. We first defined a three-dimensional grid, in which each thread calculates a sub-block data to easily facilitate the spatial and spectral neighborhood data searches in noise estimation, which is one of the most important steps involved in OMNF. Then, we optimized the processing flow and computed the noise covariance matrix before computing the image covariance matrix to reduce the original hyperspectral image data transmission. These optimization strategies can greatly improve the computing efficiency and can be applied to other feature extraction algorithms. The proposed parallel feature extraction algorithm was implemented on an Nvidia Tesla GPU using the compute unified device architecture and basic linear algebra subroutines library. Through the experiments on several real hyperspectral images, our GPU parallel implementation provides a significant speedup of the algorithm compared with the CPU implementation, especially for highly data parallelizable and arithmetically intensive algorithm parts, such as noise estimation. In order to further evaluate the effectiveness of G-OMNF, we used two different applications: spectral unmixing and classification for evaluation. Considering the sensor scanning rate and the data acquisition time, the proposed parallel implementation met the on-board real-time feature extraction.
NIMEFI: Gene Regulatory Network Inference using Multiple Ensemble Feature Importance Algorithms
Ruyssinck, Joeri; Huynh-Thu, Vân Anh; Geurts, Pierre; Dhaene, Tom; Demeester, Piet; Saeys, Yvan
2014-01-01
One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available. PMID:24667482
Evaluating some computer exhancement algorithms that improve the visibility of cometary morphology
NASA Technical Reports Server (NTRS)
Larson, Stephen M.; Slaughter, Charles D.
1992-01-01
Digital enhancement of cometary images is a necessary tool in studying cometary morphology. Many image processing algorithms, some developed specifically for comets, have been used to enhance the subtle, low contrast coma and tail features. We compare some of the most commonly used algorithms on two different images to evaluate their strong and weak points, and conclude that there currently exists no single 'ideal' algorithm, although the radial gradient spatial filter gives the best overall result. This comparison should aid users in selecting the best algorithm to enhance particular features of interest.
Vision-Aided Inertial Navigation
NASA Technical Reports Server (NTRS)
Roumeliotis, Stergios I. (Inventor); Mourikis, Anastasios I. (Inventor)
2017-01-01
This document discloses, among other things, a system and method for implementing an algorithm to determine pose, velocity, acceleration or other navigation information using feature tracking data. The algorithm has computational complexity that is linear with the number of features tracked.
An OMIC biomarker detection algorithm TriVote and its application in methylomic biomarker detection.
Xu, Cheng; Liu, Jiamei; Yang, Weifeng; Shu, Yayun; Wei, Zhipeng; Zheng, Weiwei; Feng, Xin; Zhou, Fengfeng
2018-04-01
Transcriptomic and methylomic patterns represent two major OMIC data sources impacted by both inheritable genetic information and environmental factors, and have been widely used as disease diagnosis and prognosis biomarkers. Modern transcriptomic and methylomic profiling technologies detect the status of tens of thousands or even millions of probing residues in the human genome, and introduce a major computational challenge for the existing feature selection algorithms. This study proposes a three-step feature selection algorithm, TriVote, to detect a subset of transcriptomic or methylomic residues with highly accurate binary classification performance. TriVote outperforms both filter and wrapper feature selection algorithms with both higher classification accuracy and smaller feature number on 17 transcriptomes and two methylomes. Biological functions of the methylome biomarkers detected by TriVote were discussed for their disease associations. An easy-to-use Python package is also released to facilitate the further applications.
A Novel Robot Visual Homing Method Based on SIFT Features
Zhu, Qidan; Liu, Chuanjia; Cai, Chengtao
2015-01-01
Warping is an effective visual homing method for robot local navigation. However, the performance of the warping method can be greatly influenced by the changes of the environment in a real scene, thus resulting in lower accuracy. In order to solve the above problem and to get higher homing precision, a novel robot visual homing algorithm is proposed by combining SIFT (scale-invariant feature transform) features with the warping method. The algorithm is novel in using SIFT features as landmarks instead of the pixels in the horizon region of the panoramic image. In addition, to further improve the matching accuracy of landmarks in the homing algorithm, a novel mismatching elimination algorithm, based on the distribution characteristics of landmarks in the catadioptric panoramic image, is proposed. Experiments on image databases and on a real scene confirm the effectiveness of the proposed method. PMID:26473880
Chen, Yuantao; Xu, Weihong; Kuang, Fangjun; Gao, Shangbing
2013-01-01
The efficient target tracking algorithm researches have become current research focus of intelligent robots. The main problems of target tracking process in mobile robot face environmental uncertainty. They are very difficult to estimate the target states, illumination change, target shape changes, complex backgrounds, and other factors and all affect the occlusion in tracking robustness. To further improve the target tracking's accuracy and reliability, we present a novel target tracking algorithm to use visual saliency and adaptive support vector machine (ASVM). Furthermore, the paper's algorithm has been based on the mixture saliency of image features. These features include color, brightness, and sport feature. The execution process used visual saliency features and those common characteristics have been expressed as the target's saliency. Numerous experiments demonstrate the effectiveness and timeliness of the proposed target tracking algorithm in video sequences where the target objects undergo large changes in pose, scale, and illumination.
Minimalist ensemble algorithms for genome-wide protein localization prediction.
Lin, Jhih-Rong; Mondal, Ananda Mohan; Liu, Rong; Hu, Jianjun
2012-07-03
Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms. This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature selection based filter and a logistic regression classifier. Using a novel concept of contribution scores, we analyzed issues of algorithm redundancy, consensus mistakes, and algorithm complementarity in designing ensemble algorithms. We applied the proposed minimalist logistic regression (LR) ensemble algorithm to two genome-wide datasets of Yeast and Human and compared its performance with current ensemble algorithms. Experimental results showed that the minimalist ensemble algorithm can achieve high prediction accuracy with only 1/3 to 1/2 of individual predictors of current ensemble algorithms, which greatly reduces computational complexity and running time. It was found that the high performance ensemble algorithms are usually composed of the predictors that together cover most of available features. Compared to the best individual predictor, our ensemble algorithm improved the prediction accuracy from AUC score of 0.558 to 0.707 for the Yeast dataset and from 0.628 to 0.646 for the Human dataset. Compared with popular weighted voting based ensemble algorithms, our classifier-based ensemble algorithms achieved much better performance without suffering from inclusion of too many individual predictors. We proposed a method for rational design of minimalist ensemble algorithms using feature selection and classifiers. The proposed minimalist ensemble algorithm based on logistic regression can achieve equal or better prediction performance while using only half or one-third of individual predictors compared to other ensemble algorithms. The results also suggested that meta-predictors that take advantage of a variety of features by combining individual predictors tend to achieve the best performance. The LR ensemble server and related benchmark datasets are available at http://mleg.cse.sc.edu/LRensemble/cgi-bin/predict.cgi.
Minimalist ensemble algorithms for genome-wide protein localization prediction
2012-01-01
Background Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms. Results This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature selection based filter and a logistic regression classifier. Using a novel concept of contribution scores, we analyzed issues of algorithm redundancy, consensus mistakes, and algorithm complementarity in designing ensemble algorithms. We applied the proposed minimalist logistic regression (LR) ensemble algorithm to two genome-wide datasets of Yeast and Human and compared its performance with current ensemble algorithms. Experimental results showed that the minimalist ensemble algorithm can achieve high prediction accuracy with only 1/3 to 1/2 of individual predictors of current ensemble algorithms, which greatly reduces computational complexity and running time. It was found that the high performance ensemble algorithms are usually composed of the predictors that together cover most of available features. Compared to the best individual predictor, our ensemble algorithm improved the prediction accuracy from AUC score of 0.558 to 0.707 for the Yeast dataset and from 0.628 to 0.646 for the Human dataset. Compared with popular weighted voting based ensemble algorithms, our classifier-based ensemble algorithms achieved much better performance without suffering from inclusion of too many individual predictors. Conclusions We proposed a method for rational design of minimalist ensemble algorithms using feature selection and classifiers. The proposed minimalist ensemble algorithm based on logistic regression can achieve equal or better prediction performance while using only half or one-third of individual predictors compared to other ensemble algorithms. The results also suggested that meta-predictors that take advantage of a variety of features by combining individual predictors tend to achieve the best performance. The LR ensemble server and related benchmark datasets are available at http://mleg.cse.sc.edu/LRensemble/cgi-bin/predict.cgi. PMID:22759391
Liu, Aiming; Liu, Quan; Ai, Qingsong; Xie, Yi; Chen, Anqi
2017-01-01
Motor Imagery (MI) electroencephalography (EEG) is widely studied for its non-invasiveness, easy availability, portability, and high temporal resolution. As for MI EEG signal processing, the high dimensions of features represent a research challenge. It is necessary to eliminate redundant features, which not only create an additional overhead of managing the space complexity, but also might include outliers, thereby reducing classification accuracy. The firefly algorithm (FA) can adaptively select the best subset of features, and improve classification accuracy. However, the FA is easily entrapped in a local optimum. To solve this problem, this paper proposes a method of combining the firefly algorithm and learning automata (LA) to optimize feature selection for motor imagery EEG. We employed a method of combining common spatial pattern (CSP) and local characteristic-scale decomposition (LCD) algorithms to obtain a high dimensional feature set, and classified it by using the spectral regression discriminant analysis (SRDA) classifier. Both the fourth brain–computer interface competition data and real-time data acquired in our designed experiments were used to verify the validation of the proposed method. Compared with genetic and adaptive weight particle swarm optimization algorithms, the experimental results show that our proposed method effectively eliminates redundant features, and improves the classification accuracy of MI EEG signals. In addition, a real-time brain–computer interface system was implemented to verify the feasibility of our proposed methods being applied in practical brain–computer interface systems. PMID:29117100
Liu, Aiming; Chen, Kun; Liu, Quan; Ai, Qingsong; Xie, Yi; Chen, Anqi
2017-11-08
Motor Imagery (MI) electroencephalography (EEG) is widely studied for its non-invasiveness, easy availability, portability, and high temporal resolution. As for MI EEG signal processing, the high dimensions of features represent a research challenge. It is necessary to eliminate redundant features, which not only create an additional overhead of managing the space complexity, but also might include outliers, thereby reducing classification accuracy. The firefly algorithm (FA) can adaptively select the best subset of features, and improve classification accuracy. However, the FA is easily entrapped in a local optimum. To solve this problem, this paper proposes a method of combining the firefly algorithm and learning automata (LA) to optimize feature selection for motor imagery EEG. We employed a method of combining common spatial pattern (CSP) and local characteristic-scale decomposition (LCD) algorithms to obtain a high dimensional feature set, and classified it by using the spectral regression discriminant analysis (SRDA) classifier. Both the fourth brain-computer interface competition data and real-time data acquired in our designed experiments were used to verify the validation of the proposed method. Compared with genetic and adaptive weight particle swarm optimization algorithms, the experimental results show that our proposed method effectively eliminates redundant features, and improves the classification accuracy of MI EEG signals. In addition, a real-time brain-computer interface system was implemented to verify the feasibility of our proposed methods being applied in practical brain-computer interface systems.
SU-E-I-01: Iterative CBCT Reconstruction with a Feature-Preserving Penalty
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lyu, Q; Li, B; Southern Medical University, Guangzhou
2015-06-15
Purpose: Low-dose CBCT is desired in various clinical applications. Iterative image reconstruction algorithms have shown advantages in suppressing noise in low-dose CBCT. However, due to the smoothness constraint enforced during the reconstruction process, edges may be blurred and image features may lose in the reconstructed image. In this work, we proposed a new penalty design to preserve image features in the image reconstructed by iterative algorithms. Methods: Low-dose CBCT is reconstructed by minimizing the penalized weighted least-squares (PWLS) objective function. Binary Robust Independent Elementary Features (BRIEF) of the image were integrated into the penalty of PWLS. BRIEF is a generalmore » purpose point descriptor that can be used to identify important features of an image. In this work, BRIEF distance of two neighboring pixels was used to weigh the smoothing parameter in PWLS. For pixels of large BRIEF distance, weaker smooth constraint will be enforced. Image features will be better preserved through such a design. The performance of the PWLS algorithm with BRIEF penalty was evaluated by a CatPhan 600 phantom. Results: The image quality reconstructed by the proposed PWLS-BRIEF algorithm is superior to that by the conventional PWLS method and the standard FDK method. At matched noise level, edges in PWLS-BRIEF reconstructed image are better preserved. Conclusion: This study demonstrated that the proposed PWLS-BRIEF algorithm has great potential on preserving image features in low-dose CBCT.« less
Comparison of Different EHG Feature Selection Methods for the Detection of Preterm Labor
Alamedine, D.; Khalil, M.; Marque, C.
2013-01-01
Numerous types of linear and nonlinear features have been extracted from the electrohysterogram (EHG) in order to classify labor and pregnancy contractions. As a result, the number of available features is now very large. The goal of this study is to reduce the number of features by selecting only the relevant ones which are useful for solving the classification problem. This paper presents three methods for feature subset selection that can be applied to choose the best subsets for classifying labor and pregnancy contractions: an algorithm using the Jeffrey divergence (JD) distance, a sequential forward selection (SFS) algorithm, and a binary particle swarm optimization (BPSO) algorithm. The two last methods are based on a classifier and were tested with three types of classifiers. These methods have allowed us to identify common features which are relevant for contraction classification. PMID:24454536
Sequential structural damage diagnosis algorithm using a change point detection method
NASA Astrophysics Data System (ADS)
Noh, H.; Rajagopal, R.; Kiremidjian, A. S.
2013-11-01
This paper introduces a damage diagnosis algorithm for civil structures that uses a sequential change point detection method. The general change point detection method uses the known pre- and post-damage feature distributions to perform a sequential hypothesis test. In practice, however, the post-damage distribution is unlikely to be known a priori, unless we are looking for a known specific type of damage. Therefore, we introduce an additional algorithm that estimates and updates this distribution as data are collected using the maximum likelihood and the Bayesian methods. We also applied an approximate method to reduce the computation load and memory requirement associated with the estimation. The algorithm is validated using a set of experimental data collected from a four-story steel special moment-resisting frame and multiple sets of simulated data. Various features of different dimensions have been explored, and the algorithm was able to identify damage, particularly when it uses multidimensional damage sensitive features and lower false alarm rates, with a known post-damage feature distribution. For unknown feature distribution cases, the post-damage distribution was consistently estimated and the detection delays were only a few time steps longer than the delays from the general method that assumes we know the post-damage feature distribution. We confirmed that the Bayesian method is particularly efficient in declaring damage with minimal memory requirement, but the maximum likelihood method provides an insightful heuristic approach.
Nurmohamadi, Maryam; Pourghassem, Hossein
2014-05-01
The utilization of antibiotics produced by Clavulanic acid (CA) is an increasing need in medicine and industry. Usually, the CA is created from the fermentation of Streptomycen Clavuligerus (SC) bacteria. Analysis of visual and morphological features of SC bacteria is an appropriate measure to estimate the growth of CA. In this paper, an automatic and fast CA production level estimation algorithm based on visual and structural features of SC bacteria instead of statistical methods and experimental evaluation by microbiologist is proposed. In this algorithm, structural features such as the number of newborn branches, thickness of hyphal and bacterial density and also color features such as acceptance color levels are extracted from the SC bacteria. Moreover, PH and biomass of the medium provided by microbiologists are considered as specified features. The level of CA production is estimated by using a new application of Self-Organizing Map (SOM), and a hybrid model of genetic algorithm with back propagation network (GA-BPN). The proposed algorithm is evaluated on four carbonic resources including malt, starch, wheat flour and glycerol that had used as different mediums of bacterial growth. Then, the obtained results are compared and evaluated with observation of specialist. Finally, the Relative Error (RE) for the SOM and GA-BPN are achieved 14.97% and 16.63%, respectively. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
HYBRID FAST HANKEL TRANSFORM ALGORITHM FOR ELECTROMAGNETIC MODELING
A hybrid fast Hankel transform algorithm has been developed that uses several complementary features of two existing algorithms: Anderson's digital filtering or fast Hankel transform (FHT) algorithm and Chave's quadrature and continued fraction algorithm. A hybrid FHT subprogram ...
An Improved Vision-based Algorithm for Unmanned Aerial Vehicles Autonomous Landing
NASA Astrophysics Data System (ADS)
Zhao, Yunji; Pei, Hailong
In vision-based autonomous landing system of UAV, the efficiency of target detecting and tracking will directly affect the control system. The improved algorithm of SURF(Speed Up Robust Features) will resolve the problem which is the inefficiency of the SURF algorithm in the autonomous landing system. The improved algorithm is composed of three steps: first, detect the region of the target using the Camshift; second, detect the feature points in the region of the above acquired using the SURF algorithm; third, do the matching between the template target and the region of target in frame. The results of experiment and theoretical analysis testify the efficiency of the algorithm.
Kesharaju, Manasa; Nagarajah, Romesh
2015-09-01
The motivation for this research stems from a need for providing a non-destructive testing method capable of detecting and locating any defects and microstructural variations within armour ceramic components before issuing them to the soldiers who rely on them for their survival. The development of an automated ultrasonic inspection based classification system would make possible the checking of each ceramic component and immediately alert the operator about the presence of defects. Generally, in many classification problems a choice of features or dimensionality reduction is significant and simultaneously very difficult, as a substantial computational effort is required to evaluate possible feature subsets. In this research, a combination of artificial neural networks and genetic algorithms are used to optimize the feature subset used in classification of various defects in reaction-sintered silicon carbide ceramic components. Initially wavelet based feature extraction is implemented from the region of interest. An Artificial Neural Network classifier is employed to evaluate the performance of these features. Genetic Algorithm based feature selection is performed. Principal Component Analysis is a popular technique used for feature selection and is compared with the genetic algorithm based technique in terms of classification accuracy and selection of optimal number of features. The experimental results confirm that features identified by Principal Component Analysis lead to improved performance in terms of classification percentage with 96% than Genetic algorithm with 94%. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Shahamatnia, Ehsan; Dorotovič, Ivan; Fonseca, Jose M.; Ribeiro, Rita A.
2016-03-01
Developing specialized software tools is essential to support studies of solar activity evolution. With new space missions such as Solar Dynamics Observatory (SDO), solar images are being produced in unprecedented volumes. To capitalize on that huge data availability, the scientific community needs a new generation of software tools for automatic and efficient data processing. In this paper a prototype of a modular framework for solar feature detection, characterization, and tracking is presented. To develop an efficient system capable of automatic solar feature tracking and measuring, a hybrid approach combining specialized image processing, evolutionary optimization, and soft computing algorithms is being followed. The specialized hybrid algorithm for tracking solar features allows automatic feature tracking while gathering characterization details about the tracked features. The hybrid algorithm takes advantages of the snake model, a specialized image processing algorithm widely used in applications such as boundary delineation, image segmentation, and object tracking. Further, it exploits the flexibility and efficiency of Particle Swarm Optimization (PSO), a stochastic population based optimization algorithm. PSO has been used successfully in a wide range of applications including combinatorial optimization, control, clustering, robotics, scheduling, and image processing and video analysis applications. The proposed tool, denoted PSO-Snake model, was already successfully tested in other works for tracking sunspots and coronal bright points. In this work, we discuss the application of the PSO-Snake algorithm for calculating the sidereal rotational angular velocity of the solar corona. To validate the results we compare them with published manual results performed by an expert.
Raghunathan, Shriram; Gupta, Sumeet K; Markandeya, Himanshu S; Roy, Kaushik; Irazoqui, Pedro P
2010-10-30
Implantable neural prostheses that deliver focal electrical stimulation upon demand are rapidly emerging as an alternate therapy for roughly a third of the epileptic patient population that is medically refractory. Seizure detection algorithms enable feedback mechanisms to provide focally and temporally specific intervention. Real-time feasibility and computational complexity often limit most reported detection algorithms to implementations using computers for bedside monitoring or external devices communicating with the implanted electrodes. A comparison of algorithms based on detection efficacy does not present a complete picture of the feasibility of the algorithm with limited computational power, as is the case with most battery-powered applications. We present a two-dimensional design optimization approach that takes into account both detection efficacy and hardware cost in evaluating algorithms for their feasibility in an implantable application. Detection features are first compared for their ability to detect electrographic seizures from micro-electrode data recorded from kainate-treated rats. Circuit models are then used to estimate the dynamic and leakage power consumption of the compared features. A score is assigned based on detection efficacy and the hardware cost for each of the features, then plotted on a two-dimensional design space. An optimal combination of compared features is used to construct an algorithm that provides maximal detection efficacy per unit hardware cost. The methods presented in this paper would facilitate the development of a common platform to benchmark seizure detection algorithms for comparison and feasibility analysis in the next generation of implantable neuroprosthetic devices to treat epilepsy. Copyright © 2010 Elsevier B.V. All rights reserved.
Applicability Analysis of Cloth Simulation Filtering Algorithm for Mobile LIDAR Point Cloud
NASA Astrophysics Data System (ADS)
Cai, S.; Zhang, W.; Qi, J.; Wan, P.; Shao, J.; Shen, A.
2018-04-01
Classifying the original point clouds into ground and non-ground points is a key step in LiDAR (light detection and ranging) data post-processing. Cloth simulation filtering (CSF) algorithm, which based on a physical process, has been validated to be an accurate, automatic and easy-to-use algorithm for airborne LiDAR point cloud. As a new technique of three-dimensional data collection, the mobile laser scanning (MLS) has been gradually applied in various fields, such as reconstruction of digital terrain models (DTM), 3D building modeling and forest inventory and management. Compared with airborne LiDAR point cloud, there are some different features (such as point density feature, distribution feature and complexity feature) for mobile LiDAR point cloud. Some filtering algorithms for airborne LiDAR data were directly used in mobile LiDAR point cloud, but it did not give satisfactory results. In this paper, we explore the ability of the CSF algorithm for mobile LiDAR point cloud. Three samples with different shape of the terrain are selected to test the performance of this algorithm, which respectively yields total errors of 0.44 %, 0.77 % and1.20 %. Additionally, large area dataset is also tested to further validate the effectiveness of this algorithm, and results show that it can quickly and accurately separate point clouds into ground and non-ground points. In summary, this algorithm is efficient and reliable for mobile LiDAR point cloud.
Li, Kejia; Warren, Steve; Natarajan, Balasubramaniam
2012-02-01
Onboard assessment of photoplethysmogram (PPG) quality could reduce unnecessary data transmission on battery-powered wireless pulse oximeters and improve the viability of the electronic patient records to which these data are stored. These algorithms show promise to increase the intelligence level of former "dumb" medical devices: devices that acquire and forward data but leave data interpretation to the clinician or host system. To this end, the authors have developed a unique onboard feature detection algorithm to assess the quality of PPGs acquired with a custom reflectance mode, wireless pulse oximeter. The algorithm uses a Bayesian hypothesis testing method to analyze four features extracted from raw and decimated PPG data in order to determine whether the original data comprise valid PPG waveforms or whether they are corrupted by motion or other environmental influences. Based on these results, the algorithm further calculates heart rate and blood oxygen saturation from a "compact representation" structure. PPG data were collected from 47 subjects to train the feature detection algorithm and to gauge their performance. A MATLAB interface was also developed to visualize the features extracted, the algorithm flow, and the decision results, where all algorithm-related parameters and decisions were ascertained on the wireless unit prior to transmission. For the data sets acquired here, the algorithm was 99% effective in identifying clean, usable PPGs versus nonsaturated data that did not demonstrate meaningful pulsatile waveshapes, PPGs corrupted by motion artifact, and data affected by signal saturation.
A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data.
Baur, Brittany; Bozdag, Serdar
2016-01-01
DNA methylation is an important epigenetic event that effects gene expression during development and various diseases such as cancer. Understanding the mechanism of action of DNA methylation is important for downstream analysis. In the Illumina Infinium HumanMethylation 450K array, there are tens of probes associated with each gene. Given methylation intensities of all these probes, it is necessary to compute which of these probes are most representative of the gene centric methylation level. In this study, we developed a feature selection algorithm based on sequential forward selection that utilized different classification methods to compute gene centric DNA methylation using probe level DNA methylation data. We compared our algorithm to other feature selection algorithms such as support vector machines with recursive feature elimination, genetic algorithms and ReliefF. We evaluated all methods based on the predictive power of selected probes on their mRNA expression levels and found that a K-Nearest Neighbors classification using the sequential forward selection algorithm performed better than other algorithms based on all metrics. We also observed that transcriptional activities of certain genes were more sensitive to DNA methylation changes than transcriptional activities of other genes. Our algorithm was able to predict the expression of those genes with high accuracy using only DNA methylation data. Our results also showed that those DNA methylation-sensitive genes were enriched in Gene Ontology terms related to the regulation of various biological processes.
Handwritten digits recognition based on immune network
NASA Astrophysics Data System (ADS)
Li, Yangyang; Wu, Yunhui; Jiao, Lc; Wu, Jianshe
2011-11-01
With the development of society, handwritten digits recognition technique has been widely applied to production and daily life. It is a very difficult task to solve these problems in the field of pattern recognition. In this paper, a new method is presented for handwritten digit recognition. The digit samples firstly are processed and features extraction. Based on these features, a novel immune network classification algorithm is designed and implemented to the handwritten digits recognition. The proposed algorithm is developed by Jerne's immune network model for feature selection and KNN method for classification. Its characteristic is the novel network with parallel commutating and learning. The performance of the proposed method is experimented to the handwritten number datasets MNIST and compared with some other recognition algorithms-KNN, ANN and SVM algorithm. The result shows that the novel classification algorithm based on immune network gives promising performance and stable behavior for handwritten digits recognition.
Study on Huizhou architecture of point cloud registration based on optimized ICP algorithm
NASA Astrophysics Data System (ADS)
Zhang, Runmei; Wu, Yulu; Zhang, Guangbin; Zhou, Wei; Tao, Yuqian
2018-03-01
In view of the current point cloud registration software has high hardware requirements, heavy workload and moltiple interactive definition, the source of software with better processing effect is not open, a two--step registration method based on normal vector distribution feature and coarse feature based iterative closest point (ICP) algorithm is proposed in this paper. This method combines fast point feature histogram (FPFH) algorithm, define the adjacency region of point cloud and the calculation model of the distribution of normal vectors, setting up the local coordinate system for each key point, and obtaining the transformation matrix to finish rough registration, the rough registration results of two stations are accurately registered by using the ICP algorithm. Experimental results show that, compared with the traditional ICP algorithm, the method used in this paper has obvious time and precision advantages for large amount of point clouds.
NASA Astrophysics Data System (ADS)
Wang, Bingjie; Pi, Shaohua; Sun, Qi; Jia, Bo
2015-05-01
An improved classification algorithm that considers multiscale wavelet packet Shannon entropy is proposed. Decomposition coefficients at all levels are obtained to build the initial Shannon entropy feature vector. After subtracting the Shannon entropy map of the background signal, components of the strongest discriminating power in the initial feature vector are picked out to rebuild the Shannon entropy feature vector, which is transferred to radial basis function (RBF) neural network for classification. Four types of man-made vibrational intrusion signals are recorded based on a modified Sagnac interferometer. The performance of the improved classification algorithm has been evaluated by the classification experiments via RBF neural network under different diffusion coefficients. An 85% classification accuracy rate is achieved, which is higher than the other common algorithms. The classification results show that this improved classification algorithm can be used to classify vibrational intrusion signals in an automatic real-time monitoring system.
Automated real-time search and analysis algorithms for a non-contact 3D profiling system
NASA Astrophysics Data System (ADS)
Haynes, Mark; Wu, Chih-Hang John; Beck, B. Terry; Peterman, Robert J.
2013-04-01
The purpose of this research is to develop a new means of identifying and extracting geometrical feature statistics from a non-contact precision-measurement 3D profilometer. Autonomous algorithms have been developed to search through large-scale Cartesian point clouds to identify and extract geometrical features. These algorithms are developed with the intent of providing real-time production quality control of cold-rolled steel wires. The steel wires in question are prestressing steel reinforcement wires for concrete members. The geometry of the wire is critical in the performance of the overall concrete structure. For this research a custom 3D non-contact profilometry system has been developed that utilizes laser displacement sensors for submicron resolution surface profiling. Optimizations in the control and sensory system allow for data points to be collected at up to an approximate 400,000 points per second. In order to achieve geometrical feature extraction and tolerancing with this large volume of data, the algorithms employed are optimized for parsing large data quantities. The methods used provide a unique means of maintaining high resolution data of the surface profiles while keeping algorithm running times within practical bounds for industrial application. By a combination of regional sampling, iterative search, spatial filtering, frequency filtering, spatial clustering, and template matching a robust feature identification method has been developed. These algorithms provide an autonomous means of verifying tolerances in geometrical features. The key method of identifying the features is through a combination of downhill simplex and geometrical feature templates. By performing downhill simplex through several procedural programming layers of different search and filtering techniques, very specific geometrical features can be identified within the point cloud and analyzed for proper tolerancing. Being able to perform this quality control in real time provides significant opportunities in cost savings in both equipment protection and waste minimization.
Infrared vehicle recognition using unsupervised feature learning based on K-feature
NASA Astrophysics Data System (ADS)
Lin, Jin; Tan, Yihua; Xia, Haijiao; Tian, Jinwen
2018-02-01
Subject to the complex battlefield environment, it is difficult to establish a complete knowledge base in practical application of vehicle recognition algorithms. The infrared vehicle recognition is always difficult and challenging, which plays an important role in remote sensing. In this paper we propose a new unsupervised feature learning method based on K-feature to recognize vehicle in infrared images. First, we use the target detection algorithm which is based on the saliency to detect the initial image. Then, the unsupervised feature learning based on K-feature, which is generated by Kmeans clustering algorithm that extracted features by learning a visual dictionary from a large number of samples without label, is calculated to suppress the false alarm and improve the accuracy. Finally, the vehicle target recognition image is finished by some post-processing. Large numbers of experiments demonstrate that the proposed method has satisfy recognition effectiveness and robustness for vehicle recognition in infrared images under complex backgrounds, and it also improve the reliability of it.
Blood vessel segmentation in color fundus images based on regional and Hessian features.
Shah, Syed Ayaz Ali; Tang, Tong Boon; Faye, Ibrahima; Laude, Augustinus
2017-08-01
To propose a new algorithm of blood vessel segmentation based on regional and Hessian features for image analysis in retinal abnormality diagnosis. Firstly, color fundus images from the publicly available database DRIVE were converted from RGB to grayscale. To enhance the contrast of the dark objects (blood vessels) against the background, the dot product of the grayscale image with itself was generated. To rectify the variation in contrast, we used a 5 × 5 window filter on each pixel. Based on 5 regional features, 1 intensity feature and 2 Hessian features per scale using 9 scales, we extracted a total of 24 features. A linear minimum squared error (LMSE) classifier was trained to classify each pixel into a vessel or non-vessel pixel. The DRIVE dataset provided 20 training and 20 test color fundus images. The proposed algorithm achieves a sensitivity of 72.05% with 94.79% accuracy. Our proposed algorithm achieved higher accuracy (0.9206) at the peripapillary region, where the ocular manifestations in the microvasculature due to glaucoma, central retinal vein occlusion, etc. are most obvious. This supports the proposed algorithm as a strong candidate for automated vessel segmentation.
Ant-cuckoo colony optimization for feature selection in digital mammogram.
Jona, J B; Nagaveni, N
2014-01-15
Digital mammogram is the only effective screening method to detect the breast cancer. Gray Level Co-occurrence Matrix (GLCM) textural features are extracted from the mammogram. All the features are not essential to detect the mammogram. Therefore identifying the relevant feature is the aim of this work. Feature selection improves the classification rate and accuracy of any classifier. In this study, a new hybrid metaheuristic named Ant-Cuckoo Colony Optimization a hybrid of Ant Colony Optimization (ACO) and Cuckoo Search (CS) is proposed for feature selection in Digital Mammogram. ACO is a good metaheuristic optimization technique but the drawback of this algorithm is that the ant will walk through the path where the pheromone density is high which makes the whole process slow hence CS is employed to carry out the local search of ACO. Support Vector Machine (SVM) classifier with Radial Basis Kernal Function (RBF) is done along with the ACO to classify the normal mammogram from the abnormal mammogram. Experiments are conducted in miniMIAS database. The performance of the new hybrid algorithm is compared with the ACO and PSO algorithm. The results show that the hybrid Ant-Cuckoo Colony Optimization algorithm is more accurate than the other techniques.
A Probabilistic Feature Map-Based Localization System Using a Monocular Camera.
Kim, Hyungjin; Lee, Donghwa; Oh, Taekjun; Choi, Hyun-Taek; Myung, Hyun
2015-08-31
Image-based localization is one of the most widely researched localization techniques in the robotics and computer vision communities. As enormous image data sets are provided through the Internet, many studies on estimating a location with a pre-built image-based 3D map have been conducted. Most research groups use numerous image data sets that contain sufficient features. In contrast, this paper focuses on image-based localization in the case of insufficient images and features. A more accurate localization method is proposed based on a probabilistic map using 3D-to-2D matching correspondences between a map and a query image. The probabilistic feature map is generated in advance by probabilistic modeling of the sensor system as well as the uncertainties of camera poses. Using the conventional PnP algorithm, an initial camera pose is estimated on the probabilistic feature map. The proposed algorithm is optimized from the initial pose by minimizing Mahalanobis distance errors between features from the query image and the map to improve accuracy. To verify that the localization accuracy is improved, the proposed algorithm is compared with the conventional algorithm in a simulation and realenvironments.
A Probabilistic Feature Map-Based Localization System Using a Monocular Camera
Kim, Hyungjin; Lee, Donghwa; Oh, Taekjun; Choi, Hyun-Taek; Myung, Hyun
2015-01-01
Image-based localization is one of the most widely researched localization techniques in the robotics and computer vision communities. As enormous image data sets are provided through the Internet, many studies on estimating a location with a pre-built image-based 3D map have been conducted. Most research groups use numerous image data sets that contain sufficient features. In contrast, this paper focuses on image-based localization in the case of insufficient images and features. A more accurate localization method is proposed based on a probabilistic map using 3D-to-2D matching correspondences between a map and a query image. The probabilistic feature map is generated in advance by probabilistic modeling of the sensor system as well as the uncertainties of camera poses. Using the conventional PnP algorithm, an initial camera pose is estimated on the probabilistic feature map. The proposed algorithm is optimized from the initial pose by minimizing Mahalanobis distance errors between features from the query image and the map to improve accuracy. To verify that the localization accuracy is improved, the proposed algorithm is compared with the conventional algorithm in a simulation and realenvironments. PMID:26404284
Retinal image quality assessment based on image clarity and content
NASA Astrophysics Data System (ADS)
Abdel-Hamid, Lamiaa; El-Rafei, Ahmed; El-Ramly, Salwa; Michelson, Georg; Hornegger, Joachim
2016-09-01
Retinal image quality assessment (RIQA) is an essential step in automated screening systems to avoid misdiagnosis caused by processing poor quality retinal images. A no-reference transform-based RIQA algorithm is introduced that assesses images based on five clarity and content quality issues: sharpness, illumination, homogeneity, field definition, and content. Transform-based RIQA algorithms have the advantage of considering retinal structures while being computationally inexpensive. Wavelet-based features are proposed to evaluate the sharpness and overall illumination of the images. A retinal saturation channel is designed and used along with wavelet-based features for homogeneity assessment. The presented sharpness and illumination features are utilized to assure adequate field definition, whereas color information is used to exclude nonretinal images. Several publicly available datasets of varying quality grades are utilized to evaluate the feature sets resulting in area under the receiver operating characteristic curve above 0.99 for each of the individual feature sets. The overall quality is assessed by a classifier that uses the collective features as an input vector. The classification results show superior performance of the algorithm in comparison to other methods from literature. Moreover, the algorithm addresses efficiently and comprehensively various quality issues and is suitable for automatic screening systems.
Nie, Haitao; Long, Kehui; Ma, Jun; Yue, Dan; Liu, Jinguo
2015-01-01
Partial occlusions, large pose variations, and extreme ambient illumination conditions generally cause the performance degradation of object recognition systems. Therefore, this paper presents a novel approach for fast and robust object recognition in cluttered scenes based on an improved scale invariant feature transform (SIFT) algorithm and a fuzzy closed-loop control method. First, a fast SIFT algorithm is proposed by classifying SIFT features into several clusters based on several attributes computed from the sub-orientation histogram (SOH), in the feature matching phase only features that share nearly the same corresponding attributes are compared. Second, a feature matching step is performed following a prioritized order based on the scale factor, which is calculated between the object image and the target object image, guaranteeing robust feature matching. Finally, a fuzzy closed-loop control strategy is applied to increase the accuracy of the object recognition and is essential for autonomous object manipulation process. Compared to the original SIFT algorithm for object recognition, the result of the proposed method shows that the number of SIFT features extracted from an object has a significant increase, and the computing speed of the object recognition processes increases by more than 40%. The experimental results confirmed that the proposed method performs effectively and accurately in cluttered scenes. PMID:25714094
Quantum Algorithm for K-Nearest Neighbors Classification Based on the Metric of Hamming Distance
NASA Astrophysics Data System (ADS)
Ruan, Yue; Xue, Xiling; Liu, Heng; Tan, Jianing; Li, Xi
2017-11-01
K-nearest neighbors (KNN) algorithm is a common algorithm used for classification, and also a sub-routine in various complicated machine learning tasks. In this paper, we presented a quantum algorithm (QKNN) for implementing this algorithm based on the metric of Hamming distance. We put forward a quantum circuit for computing Hamming distance between testing sample and each feature vector in the training set. Taking advantage of this method, we realized a good analog for classical KNN algorithm by setting a distance threshold value t to select k - n e a r e s t neighbors. As a result, QKNN achieves O( n 3) performance which is only relevant to the dimension of feature vectors and high classification accuracy, outperforms Llyod's algorithm (Lloyd et al. 2013) and Wiebe's algorithm (Wiebe et al. 2014).
Influence of time and length size feature selections for human activity sequences recognition.
Fang, Hongqing; Chen, Long; Srinivasan, Raghavendiran
2014-01-01
In this paper, Viterbi algorithm based on a hidden Markov model is applied to recognize activity sequences from observed sensors events. Alternative features selections of time feature values of sensors events and activity length size feature values are tested, respectively, and then the results of activity sequences recognition performances of Viterbi algorithm are evaluated. The results show that the selection of larger time feature values of sensor events and/or smaller activity length size feature values will generate relatively better results on the activity sequences recognition performances. © 2013 ISA Published by ISA All rights reserved.
Research on Palmprint Identification Method Based on Quantum Algorithms
Zhang, Zhanzhan
2014-01-01
Quantum image recognition is a technology by using quantum algorithm to process the image information. It can obtain better effect than classical algorithm. In this paper, four different quantum algorithms are used in the three stages of palmprint recognition. First, quantum adaptive median filtering algorithm is presented in palmprint filtering processing. Quantum filtering algorithm can get a better filtering result than classical algorithm through the comparison. Next, quantum Fourier transform (QFT) is used to extract pattern features by only one operation due to quantum parallelism. The proposed algorithm exhibits an exponential speed-up compared with discrete Fourier transform in the feature extraction. Finally, quantum set operations and Grover algorithm are used in palmprint matching. According to the experimental results, quantum algorithm only needs to apply square of N operations to find out the target palmprint, but the traditional method needs N times of calculation. At the same time, the matching accuracy of quantum algorithm is almost 100%. PMID:25105165
A novel approach for dimension reduction of microarray.
Aziz, Rabia; Verma, C K; Srivastava, Namita
2017-12-01
This paper proposes a new hybrid search technique for feature (gene) selection (FS) using Independent component analysis (ICA) and Artificial Bee Colony (ABC) called ICA+ABC, to select informative genes based on a Naïve Bayes (NB) algorithm. An important trait of this technique is the optimization of ICA feature vector using ABC. ICA+ABC is a hybrid search algorithm that combines the benefits of extraction approach, to reduce the size of data and wrapper approach, to optimize the reduced feature vectors. This hybrid search technique is facilitated by evaluating the performance of ICA+ABC on six standard gene expression datasets of classification. Extensive experiments were conducted to compare the performance of ICA+ABC with the results obtained from recently published Minimum Redundancy Maximum Relevance (mRMR) +ABC algorithm for NB classifier. Also to check the performance that how ICA+ABC works as feature selection with NB classifier, compared the combination of ICA with popular filter techniques and with other similar bio inspired algorithm such as Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The result shows that ICA+ABC has a significant ability to generate small subsets of genes from the ICA feature vector, that significantly improve the classification accuracy of NB classifier compared to other previously suggested methods. Copyright © 2017 Elsevier Ltd. All rights reserved.
3D model retrieval method based on mesh segmentation
NASA Astrophysics Data System (ADS)
Gan, Yuanchao; Tang, Yan; Zhang, Qingchen
2012-04-01
In the process of feature description and extraction, current 3D model retrieval algorithms focus on the global features of 3D models but ignore the combination of global and local features of the model. For this reason, they show less effective performance to the models with similar global shape and different local shape. This paper proposes a novel algorithm for 3D model retrieval based on mesh segmentation. The key idea is to exact the structure feature and the local shape feature of 3D models, and then to compares the similarities of the two characteristics and the total similarity between the models. A system that realizes this approach was built and tested on a database of 200 objects and achieves expected results. The results show that the proposed algorithm improves the precision and the recall rate effectively.
Zafar, Raheel; Dass, Sarat C; Malik, Aamir Saeed
2017-01-01
Electroencephalogram (EEG)-based decoding human brain activity is challenging, owing to the low spatial resolution of EEG. However, EEG is an important technique, especially for brain-computer interface applications. In this study, a novel algorithm is proposed to decode brain activity associated with different types of images. In this hybrid algorithm, convolutional neural network is modified for the extraction of features, a t-test is used for the selection of significant features and likelihood ratio-based score fusion is used for the prediction of brain activity. The proposed algorithm takes input data from multichannel EEG time-series, which is also known as multivariate pattern analysis. Comprehensive analysis was conducted using data from 30 participants. The results from the proposed method are compared with current recognized feature extraction and classification/prediction techniques. The wavelet transform-support vector machine method is the most popular currently used feature extraction and prediction method. This method showed an accuracy of 65.7%. However, the proposed method predicts the novel data with improved accuracy of 79.9%. In conclusion, the proposed algorithm outperformed the current feature extraction and prediction method.
A model of proto-object based saliency
Russell, Alexander F.; Mihalaş, Stefan; von der Heydt, Rudiger; Niebur, Ernst; Etienne-Cummings, Ralph
2013-01-01
Organisms use the process of selective attention to optimally allocate their computational resources to the instantaneously most relevant subsets of a visual scene, ensuring that they can parse the scene in real time. Many models of bottom-up attentional selection assume that elementary image features, like intensity, color and orientation, attract attention. Gestalt psychologists, how-ever, argue that humans perceive whole objects before they analyze individual features. This is supported by recent psychophysical studies that show that objects predict eye-fixations better than features. In this report we present a neurally inspired algorithm of object based, bottom-up attention. The model rivals the performance of state of the art non-biologically plausible feature based algorithms (and outperforms biologically plausible feature based algorithms) in its ability to predict perceptual saliency (eye fixations and subjective interest points) in natural scenes. The model achieves this by computing saliency as a function of proto-objects that establish the perceptual organization of the scene. All computational mechanisms of the algorithm have direct neural correlates, and our results provide evidence for the interface theory of attention. PMID:24184601
Communication target object recognition for D2D connection with feature size limit
NASA Astrophysics Data System (ADS)
Ok, Jiheon; Kim, Soochang; Kim, Young-hoon; Lee, Chulhee
2015-03-01
Recently, a new concept of device-to-device (D2D) communication, which is called "point-and-link communication" has attracted great attentions due to its intuitive and simple operation. This approach enables user to communicate with target devices without any pre-identification information such as SSIDs, MAC addresses by selecting the target image displayed on the user's own device. In this paper, we present an efficient object matching algorithm that can be applied to look(point)-and-link communications for mobile services. Due to the limited channel bandwidth and low computational power of mobile terminals, the matching algorithm should satisfy low-complexity, low-memory and realtime requirements. To meet these requirements, we propose fast and robust feature extraction by considering the descriptor size and processing time. The proposed algorithm utilizes a HSV color histogram, SIFT (Scale Invariant Feature Transform) features and object aspect ratios. To reduce the descriptor size under 300 bytes, a limited number of SIFT key points were chosen as feature points and histograms were binarized while maintaining required performance. Experimental results show the robustness and the efficiency of the proposed algorithm.
FGRAAL: FORTRAN extended graph algorithmic language
NASA Technical Reports Server (NTRS)
Basili, V. R.; Mesztenyi, C. K.; Rheinboldt, W. C.
1972-01-01
The FORTRAN version FGRAAL of the graph algorithmic language GRAAL as it has been implemented for the Univac 1108 is described. FBRAAL is an extension of FORTRAN 5 and is intended for describing and implementing graph algorithms of the type primarily arising in applications. The formal description contained in this report represents a supplement to the FORTRAN 5 manual for the Univac 1108 (UP-4060), that is, only the new features of the language are described. Several typical graph algorithms, written in FGRAAL, are included to illustrate various features of the language and to show its applicability.
Feature Based Retention Time Alignment for Improved HDX MS Analysis
NASA Astrophysics Data System (ADS)
Venable, John D.; Scuba, William; Brock, Ansgar
2013-04-01
An algorithm for retention time alignment of mass shifted hydrogen-deuterium exchange (HDX) data based on an iterative distance minimization procedure is described. The algorithm performs pairwise comparisons in an iterative fashion between a list of features from a reference file and a file to be time aligned to calculate a retention time mapping function. Features are characterized by their charge, retention time and mass of the monoisotopic peak. The algorithm is able to align datasets with mass shifted features, which is a prerequisite for aligning hydrogen-deuterium exchange mass spectrometry datasets. Confidence assignments from the fully automated processing of a commercial HDX software package are shown to benefit significantly from retention time alignment prior to extraction of deuterium incorporation values.
Addeh, Abdoljalil; Khormali, Aminollah; Golilarz, Noorbakhsh Amiri
2018-05-04
The control chart patterns are the most commonly used statistical process control (SPC) tools to monitor process changes. When a control chart produces an out-of-control signal, this means that the process has been changed. In this study, a new method based on optimized radial basis function neural network (RBFNN) is proposed for control chart patterns (CCPs) recognition. The proposed method consists of four main modules: feature extraction, feature selection, classification and learning algorithm. In the feature extraction module, shape and statistical features are used. Recently, various shape and statistical features have been presented for the CCPs recognition. In the feature selection module, the association rules (AR) method has been employed to select the best set of the shape and statistical features. In the classifier section, RBFNN is used and finally, in RBFNN, learning algorithm has a high impact on the network performance. Therefore, a new learning algorithm based on the bees algorithm has been used in the learning module. Most studies have considered only six patterns: Normal, Cyclic, Increasing Trend, Decreasing Trend, Upward Shift and Downward Shift. Since three patterns namely Normal, Stratification, and Systematic are very similar to each other and distinguishing them is very difficult, in most studies Stratification and Systematic have not been considered. Regarding to the continuous monitoring and control over the production process and the exact type detection of the problem encountered during the production process, eight patterns have been investigated in this study. The proposed method is tested on a dataset containing 1600 samples (200 samples from each pattern) and the results showed that the proposed method has a very good performance. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.
Shearlet Features for Registration of Remotely Sensed Multitemporal Images
NASA Technical Reports Server (NTRS)
Murphy, James M.; Le Moigne, Jacqueline
2015-01-01
We investigate the role of anisotropic feature extraction methods for automatic image registration of remotely sensed multitemporal images. Building on the classical use of wavelets in image registration, we develop an algorithm based on shearlets, a mathematical generalization of wavelets that offers increased directional sensitivity. Initial experimental results on LANDSAT images are presented, which indicate superior performance of the shearlet algorithm when compared to classical wavelet algorithms.
Bourke, Alan K; Klenk, Jochen; Schwickert, Lars; Aminian, Kamiar; Ihlen, Espen A F; Mellone, Sabato; Helbostad, Jorunn L; Chiari, Lorenzo; Becker, Clemens
2016-08-01
Automatic fall detection will promote independent living and reduce the consequences of falls in the elderly by ensuring people can confidently live safely at home for linger. In laboratory studies inertial sensor technology has been shown capable of distinguishing falls from normal activities. However less than 7% of fall-detection algorithm studies have used fall data recorded from elderly people in real life. The FARSEEING project has compiled a database of real life falls from elderly people, to gain new knowledge about fall events and to develop fall detection algorithms to combat the problems associated with falls. We have extracted 12 different kinematic, temporal and kinetic related features from a data-set of 89 real-world falls and 368 activities of daily living. Using the extracted features we applied machine learning techniques and produced a selection of algorithms based on different feature combinations. The best algorithm employs 10 different features and produced a sensitivity of 0.88 and a specificity of 0.87 in classifying falls correctly. This algorithm can be used distinguish real-world falls from normal activities of daily living in a sensor consisting of a tri-axial accelerometer and tri-axial gyroscope located at L5.
Minimizing the semantic gap in biomedical content-based image retrieval
NASA Astrophysics Data System (ADS)
Guan, Haiying; Antani, Sameer; Long, L. Rodney; Thoma, George R.
2010-03-01
A major challenge in biomedical Content-Based Image Retrieval (CBIR) is to achieve meaningful mappings that minimize the semantic gap between the high-level biomedical semantic concepts and the low-level visual features in images. This paper presents a comprehensive learning-based scheme toward meeting this challenge and improving retrieval quality. The article presents two algorithms: a learning-based feature selection and fusion algorithm and the Ranking Support Vector Machine (Ranking SVM) algorithm. The feature selection algorithm aims to select 'good' features and fuse them using different similarity measurements to provide a better representation of the high-level concepts with the low-level image features. Ranking SVM is applied to learn the retrieval rank function and associate the selected low-level features with query concepts, given the ground-truth ranking of the training samples. The proposed scheme addresses four major issues in CBIR to improve the retrieval accuracy: image feature extraction, selection and fusion, similarity measurements, the association of the low-level features with high-level concepts, and the generation of the rank function to support high-level semantic image retrieval. It models the relationship between semantic concepts and image features, and enables retrieval at the semantic level. We apply it to the problem of vertebra shape retrieval from a digitized spine x-ray image set collected by the second National Health and Nutrition Examination Survey (NHANES II). The experimental results show an improvement of up to 41.92% in the mean average precision (MAP) over conventional image similarity computation methods.
Khoje, Suchitra
2018-02-01
Images of four qualities of mangoes and guavas are evaluated for color and textural features to characterize and classify them, and to model the fruit appearance grading. The paper discusses three approaches to identify most discriminating texture features of both the fruits. In the first approach, fruit's color and texture features are selected using Mahalanobis distance. A total of 20 color features and 40 textural features are extracted for analysis. Using Mahalanobis distance and feature intercorrelation analyses, one best color feature (mean of a* [L*a*b* color space]) and two textural features (energy a*, contrast of H*) are selected as features for Guava while two best color features (R std, H std) and one textural features (energy b*) are selected as features for mangoes with the highest discriminate power. The second approach studies some common wavelet families for searching the best classification model for fruit quality grading. The wavelet features extracted from five basic mother wavelets (db, bior, rbior, Coif, Sym) are explored to characterize fruits texture appearance. In third approach, genetic algorithm is used to select only those color and wavelet texture features that are relevant to the separation of the class, from a large universe of features. The study shows that image color and texture features which were identified using a genetic algorithm can distinguish between various qualities classes of fruits. The experimental results showed that support vector machine classifier is elected for Guava grading with an accuracy of 97.61% and artificial neural network is elected from Mango grading with an accuracy of 95.65%. The proposed method is nondestructive fruit quality assessment method. The experimental results has proven that Genetic algorithm along with wavelet textures feature has potential to discriminate fruit quality. Finally, it can be concluded that discussed method is an accurate, reliable, and objective tool to determine fruit quality namely Mango and Guava, and might be applicable to in-line sorting systems. © 2017 Wiley Periodicals, Inc.
Integrated feature extraction and selection for neuroimage classification
NASA Astrophysics Data System (ADS)
Fan, Yong; Shen, Dinggang
2009-02-01
Feature extraction and selection are of great importance in neuroimage classification for identifying informative features and reducing feature dimensionality, which are generally implemented as two separate steps. This paper presents an integrated feature extraction and selection algorithm with two iterative steps: constrained subspace learning based feature extraction and support vector machine (SVM) based feature selection. The subspace learning based feature extraction focuses on the brain regions with higher possibility of being affected by the disease under study, while the possibility of brain regions being affected by disease is estimated by the SVM based feature selection, in conjunction with SVM classification. This algorithm can not only take into account the inter-correlation among different brain regions, but also overcome the limitation of traditional subspace learning based feature extraction methods. To achieve robust performance and optimal selection of parameters involved in feature extraction, selection, and classification, a bootstrapping strategy is used to generate multiple versions of training and testing sets for parameter optimization, according to the classification performance measured by the area under the ROC (receiver operating characteristic) curve. The integrated feature extraction and selection method is applied to a structural MR image based Alzheimer's disease (AD) study with 98 non-demented and 100 demented subjects. Cross-validation results indicate that the proposed algorithm can improve performance of the traditional subspace learning based classification.
Gene/protein name recognition based on support vector machine using dictionary as features.
Mitsumori, Tomohiro; Fation, Sevrani; Murata, Masaki; Doi, Kouichi; Doi, Hirohumi
2005-01-01
Automated information extraction from biomedical literature is important because a vast amount of biomedical literature has been published. Recognition of the biomedical named entities is the first step in information extraction. We developed an automated recognition system based on the SVM algorithm and evaluated it in Task 1.A of BioCreAtIvE, a competition for automated gene/protein name recognition. In the work presented here, our recognition system uses the feature set of the word, the part-of-speech (POS), the orthography, the prefix, the suffix, and the preceding class. We call these features "internal resource features", i.e., features that can be found in the training data. Additionally, we consider the features of matching against dictionaries to be external resource features. We investigated and evaluated the effect of these features as well as the effect of tuning the parameters of the SVM algorithm. We found that the dictionary matching features contributed slightly to the improvement in the performance of the f-score. We attribute this to the possibility that the dictionary matching features might overlap with other features in the current multiple feature setting. During SVM learning, each feature alone had a marginally positive effect on system performance. This supports the fact that the SVM algorithm is robust on the high dimensionality of the feature vector space and means that feature selection is not required.
Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao
2015-01-01
Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383
BlobContours: adapting Blobworld for supervised color- and texture-based image segmentation
NASA Astrophysics Data System (ADS)
Vogel, Thomas; Nguyen, Dinh Quyen; Dittmann, Jana
2006-01-01
Extracting features is the first and one of the most crucial steps in recent image retrieval process. While the color features and the texture features of digital images can be extracted rather easily, the shape features and the layout features depend on reliable image segmentation. Unsupervised image segmentation, often used in image analysis, works on merely syntactical basis. That is, what an unsupervised segmentation algorithm can segment is only regions, but not objects. To obtain high-level objects, which is desirable in image retrieval, human assistance is needed. Supervised image segmentations schemes can improve the reliability of segmentation and segmentation refinement. In this paper we propose a novel interactive image segmentation technique that combines the reliability of a human expert with the precision of automated image segmentation. The iterative procedure can be considered a variation on the Blobworld algorithm introduced by Carson et al. from EECS Department, University of California, Berkeley. Starting with an initial segmentation as provided by the Blobworld framework, our algorithm, namely BlobContours, gradually updates it by recalculating every blob, based on the original features and the updated number of Gaussians. Since the original algorithm has hardly been designed for interactive processing we had to consider additional requirements for realizing a supervised segmentation scheme on the basis of Blobworld. Increasing transparency of the algorithm by applying usercontrolled iterative segmentation, providing different types of visualization for displaying the segmented image and decreasing computational time of segmentation are three major requirements which are discussed in detail.
Feature Selection and Effective Classifiers.
ERIC Educational Resources Information Center
Deogun, Jitender S.; Choubey, Suresh K.; Raghavan, Vijay V.; Sever, Hayri
1998-01-01
Develops and analyzes four algorithms for feature selection in the context of rough set methodology. Experimental results confirm the expected relationship between the time complexity of these algorithms and the classification accuracy of the resulting upper classifiers. When compared, results of upper classifiers perform better than lower…
CNN universal machine as classificaton platform: an art-like clustering algorithm.
Bálya, David
2003-12-01
Fast and robust classification of feature vectors is a crucial task in a number of real-time systems. A cellular neural/nonlinear network universal machine (CNN-UM) can be very efficient as a feature detector. The next step is to post-process the results for object recognition. This paper shows how a robust classification scheme based on adaptive resonance theory (ART) can be mapped to the CNN-UM. Moreover, this mapping is general enough to include different types of feed-forward neural networks. The designed analogic CNN algorithm is capable of classifying the extracted feature vectors keeping the advantages of the ART networks, such as robust, plastic and fault-tolerant behaviors. An analogic algorithm is presented for unsupervised classification with tunable sensitivity and automatic new class creation. The algorithm is extended for supervised classification. The presented binary feature vector classification is implemented on the existing standard CNN-UM chips for fast classification. The experimental evaluation shows promising performance after 100% accuracy on the training set.
Mining Co-Location Patterns with Clustering Items from Spatial Data Sets
NASA Astrophysics Data System (ADS)
Zhou, G.; Li, Q.; Deng, G.; Yue, T.; Zhou, X.
2018-05-01
The explosive growth of spatial data and widespread use of spatial databases emphasize the need for the spatial data mining. Co-location patterns discovery is an important branch in spatial data mining. Spatial co-locations represent the subsets of features which are frequently located together in geographic space. However, the appearance of a spatial feature C is often not determined by a single spatial feature A or B but by the two spatial features A and B, that is to say where A and B appear together, C often appears. We note that this co-location pattern is different from the traditional co-location pattern. Thus, this paper presents a new concept called clustering terms, and this co-location pattern is called co-location patterns with clustering items. And the traditional algorithm cannot mine this co-location pattern, so we introduce the related concept in detail and propose a novel algorithm. This algorithm is extended by join-based approach proposed by Huang. Finally, we evaluate the performance of this algorithm.
Dai, Qiong; Cheng, Jun-Hu; Sun, Da-Wen; Zeng, Xin-An
2015-01-01
There is an increased interest in the applications of hyperspectral imaging (HSI) for assessing food quality, safety, and authenticity. HSI provides abundance of spatial and spectral information from foods by combining both spectroscopy and imaging, resulting in hundreds of contiguous wavebands for each spatial position of food samples, also known as the curse of dimensionality. It is desirable to employ feature selection algorithms for decreasing computation burden and increasing predicting accuracy, which are especially relevant in the development of online applications. Recently, a variety of feature selection algorithms have been proposed that can be categorized into three groups based on the searching strategy namely complete search, heuristic search and random search. This review mainly introduced the fundamental of each algorithm, illustrated its applications in hyperspectral data analysis in the food field, and discussed the advantages and disadvantages of these algorithms. It is hoped that this review should provide a guideline for feature selections and data processing in the future development of hyperspectral imaging technique in foods.
Assessment of metal ion concentration in water with structured feature selection.
Naula, Pekka; Airola, Antti; Pihlasalo, Sari; Montoya Perez, Ileana; Salakoski, Tapio; Pahikkala, Tapio
2017-10-01
We propose a cost-effective system for the determination of metal ion concentration in water, addressing a central issue in water resources management. The system combines novel luminometric label array technology with a machine learning algorithm that selects a minimal number of array reagents (modulators) and liquid sample dilutions, such that enable accurate quantification. The algorithm is able to identify the optimal modulators and sample dilutions leading to cost reductions since less manual labour and resources are needed. Inferring the ion detector involves a unique type of a structured feature selection problem, which we formalize in this paper. We propose a novel Cartesian greedy forward feature selection algorithm for solving the problem. The novel algorithm was evaluated in the concentration assessment of five metal ions and the performance was compared to two known feature selection approaches. The results demonstrate that the proposed system can assist in lowering the costs with minimal loss in accuracy. Copyright © 2017 Elsevier Ltd. All rights reserved.
Vatsa, Mayank; Singh, Richa; Noore, Afzel
2008-08-01
This paper proposes algorithms for iris segmentation, quality enhancement, match score fusion, and indexing to improve both the accuracy and the speed of iris recognition. A curve evolution approach is proposed to effectively segment a nonideal iris image using the modified Mumford-Shah functional. Different enhancement algorithms are concurrently applied on the segmented iris image to produce multiple enhanced versions of the iris image. A support-vector-machine-based learning algorithm selects locally enhanced regions from each globally enhanced image and combines these good-quality regions to create a single high-quality iris image. Two distinct features are extracted from the high-quality iris image. The global textural feature is extracted using the 1-D log polar Gabor transform, and the local topological feature is extracted using Euler numbers. An intelligent fusion algorithm combines the textural and topological matching scores to further improve the iris recognition performance and reduce the false rejection rate, whereas an indexing algorithm enables fast and accurate iris identification. The verification and identification performance of the proposed algorithms is validated and compared with other algorithms using the CASIA Version 3, ICE 2005, and UBIRIS iris databases.
Tan, Bingyao; Wong, Alexander; Bizheva, Kostadinka
2018-01-01
A novel image processing algorithm based on a modified Bayesian residual transform (MBRT) was developed for the enhancement of morphological and vascular features in optical coherence tomography (OCT) and OCT angiography (OCTA) images. The MBRT algorithm decomposes the original OCT image into multiple residual images, where each image presents information at a unique scale. Scale selective residual adaptation is used subsequently to enhance morphological features of interest, such as blood vessels and tissue layers, and to suppress irrelevant image features such as noise and motion artefacts. The performance of the proposed MBRT algorithm was tested on a series of cross-sectional and enface OCT and OCTA images of retina and brain tissue that were acquired in-vivo. Results show that the MBRT reduces speckle noise and motion-related imaging artefacts locally, thus improving significantly the contrast and visibility of morphological features in the OCT and OCTA images. PMID:29760996
DOE Office of Scientific and Technical Information (OSTI.GOV)
Parekh, V; Jacobs, MA
Purpose: Multiparametric radiological imaging is used for diagnosis in patients. Potentially extracting useful features specific to a patient’s pathology would be crucial step towards personalized medicine and assessing treatment options. In order to automatically extract features directly from multiparametric radiological imaging datasets, we developed an advanced unsupervised machine learning algorithm called the multidimensional imaging radiomics-geodesics(MIRaGe). Methods: Seventy-six breast tumor patients underwent 3T MRI breast imaging were used for this study. We tested the MIRaGe algorithm to extract features for classification of breast tumors into benign or malignant. The MRI parameters used were T1-weighted, T2-weighted, dynamic contrast enhanced MR imaging (DCE-MRI)more » and diffusion weighted imaging(DWI). The MIRaGe algorithm extracted the radiomics-geodesics features (RGFs) from multiparametric MRI datasets. This enable our method to learn the intrinsic manifold representations corresponding to the patients. To determine the informative RGF, a modified Isomap algorithm(t-Isomap) was created for a radiomics-geodesics feature space(tRGFS) to avoid overfitting. Final classification was performed using SVM. The predictive power of the RGFs was tested and validated using k-fold cross validation. Results: The RGFs extracted by the MIRaGe algorithm successfully classified malignant lesions from benign lesions with a sensitivity of 93% and a specificity of 91%. The top 50 RGFs identified as the most predictive by the t-Isomap procedure were consistent with the radiological parameters known to be associated with breast cancer diagnosis and were categorized as kinetic curve characterizing RGFs, wash-in rate characterizing RGFs, wash-out rate characterizing RGFs and morphology characterizing RGFs. Conclusion: In this paper, we developed a novel feature extraction algorithm for multiparametric radiological imaging. The results demonstrated the power of the MIRaGe algorithm at automatically discovering useful feature representations directly from the raw multiparametric MRI data. In conclusion, the MIRaGe informatics model provides a powerful tool with applicability in cancer diagnosis and a possibility of extension to other kinds of pathologies. NIH (P50CA103175, 5P30CA006973 (IRAT), R01CA190299, U01CA140204), Siemens Medical Systems (JHU-2012-MR-86-01) and Nivida Graphics Corporation.« less
Du, Shaoyi; Xu, Yiting; Wan, Teng; Hu, Huaizhong; Zhang, Sirui; Xu, Guanglin; Zhang, Xuetao
2017-01-01
The iterative closest point (ICP) algorithm is efficient and accurate for rigid registration but it needs the good initial parameters. It is easily failed when the rotation angle between two point sets is large. To deal with this problem, a new objective function is proposed by introducing a rotation invariant feature based on the Euclidean distance between each point and a global reference point, where the global reference point is a rotation invariant. After that, this optimization problem is solved by a variant of ICP algorithm, which is an iterative method. Firstly, the accurate correspondence is established by using the weighted rotation invariant feature distance and position distance together. Secondly, the rigid transformation is solved by the singular value decomposition method. Thirdly, the weight is adjusted to control the relative contribution of the positions and features. Finally this new algorithm accomplishes the registration by a coarse-to-fine way whatever the initial rotation angle is, which is demonstrated to converge monotonically. The experimental results validate that the proposed algorithm is more accurate and robust compared with the original ICP algorithm.
Du, Shaoyi; Xu, Yiting; Wan, Teng; Zhang, Sirui; Xu, Guanglin; Zhang, Xuetao
2017-01-01
The iterative closest point (ICP) algorithm is efficient and accurate for rigid registration but it needs the good initial parameters. It is easily failed when the rotation angle between two point sets is large. To deal with this problem, a new objective function is proposed by introducing a rotation invariant feature based on the Euclidean distance between each point and a global reference point, where the global reference point is a rotation invariant. After that, this optimization problem is solved by a variant of ICP algorithm, which is an iterative method. Firstly, the accurate correspondence is established by using the weighted rotation invariant feature distance and position distance together. Secondly, the rigid transformation is solved by the singular value decomposition method. Thirdly, the weight is adjusted to control the relative contribution of the positions and features. Finally this new algorithm accomplishes the registration by a coarse-to-fine way whatever the initial rotation angle is, which is demonstrated to converge monotonically. The experimental results validate that the proposed algorithm is more accurate and robust compared with the original ICP algorithm. PMID:29176780
NASA Astrophysics Data System (ADS)
Chen, Xiang; Li, Jingchao; Han, Hui; Ying, Yulong
2018-05-01
Because of the limitations of the traditional fractal box-counting dimension algorithm in subtle feature extraction of radiation source signals, a dual improved generalized fractal box-counting dimension eigenvector algorithm is proposed. First, the radiation source signal was preprocessed, and a Hilbert transform was performed to obtain the instantaneous amplitude of the signal. Then, the improved fractal box-counting dimension of the signal instantaneous amplitude was extracted as the first eigenvector. At the same time, the improved fractal box-counting dimension of the signal without the Hilbert transform was extracted as the second eigenvector. Finally, the dual improved fractal box-counting dimension eigenvectors formed the multi-dimensional eigenvectors as signal subtle features, which were used for radiation source signal recognition by the grey relation algorithm. The experimental results show that, compared with the traditional fractal box-counting dimension algorithm and the single improved fractal box-counting dimension algorithm, the proposed dual improved fractal box-counting dimension algorithm can better extract the signal subtle distribution characteristics under different reconstruction phase space, and has a better recognition effect with good real-time performance.
Automated isotope identification algorithm using artificial neural networks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kamuda, Mark; Stinnett, Jacob; Sullivan, Clair
There is a need to develop an algorithm that can determine the relative activities of radio-isotopes in a large dataset of low-resolution gamma-ray spectra that contains a mixture of many radio-isotopes. Low-resolution gamma-ray spectra that contain mixtures of radio-isotopes often exhibit feature over-lap, requiring algorithms that can analyze these features when overlap occurs. While machine learning and pattern recognition algorithms have shown promise for the problem of radio-isotope identification, their ability to identify and quantify mixtures of radio-isotopes has not been studied. Because machine learning algorithms use abstract features of the spectrum, such as the shape of overlapping peaks andmore » Compton continuum, they are a natural choice for analyzing radio-isotope mixtures. An artificial neural network (ANN) has be trained to calculate the relative activities of 32 radio-isotopes in a spectrum. Furthermore, the ANN is trained with simulated gamma-ray spectra, allowing easy expansion of the library of target radio-isotopes. In this paper we present our initial algorithms based on an ANN and evaluate them against a series measured and simulated spectra.« less
Automated isotope identification algorithm using artificial neural networks
Kamuda, Mark; Stinnett, Jacob; Sullivan, Clair
2017-04-12
There is a need to develop an algorithm that can determine the relative activities of radio-isotopes in a large dataset of low-resolution gamma-ray spectra that contains a mixture of many radio-isotopes. Low-resolution gamma-ray spectra that contain mixtures of radio-isotopes often exhibit feature over-lap, requiring algorithms that can analyze these features when overlap occurs. While machine learning and pattern recognition algorithms have shown promise for the problem of radio-isotope identification, their ability to identify and quantify mixtures of radio-isotopes has not been studied. Because machine learning algorithms use abstract features of the spectrum, such as the shape of overlapping peaks andmore » Compton continuum, they are a natural choice for analyzing radio-isotope mixtures. An artificial neural network (ANN) has be trained to calculate the relative activities of 32 radio-isotopes in a spectrum. Furthermore, the ANN is trained with simulated gamma-ray spectra, allowing easy expansion of the library of target radio-isotopes. In this paper we present our initial algorithms based on an ANN and evaluate them against a series measured and simulated spectra.« less
FSMRank: feature selection algorithm for learning to rank.
Lai, Han-Jiang; Pan, Yan; Tang, Yong; Yu, Rong
2013-06-01
In recent years, there has been growing interest in learning to rank. The introduction of feature selection into different learning problems has been proven effective. These facts motivate us to investigate the problem of feature selection for learning to rank. We propose a joint convex optimization formulation which minimizes ranking errors while simultaneously conducting feature selection. This optimization formulation provides a flexible framework in which we can easily incorporate various importance measures and similarity measures of the features. To solve this optimization problem, we use the Nesterov's approach to derive an accelerated gradient algorithm with a fast convergence rate O(1/T(2)). We further develop a generalization bound for the proposed optimization problem using the Rademacher complexities. Extensive experimental evaluations are conducted on the public LETOR benchmark datasets. The results demonstrate that the proposed method shows: 1) significant ranking performance gain compared to several feature selection baselines for ranking, and 2) very competitive performance compared to several state-of-the-art learning-to-rank algorithms.
Salari, Nader; Shohaimi, Shamarina; Najafi, Farid; Nallappan, Meenakshii; Karishnarajah, Isthrinayagy
2014-01-01
Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms, genetic algorithms, and artificial neural networks are considered as the most common and effective methods in classification problems in numerous studies. In the present study, the results of the implementation of a novel hybrid feature selection-classification model using the above mentioned methods are presented. The purpose is benefitting from the synergies obtained from combining these technologies for the development of classification models. Such a combination creates an opportunity to invest in the strength of each algorithm, and is an approach to make up for their deficiencies. To develop proposed model, with the aim of obtaining the best array of features, first, feature ranking techniques such as the Fisher's discriminant ratio and class separability criteria were used to prioritize features. Second, the obtained results that included arrays of the top-ranked features were used as the initial population of a genetic algorithm to produce optimum arrays of features. Third, using a modified k-Nearest Neighbor method as well as an improved method of backpropagation neural networks, the classification process was advanced based on optimum arrays of the features selected by genetic algorithms. The performance of the proposed model was compared with thirteen well-known classification models based on seven datasets. Furthermore, the statistical analysis was performed using the Friedman test followed by post-hoc tests. The experimental findings indicated that the novel proposed hybrid model resulted in significantly better classification performance compared with all 13 classification methods. Finally, the performance results of the proposed model was benchmarked against the best ones reported as the state-of-the-art classifiers in terms of classification accuracy for the same data sets. The substantial findings of the comprehensive comparative study revealed that performance of the proposed model in terms of classification accuracy is desirable, promising, and competitive to the existing state-of-the-art classification models. PMID:25419659
Pose and motion recovery from feature correspondences and a digital terrain map.
Lerner, Ronen; Rivlin, Ehud; Rotstein, Héctor P
2006-09-01
A novel algorithm for pose and motion estimation using corresponding features and a Digital Terrain Map is proposed. Using a Digital Terrain (or Digital Elevation) Map (DTM/DEM) as a global reference enables the elimination of the ambiguity present in vision-based algorithms for motion recovery. As a consequence, the absolute position and orientation of a camera can be recovered with respect to the external reference frame. In order to do this, the DTM is used to formulate a constraint between corresponding features in two consecutive frames. Explicit reconstruction of the 3D world is not required. When considering a number of feature points, the resulting constraints can be solved using nonlinear optimization in terms of position, orientation, and motion. Such a procedure requires an initial guess of these parameters, which can be obtained from dead-reckoning or any other source. The feasibility of the algorithm is established through extensive experimentation. Performance is compared with a state-of-the-art alternative algorithm, which intermediately reconstructs the 3D structure and then registers it to the DTM. A clear advantage for the novel algorithm is demonstrated in variety of scenarios.
Automatic segmentation of psoriasis lesions
NASA Astrophysics Data System (ADS)
Ning, Yang; Shi, Chenbo; Wang, Li; Shu, Chang
2014-10-01
The automatic segmentation of psoriatic lesions is widely researched these years. It is an important step in Computer-aid methods of calculating PASI for estimation of lesions. Currently those algorithms can only handle single erythema or only deal with scaling segmentation. In practice, scaling and erythema are often mixed together. In order to get the segmentation of lesions area - this paper proposes an algorithm based on Random forests with color and texture features. The algorithm has three steps. The first step, the polarized light is applied based on the skin's Tyndall-effect in the imaging to eliminate the reflection and Lab color space are used for fitting the human perception. The second step, sliding window and its sub windows are used to get textural feature and color feature. In this step, a feature of image roughness has been defined, so that scaling can be easily separated from normal skin. In the end, Random forests will be used to ensure the generalization ability of the algorithm. This algorithm can give reliable segmentation results even the image has different lighting conditions, skin types. In the data set offered by Union Hospital, more than 90% images can be segmented accurately.
Filter bank common spatial patterns in mental workload estimation.
Arvaneh, Mahnaz; Umilta, Alberto; Robertson, Ian H
2015-01-01
EEG-based workload estimation technology provides a real time means of assessing mental workload. Such technology can effectively enhance the performance of the human-machine interaction and the learning process. When designing workload estimation algorithms, a crucial signal processing component is the feature extraction step. Despite several studies on this field, the spatial properties of the EEG signals were mostly neglected. Since EEG inherently has a poor spacial resolution, features extracted individually from each EEG channel may not be sufficiently efficient. This problem becomes more pronounced when we use low-cost but convenient EEG sensors with limited stability which is the case in practical scenarios. To address this issue, in this paper, we introduce a filter bank common spatial patterns algorithm combined with a feature selection method to extract spatio-spectral features discriminating different mental workload levels. To evaluate the proposed algorithm, we carry out a comparative analysis between two representative types of working memory tasks using data recorded from an Emotiv EPOC headset which is a mobile low-cost EEG recording device. The experimental results showed that the proposed spatial filtering algorithm outperformed the state-of-the algorithms in terms of the classification accuracy.
A fast algorithm for identifying friends-of-friends halos
NASA Astrophysics Data System (ADS)
Feng, Y.; Modi, C.
2017-07-01
We describe a simple and fast algorithm for identifying friends-of-friends features and prove its correctness. The algorithm avoids unnecessary expensive neighbor queries, uses minimal memory overhead, and rejects slowdown in high over-density regions. We define our algorithm formally based on pair enumeration, a problem that has been heavily studied in fast 2-point correlation codes and our reference implementation employs a dual KD-tree correlation function code. We construct features in a hierarchical tree structure, and use a splay operation to reduce the average cost of identifying the root of a feature from O [ log L ] to O [ 1 ] (L is the size of a feature) without additional memory costs. This reduces the overall time complexity of merging trees from O [ L log L ] to O [ L ] , reducing the number of operations per splay by orders of magnitude. We next introduce a pruning operation that skips merge operations between two fully self-connected KD-tree nodes. This improves the robustness of the algorithm, reducing the number of merge operations in high density peaks from O [δ2 ] to O [ δ ] . We show that for cosmological data set the algorithm eliminates more than half of merge operations for typically used linking lengths b ∼ 0 . 2 (relative to mean separation). Furthermore, our algorithm is extremely simple and easy to implement on top of an existing pair enumeration code, reusing the optimization effort that has been invested in fast correlation function codes.
Yu, Sheng; Liao, Katherine P; Shaw, Stanley Y; Gainer, Vivian S; Churchill, Susanne E; Szolovits, Peter; Murphy, Shawn N; Kohane, Isaac S; Cai, Tianxi
2015-09-01
Analysis of narrative (text) data from electronic health records (EHRs) can improve population-scale phenotyping for clinical and genetic research. Currently, selection of text features for phenotyping algorithms is slow and laborious, requiring extensive and iterative involvement by domain experts. This paper introduces a method to develop phenotyping algorithms in an unbiased manner by automatically extracting and selecting informative features, which can be comparable to expert-curated ones in classification accuracy. Comprehensive medical concepts were collected from publicly available knowledge sources in an automated, unbiased fashion. Natural language processing (NLP) revealed the occurrence patterns of these concepts in EHR narrative notes, which enabled selection of informative features for phenotype classification. When combined with additional codified features, a penalized logistic regression model was trained to classify the target phenotype. The authors applied our method to develop algorithms to identify patients with rheumatoid arthritis and coronary artery disease cases among those with rheumatoid arthritis from a large multi-institutional EHR. The area under the receiver operating characteristic curves (AUC) for classifying RA and CAD using models trained with automated features were 0.951 and 0.929, respectively, compared to the AUCs of 0.938 and 0.929 by models trained with expert-curated features. Models trained with NLP text features selected through an unbiased, automated procedure achieved comparable or slightly higher accuracy than those trained with expert-curated features. The majority of the selected model features were interpretable. The proposed automated feature extraction method, generating highly accurate phenotyping algorithms with improved efficiency, is a significant step toward high-throughput phenotyping. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Vehicle logo recognition using multi-level fusion model
NASA Astrophysics Data System (ADS)
Ming, Wei; Xiao, Jianli
2018-04-01
Vehicle logo recognition plays an important role in manufacturer identification and vehicle recognition. This paper proposes a new vehicle logo recognition algorithm. It has a hierarchical framework, which consists of two fusion levels. At the first level, a feature fusion model is employed to map the original features to a higher dimension feature space. In this space, the vehicle logos become more recognizable. At the second level, a weighted voting strategy is proposed to promote the accuracy and the robustness of the recognition results. To evaluate the performance of the proposed algorithm, extensive experiments are performed, which demonstrate that the proposed algorithm can achieve high recognition accuracy and work robustly.
Mala, S.; Latha, K.
2014-01-01
Activity recognition is needed in different requisition, for example, reconnaissance system, patient monitoring, and human-computer interfaces. Feature selection plays an important role in activity recognition, data mining, and machine learning. In selecting subset of features, an efficient evolutionary algorithm Differential Evolution (DE), a very efficient optimizer, is used for finding informative features from eye movements using electrooculography (EOG). Many researchers use EOG signals in human-computer interactions with various computational intelligence methods to analyze eye movements. The proposed system involves analysis of EOG signals using clearness based features, minimum redundancy maximum relevance features, and Differential Evolution based features. This work concentrates more on the feature selection algorithm based on DE in order to improve the classification for faultless activity recognition. PMID:25574185
Mala, S; Latha, K
2014-01-01
Activity recognition is needed in different requisition, for example, reconnaissance system, patient monitoring, and human-computer interfaces. Feature selection plays an important role in activity recognition, data mining, and machine learning. In selecting subset of features, an efficient evolutionary algorithm Differential Evolution (DE), a very efficient optimizer, is used for finding informative features from eye movements using electrooculography (EOG). Many researchers use EOG signals in human-computer interactions with various computational intelligence methods to analyze eye movements. The proposed system involves analysis of EOG signals using clearness based features, minimum redundancy maximum relevance features, and Differential Evolution based features. This work concentrates more on the feature selection algorithm based on DE in order to improve the classification for faultless activity recognition.
Genetic Algorithms and Classification Trees in Feature Discovery: Diabetes and the NHANES database
DOE Office of Scientific and Technical Information (OSTI.GOV)
Heredia-Langner, Alejandro; Jarman, Kristin H.; Amidan, Brett G.
2013-09-01
This paper presents a feature selection methodology that can be applied to datasets containing a mixture of continuous and categorical variables. Using a Genetic Algorithm (GA), this method explores a dataset and selects a small set of features relevant for the prediction of a binary (1/0) response. Binary classification trees and an objective function based on conditional probabilities are used to measure the fitness of a given subset of features. The method is applied to health data in order to find factors useful for the prediction of diabetes. Results show that our algorithm is capable of narrowing down the setmore » of predictors to around 8 factors that can be validated using reputable medical and public health resources.« less
Spatial-time-state fusion algorithm for defect detection through eddy current pulsed thermography
NASA Astrophysics Data System (ADS)
Xiao, Xiang; Gao, Bin; Woo, Wai Lok; Tian, Gui Yun; Xiao, Xiao Ting
2018-05-01
Eddy Current Pulsed Thermography (ECPT) has received extensive attention due to its high sensitive of detectability on surface and subsurface cracks. However, it remains as a difficult challenge in unsupervised detection as to identify defects without knowing any prior knowledge. This paper presents a spatial-time-state features fusion algorithm to obtain fully profile of the defects by directional scanning. The proposed method is intended to conduct features extraction by using independent component analysis (ICA) and automatic features selection embedding genetic algorithm. Finally, the optimal feature of each step is fused to obtain defects reconstruction by applying common orthogonal basis extraction (COBE) method. Experiments have been conducted to validate the study and verify the efficacy of the proposed method on blind defect detection.
Advanced Techniques for Scene Analysis
2010-06-01
robustness prefers a bigger intergration window to handle larger motions. The advantage of pyramidal implementation is that, while each motion vector dL...labeled SAR images. Now the previous algorithm leads to a more dedicated classifier for the particular target; however, our algorithm trades generality for...accuracy is traded for generality. 7.3.2 I-RELIEF Feature weighting transforms the original feature vector x into a new feature vector x′ by assigning each
ibex: An open infrastructure software platform to facilitate collaborative work in radiomics
Zhang, Lifei; Fried, David V.; Fave, Xenia J.; Hunter, Luke A.; Court, Laurence E.
2015-01-01
Purpose: Radiomics, which is the high-throughput extraction and analysis of quantitative image features, has been shown to have considerable potential to quantify the tumor phenotype. However, at present, a lack of software infrastructure has impeded the development of radiomics and its applications. Therefore, the authors developed the imaging biomarker explorer (ibex), an open infrastructure software platform that flexibly supports common radiomics workflow tasks such as multimodality image data import and review, development of feature extraction algorithms, model validation, and consistent data sharing among multiple institutions. Methods: The ibex software package was developed using the matlab and c/c++ programming languages. The software architecture deploys the modern model-view-controller, unit testing, and function handle programming concepts to isolate each quantitative imaging analysis task, to validate if their relevant data and algorithms are fit for use, and to plug in new modules. On one hand, ibex is self-contained and ready to use: it has implemented common data importers, common image filters, and common feature extraction algorithms. On the other hand, ibex provides an integrated development environment on top of matlab and c/c++, so users are not limited to its built-in functions. In the ibex developer studio, users can plug in, debug, and test new algorithms, extending ibex’s functionality. ibex also supports quality assurance for data and feature algorithms: image data, regions of interest, and feature algorithm-related data can be reviewed, validated, and/or modified. More importantly, two key elements in collaborative workflows, the consistency of data sharing and the reproducibility of calculation result, are embedded in the ibex workflow: image data, feature algorithms, and model validation including newly developed ones from different users can be easily and consistently shared so that results can be more easily reproduced between institutions. Results: Researchers with a variety of technical skill levels, including radiation oncologists, physicists, and computer scientists, have found the ibex software to be intuitive, powerful, and easy to use. ibex can be run at any computer with the windows operating system and 1GB RAM. The authors fully validated the implementation of all importers, preprocessing algorithms, and feature extraction algorithms. Windows version 1.0 beta of stand-alone ibex and ibex’s source code can be downloaded. Conclusions: The authors successfully implemented ibex, an open infrastructure software platform that streamlines common radiomics workflow tasks. Its transparency, flexibility, and portability can greatly accelerate the pace of radiomics research and pave the way toward successful clinical translation. PMID:25735289
IBEX: an open infrastructure software platform to facilitate collaborative work in radiomics.
Zhang, Lifei; Fried, David V; Fave, Xenia J; Hunter, Luke A; Yang, Jinzhong; Court, Laurence E
2015-03-01
Radiomics, which is the high-throughput extraction and analysis of quantitative image features, has been shown to have considerable potential to quantify the tumor phenotype. However, at present, a lack of software infrastructure has impeded the development of radiomics and its applications. Therefore, the authors developed the imaging biomarker explorer (IBEX), an open infrastructure software platform that flexibly supports common radiomics workflow tasks such as multimodality image data import and review, development of feature extraction algorithms, model validation, and consistent data sharing among multiple institutions. The IBEX software package was developed using the MATLAB and c/c++ programming languages. The software architecture deploys the modern model-view-controller, unit testing, and function handle programming concepts to isolate each quantitative imaging analysis task, to validate if their relevant data and algorithms are fit for use, and to plug in new modules. On one hand, IBEX is self-contained and ready to use: it has implemented common data importers, common image filters, and common feature extraction algorithms. On the other hand, IBEX provides an integrated development environment on top of MATLAB and c/c++, so users are not limited to its built-in functions. In the IBEX developer studio, users can plug in, debug, and test new algorithms, extending IBEX's functionality. IBEX also supports quality assurance for data and feature algorithms: image data, regions of interest, and feature algorithm-related data can be reviewed, validated, and/or modified. More importantly, two key elements in collaborative workflows, the consistency of data sharing and the reproducibility of calculation result, are embedded in the IBEX workflow: image data, feature algorithms, and model validation including newly developed ones from different users can be easily and consistently shared so that results can be more easily reproduced between institutions. Researchers with a variety of technical skill levels, including radiation oncologists, physicists, and computer scientists, have found the IBEX software to be intuitive, powerful, and easy to use. IBEX can be run at any computer with the windows operating system and 1GB RAM. The authors fully validated the implementation of all importers, preprocessing algorithms, and feature extraction algorithms. Windows version 1.0 beta of stand-alone IBEX and IBEX's source code can be downloaded. The authors successfully implemented IBEX, an open infrastructure software platform that streamlines common radiomics workflow tasks. Its transparency, flexibility, and portability can greatly accelerate the pace of radiomics research and pave the way toward successful clinical translation.
Automatic extraction of planetary image features
NASA Technical Reports Server (NTRS)
LeMoigne-Stewart, Jacqueline J. (Inventor); Troglio, Giulia (Inventor); Benediktsson, Jon A. (Inventor); Serpico, Sebastiano B. (Inventor); Moser, Gabriele (Inventor)
2013-01-01
A method for the extraction of Lunar data and/or planetary features is provided. The feature extraction method can include one or more image processing techniques, including, but not limited to, a watershed segmentation and/or the generalized Hough Transform. According to some embodiments, the feature extraction method can include extracting features, such as, small rocks. According to some embodiments, small rocks can be extracted by applying a watershed segmentation algorithm to the Canny gradient. According to some embodiments, applying a watershed segmentation algorithm to the Canny gradient can allow regions that appear as close contours in the gradient to be segmented.
The life-cycle of upper-tropospheric jet streams identified with a novel data segmentation algorithm
NASA Astrophysics Data System (ADS)
Limbach, S.; Schömer, E.; Wernli, H.
2010-09-01
Jet streams are prominent features of the upper-tropospheric atmospheric flow. Through the thermal wind relationship these regions with intense horizontal wind speed (typically larger than 30 m/s) are associated with pronounced baroclinicity, i.e., with regions where extratropical cyclones develop due to baroclinic instability processes. Individual jet streams are non-stationary elongated features that can extend over more than 2000 km in the along-flow and 200-500 km in the across-flow direction, respectively. Their lifetime can vary between a few days and several weeks. In recent years, feature-based algorithms have been developed that allow compiling synoptic climatologies and typologies of upper-tropospheric jet streams based upon objective selection criteria and climatological reanalysis datasets. In this study a novel algorithm to efficiently identify jet streams using an extended region-growing segmentation approach is introduced. This algorithm iterates over a 4-dimensional field of horizontal wind speed from ECMWF analyses and decides at each grid point whether all prerequisites for a jet stream are met. In a single pass the algorithm keeps track of all adjacencies of these grid points and creates the 4-dimensional connected segments associated with each jet stream. In addition to the detection of these sets of connected grid points, the algorithm analyzes the development over time of the distinct 3-dimensional features each segment consists of. Important events in the development of these features, for example mergings and splittings, are detected and analyzed on a per-grid-point and per-feature basis. The output of the algorithm consists of the actual sets of grid-points augmented with information about the particular events, and of the so-called event graphs, which are an abstract representation of the distinct 3-dimensional features and events of each segment. This technique provides comprehensive information about the frequency of upper-tropospheric jet streams, their preferred regions of genesis, merging, splitting, and lysis, and statistical information about their size, amplitude and lifetime. The presentation will introduce the technique, provide example visualizations of the time evolution of the identified 3-dimensional jet stream features, and present results from a first multi-month "climatology" of upper-tropospheric jets. In the future, the technique can be applied to longer datasets, for instance reanalyses and output from global climate model simulations - and provide detailed information about key characteristics of jet stream life cycles.
Altazi, Baderaldeen A; Zhang, Geoffrey G; Fernandez, Daniel C; Montejo, Michael E; Hunt, Dylan; Werner, Joan; Biagioli, Matthew C; Moros, Eduardo G
2017-11-01
Site-specific investigations of the role of radiomics in cancer diagnosis and therapy are emerging. We evaluated the reproducibility of radiomic features extracted from 18 Flourine-fluorodeoxyglucose ( 18 F-FDG) PET images for three parameters: manual versus computer-aided segmentation methods, gray-level discretization, and PET image reconstruction algorithms. Our cohort consisted of pretreatment PET/CT scans from 88 cervical cancer patients. Two board-certified radiation oncologists manually segmented the metabolic tumor volume (MTV 1 and MTV 2 ) for each patient. For comparison, we used a graphical-based method to generate semiautomated segmented volumes (GBSV). To address any perturbations in radiomic feature values, we down-sampled the tumor volumes into three gray-levels: 32, 64, and 128 from the original gray-level of 256. Finally, we analyzed the effect on radiomic features on PET images of eight patients due to four PET 3D-reconstruction algorithms: maximum likelihood-ordered subset expectation maximization (OSEM) iterative reconstruction (IR) method, fourier rebinning-ML-OSEM (FOREIR), FORE-filtered back projection (FOREFBP), and 3D-Reprojection (3DRP) analytical method. We extracted 79 features from all segmentation method, gray-levels of down-sampled volumes, and PET reconstruction algorithms. The features were extracted using gray-level co-occurrence matrices (GLCM), gray-level size zone matrices (GLSZM), gray-level run-length matrices (GLRLM), neighborhood gray-tone difference matrices (NGTDM), shape-based features (SF), and intensity histogram features (IHF). We computed the Dice coefficient between each MTV and GBSV to measure segmentation accuracy. Coefficient values close to one indicate high agreement, and values close to zero indicate low agreement. We evaluated the effect on radiomic features by calculating the mean percentage differences (d¯) between feature values measured from each pair of parameter elements (i.e. segmentation methods: MTV 1 -MTV 2 , MTV 1 -GBSV, MTV 2 -GBSV; gray-levels: 64-32, 64-128, and 64-256; reconstruction algorithms: OSEM-FORE-OSEM, OSEM-FOREFBP, and OSEM-3DRP). We used |d¯| as a measure of radiomic feature reproducibility level, where any feature scored |d¯| ±SD ≤ |25|% ± 35% was considered reproducible. We used Bland-Altman analysis to evaluate the mean, standard deviation (SD), and upper/lower reproducibility limits (U/LRL) for radiomic features in response to variation in each testing parameter. Furthermore, we proposed U/LRL as a method to classify the level of reproducibility: High- ±1% ≤ U/LRL ≤ ±30%; Intermediate- ±30% < U/LRL ≤ ±45%; Low- ±45 < U/LRL ≤ ±50%. We considered any feature below the low level as nonreproducible (NR). Finally, we calculated the interclass correlation coefficient (ICC) to evaluate the reliability of radiomic feature measurements for each parameter. The segmented volumes of 65 patients (81.3%) scored Dice coefficient >0.75 for all three volumes. The result outcomes revealed a tendency of higher radiomic feature reproducibility among segmentation pair MTV 1 -GBSV than MTV 2 -GBSV, gray-level pairs of 64-32 and 64-128 than 64-256, and reconstruction algorithm pairs of OSEM-FOREIR and OSEM-FOREFBP than OSEM-3DRP. Although the choice of cervical tumor segmentation method, gray-level value, and reconstruction algorithm may affect radiomic features, some features were characterized by high reproducibility through all testing parameters. The number of radiomic features that showed insensitivity to variations in segmentation methods, gray-level discretization, and reconstruction algorithms was 10 (13%), 4 (5%), and 1 (1%), respectively. These results suggest that a careful analysis of the effects of these parameters is essential prior to any radiomics clinical application. © 2017 The Authors. Journal of Applied Clinical Medical Physics published by Wiley Periodicals, Inc. on behalf of American Association of Physicists in Medicine.
Improve threshold segmentation using features extraction to automatic lung delimitation.
França, Cleunio; Vasconcelos, Germano; Diniz, Paula; Melo, Pedro; Diniz, Jéssica; Novaes, Magdala
2013-01-01
With the consolidation of PACS and RIS systems, the development of algorithms for tissue segmentation and diseases detection have intensely evolved in recent years. These algorithms have advanced to improve its accuracy and specificity, however, there is still some way until these algorithms achieved satisfactory error rates and reduced processing time to be used in daily diagnosis. The objective of this study is to propose a algorithm for lung segmentation in x-ray computed tomography images using features extraction, as Centroid and orientation measures, to improve the basic threshold segmentation. As result we found a accuracy of 85.5%.
Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.
Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo
2015-08-01
Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.
NASA Astrophysics Data System (ADS)
Agurto, C.; Barriga, S.; Murray, V.; Pattichis, M.; Soliz, P.
2010-03-01
Diabetic retinopathy (DR) is one of the leading causes of blindness among adult Americans. Automatic methods for detection of the disease have been developed in recent years, most of them addressing the segmentation of bright and red lesions. In this paper we present an automatic DR screening system that does approach the problem through the segmentation of features. The algorithm determines non-diseased retinal images from those with pathology based on textural features obtained using multiscale Amplitude Modulation-Frequency Modulation (AM-FM) decompositions. The decomposition is represented as features that are the inputs to a classifier. The algorithm achieves 0.88 area under the ROC curve (AROC) for a set of 280 images from the MESSIDOR database. The algorithm is then used to analyze the effects of image compression and degradation, which will be present in most actual clinical or screening environments. Results show that the algorithm is insensitive to illumination variations, but high rates of compression and large blurring effects degrade its performance.
Automated spike sorting algorithm based on Laplacian eigenmaps and k-means clustering.
Chah, E; Hok, V; Della-Chiesa, A; Miller, J J H; O'Mara, S M; Reilly, R B
2011-02-01
This study presents a new automatic spike sorting method based on feature extraction by Laplacian eigenmaps combined with k-means clustering. The performance of the proposed method was compared against previously reported algorithms such as principal component analysis (PCA) and amplitude-based feature extraction. Two types of classifier (namely k-means and classification expectation-maximization) were incorporated within the spike sorting algorithms, in order to find a suitable classifier for the feature sets. Simulated data sets and in-vivo tetrode multichannel recordings were employed to assess the performance of the spike sorting algorithms. The results show that the proposed algorithm yields significantly improved performance with mean sorting accuracy of 73% and sorting error of 10% compared to PCA which combined with k-means had a sorting accuracy of 58% and sorting error of 10%.A correction was made to this article on 22 February 2011. The spacing of the title was amended on the abstract page. No changes were made to the article PDF and the print version was unaffected.
CpG islands: algorithms and applications in methylation studies.
Zhao, Zhongming; Han, Leng
2009-05-15
Methylation occurs frequently at 5'-cytosine of the CpG dinucleotides in vertebrate genomes; however, this epigenetic feature is rarely observed in CpG islands (CGIs) or CpG clusters in the promoter regions of genes. Aberrant methylation of the promoter-associated CGIs might influence gene expression and cause carcinogenesis. Because of the functional importance, multiple algorithms have been available for identifying CGIs in a genome or a sequence. They can be categorized into the traditional algorithms (e.g., Gardiner-Garden and Frommer (1987), Takai and Jones (2002), and CpGPRoD (2002)) or statistical property based algorithms (CpGcluster (2006) and CG cluster (2007)). We reviewed the features of these algorithms and evaluated their performance on identifying functional CGIs using genome-wide methylation data. Moreover, identification of CGIs is an initial step in many recent studies for predicting methylation status as well as in the design of methylation detection platforms. We reviewed the benchmarks and features used in these studies.
An improved KCF tracking algorithm based on multi-feature and multi-scale
NASA Astrophysics Data System (ADS)
Wu, Wei; Wang, Ding; Luo, Xin; Su, Yang; Tian, Weiye
2018-02-01
The purpose of visual tracking is to associate the target object in a continuous video frame. In recent years, the method based on the kernel correlation filter has become the research hotspot. However, the algorithm still has some problems such as video capture equipment fast jitter, tracking scale transformation. In order to improve the ability of scale transformation and feature description, this paper has carried an innovative algorithm based on the multi feature fusion and multi-scale transform. The experimental results show that our method solves the problem that the target model update when is blocked or its scale transforms. The accuracy of the evaluation (OPE) is 77.0%, 75.4% and the success rate is 69.7%, 66.4% on the VOT and OTB datasets. Compared with the optimal one of the existing target-based tracking algorithms, the accuracy of the algorithm is improved by 6.7% and 6.3% respectively. The success rates are improved by 13.7% and 14.2% respectively.
Determination of feature generation methods for PTZ camera object tracking
NASA Astrophysics Data System (ADS)
Doyle, Daniel D.; Black, Jonathan T.
2012-06-01
Object detection and tracking using computer vision (CV) techniques have been widely applied to sensor fusion applications. Many papers continue to be written that speed up performance and increase learning of artificially intelligent systems through improved algorithms, workload distribution, and information fusion. Military application of real-time tracking systems is becoming more and more complex with an ever increasing need of fusion and CV techniques to actively track and control dynamic systems. Examples include the use of metrology systems for tracking and measuring micro air vehicles (MAVs) and autonomous navigation systems for controlling MAVs. This paper seeks to contribute to the determination of select tracking algorithms that best track a moving object using a pan/tilt/zoom (PTZ) camera applicable to both of the examples presented. The select feature generation algorithms compared in this paper are the trained Scale-Invariant Feature Transform (SIFT) and Speeded Up Robust Features (SURF), the Mixture of Gaussians (MoG) background subtraction method, the Lucas- Kanade optical flow method (2000) and the Farneback optical flow method (2003). The matching algorithm used in this paper for the trained feature generation algorithms is the Fast Library for Approximate Nearest Neighbors (FLANN). The BSD licensed OpenCV library is used extensively to demonstrate the viability of each algorithm and its performance. Initial testing is performed on a sequence of images using a stationary camera. Further testing is performed on a sequence of images such that the PTZ camera is moving in order to capture the moving object. Comparisons are made based upon accuracy, speed and memory.
Prominent feature extraction for review analysis: an empirical study
NASA Astrophysics Data System (ADS)
Agarwal, Basant; Mittal, Namita
2016-05-01
Sentiment analysis (SA) research has increased tremendously in recent times. SA aims to determine the sentiment orientation of a given text into positive or negative polarity. Motivation for SA research is the need for the industry to know the opinion of the users about their product from online portals, blogs, discussion boards and reviews and so on. Efficient features need to be extracted for machine-learning algorithm for better sentiment classification. In this paper, initially various features are extracted such as unigrams, bi-grams and dependency features from the text. In addition, new bi-tagged features are also extracted that conform to predefined part-of-speech patterns. Furthermore, various composite features are created using these features. Information gain (IG) and minimum redundancy maximum relevancy (mRMR) feature selection methods are used to eliminate the noisy and irrelevant features from the feature vector. Finally, machine-learning algorithms are used for classifying the review document into positive or negative class. Effects of different categories of features are investigated on four standard data-sets, namely, movie review and product (book, DVD and electronics) review data-sets. Experimental results show that composite features created from prominent features of unigram and bi-tagged features perform better than other features for sentiment classification. mRMR is a better feature selection method as compared with IG for sentiment classification. Boolean Multinomial Naïve Bayes) algorithm performs better than support vector machine classifier for SA in terms of accuracy and execution time.
Cascaded face alignment via intimacy definition feature
NASA Astrophysics Data System (ADS)
Li, Hailiang; Lam, Kin-Man; Chiu, Man-Yau; Wu, Kangheng; Lei, Zhibin
2017-09-01
Recent years have witnessed the emerging popularity of regression-based face aligners, which directly learn mappings between facial appearance and shape-increment manifolds. We propose a random-forest based, cascaded regression model for face alignment by using a locally lightweight feature, namely intimacy definition feature. This feature is more discriminative than the pose-indexed feature, more efficient than the histogram of oriented gradients feature and the scale-invariant feature transform feature, and more compact than the local binary feature (LBF). Experimental validation of our algorithm shows that our approach achieves state-of-the-art performance when testing on some challenging datasets. Compared with the LBF-based algorithm, our method achieves about twice the speed, 20% improvement in terms of alignment accuracy and saves an order of magnitude on memory requirement.
Membership-degree preserving discriminant analysis with applications to face recognition.
Yang, Zhangjing; Liu, Chuancai; Huang, Pu; Qian, Jianjun
2013-01-01
In pattern recognition, feature extraction techniques have been widely employed to reduce the dimensionality of high-dimensional data. In this paper, we propose a novel feature extraction algorithm called membership-degree preserving discriminant analysis (MPDA) based on the fisher criterion and fuzzy set theory for face recognition. In the proposed algorithm, the membership degree of each sample to particular classes is firstly calculated by the fuzzy k-nearest neighbor (FKNN) algorithm to characterize the similarity between each sample and class centers, and then the membership degree is incorporated into the definition of the between-class scatter and the within-class scatter. The feature extraction criterion via maximizing the ratio of the between-class scatter to the within-class scatter is applied. Experimental results on the ORL, Yale, and FERET face databases demonstrate the effectiveness of the proposed algorithm.
Automatic detection of multi-level acetowhite regions in RGB color images of the uterine cervix
NASA Astrophysics Data System (ADS)
Lange, Holger
2005-04-01
Uterine cervical cancer is the second most common cancer among women worldwide. Colposcopy is a diagnostic method used to detect cancer precursors and cancer of the uterine cervix, whereby a physician (colposcopist) visually inspects the metaplastic epithelium on the cervix for certain distinctly abnormal morphologic features. A contrast agent, a 3-5% acetic acid solution, is used, causing abnormal and metaplastic epithelia to turn white. The colposcopist considers diagnostic features such as the acetowhite, blood vessel structure, and lesion margin to derive a clinical diagnosis. STI Medical Systems is developing a Computer-Aided-Diagnosis (CAD) system for colposcopy -- ColpoCAD, a complex image analysis system that at its core assesses the same visual features as used by colposcopists. The acetowhite feature has been identified as one of the most important individual predictors of lesion severity. Here, we present the details and preliminary results of a multi-level acetowhite region detection algorithm for RGB color images of the cervix, including the detection of the anatomic features: cervix, os and columnar region, which are used for the acetowhite region detection. The RGB images are assumed to be glare free, either obtained by cross-polarized image acquisition or glare removal pre-processing. The basic approach of the algorithm is to extract a feature image from the RGB image that provides a good acetowhite to cervix background ratio, to segment the feature image using novel pixel grouping and multi-stage region-growing algorithms that provide region segmentations with different levels of detail, to extract the acetowhite regions from the region segmentations using a novel region selection algorithm, and then finally to extract the multi-levels from the acetowhite regions using multiple thresholds. The performance of the algorithm is demonstrated using human subject data.
Enhancement web proxy cache performance using Wrapper Feature Selection methods with NB and J48
NASA Astrophysics Data System (ADS)
Mahmoud Al-Qudah, Dua'a.; Funke Olanrewaju, Rashidah; Wong Azman, Amelia
2017-11-01
Web proxy cache technique reduces response time by storing a copy of pages between client and server sides. If requested pages are cached in the proxy, there is no need to access the server. Due to the limited size and excessive cost of cache compared to the other storages, cache replacement algorithm is used to determine evict page when the cache is full. On the other hand, the conventional algorithms for replacement such as Least Recently Use (LRU), First in First Out (FIFO), Least Frequently Use (LFU), Randomized Policy etc. may discard important pages just before use. Furthermore, using conventional algorithm cannot be well optimized since it requires some decision to intelligently evict a page before replacement. Hence, most researchers propose an integration among intelligent classifiers and replacement algorithm to improves replacement algorithms performance. This research proposes using automated wrapper feature selection methods to choose the best subset of features that are relevant and influence classifiers prediction accuracy. The result present that using wrapper feature selection methods namely: Best First (BFS), Incremental Wrapper subset selection(IWSS)embedded NB and particle swarm optimization(PSO)reduce number of features and have a good impact on reducing computation time. Using PSO enhance NB classifier accuracy by 1.1%, 0.43% and 0.22% over using NB with all features, using BFS and using IWSS embedded NB respectively. PSO rises J48 accuracy by 0.03%, 1.91 and 0.04% over using J48 classifier with all features, using IWSS-embedded NB and using BFS respectively. While using IWSS embedded NB fastest NB and J48 classifiers much more than BFS and PSO. However, it reduces computation time of NB by 0.1383 and reduce computation time of J48 by 2.998.
A ℓ2, 1 norm regularized multi-kernel learning for false positive reduction in Lung nodule CAD.
Cao, Peng; Liu, Xiaoli; Zhang, Jian; Li, Wei; Zhao, Dazhe; Huang, Min; Zaiane, Osmar
2017-03-01
The aim of this paper is to describe a novel algorithm for False Positive Reduction in lung nodule Computer Aided Detection(CAD). In this paper, we describes a new CT lung CAD method which aims to detect solid nodules. Specially, we proposed a multi-kernel classifier with a ℓ 2, 1 norm regularizer for heterogeneous feature fusion and selection from the feature subset level, and designed two efficient strategies to optimize the parameters of kernel weights in non-smooth ℓ 2, 1 regularized multiple kernel learning algorithm. The first optimization algorithm adapts a proximal gradient method for solving the ℓ 2, 1 norm of kernel weights, and use an accelerated method based on FISTA; the second one employs an iterative scheme based on an approximate gradient descent method. The results demonstrates that the FISTA-style accelerated proximal descent method is efficient for the ℓ 2, 1 norm formulation of multiple kernel learning with the theoretical guarantee of the convergence rate. Moreover, the experimental results demonstrate the effectiveness of the proposed methods in terms of Geometric mean (G-mean) and Area under the ROC curve (AUC), and significantly outperforms the competing methods. The proposed approach exhibits some remarkable advantages both in heterogeneous feature subsets fusion and classification phases. Compared with the fusion strategies of feature-level and decision level, the proposed ℓ 2, 1 norm multi-kernel learning algorithm is able to accurately fuse the complementary and heterogeneous feature sets, and automatically prune the irrelevant and redundant feature subsets to form a more discriminative feature set, leading a promising classification performance. Moreover, the proposed algorithm consistently outperforms the comparable classification approaches in the literature. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Noh, Hae Young; Rajagopal, Ram; Kiremidjian, Anne S.
2012-04-01
This paper introduces a damage diagnosis algorithm for civil structures that uses a sequential change point detection method for the cases where the post-damage feature distribution is unknown a priori. This algorithm extracts features from structural vibration data using time-series analysis and then declares damage using the change point detection method. The change point detection method asymptotically minimizes detection delay for a given false alarm rate. The conventional method uses the known pre- and post-damage feature distributions to perform a sequential hypothesis test. In practice, however, the post-damage distribution is unlikely to be known a priori. Therefore, our algorithm estimates and updates this distribution as data are collected using the maximum likelihood and the Bayesian methods. We also applied an approximate method to reduce the computation load and memory requirement associated with the estimation. The algorithm is validated using multiple sets of simulated data and a set of experimental data collected from a four-story steel special moment-resisting frame. Our algorithm was able to estimate the post-damage distribution consistently and resulted in detection delays only a few seconds longer than the delays from the conventional method that assumes we know the post-damage feature distribution. We confirmed that the Bayesian method is particularly efficient in declaring damage with minimal memory requirement, but the maximum likelihood method provides an insightful heuristic approach.
Harmony Search as a Powerful Tool for Feature Selection in QSPR Study of the Drugs Lipophilicity.
Bahadori, Behnoosh; Atabati, Morteza
2017-01-01
Aims & Scope: Lipophilicity represents one of the most studied and most frequently used fundamental physicochemical properties. In the present work, harmony search (HS) algorithm is suggested to feature selection in quantitative structure-property relationship (QSPR) modeling to predict lipophilicity of neutral, acidic, basic and amphotheric drugs that were determined by UHPLC. Harmony search is a music-based metaheuristic optimization algorithm. It was affected by the observation that the aim of music is to search for a perfect state of harmony. Semi-empirical quantum-chemical calculations at AM1 level were used to find the optimum 3D geometry of the studied molecules and variant descriptors (1497 descriptors) were calculated by the Dragon software. The selected descriptors by harmony search algorithm (9 descriptors) were applied for model development using multiple linear regression (MLR). In comparison with other feature selection methods such as genetic algorithm and simulated annealing, harmony search algorithm has better results. The root mean square error (RMSE) with and without leave-one out cross validation (LOOCV) were obtained 0.417 and 0.302, respectively. The results were compared with those obtained from the genetic algorithm and simulated annealing methods and it showed that the HS is a helpful tool for feature selection with fine performance. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
LDA boost classification: boosting by topics
NASA Astrophysics Data System (ADS)
Lei, La; Qiao, Guo; Qimin, Cao; Qitao, Li
2012-12-01
AdaBoost is an efficacious classification algorithm especially in text categorization (TC) tasks. The methodology of setting up a classifier committee and voting on the documents for classification can achieve high categorization precision. However, traditional Vector Space Model can easily lead to the curse of dimensionality and feature sparsity problems; so it affects classification performance seriously. This article proposed a novel classification algorithm called LDABoost based on boosting ideology which uses Latent Dirichlet Allocation (LDA) to modeling the feature space. Instead of using words or phrase, LDABoost use latent topics as the features. In this way, the feature dimension is significantly reduced. Improved Naïve Bayes (NB) is designed as the weaker classifier which keeps the efficiency advantage of classic NB algorithm and has higher precision. Moreover, a two-stage iterative weighted method called Cute Integration in this article is proposed for improving the accuracy by integrating weak classifiers into strong classifier in a more rational way. Mutual Information is used as metrics of weights allocation. The voting information and the categorization decision made by basis classifiers are fully utilized for generating the strong classifier. Experimental results reveals LDABoost making categorization in a low-dimensional space, it has higher accuracy than traditional AdaBoost algorithms and many other classic classification algorithms. Moreover, its runtime consumption is lower than different versions of AdaBoost, TC algorithms based on support vector machine and Neural Networks.
Leger, Stefan; Zwanenburg, Alex; Pilz, Karoline; Lohaus, Fabian; Linge, Annett; Zöphel, Klaus; Kotzerke, Jörg; Schreiber, Andreas; Tinhofer, Inge; Budach, Volker; Sak, Ali; Stuschke, Martin; Balermpas, Panagiotis; Rödel, Claus; Ganswindt, Ute; Belka, Claus; Pigorsch, Steffi; Combs, Stephanie E; Mönnich, David; Zips, Daniel; Krause, Mechthild; Baumann, Michael; Troost, Esther G C; Löck, Steffen; Richter, Christian
2017-10-16
Radiomics applies machine learning algorithms to quantitative imaging data to characterise the tumour phenotype and predict clinical outcome. For the development of radiomics risk models, a variety of different algorithms is available and it is not clear which one gives optimal results. Therefore, we assessed the performance of 11 machine learning algorithms combined with 12 feature selection methods by the concordance index (C-Index), to predict loco-regional tumour control (LRC) and overall survival for patients with head and neck squamous cell carcinoma. The considered algorithms are able to deal with continuous time-to-event survival data. Feature selection and model building were performed on a multicentre cohort (213 patients) and validated using an independent cohort (80 patients). We found several combinations of machine learning algorithms and feature selection methods which achieve similar results, e.g. C-Index = 0.71 and BT-COX: C-Index = 0.70 in combination with Spearman feature selection. Using the best performing models, patients were stratified into groups of low and high risk of recurrence. Significant differences in LRC were obtained between both groups on the validation cohort. Based on the presented analysis, we identified a subset of algorithms which should be considered in future radiomics studies to develop stable and clinically relevant predictive models for time-to-event endpoints.
Multispectra CWT-based algorithm (MCWT) in mass spectra for peak extraction.
Hsueh, Huey-Miin; Kuo, Hsun-Chih; Tsai, Chen-An
2008-01-01
An important objective in mass spectrometry (MS) is to identify a set of biomarkers that can be used to potentially distinguish patients between distinct treatments (or conditions) from tens or hundreds of spectra. A common two-step approach involving peak extraction and quantification is employed to identify the features of scientific interest. The selected features are then used for further investigation to understand underlying biological mechanism of individual protein or for development of genomic biomarkers to early diagnosis. However, the use of inadequate or ineffective peak detection and peak alignment algorithms in peak extraction step may lead to a high rate of false positives. Also, it is crucial to reduce the false positive rate in detecting biomarkers from ten or hundreds of spectra. Here a new procedure is introduced for feature extraction in mass spectrometry data that extends the continuous wavelet transform-based (CWT-based) algorithm to multiple spectra. The proposed multispectra CWT-based algorithm (MCWT) not only can perform peak detection for multiple spectra but also carry out peak alignment at the same time. The author' MCWT algorithm constructs a reference, which integrates information of multiple raw spectra, for feature extraction. The algorithm is applied to a SELDI-TOF mass spectra data set provided by CAMDA 2006 with known polypeptide m/z positions. This new approach is easy to implement and it outperforms the existing peak extraction method from the Bioconductor PROcess package.
2017-01-01
Electroencephalogram (EEG)-based decoding human brain activity is challenging, owing to the low spatial resolution of EEG. However, EEG is an important technique, especially for brain–computer interface applications. In this study, a novel algorithm is proposed to decode brain activity associated with different types of images. In this hybrid algorithm, convolutional neural network is modified for the extraction of features, a t-test is used for the selection of significant features and likelihood ratio-based score fusion is used for the prediction of brain activity. The proposed algorithm takes input data from multichannel EEG time-series, which is also known as multivariate pattern analysis. Comprehensive analysis was conducted using data from 30 participants. The results from the proposed method are compared with current recognized feature extraction and classification/prediction techniques. The wavelet transform-support vector machine method is the most popular currently used feature extraction and prediction method. This method showed an accuracy of 65.7%. However, the proposed method predicts the novel data with improved accuracy of 79.9%. In conclusion, the proposed algorithm outperformed the current feature extraction and prediction method. PMID:28558002
Multi-Source Multi-Target Dictionary Learning for Prediction of Cognitive Decline.
Zhang, Jie; Li, Qingyang; Caselli, Richard J; Thompson, Paul M; Ye, Jieping; Wang, Yalin
2017-06-01
Alzheimer's Disease (AD) is the most common type of dementia. Identifying correct biomarkers may determine pre-symptomatic AD subjects and enable early intervention. Recently, Multi-task sparse feature learning has been successfully applied to many computer vision and biomedical informatics researches. It aims to improve the generalization performance by exploiting the shared features among different tasks. However, most of the existing algorithms are formulated as a supervised learning scheme. Its drawback is with either insufficient feature numbers or missing label information. To address these challenges, we formulate an unsupervised framework for multi-task sparse feature learning based on a novel dictionary learning algorithm. To solve the unsupervised learning problem, we propose a two-stage Multi-Source Multi-Target Dictionary Learning (MMDL) algorithm. In stage 1, we propose a multi-source dictionary learning method to utilize the common and individual sparse features in different time slots. In stage 2, supported by a rigorous theoretical analysis, we develop a multi-task learning method to solve the missing label problem. Empirical studies on an N = 3970 longitudinal brain image data set, which involves 2 sources and 5 targets, demonstrate the improved prediction accuracy and speed efficiency of MMDL in comparison with other state-of-the-art algorithms.
Feature extraction and classification algorithms for high dimensional data
NASA Technical Reports Server (NTRS)
Lee, Chulhee; Landgrebe, David
1993-01-01
Feature extraction and classification algorithms for high dimensional data are investigated. Developments with regard to sensors for Earth observation are moving in the direction of providing much higher dimensional multispectral imagery than is now possible. In analyzing such high dimensional data, processing time becomes an important factor. With large increases in dimensionality and the number of classes, processing time will increase significantly. To address this problem, a multistage classification scheme is proposed which reduces the processing time substantially by eliminating unlikely classes from further consideration at each stage. Several truncation criteria are developed and the relationship between thresholds and the error caused by the truncation is investigated. Next an approach to feature extraction for classification is proposed based directly on the decision boundaries. It is shown that all the features needed for classification can be extracted from decision boundaries. A characteristic of the proposed method arises by noting that only a portion of the decision boundary is effective in discriminating between classes, and the concept of the effective decision boundary is introduced. The proposed feature extraction algorithm has several desirable properties: it predicts the minimum number of features necessary to achieve the same classification accuracy as in the original space for a given pattern recognition problem; and it finds the necessary feature vectors. The proposed algorithm does not deteriorate under the circumstances of equal means or equal covariances as some previous algorithms do. In addition, the decision boundary feature extraction algorithm can be used both for parametric and non-parametric classifiers. Finally, some problems encountered in analyzing high dimensional data are studied and possible solutions are proposed. First, the increased importance of the second order statistics in analyzing high dimensional data is recognized. By investigating the characteristics of high dimensional data, the reason why the second order statistics must be taken into account in high dimensional data is suggested. Recognizing the importance of the second order statistics, there is a need to represent the second order statistics. A method to visualize statistics using a color code is proposed. By representing statistics using color coding, one can easily extract and compare the first and the second statistics.
Learning to rank using user clicks and visual features for image retrieval.
Yu, Jun; Tao, Dacheng; Wang, Meng; Rui, Yong
2015-04-01
The inconsistency between textual features and visual contents can cause poor image search results. To solve this problem, click features, which are more reliable than textual information in justifying the relevance between a query and clicked images, are adopted in image ranking model. However, the existing ranking model cannot integrate visual features, which are efficient in refining the click-based search results. In this paper, we propose a novel ranking model based on the learning to rank framework. Visual features and click features are simultaneously utilized to obtain the ranking model. Specifically, the proposed approach is based on large margin structured output learning and the visual consistency is integrated with the click features through a hypergraph regularizer term. In accordance with the fast alternating linearization method, we design a novel algorithm to optimize the objective function. This algorithm alternately minimizes two different approximations of the original objective function by keeping one function unchanged and linearizing the other. We conduct experiments on a large-scale dataset collected from the Microsoft Bing image search engine, and the results demonstrate that the proposed learning to rank models based on visual features and user clicks outperforms state-of-the-art algorithms.
NASA Technical Reports Server (NTRS)
Dasarathy, B. V.
1976-01-01
An algorithm is proposed for dimensionality reduction in the context of clustering techniques based on histogram analysis. The approach is based on an evaluation of the hills and valleys in the unidimensional histograms along the different features and provides an economical means of assessing the significance of the features in a nonparametric unsupervised data environment. The method has relevance to remote sensing applications.
Siuly; Li, Yan; Paul Wen, Peng
2014-03-01
Motor imagery (MI) tasks classification provides an important basis for designing brain-computer interface (BCI) systems. If the MI tasks are reliably distinguished through identifying typical patterns in electroencephalography (EEG) data, a motor disabled people could communicate with a device by composing sequences of these mental states. In our earlier study, we developed a cross-correlation based logistic regression (CC-LR) algorithm for the classification of MI tasks for BCI applications, but its performance was not satisfactory. This study develops a modified version of the CC-LR algorithm exploring a suitable feature set that can improve the performance. The modified CC-LR algorithm uses the C3 electrode channel (in the international 10-20 system) as a reference channel for the cross-correlation (CC) technique and applies three diverse feature sets separately, as the input to the logistic regression (LR) classifier. The present algorithm investigates which feature set is the best to characterize the distribution of MI tasks based EEG data. This study also provides an insight into how to select a reference channel for the CC technique with EEG signals considering the anatomical structure of the human brain. The proposed algorithm is compared with eight of the most recently reported well-known methods including the BCI III Winner algorithm. The findings of this study indicate that the modified CC-LR algorithm has potential to improve the identification performance of MI tasks in BCI systems. The results demonstrate that the proposed technique provides a classification improvement over the existing methods tested. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Model-based vision for space applications
NASA Technical Reports Server (NTRS)
Chaconas, Karen; Nashman, Marilyn; Lumia, Ronald
1992-01-01
This paper describes a method for tracking moving image features by combining spatial and temporal edge information with model based feature information. The algorithm updates the two-dimensional position of object features by correlating predicted model features with current image data. The results of the correlation process are used to compute an updated model. The algorithm makes use of a high temporal sampling rate with respect to spatial changes of the image features and operates in a real-time multiprocessing environment. Preliminary results demonstrate successful tracking for image feature velocities between 1.1 and 4.5 pixels every image frame. This work has applications for docking, assembly, retrieval of floating objects and a host of other space-related tasks.
Muñoz, Mario A; Smith-Miles, Kate A
2017-01-01
This article presents a method for the objective assessment of an algorithm's strengths and weaknesses. Instead of examining the performance of only one or more algorithms on a benchmark set, or generating custom problems that maximize the performance difference between two algorithms, our method quantifies both the nature of the test instances and the algorithm performance. Our aim is to gather information about possible phase transitions in performance, that is, the points in which a small change in problem structure produces algorithm failure. The method is based on the accurate estimation and characterization of the algorithm footprints, that is, the regions of instance space in which good or exceptional performance is expected from an algorithm. A footprint can be estimated for each algorithm and for the overall portfolio. Therefore, we select a set of features to generate a common instance space, which we validate by constructing a sufficiently accurate prediction model. We characterize the footprints by their area and density. Our method identifies complementary performance between algorithms, quantifies the common features of hard problems, and locates regions where a phase transition may lie.
The algorithm of fast image stitching based on multi-feature extraction
NASA Astrophysics Data System (ADS)
Yang, Chunde; Wu, Ge; Shi, Jing
2018-05-01
This paper proposed an improved image registration method combining Hu-based invariant moment contour information and feature points detection, aiming to solve the problems in traditional image stitching algorithm, such as time-consuming feature points extraction process, redundant invalid information overload and inefficiency. First, use the neighborhood of pixels to extract the contour information, employing the Hu invariant moment as similarity measure to extract SIFT feature points in those similar regions. Then replace the Euclidean distance with Hellinger kernel function to improve the initial matching efficiency and get less mismatching points, further, estimate affine transformation matrix between the images. Finally, local color mapping method is adopted to solve uneven exposure, using the improved multiresolution fusion algorithm to fuse the mosaic images and realize seamless stitching. Experimental results confirm high accuracy and efficiency of method proposed in this paper.
Combined rule extraction and feature elimination in supervised classification.
Liu, Sheng; Patel, Ronak Y; Daga, Pankaj R; Liu, Haining; Fu, Gang; Doerksen, Robert J; Chen, Yixin; Wilkins, Dawn E
2012-09-01
There are a vast number of biology related research problems involving a combination of multiple sources of data to achieve a better understanding of the underlying problems. It is important to select and interpret the most important information from these sources. Thus it will be beneficial to have a good algorithm to simultaneously extract rules and select features for better interpretation of the predictive model. We propose an efficient algorithm, Combined Rule Extraction and Feature Elimination (CRF), based on 1-norm regularized random forests. CRF simultaneously extracts a small number of rules generated by random forests and selects important features. We applied CRF to several drug activity prediction and microarray data sets. CRF is capable of producing performance comparable with state-of-the-art prediction algorithms using a small number of decision rules. Some of the decision rules are biologically significant.
Reference-free automatic quality assessment of tracheoesophageal speech.
Huang, Andy; Falk, Tiago H; Chan, Wai-Yip; Parsa, Vijay; Doyle, Philip
2009-01-01
Evaluation of the quality of tracheoesophageal (TE) speech using machines instead of human experts can enhance the voice rehabilitation process for patients who have undergone total laryngectomy and voice restoration. Towards the goal of devising a reference-free TE speech quality estimation algorithm, we investigate the efficacy of speech signal features that are used in standard telephone-speech quality assessment algorithms, in conjunction with a recently introduced speech modulation spectrum measure. Tests performed on two TE speech databases demonstrate that the modulation spectral measure and a subset of features in the standard ITU-T P.563 algorithm estimate TE speech quality with better correlation (up to 0.9) than previously proposed features.
A roadmap of clustering algorithms: finding a match for a biomedical application.
Andreopoulos, Bill; An, Aijun; Wang, Xiaogang; Schroeder, Michael
2009-05-01
Clustering is ubiquitously applied in bioinformatics with hierarchical clustering and k-means partitioning being the most popular methods. Numerous improvements of these two clustering methods have been introduced, as well as completely different approaches such as grid-based, density-based and model-based clustering. For improved bioinformatics analysis of data, it is important to match clusterings to the requirements of a biomedical application. In this article, we present a set of desirable clustering features that are used as evaluation criteria for clustering algorithms. We review 40 different clustering algorithms of all approaches and datatypes. We compare algorithms on the basis of desirable clustering features, and outline algorithms' benefits and drawbacks as a basis for matching them to biomedical applications.
Linear-time general decoding algorithm for the surface code
NASA Astrophysics Data System (ADS)
Darmawan, Andrew S.; Poulin, David
2018-05-01
A quantum error correcting protocol can be substantially improved by taking into account features of the physical noise process. We present an efficient decoder for the surface code which can account for general noise features, including coherences and correlations. We demonstrate that the decoder significantly outperforms the conventional matching algorithm on a variety of noise models, including non-Pauli noise and spatially correlated noise. The algorithm is based on an approximate calculation of the logical channel using a tensor-network description of the noisy state.
The method for froth floatation condition recognition based on adaptive feature weighted
NASA Astrophysics Data System (ADS)
Wang, Jieran; Zhang, Jun; Tian, Jinwen; Zhang, Daimeng; Liu, Xiaomao
2018-03-01
The fusion of foam characteristics can play a complementary role in expressing the content of foam image. The weight of foam characteristics is the key to make full use of the relationship between the different features. In this paper, an Adaptive Feature Weighted Method For Froth Floatation Condition Recognition is proposed. Foam features without and with weights are both classified by using support vector machine (SVM).The classification accuracy and optimal equaling algorithm under the each ore grade are regarded as the result of the adaptive feature weighting algorithm. At the same time the effectiveness of adaptive weighted method is demonstrated.
Fast linear feature detection using multiple directional non-maximum suppression.
Sun, C; Vallotton, P
2009-05-01
The capacity to detect linear features is central to image analysis, computer vision and pattern recognition and has practical applications in areas such as neurite outgrowth detection, retinal vessel extraction, skin hair removal, plant root analysis and road detection. Linear feature detection often represents the starting point for image segmentation and image interpretation. In this paper, we present a new algorithm for linear feature detection using multiple directional non-maximum suppression with symmetry checking and gap linking. Given its low computational complexity, the algorithm is very fast. We show in several examples that it performs very well in terms of both sensitivity and continuity of detected linear features.
Efficient feature selection using a hybrid algorithm for the task of epileptic seizure detection
NASA Astrophysics Data System (ADS)
Lai, Kee Huong; Zainuddin, Zarita; Ong, Pauline
2014-07-01
Feature selection is a very important aspect in the field of machine learning. It entails the search of an optimal subset from a very large data set with high dimensional feature space. Apart from eliminating redundant features and reducing computational cost, a good selection of feature also leads to higher prediction and classification accuracy. In this paper, an efficient feature selection technique is introduced in the task of epileptic seizure detection. The raw data are electroencephalography (EEG) signals. Using discrete wavelet transform, the biomedical signals were decomposed into several sets of wavelet coefficients. To reduce the dimension of these wavelet coefficients, a feature selection method that combines the strength of both filter and wrapper methods is proposed. Principal component analysis (PCA) is used as part of the filter method. As for wrapper method, the evolutionary harmony search (HS) algorithm is employed. This metaheuristic method aims at finding the best discriminating set of features from the original data. The obtained features were then used as input for an automated classifier, namely wavelet neural networks (WNNs). The WNNs model was trained to perform a binary classification task, that is, to determine whether a given EEG signal was normal or epileptic. For comparison purposes, different sets of features were also used as input. Simulation results showed that the WNNs that used the features chosen by the hybrid algorithm achieved the highest overall classification accuracy.
Vivekanandan, T; Sriman Narayana Iyengar, N Ch
2017-11-01
Enormous data growth in multiple domains has posed a great challenge for data processing and analysis techniques. In particular, the traditional record maintenance strategy has been replaced in the healthcare system. It is vital to develop a model that is able to handle the huge amount of e-healthcare data efficiently. In this paper, the challenging tasks of selecting critical features from the enormous set of available features and diagnosing heart disease are carried out. Feature selection is one of the most widely used pre-processing steps in classification problems. A modified differential evolution (DE) algorithm is used to perform feature selection for cardiovascular disease and optimization of selected features. Of the 10 available strategies for the traditional DE algorithm, the seventh strategy, which is represented by DE/rand/2/exp, is considered for comparative study. The performance analysis of the developed modified DE strategy is given in this paper. With the selected critical features, prediction of heart disease is carried out using fuzzy AHP and a feed-forward neural network. Various performance measures of integrating the modified differential evolution algorithm with fuzzy AHP and a feed-forward neural network in the prediction of heart disease are evaluated in this paper. The accuracy of the proposed hybrid model is 83%, which is higher than that of some other existing models. In addition, the prediction time of the proposed hybrid model is also evaluated and has shown promising results. Copyright © 2017 Elsevier Ltd. All rights reserved.
Guo, Hao; Cao, Xiaohua; Liu, Zhifen; Li, Haifang; Chen, Junjie; Zhang, Kerang
2012-12-05
Resting state functional brain networks have been widely studied in brain disease research. However, it is currently unclear whether abnormal resting state functional brain network metrics can be used with machine learning for the classification of brain diseases. Resting state functional brain networks were constructed for 28 healthy controls and 38 major depressive disorder patients by thresholding partial correlation matrices of 90 regions. Three nodal metrics were calculated using graph theory-based approaches. Nonparametric permutation tests were then used for group comparisons of topological metrics, which were used as classified features in six different algorithms. We used statistical significance as the threshold for selecting features and measured the accuracies of six classifiers with different number of features. A sensitivity analysis method was used to evaluate the importance of different features. The result indicated that some of the regions exhibited significantly abnormal nodal centralities, including the limbic system, basal ganglia, medial temporal, and prefrontal regions. Support vector machine with radial basis kernel function algorithm and neural network algorithm exhibited the highest average accuracy (79.27 and 78.22%, respectively) with 28 features (P<0.05). Correlation analysis between feature importance and the statistical significance of metrics was investigated, and the results revealed a strong positive correlation between them. Overall, the current study demonstrated that major depressive disorder is associated with abnormal functional brain network topological metrics and statistically significant nodal metrics can be successfully used for feature selection in classification algorithms.
Facial Expression Recognition with Fusion Features Extracted from Salient Facial Areas.
Liu, Yanpeng; Li, Yibin; Ma, Xin; Song, Rui
2017-03-29
In the pattern recognition domain, deep architectures are currently widely used and they have achieved fine results. However, these deep architectures make particular demands, especially in terms of their requirement for big datasets and GPU. Aiming to gain better results without deep networks, we propose a simplified algorithm framework using fusion features extracted from the salient areas of faces. Furthermore, the proposed algorithm has achieved a better result than some deep architectures. For extracting more effective features, this paper firstly defines the salient areas on the faces. This paper normalizes the salient areas of the same location in the faces to the same size; therefore, it can extracts more similar features from different subjects. LBP and HOG features are extracted from the salient areas, fusion features' dimensions are reduced by Principal Component Analysis (PCA) and we apply several classifiers to classify the six basic expressions at once. This paper proposes a salient areas definitude method which uses peak expressions frames compared with neutral faces. This paper also proposes and applies the idea of normalizing the salient areas to align the specific areas which express the different expressions. As a result, the salient areas found from different subjects are the same size. In addition, the gamma correction method is firstly applied on LBP features in our algorithm framework which improves our recognition rates significantly. By applying this algorithm framework, our research has gained state-of-the-art performances on CK+ database and JAFFE database.
The Roles of Suprasegmental Features in Predicting English Oral Proficiency with an Automated System
ERIC Educational Resources Information Center
Kang, Okim; Johnson, David
2018-01-01
Suprasegmental features have received growing attention in the field of oral assessment. In this article we describe a set of computer algorithms that automatically scores the oral proficiency of non-native speakers using unconstrained English speech. The algorithms employ machine learning and 11 suprasegmental measures divided into four groups…
Feature detection on 3D images of dental imprints
NASA Astrophysics Data System (ADS)
Mokhtari, Marielle; Laurendeau, Denis
1994-09-01
A computer vision approach for the extraction of feature points on 3D images of dental imprints is presented. The position of feature points are needed for the measurement of a set of parameters for automatic diagnosis of malocclusion problems in orthodontics. The system for the acquisition of the 3D profile of the imprint, the procedure for the detection of the interstices between teeth, and the approach for the identification of the type of tooth are described, as well as the algorithm for the reconstruction of the surface of each type of tooth. A new approach for the detection of feature points, called the watershed algorithm, is described in detail. The algorithm is a two-stage procedure which tracks the position of local minima at four different scales and produces a final map of the position of the minima. Experimental results of the application of the watershed algorithm on actual 3D images of dental imprints are presented for molars, premolars and canines. The segmentation approach for the analysis of the shape of incisors is also described in detail.
A Survey on the Feasibility of Sound Classification on Wireless Sensor Nodes
Salomons, Etto L.; Havinga, Paul J. M.
2015-01-01
Wireless sensor networks are suitable to gain context awareness for indoor environments. As sound waves form a rich source of context information, equipping the nodes with microphones can be of great benefit. The algorithms to extract features from sound waves are often highly computationally intensive. This can be problematic as wireless nodes are usually restricted in resources. In order to be able to make a proper decision about which features to use, we survey how sound is used in the literature for global sound classification, age and gender classification, emotion recognition, person verification and identification and indoor and outdoor environmental sound classification. The results of the surveyed algorithms are compared with respect to accuracy and computational load. The accuracies are taken from the surveyed papers; the computational loads are determined by benchmarking the algorithms on an actual sensor node. We conclude that for indoor context awareness, the low-cost algorithms for feature extraction perform equally well as the more computationally-intensive variants. As the feature extraction still requires a large amount of processing time, we present four possible strategies to deal with this problem. PMID:25822142
Quantum algorithms for topological and geometric analysis of data
Lloyd, Seth; Garnerone, Silvano; Zanardi, Paolo
2016-01-01
Extracting useful information from large data sets can be a daunting task. Topological methods for analysing data sets provide a powerful technique for extracting such information. Persistent homology is a sophisticated tool for identifying topological features and for determining how such features persist as the data is viewed at different scales. Here we present quantum machine learning algorithms for calculating Betti numbers—the numbers of connected components, holes and voids—in persistent homology, and for finding eigenvectors and eigenvalues of the combinatorial Laplacian. The algorithms provide an exponential speed-up over the best currently known classical algorithms for topological data analysis. PMID:26806491
A biomimetic algorithm for the improved detection of microarray features
NASA Astrophysics Data System (ADS)
Nicolau, Dan V., Jr.; Nicolau, Dan V.; Maini, Philip K.
2007-02-01
One the major difficulties of microarray technology relate to the processing of large and - importantly - error-loaded images of the dots on the chip surface. Whatever the source of these errors, those obtained in the first stage of data acquisition - segmentation - are passed down to the subsequent processes, with deleterious results. As it has been demonstrated recently that biological systems have evolved algorithms that are mathematically efficient, this contribution attempts to test an algorithm that mimics a bacterial-"patented" algorithm for the search of available space and nutrients to find, "zero-in" and eventually delimitate the features existent on the microarray surface.
Chaotic particle swarm optimization with mutation for classification.
Assarzadeh, Zahra; Naghsh-Nilchi, Ahmad Reza
2015-01-01
In this paper, a chaotic particle swarm optimization with mutation-based classifier particle swarm optimization is proposed to classify patterns of different classes in the feature space. The introduced mutation operators and chaotic sequences allows us to overcome the problem of early convergence into a local minima associated with particle swarm optimization algorithms. That is, the mutation operator sharpens the convergence and it tunes the best possible solution. Furthermore, to remove the irrelevant data and reduce the dimensionality of medical datasets, a feature selection approach using binary version of the proposed particle swarm optimization is introduced. In order to demonstrate the effectiveness of our proposed classifier, mutation-based classifier particle swarm optimization, it is checked out with three sets of data classifications namely, Wisconsin diagnostic breast cancer, Wisconsin breast cancer and heart-statlog, with different feature vector dimensions. The proposed algorithm is compared with different classifier algorithms including k-nearest neighbor, as a conventional classifier, particle swarm-classifier, genetic algorithm, and Imperialist competitive algorithm-classifier, as more sophisticated ones. The performance of each classifier was evaluated by calculating the accuracy, sensitivity, specificity and Matthews's correlation coefficient. The experimental results show that the mutation-based classifier particle swarm optimization unequivocally performs better than all the compared algorithms.
Efficient enumeration of monocyclic chemical graphs with given path frequencies
2014-01-01
Background The enumeration of chemical graphs (molecular graphs) satisfying given constraints is one of the fundamental problems in chemoinformatics and bioinformatics because it leads to a variety of useful applications including structure determination and development of novel chemical compounds. Results We consider the problem of enumerating chemical graphs with monocyclic structure (a graph structure that contains exactly one cycle) from a given set of feature vectors, where a feature vector represents the frequency of the prescribed paths in a chemical compound to be constructed and the set is specified by a pair of upper and lower feature vectors. To enumerate all tree-like (acyclic) chemical graphs from a given set of feature vectors, Shimizu et al. and Suzuki et al. proposed efficient branch-and-bound algorithms based on a fast tree enumeration algorithm. In this study, we devise a novel method for extending these algorithms to enumeration of chemical graphs with monocyclic structure by designing a fast algorithm for testing uniqueness. The results of computational experiments reveal that the computational efficiency of the new algorithm is as good as those for enumeration of tree-like chemical compounds. Conclusions We succeed in expanding the class of chemical graphs that are able to be enumerated efficiently. PMID:24955135
Holistic approach for automated background EEG assessment in asphyxiated full-term infants
NASA Astrophysics Data System (ADS)
Matic, Vladimir; Cherian, Perumpillichira J.; Koolen, Ninah; Naulaers, Gunnar; Swarte, Renate M.; Govaert, Paul; Van Huffel, Sabine; De Vos, Maarten
2014-12-01
Objective. To develop an automated algorithm to quantify background EEG abnormalities in full-term neonates with hypoxic ischemic encephalopathy. Approach. The algorithm classifies 1 h of continuous neonatal EEG (cEEG) into a mild, moderate or severe background abnormality grade. These classes are well established in the literature and a clinical neurophysiologist labeled 272 1 h cEEG epochs selected from 34 neonates. The algorithm is based on adaptive EEG segmentation and mapping of the segments into the so-called segments’ feature space. Three features are suggested and further processing is obtained using a discretized three-dimensional distribution of the segments’ features represented as a 3-way data tensor. Further classification has been achieved using recently developed tensor decomposition/classification methods that reduce the size of the model and extract a significant and discriminative set of features. Main results. Effective parameterization of cEEG data has been achieved resulting in high classification accuracy (89%) to grade background EEG abnormalities. Significance. For the first time, the algorithm for the background EEG assessment has been validated on an extensive dataset which contained major artifacts and epileptic seizures. The demonstrated high robustness, while processing real-case EEGs, suggests that the algorithm can be used as an assistive tool to monitor the severity of hypoxic insults in newborns.
Segmenting texts from outdoor images taken by mobile phones using color features
NASA Astrophysics Data System (ADS)
Liu, Zongyi; Zhou, Hanning
2011-01-01
Recognizing texts from images taken by mobile phones with low resolution has wide applications. It has been shown that a good image binarization can substantially improve the performances of OCR engines. In this paper, we present a framework to segment texts from outdoor images taken by mobile phones using color features. The framework consists of three steps: (i) the initial process including image enhancement, binarization and noise filtering, where we binarize the input images in each RGB channel, and apply component level noise filtering; (ii) grouping components into blocks using color features, where we compute the component similarities by dynamically adjusting the weights of RGB channels, and merge groups hierachically, and (iii) blocks selection, where we use the run-length features and choose the Support Vector Machine (SVM) as the classifier. We tested the algorithm using 13 outdoor images taken by an old-style LG-64693 mobile phone with 640x480 resolution. We compared the segmentation results with Tsar's algorithm, a state-of-the-art camera text detection algorithm, and show that our algorithm is more robust, particularly in terms of the false alarm rates. In addition, we also evaluated the impacts of our algorithm on the Abbyy's FineReader, one of the most popular commercial OCR engines in the market.
Enabling phenotypic big data with PheNorm.
Yu, Sheng; Ma, Yumeng; Gronsbell, Jessica; Cai, Tianrun; Ananthakrishnan, Ashwin N; Gainer, Vivian S; Churchill, Susanne E; Szolovits, Peter; Murphy, Shawn N; Kohane, Isaac S; Liao, Katherine P; Cai, Tianxi
2018-01-01
Electronic health record (EHR)-based phenotyping infers whether a patient has a disease based on the information in his or her EHR. A human-annotated training set with gold-standard disease status labels is usually required to build an algorithm for phenotyping based on a set of predictive features. The time intensiveness of annotation and feature curation severely limits the ability to achieve high-throughput phenotyping. While previous studies have successfully automated feature curation, annotation remains a major bottleneck. In this paper, we present PheNorm, a phenotyping algorithm that does not require expert-labeled samples for training. The most predictive features, such as the number of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes or mentions of the target phenotype, are normalized to resemble a normal mixture distribution with high area under the receiver operating curve (AUC) for prediction. The transformed features are then denoised and combined into a score for accurate disease classification. We validated the accuracy of PheNorm with 4 phenotypes: coronary artery disease, rheumatoid arthritis, Crohn's disease, and ulcerative colitis. The AUCs of the PheNorm score reached 0.90, 0.94, 0.95, and 0.94 for the 4 phenotypes, respectively, which were comparable to the accuracy of supervised algorithms trained with sample sizes of 100-300, with no statistically significant difference. The accuracy of the PheNorm algorithms is on par with algorithms trained with annotated samples. PheNorm fully automates the generation of accurate phenotyping algorithms and demonstrates the capacity for EHR-driven annotations to scale to the next level - phenotypic big data. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
NASA Astrophysics Data System (ADS)
Srinivasan, Yeshwanth; Hernes, Dana; Tulpule, Bhakti; Yang, Shuyu; Guo, Jiangling; Mitra, Sunanda; Yagneswaran, Sriraja; Nutter, Brian; Jeronimo, Jose; Phillips, Benny; Long, Rodney; Ferris, Daron
2005-04-01
Automated segmentation and classification of diagnostic markers in medical imagery are challenging tasks. Numerous algorithms for segmentation and classification based on statistical approaches of varying complexity are found in the literature. However, the design of an efficient and automated algorithm for precise classification of desired diagnostic markers is extremely image-specific. The National Library of Medicine (NLM), in collaboration with the National Cancer Institute (NCI), is creating an archive of 60,000 digitized color images of the uterine cervix. NLM is developing tools for the analysis and dissemination of these images over the Web for the study of visual features correlated with precancerous neoplasia and cancer. To enable indexing of images of the cervix, it is essential to develop algorithms for the segmentation of regions of interest, such as acetowhitened regions, and automatic identification and classification of regions exhibiting mosaicism and punctation. Success of such algorithms depends, primarily, on the selection of relevant features representing the region of interest. We present color and geometric features based statistical classification and segmentation algorithms yielding excellent identification of the regions of interest. The distinct classification of the mosaic regions from the non-mosaic ones has been obtained by clustering multiple geometric and color features of the segmented sections using various morphological and statistical approaches. Such automated classification methodologies will facilitate content-based image retrieval from the digital archive of uterine cervix and have the potential of developing an image based screening tool for cervical cancer.
Detection and clustering of features in aerial images by neuron network-based algorithm
NASA Astrophysics Data System (ADS)
Vozenilek, Vit
2015-12-01
The paper presents the algorithm for detection and clustering of feature in aerial photographs based on artificial neural networks. The presented approach is not focused on the detection of specific topographic features, but on the combination of general features analysis and their use for clustering and backward projection of clusters to aerial image. The basis of the algorithm is a calculation of the total error of the network and a change of weights of the network to minimize the error. A classic bipolar sigmoid was used for the activation function of the neurons and the basic method of backpropagation was used for learning. To verify that a set of features is able to represent the image content from the user's perspective, the web application was compiled (ASP.NET on the Microsoft .NET platform). The main achievements include the knowledge that man-made objects in aerial images can be successfully identified by detection of shapes and anomalies. It was also found that the appropriate combination of comprehensive features that describe the colors and selected shapes of individual areas can be useful for image analysis.
Classifier dependent feature preprocessing methods
NASA Astrophysics Data System (ADS)
Rodriguez, Benjamin M., II; Peterson, Gilbert L.
2008-04-01
In mobile applications, computational complexity is an issue that limits sophisticated algorithms from being implemented on these devices. This paper provides an initial solution to applying pattern recognition systems on mobile devices by combining existing preprocessing algorithms for recognition. In pattern recognition systems, it is essential to properly apply feature preprocessing tools prior to training classification models in an attempt to reduce computational complexity and improve the overall classification accuracy. The feature preprocessing tools extended for the mobile environment are feature ranking, feature extraction, data preparation and outlier removal. Most desktop systems today are capable of processing a majority of the available classification algorithms without concern of processing while the same is not true on mobile platforms. As an application of pattern recognition for mobile devices, the recognition system targets the problem of steganalysis, determining if an image contains hidden information. The measure of performance shows that feature preprocessing increases the overall steganalysis classification accuracy by an average of 22%. The methods in this paper are tested on a workstation and a Nokia 6620 (Symbian operating system) camera phone with similar results.
Cluster compression algorithm: A joint clustering/data compression concept
NASA Technical Reports Server (NTRS)
Hilbert, E. E.
1977-01-01
The Cluster Compression Algorithm (CCA), which was developed to reduce costs associated with transmitting, storing, distributing, and interpreting LANDSAT multispectral image data is described. The CCA is a preprocessing algorithm that uses feature extraction and data compression to more efficiently represent the information in the image data. The format of the preprocessed data enables simply a look-up table decoding and direct use of the extracted features to reduce user computation for either image reconstruction, or computer interpretation of the image data. Basically, the CCA uses spatially local clustering to extract features from the image data to describe spectral characteristics of the data set. In addition, the features may be used to form a sequence of scalar numbers that define each picture element in terms of the cluster features. This sequence, called the feature map, is then efficiently represented by using source encoding concepts. Various forms of the CCA are defined and experimental results are presented to show trade-offs and characteristics of the various implementations. Examples are provided that demonstrate the application of the cluster compression concept to multi-spectral images from LANDSAT and other sources.
Firefly Mating Algorithm for Continuous Optimization Problems
Ritthipakdee, Amarita; Premasathian, Nol; Jitkongchuen, Duangjai
2017-01-01
This paper proposes a swarm intelligence algorithm, called firefly mating algorithm (FMA), for solving continuous optimization problems. FMA uses genetic algorithm as the core of the algorithm. The main feature of the algorithm is a novel mating pair selection method which is inspired by the following 2 mating behaviors of fireflies in nature: (i) the mutual attraction between males and females causes them to mate and (ii) fireflies of both sexes are of the multiple-mating type, mating with multiple opposite sex partners. A female continues mating until her spermatheca becomes full, and, in the same vein, a male can provide sperms for several females until his sperm reservoir is depleted. This new feature enhances the global convergence capability of the algorithm. The performance of FMA was tested with 20 benchmark functions (sixteen 30-dimensional functions and four 2-dimensional ones) against FA, ALC-PSO, COA, MCPSO, LWGSODE, MPSODDS, DFOA, SHPSOS, LSA, MPDPGA, DE, and GABC algorithms. The experimental results showed that the success rates of our proposed algorithm with these functions were higher than those of other algorithms and the proposed algorithm also required fewer numbers of iterations to reach the global optima. PMID:28808442
Firefly Mating Algorithm for Continuous Optimization Problems.
Ritthipakdee, Amarita; Thammano, Arit; Premasathian, Nol; Jitkongchuen, Duangjai
2017-01-01
This paper proposes a swarm intelligence algorithm, called firefly mating algorithm (FMA), for solving continuous optimization problems. FMA uses genetic algorithm as the core of the algorithm. The main feature of the algorithm is a novel mating pair selection method which is inspired by the following 2 mating behaviors of fireflies in nature: (i) the mutual attraction between males and females causes them to mate and (ii) fireflies of both sexes are of the multiple-mating type, mating with multiple opposite sex partners. A female continues mating until her spermatheca becomes full, and, in the same vein, a male can provide sperms for several females until his sperm reservoir is depleted. This new feature enhances the global convergence capability of the algorithm. The performance of FMA was tested with 20 benchmark functions (sixteen 30-dimensional functions and four 2-dimensional ones) against FA, ALC-PSO, COA, MCPSO, LWGSODE, MPSODDS, DFOA, SHPSOS, LSA, MPDPGA, DE, and GABC algorithms. The experimental results showed that the success rates of our proposed algorithm with these functions were higher than those of other algorithms and the proposed algorithm also required fewer numbers of iterations to reach the global optima.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hui, C; Suh, Y; Robertson, D
Purpose: To develop a novel algorithm to generate internal respiratory signals for sorting of four-dimensional (4D) computed tomography (CT) images. Methods: The proposed algorithm extracted multiple time resolved features as potential respiratory signals. These features were taken from the 4D CT images and its Fourier transformed space. Several low-frequency locations in the Fourier space and selected anatomical features from the images were used as potential respiratory signals. A clustering algorithm was then used to search for the group of appropriate potential respiratory signals. The chosen signals were then normalized and averaged to form the final internal respiratory signal. Performance ofmore » the algorithm was tested in 50 4D CT data sets and results were compared with external signals from the real-time position management (RPM) system. Results: In almost all cases, the proposed algorithm generated internal respiratory signals that visibly matched the external respiratory signals from the RPM system. On average, the end inspiration times calculated by the proposed algorithm were within 0.1 s of those given by the RPM system. Less than 3% of the calculated end inspiration times were more than one time frame away from those given by the RPM system. In 3 out of the 50 cases, the proposed algorithm generated internal respiratory signals that were significantly smoother than the RPM signals. In these cases, images sorted using the internal respiratory signals showed fewer artifacts in locations corresponding to the discrepancy in the internal and external respiratory signals. Conclusion: We developed a robust algorithm that generates internal respiratory signals from 4D CT images. In some cases, it even showed the potential to outperform the RPM system. The proposed algorithm is completely automatic and generally takes less than 2 min to process. It can be easily implemented into the clinic and can potentially replace the use of external surrogates.« less
Hamilton, Lei; McConley, Marc; Angermueller, Kai; Goldberg, David; Corba, Massimiliano; Kim, Louis; Moran, James; Parks, Philip D; Sang Chin; Widge, Alik S; Dougherty, Darin D; Eskandar, Emad N
2015-08-01
A fully autonomous intracranial device is built to continually record neural activities in different parts of the brain, process these sampled signals, decode features that correlate to behaviors and neuropsychiatric states, and use these features to deliver brain stimulation in a closed-loop fashion. In this paper, we describe the sampling and stimulation aspects of such a device. We first describe the signal processing algorithms of two unsupervised spike sorting methods. Next, we describe the LFP time-frequency analysis and feature derivation from the two spike sorting methods. Spike sorting includes a novel approach to constructing a dictionary learning algorithm in a Compressed Sensing (CS) framework. We present a joint prediction scheme to determine the class of neural spikes in the dictionary learning framework; and, the second approach is a modified OSort algorithm which is implemented in a distributed system optimized for power efficiency. Furthermore, sorted spikes and time-frequency analysis of LFP signals can be used to generate derived features (including cross-frequency coupling, spike-field coupling). We then show how these derived features can be used in the design and development of novel decode and closed-loop control algorithms that are optimized to apply deep brain stimulation based on a patient's neuropsychiatric state. For the control algorithm, we define the state vector as representative of a patient's impulsivity, avoidance, inhibition, etc. Controller parameters are optimized to apply stimulation based on the state vector's current state as well as its historical values. The overall algorithm and software design for our implantable neural recording and stimulation system uses an innovative, adaptable, and reprogrammable architecture that enables advancement of the state-of-the-art in closed-loop neural control while also meeting the challenges of system power constraints and concurrent development with ongoing scientific research designed to define brain network connectivity and neural network dynamics that vary at the individual patient level and vary over time.
Algorithm-Dependent Generalization Bounds for Multi-Task Learning.
Liu, Tongliang; Tao, Dacheng; Song, Mingli; Maybank, Stephen J
2017-02-01
Often, tasks are collected for multi-task learning (MTL) because they share similar feature structures. Based on this observation, in this paper, we present novel algorithm-dependent generalization bounds for MTL by exploiting the notion of algorithmic stability. We focus on the performance of one particular task and the average performance over multiple tasks by analyzing the generalization ability of a common parameter that is shared in MTL. When focusing on one particular task, with the help of a mild assumption on the feature structures, we interpret the function of the other tasks as a regularizer that produces a specific inductive bias. The algorithm for learning the common parameter, as well as the predictor, is thereby uniformly stable with respect to the domain of the particular task and has a generalization bound with a fast convergence rate of order O(1/n), where n is the sample size of the particular task. When focusing on the average performance over multiple tasks, we prove that a similar inductive bias exists under certain conditions on the feature structures. Thus, the corresponding algorithm for learning the common parameter is also uniformly stable with respect to the domains of the multiple tasks, and its generalization bound is of the order O(1/T), where T is the number of tasks. These theoretical analyses naturally show that the similarity of feature structures in MTL will lead to specific regularizations for predicting, which enables the learning algorithms to generalize fast and correctly from a few examples.
NASA Astrophysics Data System (ADS)
Jia, Huizhen; Sun, Quansen; Ji, Zexuan; Wang, Tonghan; Chen, Qiang
2014-11-01
The goal of no-reference/blind image quality assessment (NR-IQA) is to devise a perceptual model that can accurately predict the quality of a distorted image as human opinions, in which feature extraction is an important issue. However, the features used in the state-of-the-art "general purpose" NR-IQA algorithms are usually natural scene statistics (NSS) based or are perceptually relevant; therefore, the performance of these models is limited. To further improve the performance of NR-IQA, we propose a general purpose NR-IQA algorithm which combines NSS-based features with perceptually relevant features. The new method extracts features in both the spatial and gradient domains. In the spatial domain, we extract the point-wise statistics for single pixel values which are characterized by a generalized Gaussian distribution model to form the underlying features. In the gradient domain, statistical features based on neighboring gradient magnitude similarity are extracted. Then a mapping is learned to predict quality scores using a support vector regression. The experimental results on the benchmark image databases demonstrate that the proposed algorithm correlates highly with human judgments of quality and leads to significant performance improvements over state-of-the-art methods.
Multi-label spacecraft electrical signal classification method based on DBN and random forest
Li, Ke; Yu, Nan; Li, Pengfei; Song, Shimin; Wu, Yalei; Li, Yang; Liu, Meng
2017-01-01
In spacecraft electrical signal characteristic data, there exists a large amount of data with high-dimensional features, a high computational complexity degree, and a low rate of identification problems, which causes great difficulty in fault diagnosis of spacecraft electronic load systems. This paper proposes a feature extraction method that is based on deep belief networks (DBN) and a classification method that is based on the random forest (RF) algorithm; The proposed algorithm mainly employs a multi-layer neural network to reduce the dimension of the original data, and then, classification is applied. Firstly, we use the method of wavelet denoising, which was used to pre-process the data. Secondly, the deep belief network is used to reduce the feature dimension and improve the rate of classification for the electrical characteristics data. Finally, we used the random forest algorithm to classify the data and comparing it with other algorithms. The experimental results show that compared with other algorithms, the proposed method shows excellent performance in terms of accuracy, computational efficiency, and stability in addressing spacecraft electrical signal data. PMID:28486479
Multi-label spacecraft electrical signal classification method based on DBN and random forest.
Li, Ke; Yu, Nan; Li, Pengfei; Song, Shimin; Wu, Yalei; Li, Yang; Liu, Meng
2017-01-01
In spacecraft electrical signal characteristic data, there exists a large amount of data with high-dimensional features, a high computational complexity degree, and a low rate of identification problems, which causes great difficulty in fault diagnosis of spacecraft electronic load systems. This paper proposes a feature extraction method that is based on deep belief networks (DBN) and a classification method that is based on the random forest (RF) algorithm; The proposed algorithm mainly employs a multi-layer neural network to reduce the dimension of the original data, and then, classification is applied. Firstly, we use the method of wavelet denoising, which was used to pre-process the data. Secondly, the deep belief network is used to reduce the feature dimension and improve the rate of classification for the electrical characteristics data. Finally, we used the random forest algorithm to classify the data and comparing it with other algorithms. The experimental results show that compared with other algorithms, the proposed method shows excellent performance in terms of accuracy, computational efficiency, and stability in addressing spacecraft electrical signal data.
Image fusion using sparse overcomplete feature dictionaries
Brumby, Steven P.; Bettencourt, Luis; Kenyon, Garrett T.; Chartrand, Rick; Wohlberg, Brendt
2015-10-06
Approaches for deciding what individuals in a population of visual system "neurons" are looking for using sparse overcomplete feature dictionaries are provided. A sparse overcomplete feature dictionary may be learned for an image dataset and a local sparse representation of the image dataset may be built using the learned feature dictionary. A local maximum pooling operation may be applied on the local sparse representation to produce a translation-tolerant representation of the image dataset. An object may then be classified and/or clustered within the translation-tolerant representation of the image dataset using a supervised classification algorithm and/or an unsupervised clustering algorithm.
PDC-SGB: Prediction of effective drug combinations using a stochastic gradient boosting algorithm.
Xu, Qian; Xiong, Yi; Dai, Hao; Kumari, Kotni Meena; Xu, Qin; Ou, Hong-Yu; Wei, Dong-Qing
2017-03-21
Combinatorial therapy is a promising strategy for combating complex diseases by improving the efficacy and reducing the side effects. To facilitate the identification of drug combinations in pharmacology, we proposed a new computational model, termed PDC-SGB, to predict effective drug combinations by integrating biological, chemical and pharmacological information based on a stochastic gradient boosting algorithm. To begin with, a set of 352 golden positive samples were collected from the public drug combination database. Then, a set of 732 dimensional feature vector involving biological, chemical and pharmaceutical information was constructed for each drug combination to describe its properties. To avoid overfitting, the maximum relevance & minimum redundancy (mRMR) method was performed to extract useful ones by removing redundant subsets. Based on the selected features, the three different type of classification algorithms were employed to build the drug combination prediction models. Our results demonstrated that the model based on the stochastic gradient boosting algorithm yield out the best performance. Furthermore, it is indicated that the feature patterns of therapy had powerful ability to discriminate effective drug combinations from non-effective ones. By analyzing various features, it is shown that the enriched features occurred frequently in golden positive samples can help predict novel drug combinations. Copyright © 2017 Elsevier Ltd. All rights reserved.
GridMass: a fast two-dimensional feature detection method for LC/MS.
Treviño, Victor; Yañez-Garza, Irma-Luz; Rodriguez-López, Carlos E; Urrea-López, Rafael; Garza-Rodriguez, Maria-Lourdes; Barrera-Saldaña, Hugo-Alberto; Tamez-Peña, José G; Winkler, Robert; Díaz de-la-Garza, Rocío-Isabel
2015-01-01
One of the initial and critical procedures for the analysis of metabolomics data using liquid chromatography and mass spectrometry is feature detection. Feature detection is the process to detect boundaries of the mass surface from raw data. It consists of detected abundances arranged in a two-dimensional (2D) matrix of mass/charge and elution time. MZmine 2 is one of the leading software environments that provide a full analysis pipeline for these data. However, the feature detection algorithms provided in MZmine 2 are based mainly on the analysis of one-dimension at a time. We propose GridMass, an efficient algorithm for 2D feature detection. The algorithm is based on landing probes across the chromatographic space that are moved to find local maxima providing accurate boundary estimations. We tested GridMass on a controlled marker experiment, on plasma samples, on plant fruits, and in a proteome sample. Compared with other algorithms, GridMass is faster and may achieve comparable or better sensitivity and specificity. As a proof of concept, GridMass has been implemented in Java under the MZmine 2 environment and is available at http://www.bioinformatica.mty.itesm.mx/GridMass and MASSyPup. It has also been submitted to the MZmine 2 developing community. Copyright © 2015 John Wiley & Sons, Ltd.
Multi-Source Multi-Target Dictionary Learning for Prediction of Cognitive Decline
Zhang, Jie; Li, Qingyang; Caselli, Richard J.; Thompson, Paul M.; Ye, Jieping; Wang, Yalin
2017-01-01
Alzheimer’s Disease (AD) is the most common type of dementia. Identifying correct biomarkers may determine pre-symptomatic AD subjects and enable early intervention. Recently, Multi-task sparse feature learning has been successfully applied to many computer vision and biomedical informatics researches. It aims to improve the generalization performance by exploiting the shared features among different tasks. However, most of the existing algorithms are formulated as a supervised learning scheme. Its drawback is with either insufficient feature numbers or missing label information. To address these challenges, we formulate an unsupervised framework for multi-task sparse feature learning based on a novel dictionary learning algorithm. To solve the unsupervised learning problem, we propose a two-stage Multi-Source Multi-Target Dictionary Learning (MMDL) algorithm. In stage 1, we propose a multi-source dictionary learning method to utilize the common and individual sparse features in different time slots. In stage 2, supported by a rigorous theoretical analysis, we develop a multi-task learning method to solve the missing label problem. Empirical studies on an N = 3970 longitudinal brain image data set, which involves 2 sources and 5 targets, demonstrate the improved prediction accuracy and speed efficiency of MMDL in comparison with other state-of-the-art algorithms. PMID:28943731
An Integrated Ransac and Graph Based Mismatch Elimination Approach for Wide-Baseline Image Matching
NASA Astrophysics Data System (ADS)
Hasheminasab, M.; Ebadi, H.; Sedaghat, A.
2015-12-01
In this paper we propose an integrated approach in order to increase the precision of feature point matching. Many different algorithms have been developed as to optimizing the short-baseline image matching while because of illumination differences and viewpoints changes, wide-baseline image matching is so difficult to handle. Fortunately, the recent developments in the automatic extraction of local invariant features make wide-baseline image matching possible. The matching algorithms which are based on local feature similarity principle, using feature descriptor as to establish correspondence between feature point sets. To date, the most remarkable descriptor is the scale-invariant feature transform (SIFT) descriptor , which is invariant to image rotation and scale, and it remains robust across a substantial range of affine distortion, presence of noise, and changes in illumination. The epipolar constraint based on RANSAC (random sample consensus) method is a conventional model for mismatch elimination, particularly in computer vision. Because only the distance from the epipolar line is considered, there are a few false matches in the selected matching results based on epipolar geometry and RANSAC. Aguilariu et al. proposed Graph Transformation Matching (GTM) algorithm to remove outliers which has some difficulties when the mismatched points surrounded by the same local neighbor structure. In this study to overcome these limitations, which mentioned above, a new three step matching scheme is presented where the SIFT algorithm is used to obtain initial corresponding point sets. In the second step, in order to reduce the outliers, RANSAC algorithm is applied. Finally, to remove the remained mismatches, based on the adjacent K-NN graph, the GTM is implemented. Four different close range image datasets with changes in viewpoint are utilized to evaluate the performance of the proposed method and the experimental results indicate its robustness and capability.
Automated Detection of Epileptic Biomarkers in Resting-State Interictal MEG Data
Soriano, Miguel C.; Niso, Guiomar; Clements, Jillian; Ortín, Silvia; Carrasco, Sira; Gudín, María; Mirasso, Claudio R.; Pereda, Ernesto
2017-01-01
Certain differences between brain networks of healthy and epilectic subjects have been reported even during the interictal activity, in which no epileptic seizures occur. Here, magnetoencephalography (MEG) data recorded in the resting state is used to discriminate between healthy subjects and patients with either idiopathic generalized epilepsy or frontal focal epilepsy. Signal features extracted from interictal periods without any epileptiform activity are used to train a machine learning algorithm to draw a diagnosis. This is potentially relevant to patients without frequent or easily detectable spikes. To analyze the data, we use an up-to-date machine learning algorithm and explore the benefits of including different features obtained from the MEG data as inputs to the algorithm. We find that the relative power spectral density of the MEG time-series is sufficient to distinguish between healthy and epileptic subjects with a high prediction accuracy. We also find that a combination of features such as the phase-locked value and the relative power spectral density allow to discriminate generalized and focal epilepsy, when these features are calculated over a filtered version of the signals in certain frequency bands. Machine learning algorithms are currently being applied to the analysis and classification of brain signals. It is, however, less evident to identify the proper features of these signals that are prone to be used in such machine learning algorithms. Here, we evaluate the influence of the input feature selection on a clinical scenario to distinguish between healthy and epileptic subjects. Our results indicate that such distinction is possible with a high accuracy (86%), allowing the discrimination between idiopathic generalized and frontal focal epilepsy types. PMID:28713260
Boon, K H; Khalil-Hani, M; Malarvili, M B
2018-01-01
This paper presents a method that able to predict the paroxysmal atrial fibrillation (PAF). The method uses shorter heart rate variability (HRV) signals when compared to existing methods, and achieves good prediction accuracy. PAF is a common cardiac arrhythmia that increases the health risk of a patient, and the development of an accurate predictor of the onset of PAF is clinical important because it increases the possibility to electrically stabilize and prevent the onset of atrial arrhythmias with different pacing techniques. We propose a multi-objective optimization algorithm based on the non-dominated sorting genetic algorithm III for optimizing the baseline PAF prediction system, that consists of the stages of pre-processing, HRV feature extraction, and support vector machine (SVM) model. The pre-processing stage comprises of heart rate correction, interpolation, and signal detrending. After that, time-domain, frequency-domain, non-linear HRV features are extracted from the pre-processed data in feature extraction stage. Then, these features are used as input to the SVM for predicting the PAF event. The proposed optimization algorithm is used to optimize the parameters and settings of various HRV feature extraction algorithms, select the best feature subsets, and tune the SVM parameters simultaneously for maximum prediction performance. The proposed method achieves an accuracy rate of 87.7%, which significantly outperforms most of the previous works. This accuracy rate is achieved even with the HRV signal length being reduced from the typical 30 min to just 5 min (a reduction of 83%). Furthermore, another significant result is the sensitivity rate, which is considered more important that other performance metrics in this paper, can be improved with the trade-off of lower specificity. Copyright © 2017 Elsevier B.V. All rights reserved.
Face recognition algorithm based on Gabor wavelet and locality preserving projections
NASA Astrophysics Data System (ADS)
Liu, Xiaojie; Shen, Lin; Fan, Honghui
2017-07-01
In order to solve the effects of illumination changes and differences of personal features on the face recognition rate, this paper presents a new face recognition algorithm based on Gabor wavelet and Locality Preserving Projections (LPP). The problem of the Gabor filter banks with high dimensions was solved effectively, and also the shortcoming of the LPP on the light illumination changes was overcome. Firstly, the features of global image information were achieved, which used the good spatial locality and orientation selectivity of Gabor wavelet filters. Then the dimensions were reduced by utilizing the LPP, which well-preserved the local information of the image. The experimental results shown that this algorithm can effectively extract the features relating to facial expressions, attitude and other information. Besides, it can reduce influence of the illumination changes and the differences in personal features effectively, which improves the face recognition rate to 99.2%.
NASA Astrophysics Data System (ADS)
Xu, Lili; Luo, Shuqian
2010-11-01
Microaneurysms (MAs) are the first manifestations of the diabetic retinopathy (DR) as well as an indicator for its progression. Their automatic detection plays a key role for both mass screening and monitoring and is therefore in the core of any system for computer-assisted diagnosis of DR. The algorithm basically comprises the following stages: candidate detection aiming at extracting the patterns possibly corresponding to MAs based on mathematical morphological black top hat, feature extraction to characterize these candidates, and classification based on support vector machine (SVM), to validate MAs. Feature vector and kernel function of SVM selection is very important to the algorithm. We use the receiver operating characteristic (ROC) curve to evaluate the distinguishing performance of different feature vectors and different kernel functions of SVM. The ROC analysis indicates the quadratic polynomial SVM with a combination of features as the input shows the best discriminating performance.
Xu, Lili; Luo, Shuqian
2010-01-01
Microaneurysms (MAs) are the first manifestations of the diabetic retinopathy (DR) as well as an indicator for its progression. Their automatic detection plays a key role for both mass screening and monitoring and is therefore in the core of any system for computer-assisted diagnosis of DR. The algorithm basically comprises the following stages: candidate detection aiming at extracting the patterns possibly corresponding to MAs based on mathematical morphological black top hat, feature extraction to characterize these candidates, and classification based on support vector machine (SVM), to validate MAs. Feature vector and kernel function of SVM selection is very important to the algorithm. We use the receiver operating characteristic (ROC) curve to evaluate the distinguishing performance of different feature vectors and different kernel functions of SVM. The ROC analysis indicates the quadratic polynomial SVM with a combination of features as the input shows the best discriminating performance.
Feature and contrast enhancement of mammographic image based on multiscale analysis and morphology.
Wu, Shibin; Yu, Shaode; Yang, Yuhan; Xie, Yaoqin
2013-01-01
A new algorithm for feature and contrast enhancement of mammographic images is proposed in this paper. The approach bases on multiscale transform and mathematical morphology. First of all, the Laplacian Gaussian pyramid operator is applied to transform the mammography into different scale subband images. In addition, the detail or high frequency subimages are equalized by contrast limited adaptive histogram equalization (CLAHE) and low-pass subimages are processed by mathematical morphology. Finally, the enhanced image of feature and contrast is reconstructed from the Laplacian Gaussian pyramid coefficients modified at one or more levels by contrast limited adaptive histogram equalization and mathematical morphology, respectively. The enhanced image is processed by global nonlinear operator. The experimental results show that the presented algorithm is effective for feature and contrast enhancement of mammogram. The performance evaluation of the proposed algorithm is measured by contrast evaluation criterion for image, signal-noise-ratio (SNR), and contrast improvement index (CII).
Feature and Contrast Enhancement of Mammographic Image Based on Multiscale Analysis and Morphology
Wu, Shibin; Xie, Yaoqin
2013-01-01
A new algorithm for feature and contrast enhancement of mammographic images is proposed in this paper. The approach bases on multiscale transform and mathematical morphology. First of all, the Laplacian Gaussian pyramid operator is applied to transform the mammography into different scale subband images. In addition, the detail or high frequency subimages are equalized by contrast limited adaptive histogram equalization (CLAHE) and low-pass subimages are processed by mathematical morphology. Finally, the enhanced image of feature and contrast is reconstructed from the Laplacian Gaussian pyramid coefficients modified at one or more levels by contrast limited adaptive histogram equalization and mathematical morphology, respectively. The enhanced image is processed by global nonlinear operator. The experimental results show that the presented algorithm is effective for feature and contrast enhancement of mammogram. The performance evaluation of the proposed algorithm is measured by contrast evaluation criterion for image, signal-noise-ratio (SNR), and contrast improvement index (CII). PMID:24416072
Wang, Jie-sheng; Han, Shuang; Shen, Na-na; Li, Shu-xia
2014-01-01
For meeting the forecasting target of key technology indicators in the flotation process, a BP neural network soft-sensor model based on features extraction of flotation froth images and optimized by shuffled cuckoo search algorithm is proposed. Based on the digital image processing technique, the color features in HSI color space, the visual features based on the gray level cooccurrence matrix, and the shape characteristics based on the geometric theory of flotation froth images are extracted, respectively, as the input variables of the proposed soft-sensor model. Then the isometric mapping method is used to reduce the input dimension, the network size, and learning time of BP neural network. Finally, a shuffled cuckoo search algorithm is adopted to optimize the BP neural network soft-sensor model. Simulation results show that the model has better generalization results and prediction accuracy. PMID:25133210
Image fusion algorithm based on energy of Laplacian and PCNN
NASA Astrophysics Data System (ADS)
Li, Meili; Wang, Hongmei; Li, Yanjun; Zhang, Ke
2009-12-01
Owing to the global coupling and pulse synchronization characteristic of pulse coupled neural networks (PCNN), it has been proved to be suitable for image processing and successfully employed in image fusion. However, in almost all the literatures of image processing about PCNN, linking strength of each neuron is assigned the same value which is chosen by experiments. This is not consistent with the human vision system in which the responses to the region with notable features are stronger than that to the region with nonnotable features. It is more reasonable that notable features, rather than the same value, are employed to linking strength of each neuron. As notable feature, energy of Laplacian (EOL) is used to obtain the value of linking strength in PCNN in this paper. Experimental results demonstrate that the proposed algorithm outperforms Laplacian-based, wavelet-based, PCNN -based fusion algorithms.
NASA Astrophysics Data System (ADS)
Wantuch, Andrew C.; Vita, Joshua A.; Jimenez, Edward S.; Bray, Iliana E.
2016-10-01
Despite object detection, recognition, and identification being very active areas of computer vision research, many of the available tools to aid in these processes are designed with only photographs in mind. Although some algorithms used specifically for feature detection and identification may not take explicit advantage of the colors available in the image, they still under-perform on radiographs, which are grayscale images. We are especially interested in the robustness of these algorithms, specifically their performance on a preexisting database of X-ray radiographs in compressed JPEG form, with multiple ways of describing pixel information. We will review various aspects of the performance of available feature detection and identification systems, including MATLABs Computer Vision toolbox, VLFeat, and OpenCV on our non-ideal database. In the process, we will explore possible reasons for the algorithms' lessened ability to detect and identify features from the X-ray radiographs.
Image quality classification for DR screening using deep learning.
FengLi Yu; Jing Sun; Annan Li; Jun Cheng; Cheng Wan; Jiang Liu
2017-07-01
The quality of input images significantly affects the outcome of automated diabetic retinopathy (DR) screening systems. Unlike the previous methods that only consider simple low-level features such as hand-crafted geometric and structural features, in this paper we propose a novel method for retinal image quality classification (IQC) that performs computational algorithms imitating the working of the human visual system. The proposed algorithm combines unsupervised features from saliency map and supervised features coming from convolutional neural networks (CNN), which are fed to an SVM to automatically detect high quality vs poor quality retinal fundus images. We demonstrate the superior performance of our proposed algorithm on a large retinal fundus image dataset and the method could achieve higher accuracy than other methods. Although retinal images are used in this study, the methodology is applicable to the image quality assessment and enhancement of other types of medical images.
NASA Technical Reports Server (NTRS)
Smith, Michael D.; Bandfield, Joshua L.; Christensen, Philip R.
2000-01-01
We present two algorithms for the separation of spectral features caused by atmospheric and surface components in Thermal Emission Spectrometer (TES) data. One algorithm uses radiative transfer and successive least squares fitting to find spectral shapes first for atmospheric dust, then for water-ice aerosols, and then, finally, for surface emissivity. A second independent algorithm uses a combination of factor analysis, target transformation, and deconvolution to simultaneously find dust, water ice, and surface emissivity spectral shapes. Both algorithms have been applied to TES spectra, and both find very similar atmospheric and surface spectral shapes. For TES spectra taken during aerobraking and science phasing periods in nadir-geometry these two algorithms give meaningful and usable surface emissivity spectra that can be used for mineralogical identification.
Detection of unmanned aerial vehicles using a visible camera system.
Hu, Shuowen; Goldman, Geoffrey H; Borel-Donohue, Christoph C
2017-01-20
Unmanned aerial vehicles (UAVs) flown by adversaries are an emerging asymmetric threat to homeland security and the military. To help address this threat, we developed and tested a computationally efficient UAV detection algorithm consisting of horizon finding, motion feature extraction, blob analysis, and coherence analysis. We compare the performance of this algorithm against two variants, one using the difference image intensity as the motion features and another using higher-order moments. The proposed algorithm and its variants are tested using field test data of a group 3 UAV acquired with a panoramic video camera in the visible spectrum. The performance of the algorithms was evaluated using receiver operating characteristic curves. The results show that the proposed approach had the best performance compared to the two algorithmic variants.
Integrating image quality in 2nu-SVM biometric match score fusion.
Vatsa, Mayank; Singh, Richa; Noore, Afzel
2007-10-01
This paper proposes an intelligent 2nu-support vector machine based match score fusion algorithm to improve the performance of face and iris recognition by integrating the quality of images. The proposed algorithm applies redundant discrete wavelet transform to evaluate the underlying linear and non-linear features present in the image. A composite quality score is computed to determine the extent of smoothness, sharpness, noise, and other pertinent features present in each subband of the image. The match score and the corresponding quality score of an image are fused using 2nu-support vector machine to improve the verification performance. The proposed algorithm is experimentally validated using the FERET face database and the CASIA iris database. The verification performance and statistical evaluation show that the proposed algorithm outperforms existing fusion algorithms.
SPITZER IRAC PHOTOMETRY FOR TIME SERIES IN CROWDED FIELDS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Novati, S. Calchi; Beichman, C.; Gould, A.
We develop a new photometry algorithm that is optimized for the Infrared Array Camera (IRAC) Spitzer time series in crowded fields and that is particularly adapted to faint or heavily blended targets. We apply this to the 170 targets from the 2015 Spitzer microlensing campaign and present the results of three variants of this algorithm in an online catalog. We present detailed accounts of the application of this algorithm to two difficult cases, one very faint and the other very crowded. Several of Spitzer's instrumental characteristics that drive the specific features of this algorithm are shared by Kepler and WFIRST,more » implying that these features may prove to be a useful starting point for algorithms designed for microlensing campaigns by these other missions.« less
Adaboost multi-view face detection based on YCgCr skin color model
NASA Astrophysics Data System (ADS)
Lan, Qi; Xu, Zhiyong
2016-09-01
Traditional Adaboost face detection algorithm uses Haar-like features training face classifiers, whose detection error rate is low in the face region. While under the complex background, the classifiers will make wrong detection easily to the background regions with the similar faces gray level distribution, which leads to the error detection rate of traditional Adaboost algorithm is high. As one of the most important features of a face, skin in YCgCr color space has good clustering. We can fast exclude the non-face areas through the skin color model. Therefore, combining with the advantages of the Adaboost algorithm and skin color detection algorithm, this paper proposes Adaboost face detection algorithm method that bases on YCgCr skin color model. Experiments show that, compared with traditional algorithm, the method we proposed has improved significantly in the detection accuracy and errors.
Karayiannis, Nicolaos B; Mukherjee, Amit; Glover, John R; Ktonas, Periklis Y; Frost, James D; Hrachovy, Richard A; Mizrahi, Eli M
2006-04-01
This paper presents an approach to detect epileptic seizure segments in the neonatal electroencephalogram (EEG) by characterizing the spectral features of the EEG waveform using a rule-based algorithm cascaded with a neural network. A rule-based algorithm screens out short segments of pseudosinusoidal EEG patterns as epileptic based on features in the power spectrum. The output of the rule-based algorithm is used to train and compare the performance of conventional feedforward neural networks and quantum neural networks. The results indicate that the trained neural networks, cascaded with the rule-based algorithm, improved the performance of the rule-based algorithm acting by itself. The evaluation of the proposed cascaded scheme for the detection of pseudosinusoidal seizure segments reveals its potential as a building block of the automated seizure detection system under development.
Estimating Position of Mobile Robots From Omnidirectional Vision Using an Adaptive Algorithm.
Li, Luyang; Liu, Yun-Hui; Wang, Kai; Fang, Mu
2015-08-01
This paper presents a novel and simple adaptive algorithm for estimating the position of a mobile robot with high accuracy in an unknown and unstructured environment by fusing images of an omnidirectional vision system with measurements of odometry and inertial sensors. Based on a new derivation where the omnidirectional projection can be linearly parameterized by the positions of the robot and natural feature points, we propose a novel adaptive algorithm, which is similar to the Slotine-Li algorithm in model-based adaptive control, to estimate the robot's position by using the tracked feature points in image sequence, the robot's velocity, and orientation angles measured by odometry and inertial sensors. It is proved that the adaptive algorithm leads to global exponential convergence of the position estimation errors to zero. Simulations and real-world experiments are performed to demonstrate the performance of the proposed algorithm.
Vargas-Rodriguez, Everardo; Guzman-Chavez, Ana Dinora; Baeza-Serrato, Roberto
2018-06-04
In this work, a novel tailored algorithm to enhance the overall sensitivity of gas concentration sensors based on the Direct Absorption Tunable Laser Absorption Spectroscopy (DA-ATLAS) method is presented. By using this algorithm, the sensor sensitivity can be custom-designed to be quasi constant over a much larger dynamic range compared with that obtained by typical methods based on a single statistics feature of the sensor signal output (peak amplitude, area under the curve, mean or RMS). Additionally, it is shown that with our algorithm, an optimal function can be tailored to get a quasi linear relationship between the concentration and some specific statistics features over a wider dynamic range. In order to test the viability of our algorithm, a basic C 2 H 2 sensor based on DA-ATLAS was implemented, and its experimental measurements support the simulated results provided by our algorithm.
Fast Solution in Sparse LDA for Binary Classification
NASA Technical Reports Server (NTRS)
Moghaddam, Baback
2010-01-01
An algorithm that performs sparse linear discriminant analysis (Sparse-LDA) finds near-optimal solutions in far less time than the prior art when specialized to binary classification (of 2 classes). Sparse-LDA is a type of feature- or variable- selection problem with numerous applications in statistics, machine learning, computer vision, computational finance, operations research, and bio-informatics. Because of its combinatorial nature, feature- or variable-selection problems are NP-hard or computationally intractable in cases involving more than 30 variables or features. Therefore, one typically seeks approximate solutions by means of greedy search algorithms. The prior Sparse-LDA algorithm was a greedy algorithm that considered the best variable or feature to add/ delete to/ from its subsets in order to maximally discriminate between multiple classes of data. The present algorithm is designed for the special but prevalent case of 2-class or binary classification (e.g. 1 vs. 0, functioning vs. malfunctioning, or change versus no change). The present algorithm provides near-optimal solutions on large real-world datasets having hundreds or even thousands of variables or features (e.g. selecting the fewest wavelength bands in a hyperspectral sensor to do terrain classification) and does so in typical computation times of minutes as compared to days or weeks as taken by the prior art. Sparse LDA requires solving generalized eigenvalue problems for a large number of variable subsets (represented by the submatrices of the input within-class and between-class covariance matrices). In the general (fullrank) case, the amount of computation scales at least cubically with the number of variables and thus the size of the problems that can be solved is limited accordingly. However, in binary classification, the principal eigenvalues can be found using a special analytic formula, without resorting to costly iterative techniques. The present algorithm exploits this analytic form along with the inherent sequential nature of greedy search itself. Together this enables the use of highly-efficient partitioned-matrix-inverse techniques that result in large speedups of computation in both the forward-selection and backward-elimination stages of greedy algorithms in general.
Signature Verification Using N-tuple Learning Machine.
Maneechot, Thanin; Kitjaidure, Yuttana
2005-01-01
This research presents new algorithm for signature verification using N-tuple learning machine. The features are taken from handwritten signature on Digital Tablet (On-line). This research develops recognition algorithm using four features extraction, namely horizontal and vertical pen tip position(x-y position), pen tip pressure, and pen altitude angles. Verification uses N-tuple technique with Gaussian thresholding.
Learning in Structured Connectionist Networks
1988-04-01
the structure is too rigid and learning too difficult for cognitive modeling. Two algorithms for learning simple, feature-based concept descriptions...and learning too difficult for cognitive model- ing. Two algorithms for learning simple, feature-based concept descriptions were also implemented. The...Term Goals Recent progress in connectionist research has been encouraging; networks have success- fully modeled human performance for various cognitive
NASA Astrophysics Data System (ADS)
Muckenhuber, Stefan; Sandven, Stein
2017-04-01
An open-source sea ice drift algorithm for Sentinel-1 SAR imagery is introduced based on the combination of feature-tracking and pattern-matching. A computational efficient feature-tracking algorithm produces an initial drift estimate and limits the search area for the pattern-matching, that provides small to medium scale drift adjustments and normalised cross correlation values as quality measure. The algorithm is designed to utilise the respective advantages of the two approaches and allows drift calculation at user defined locations. The pre-processing of the Sentinel-1 data has been optimised to retrieve a feature distribution that depends less on SAR backscatter peak values. A recommended parameter set for the algorithm has been found using a representative image pair over Fram Strait and 350 manually derived drift vectors as validation. Applying the algorithm with this parameter setting, sea ice drift retrieval with a vector spacing of 8 km on Sentinel-1 images covering 400 km x 400 km, takes less than 3.5 minutes on a standard 2.7 GHz processor with 8 GB memory. For validation, buoy GPS data, collected in 2015 between 15th January and 22nd April and covering an area from 81° N to 83.5° N and 12° E to 27° E, have been compared to calculated drift results from 261 corresponding Sentinel-1 image pairs. We found a logarithmic distribution of the error with a peak at 300 m. All software requirements necessary for applying the presented sea ice drift algorithm are open-source to ensure free implementation and easy distribution.
NASA Astrophysics Data System (ADS)
Kurugol, Sila; Dy, Jennifer G.; Rajadhyaksha, Milind; Gossage, Kirk W.; Weissmann, Jesse; Brooks, Dana H.
2011-03-01
The examination of the dermis/epidermis junction (DEJ) is clinically important for skin cancer diagnosis. Reflectance confocal microscopy (RCM) is an emerging tool for detection of skin cancers in vivo. However, visual localization of the DEJ in RCM images, with high accuracy and repeatability, is challenging, especially in fair skin, due to low contrast, heterogeneous structure and high inter- and intra-subject variability. We recently proposed a semi-automated algorithm to localize the DEJ in z-stacks of RCM images of fair skin, based on feature segmentation and classification. Here we extend the algorithm to dark skin. The extended algorithm first decides the skin type and then applies the appropriate DEJ localization method. In dark skin, strong backscatter from the pigment melanin causes the basal cells above the DEJ to appear with high contrast. To locate those high contrast regions, the algorithm operates on small tiles (regions) and finds the peaks of the smoothed average intensity depth profile of each tile. However, for some tiles, due to heterogeneity, multiple peaks in the depth profile exist and the strongest peak might not be the basal layer peak. To select the correct peak, basal cells are represented with a vector of texture features. The peak with most similar features to this feature vector is selected. The results show that the algorithm detected the skin types correctly for all 17 stacks tested (8 fair, 9 dark). The DEJ detection algorithm achieved an average distance from the ground truth DEJ surface of around 4.7μm for dark skin and around 7-14μm for fair skin.
A novel image retrieval algorithm based on PHOG and LSH
NASA Astrophysics Data System (ADS)
Wu, Hongliang; Wu, Weimin; Peng, Jiajin; Zhang, Junyuan
2017-08-01
PHOG can describe the local shape of the image and its relationship between the spaces. The using of PHOG algorithm to extract image features in image recognition and retrieval and other aspects have achieved good results. In recent years, locality sensitive hashing (LSH) algorithm has been superior to large-scale data in solving near-nearest neighbor problems compared with traditional algorithms. This paper presents a novel image retrieval algorithm based on PHOG and LSH. First, we use PHOG to extract the feature vector of the image, then use L different LSH hash table to reduce the dimension of PHOG texture to index values and map to different bucket, and finally extract the corresponding value of the image in the bucket for second image retrieval using Manhattan distance. This algorithm can adapt to the massive image retrieval, which ensures the high accuracy of the image retrieval and reduces the time complexity of the retrieval. This algorithm is of great significance.
Evaluation of security algorithms used for security processing on DICOM images
NASA Astrophysics Data System (ADS)
Chen, Xiaomeng; Shuai, Jie; Zhang, Jianguo; Huang, H. K.
2005-04-01
In this paper, we developed security approach to provide security measures and features in PACS image acquisition and Tele-radiology image transmission. The security processing on medical images was based on public key infrastructure (PKI) and including digital signature and data encryption to achieve the security features of confidentiality, privacy, authenticity, integrity, and non-repudiation. There are many algorithms which can be used in PKI for data encryption and digital signature. In this research, we select several algorithms to perform security processing on different DICOM images in PACS environment, evaluate the security processing performance of these algorithms, and find the relationship between performance with image types, sizes and the implementation methods.
Document localization algorithms based on feature points and straight lines
NASA Astrophysics Data System (ADS)
Skoryukina, Natalya; Shemiakina, Julia; Arlazarov, Vladimir L.; Faradjev, Igor
2018-04-01
The important part of the system of a planar rectangular object analysis is the localization: the estimation of projective transform from template image of an object to its photograph. The system also includes such subsystems as the selection and recognition of text fields, the usage of contexts etc. In this paper three localization algorithms are described. All algorithms use feature points and two of them also analyze near-horizontal and near- vertical lines on the photograph. The algorithms and their combinations are tested on a dataset of real document photographs. Also the method of localization quality estimation is proposed that allows configuring the localization subsystem independently of the other subsystems quality.
Ye, Fei; Lou, Xin Yuan; Sun, Lin Fu
2017-01-01
This paper proposes a new support vector machine (SVM) optimization scheme based on an improved chaotic fly optimization algorithm (FOA) with a mutation strategy to simultaneously perform parameter setting turning for the SVM and feature selection. In the improved FOA, the chaotic particle initializes the fruit fly swarm location and replaces the expression of distance for the fruit fly to find the food source. However, the proposed mutation strategy uses two distinct generative mechanisms for new food sources at the osphresis phase, allowing the algorithm procedure to search for the optimal solution in both the whole solution space and within the local solution space containing the fruit fly swarm location. In an evaluation based on a group of ten benchmark problems, the proposed algorithm's performance is compared with that of other well-known algorithms, and the results support the superiority of the proposed algorithm. Moreover, this algorithm is successfully applied in a SVM to perform both parameter setting turning for the SVM and feature selection to solve real-world classification problems. This method is called chaotic fruit fly optimization algorithm (CIFOA)-SVM and has been shown to be a more robust and effective optimization method than other well-known methods, particularly in terms of solving the medical diagnosis problem and the credit card problem.
A systematic investigation of computation models for predicting Adverse Drug Reactions (ADRs).
Kuang, Qifan; Wang, MinQi; Li, Rong; Dong, YongCheng; Li, Yizhou; Li, Menglong
2014-01-01
Early and accurate identification of adverse drug reactions (ADRs) is critically important for drug development and clinical safety. Computer-aided prediction of ADRs has attracted increasing attention in recent years, and many computational models have been proposed. However, because of the lack of systematic analysis and comparison of the different computational models, there remain limitations in designing more effective algorithms and selecting more useful features. There is therefore an urgent need to review and analyze previous computation models to obtain general conclusions that can provide useful guidance to construct more effective computational models to predict ADRs. In the current study, the main work is to compare and analyze the performance of existing computational methods to predict ADRs, by implementing and evaluating additional algorithms that have been earlier used for predicting drug targets. Our results indicated that topological and intrinsic features were complementary to an extent and the Jaccard coefficient had an important and general effect on the prediction of drug-ADR associations. By comparing the structure of each algorithm, final formulas of these algorithms were all converted to linear model in form, based on this finding we propose a new algorithm called the general weighted profile method and it yielded the best overall performance among the algorithms investigated in this paper. Several meaningful conclusions and useful findings regarding the prediction of ADRs are provided for selecting optimal features and algorithms.
An improved ASIFT algorithm for indoor panorama image matching
NASA Astrophysics Data System (ADS)
Fu, Han; Xie, Donghai; Zhong, Ruofei; Wu, Yu; Wu, Qiong
2017-07-01
The generation of 3D models for indoor objects and scenes is an attractive tool for digital city, virtual reality and SLAM purposes. Panoramic images are becoming increasingly more common in such applications due to their advantages to capture the complete environment in one single image with large field of view. The extraction and matching of image feature points are important and difficult steps in three-dimensional reconstruction, and ASIFT is a state-of-the-art algorithm to implement these functions. Compared with the SIFT algorithm, more feature points can be generated and the matching accuracy of ASIFT algorithm is higher, even for the panoramic images with obvious distortions. However, the algorithm is really time-consuming because of complex operations and performs not very well for some indoor scenes under poor light or without rich textures. To solve this problem, this paper proposes an improved ASIFT algorithm for indoor panoramic images: firstly, the panoramic images are projected into multiple normal perspective images. Secondly, the original ASIFT algorithm is simplified from the affine transformation of tilt and rotation with the images to the only tilt affine transformation. Finally, the results are re-projected to the panoramic image space. Experiments in different environments show that this method can not only ensure the precision of feature points extraction and matching, but also greatly reduce the computing time.
An extension of the QZ algorithm for solving the generalized matrix eigenvalue problem
NASA Technical Reports Server (NTRS)
Ward, R. C.
1973-01-01
This algorithm is an extension of Moler and Stewart's QZ algorithm with some added features for saving time and operations. Also, some additional properties of the QR algorithm which were not practical to implement in the QZ algorithm can be generalized with the combination shift QZ algorithm. Numerous test cases are presented to give practical application tests for algorithm. Based on results, this algorithm should be preferred over existing algorithms which attempt to solve the class of generalized eigenproblems where both matrices are singular or nearly singular.
A universal deep learning approach for modeling the flow of patients under different severities.
Jiang, Shancheng; Chin, Kwai-Sang; Tsui, Kwok L
2018-02-01
The Accident and Emergency Department (A&ED) is the frontline for providing emergency care in hospitals. Unfortunately, relative A&ED resources have failed to keep up with continuously increasing demand in recent years, which leads to overcrowding in A&ED. Knowing the fluctuation of patient arrival volume in advance is a significant premise to relieve this pressure. Based on this motivation, the objective of this study is to explore an integrated framework with high accuracy for predicting A&ED patient flow under different triage levels, by combining a novel feature selection process with deep neural networks. Administrative data is collected from an actual A&ED and categorized into five groups based on different triage levels. A genetic algorithm (GA)-based feature selection algorithm is improved and implemented as a pre-processing step for this time-series prediction problem, in order to explore key features affecting patient flow. In our improved GA, a fitness-based crossover is proposed to maintain the joint information of multiple features during iterative process, instead of traditional point-based crossover. Deep neural networks (DNN) is employed as the prediction model to utilize their universal adaptability and high flexibility. In the model-training process, the learning algorithm is well-configured based on a parallel stochastic gradient descent algorithm. Two effective regularization strategies are integrated in one DNN framework to avoid overfitting. All introduced hyper-parameters are optimized efficiently by grid-search in one pass. As for feature selection, our improved GA-based feature selection algorithm has outperformed a typical GA and four state-of-the-art feature selection algorithms (mRMR, SAFS, VIFR, and CFR). As for the prediction accuracy of proposed integrated framework, compared with other frequently used statistical models (GLM, seasonal-ARIMA, ARIMAX, and ANN) and modern machine models (SVM-RBF, SVM-linear, RF, and R-LASSO), the proposed integrated "DNN-I-GA" framework achieves higher prediction accuracy on both MAPE and RMSE metrics in pairwise comparisons. The contribution of our study is two-fold. Theoretically, the traditional GA-based feature selection process is improved to have less hyper-parameters and higher efficiency, and the joint information of multiple features is maintained by fitness-based crossover operator. The universal property of DNN is further enhanced by merging different regularization strategies. Practically, features selected by our improved GA can be used to acquire an underlying relationship between patient flows and input features. Predictive values are significant indicators of patients' demand and can be used by A&ED managers to make resource planning and allocation. High accuracy achieved by the present framework in different cases enhances the reliability of downstream decision makings. Copyright © 2017 Elsevier B.V. All rights reserved.
Remote sensing image denoising application by generalized morphological component analysis
NASA Astrophysics Data System (ADS)
Yu, Chong; Chen, Xiong
2014-12-01
In this paper, we introduced a remote sensing image denoising method based on generalized morphological component analysis (GMCA). This novel algorithm is the further extension of morphological component analysis (MCA) algorithm to the blind source separation framework. The iterative thresholding strategy adopted by GMCA algorithm firstly works on the most significant features in the image, and then progressively incorporates smaller features to finely tune the parameters of whole model. Mathematical analysis of the computational complexity of GMCA algorithm is provided. Several comparison experiments with state-of-the-art denoising algorithms are reported. In order to make quantitative assessment of algorithms in experiments, Peak Signal to Noise Ratio (PSNR) index and Structural Similarity (SSIM) index are calculated to assess the denoising effect from the gray-level fidelity aspect and the structure-level fidelity aspect, respectively. Quantitative analysis on experiment results, which is consistent with the visual effect illustrated by denoised images, has proven that the introduced GMCA algorithm possesses a marvelous remote sensing image denoising effectiveness and ability. It is even hard to distinguish the original noiseless image from the recovered image by adopting GMCA algorithm through visual effect.
NASA Technical Reports Server (NTRS)
Wang, Xu; Shi, Fang; Sigrist, Norbert; Seo, Byoung-Joon; Tang, Hong; Bikkannavar, Siddarayappa; Basinger, Scott; Lay, Oliver
2012-01-01
Large aperture telescope commonly features segment mirrors and a coarse phasing step is needed to bring these individual segments into the fine phasing capture range. Dispersed Fringe Sensing (DFS) is a powerful coarse phasing technique and its alteration is currently being used for JWST.An Advanced Dispersed Fringe Sensing (ADFS) algorithm is recently developed to improve the performance and robustness of previous DFS algorithms with better accuracy and unique solution. The first part of the paper introduces the basic ideas and the essential features of the ADFS algorithm and presents the some algorithm sensitivity study results. The second part of the paper describes the full details of algorithm validation process through the advanced wavefront sensing and correction testbed (AWCT): first, the optimization of the DFS hardware of AWCT to ensure the data accuracy and reliability is illustrated. Then, a few carefully designed algorithm validation experiments are implemented, and the corresponding data analysis results are shown. Finally the fiducial calibration using Range-Gate-Metrology technique is carried out and a <10nm or <1% algorithm accuracy is demonstrated.
Pose estimation for augmented reality applications using genetic algorithm.
Yu, Ying Kin; Wong, Kin Hong; Chang, Michael Ming Yuen
2005-12-01
This paper describes a genetic algorithm that tackles the pose-estimation problem in computer vision. Our genetic algorithm can find the rotation and translation of an object accurately when the three-dimensional structure of the object is given. In our implementation, each chromosome encodes both the pose and the indexes to the selected point features of the object. Instead of only searching for the pose as in the existing work, our algorithm, at the same time, searches for a set containing the most reliable feature points in the process. This mismatch filtering strategy successfully makes the algorithm more robust under the presence of point mismatches and outliers in the images. Our algorithm has been tested with both synthetic and real data with good results. The accuracy of the recovered pose is compared to the existing algorithms. Our approach outperformed the Lowe's method and the other two genetic algorithms under the presence of point mismatches and outliers. In addition, it has been used to estimate the pose of a real object. It is shown that the proposed method is applicable to augmented reality applications.
Flexible methods for segmentation evaluation: results from CT-based luggage screening.
Karimi, Seemeen; Jiang, Xiaoqian; Cosman, Pamela; Martz, Harry
2014-01-01
Imaging systems used in aviation security include segmentation algorithms in an automatic threat recognition pipeline. The segmentation algorithms evolve in response to emerging threats and changing performance requirements. Analysis of segmentation algorithms' behavior, including the nature of errors and feature recovery, facilitates their development. However, evaluation methods from the literature provide limited characterization of the segmentation algorithms. To develop segmentation evaluation methods that measure systematic errors such as oversegmentation and undersegmentation, outliers, and overall errors. The methods must measure feature recovery and allow us to prioritize segments. We developed two complementary evaluation methods using statistical techniques and information theory. We also created a semi-automatic method to define ground truth from 3D images. We applied our methods to evaluate five segmentation algorithms developed for CT luggage screening. We validated our methods with synthetic problems and an observer evaluation. Both methods selected the same best segmentation algorithm. Human evaluation confirmed the findings. The measurement of systematic errors and prioritization helped in understanding the behavior of each segmentation algorithm. Our evaluation methods allow us to measure and explain the accuracy of segmentation algorithms.
Kherfi, Mohammed Lamine; Ziou, Djemel
2006-04-01
In content-based image retrieval, understanding the user's needs is a challenging task that requires integrating him in the process of retrieval. Relevance feedback (RF) has proven to be an effective tool for taking the user's judgement into account. In this paper, we present a new RF framework based on a feature selection algorithm that nicely combines the advantages of a probabilistic formulation with those of using both the positive example (PE) and the negative example (NE). Through interaction with the user, our algorithm learns the importance he assigns to image features, and then applies the results obtained to define similarity measures that correspond better to his judgement. The use of the NE allows images undesired by the user to be discarded, thereby improving retrieval accuracy. As for the probabilistic formulation of the problem, it presents a multitude of advantages and opens the door to more modeling possibilities that achieve a good feature selection. It makes it possible to cluster the query data into classes, choose the probability law that best models each class, model missing data, and support queries with multiple PE and/or NE classes. The basic principle of our algorithm is to assign more importance to features with a high likelihood and those which distinguish well between PE classes and NE classes. The proposed algorithm was validated separately and in image retrieval context, and the experiments show that it performs a good feature selection and contributes to improving retrieval effectiveness.
A Sparsity-Promoted Method Based on Majorization-Minimization for Weak Fault Feature Enhancement
Hao, Yansong; Song, Liuyang; Tang, Gang; Yuan, Hongfang
2018-01-01
Fault transient impulses induced by faulty components in rotating machinery usually contain substantial interference. Fault features are comparatively weak in the initial fault stage, which renders fault diagnosis more difficult. In this case, a sparse representation method based on the Majorzation-Minimization (MM) algorithm is proposed to enhance weak fault features and extract the features from strong background noise. However, the traditional MM algorithm suffers from two issues, which are the choice of sparse basis and complicated calculations. To address these challenges, a modified MM algorithm is proposed in which a sparse optimization objective function is designed firstly. Inspired by the Basis Pursuit (BP) model, the optimization function integrates an impulsive feature-preserving factor and a penalty function factor. Second, a modified Majorization iterative method is applied to address the convex optimization problem of the designed function. A series of sparse coefficients can be achieved through iterating, which only contain transient components. It is noteworthy that there is no need to select the sparse basis in the proposed iterative method because it is fixed as a unit matrix. Then the reconstruction step is omitted, which can significantly increase detection efficiency. Eventually, envelope analysis of the sparse coefficients is performed to extract weak fault features. Simulated and experimental signals including bearings and gearboxes are employed to validate the effectiveness of the proposed method. In addition, comparisons are made to prove that the proposed method outperforms the traditional MM algorithm in terms of detection results and efficiency. PMID:29597280
A Sparsity-Promoted Method Based on Majorization-Minimization for Weak Fault Feature Enhancement.
Ren, Bangyue; Hao, Yansong; Wang, Huaqing; Song, Liuyang; Tang, Gang; Yuan, Hongfang
2018-03-28
Fault transient impulses induced by faulty components in rotating machinery usually contain substantial interference. Fault features are comparatively weak in the initial fault stage, which renders fault diagnosis more difficult. In this case, a sparse representation method based on the Majorzation-Minimization (MM) algorithm is proposed to enhance weak fault features and extract the features from strong background noise. However, the traditional MM algorithm suffers from two issues, which are the choice of sparse basis and complicated calculations. To address these challenges, a modified MM algorithm is proposed in which a sparse optimization objective function is designed firstly. Inspired by the Basis Pursuit (BP) model, the optimization function integrates an impulsive feature-preserving factor and a penalty function factor. Second, a modified Majorization iterative method is applied to address the convex optimization problem of the designed function. A series of sparse coefficients can be achieved through iterating, which only contain transient components. It is noteworthy that there is no need to select the sparse basis in the proposed iterative method because it is fixed as a unit matrix. Then the reconstruction step is omitted, which can significantly increase detection efficiency. Eventually, envelope analysis of the sparse coefficients is performed to extract weak fault features. Simulated and experimental signals including bearings and gearboxes are employed to validate the effectiveness of the proposed method. In addition, comparisons are made to prove that the proposed method outperforms the traditional MM algorithm in terms of detection results and efficiency.
Chaotic Particle Swarm Optimization with Mutation for Classification
Assarzadeh, Zahra; Naghsh-Nilchi, Ahmad Reza
2015-01-01
In this paper, a chaotic particle swarm optimization with mutation-based classifier particle swarm optimization is proposed to classify patterns of different classes in the feature space. The introduced mutation operators and chaotic sequences allows us to overcome the problem of early convergence into a local minima associated with particle swarm optimization algorithms. That is, the mutation operator sharpens the convergence and it tunes the best possible solution. Furthermore, to remove the irrelevant data and reduce the dimensionality of medical datasets, a feature selection approach using binary version of the proposed particle swarm optimization is introduced. In order to demonstrate the effectiveness of our proposed classifier, mutation-based classifier particle swarm optimization, it is checked out with three sets of data classifications namely, Wisconsin diagnostic breast cancer, Wisconsin breast cancer and heart-statlog, with different feature vector dimensions. The proposed algorithm is compared with different classifier algorithms including k-nearest neighbor, as a conventional classifier, particle swarm-classifier, genetic algorithm, and Imperialist competitive algorithm-classifier, as more sophisticated ones. The performance of each classifier was evaluated by calculating the accuracy, sensitivity, specificity and Matthews's correlation coefficient. The experimental results show that the mutation-based classifier particle swarm optimization unequivocally performs better than all the compared algorithms. PMID:25709937
Crater Identification Algorithm for the Lost in Low Lunar Orbit Scenario
NASA Technical Reports Server (NTRS)
Hanak, Chad; Crain, TImothy
2010-01-01
Recent emphasis by NASA on returning astronauts to the Moon has placed attention on the subject of lunar surface feature tracking. Although many algorithms have been proposed for lunar surface feature tracking navigation, much less attention has been paid to the issue of navigational state initialization from lunar craters in a lost in low lunar orbit (LLO) scenario. That is, a scenario in which lunar surface feature tracking must begin, but current navigation state knowledge is either unavailable or too poor to initiate a tracking algorithm. The situation is analogous to the lost in space scenario for star trackers. A new crater identification algorithm is developed herein that allows for navigation state initialization from as few as one image of the lunar surface with no a priori state knowledge. The algorithm takes as inputs the locations and diameters of craters that have been detected in an image, and uses the information to match the craters to entries in the USGS lunar crater catalog via non-dimensional crater triangle parameters. Due to the large number of uncataloged craters that exist on the lunar surface, a probability-based check was developed to reject false identifications. The algorithm was tested on craters detected in four revolutions of Apollo 16 LLO images, and shown to perform well.
NASA Astrophysics Data System (ADS)
Hortos, William S.
2008-04-01
Proposed distributed wavelet-based algorithms are a means to compress sensor data received at the nodes forming a wireless sensor network (WSN) by exchanging information between neighboring sensor nodes. Local collaboration among nodes compacts the measurements, yielding a reduced fused set with equivalent information at far fewer nodes. Nodes may be equipped with multiple sensor types, each capable of sensing distinct phenomena: thermal, humidity, chemical, voltage, or image signals with low or no frequency content as well as audio, seismic or video signals within defined frequency ranges. Compression of the multi-source data through wavelet-based methods, distributed at active nodes, reduces downstream processing and storage requirements along the paths to sink nodes; it also enables noise suppression and more energy-efficient query routing within the WSN. Targets are first detected by the multiple sensors; then wavelet compression and data fusion are applied to the target returns, followed by feature extraction from the reduced data; feature data are input to target recognition/classification routines; targets are tracked during their sojourns through the area monitored by the WSN. Algorithms to perform these tasks are implemented in a distributed manner, based on a partition of the WSN into clusters of nodes. In this work, a scheme of collaborative processing is applied for hierarchical data aggregation and decorrelation, based on the sensor data itself and any redundant information, enabled by a distributed, in-cluster wavelet transform with lifting that allows multiple levels of resolution. The wavelet-based compression algorithm significantly decreases RF bandwidth and other resource use in target processing tasks. Following wavelet compression, features are extracted. The objective of feature extraction is to maximize the probabilities of correct target classification based on multi-source sensor measurements, while minimizing the resource expenditures at participating nodes. Therefore, the feature-extraction method based on the Haar DWT is presented that employs a maximum-entropy measure to determine significant wavelet coefficients. Features are formed by calculating the energy of coefficients grouped around the competing clusters. A DWT-based feature extraction algorithm used for vehicle classification in WSNs can be enhanced by an added rule for selecting the optimal number of resolution levels to improve the correct classification rate and reduce energy consumption expended in local algorithm computations. Published field trial data for vehicular ground targets, measured with multiple sensor types, are used to evaluate the wavelet-assisted algorithms. Extracted features are used in established target recognition routines, e.g., the Bayesian minimum-error-rate classifier, to compare the effects on the classification performance of the wavelet compression. Simulations of feature sets and recognition routines at different resolution levels in target scenarios indicate the impact on classification rates, while formulas are provided to estimate reduction in resource use due to distributed compression.
DOE Office of Scientific and Technical Information (OSTI.GOV)
This algorithm processes high-rate 3-phase signals to identify the start time of each signal and estimate its envelope as data features. The start time and magnitude of each signal during the steady state is also extracted. The features can be used to detect abnormal signals. This algorithm is developed to analyze Exxeno's 3-phase voltage and current data recorded from refrigeration systems to detect device failure or degradation.
Kernel-based discriminant feature extraction using a representative dataset
NASA Astrophysics Data System (ADS)
Li, Honglin; Sancho Gomez, Jose-Luis; Ahalt, Stanley C.
2002-07-01
Discriminant Feature Extraction (DFE) is widely recognized as an important pre-processing step in classification applications. Most DFE algorithms are linear and thus can only explore the linear discriminant information among the different classes. Recently, there has been several promising attempts to develop nonlinear DFE algorithms, among which is Kernel-based Feature Extraction (KFE). The efficacy of KFE has been experimentally verified by both synthetic data and real problems. However, KFE has some known limitations. First, KFE does not work well for strongly overlapped data. Second, KFE employs all of the training set samples during the feature extraction phase, which can result in significant computation when applied to very large datasets. Finally, KFE can result in overfitting. In this paper, we propose a substantial improvement to KFE that overcomes the above limitations by using a representative dataset, which consists of critical points that are generated from data-editing techniques and centroid points that are determined by using the Frequency Sensitive Competitive Learning (FSCL) algorithm. Experiments show that this new KFE algorithm performs well on significantly overlapped datasets, and it also reduces computational complexity. Further, by controlling the number of centroids, the overfitting problem can be effectively alleviated.
Automated detection of diabetic retinopathy on digital fundus images.
Sinthanayothin, C; Boyce, J F; Williamson, T H; Cook, H L; Mensah, E; Lal, S; Usher, D
2002-02-01
The aim was to develop an automated screening system to analyse digital colour retinal images for important features of non-proliferative diabetic retinopathy (NPDR). High performance pre-processing of the colour images was performed. Previously described automated image analysis systems were used to detect major landmarks of the retinal image (optic disc, blood vessels and fovea). Recursive region growing segmentation algorithms combined with the use of a new technique, termed a 'Moat Operator', were used to automatically detect features of NPDR. These features included haemorrhages and microaneurysms (HMA), which were treated as one group, and hard exudates as another group. Sensitivity and specificity data were calculated by comparison with an experienced fundoscopist. The algorithm for exudate recognition was applied to 30 retinal images of which 21 contained exudates and nine were without pathology. The sensitivity and specificity for exudate detection were 88.5% and 99.7%, respectively, when compared with the ophthalmologist. HMA were present in 14 retinal images. The algorithm achieved a sensitivity of 77.5% and specificity of 88.7% for detection of HMA. Fully automated computer algorithms were able to detect hard exudates and HMA. This paper presents encouraging results in automatic identification of important features of NPDR.
Robust spike classification based on frequency domain neural waveform features.
Yang, Chenhui; Yuan, Yuan; Si, Jennie
2013-12-01
We introduce a new spike classification algorithm based on frequency domain features of the spike snippets. The goal for the algorithm is to provide high classification accuracy, low false misclassification, ease of implementation, robustness to signal degradation, and objectivity in classification outcomes. In this paper, we propose a spike classification algorithm based on frequency domain features (CFDF). It makes use of frequency domain contents of the recorded neural waveforms for spike classification. The self-organizing map (SOM) is used as a tool to determine the cluster number intuitively and directly by viewing the SOM output map. After that, spike classification can be easily performed using clustering algorithms such as the k-Means. In conjunction with our previously developed multiscale correlation of wavelet coefficient (MCWC) spike detection algorithm, we show that the MCWC and CFDF detection and classification system is robust when tested on several sets of artificial and real neural waveforms. The CFDF is comparable to or outperforms some popular automatic spike classification algorithms with artificial and real neural data. The detection and classification of neural action potentials or neural spikes is an important step in single-unit-based neuroscientific studies and applications. After the detection of neural snippets potentially containing neural spikes, a robust classification algorithm is applied for the analysis of the snippets to (1) extract similar waveforms into one class for them to be considered coming from one unit, and to (2) remove noise snippets if they do not contain any features of an action potential. Usually, a snippet is a small 2 or 3 ms segment of the recorded waveform, and differences in neural action potentials can be subtle from one unit to another. Therefore, a robust, high performance classification system like the CFDF is necessary. In addition, the proposed algorithm does not require any assumptions on statistical properties of the noise and proves to be robust under noise contamination.
Shilov, Ignat V; Seymour, Sean L; Patel, Alpesh A; Loboda, Alex; Tang, Wilfred H; Keating, Sean P; Hunter, Christie L; Nuwaysir, Lydia M; Schaeffer, Daniel A
2007-09-01
The Paragon Algorithm, a novel database search engine for the identification of peptides from tandem mass spectrometry data, is presented. Sequence Temperature Values are computed using a sequence tag algorithm, allowing the degree of implication by an MS/MS spectrum of each region of a database to be determined on a continuum. Counter to conventional approaches, features such as modifications, substitutions, and cleavage events are modeled with probabilities rather than by discrete user-controlled settings to consider or not consider a feature. The use of feature probabilities in conjunction with Sequence Temperature Values allows for a very large increase in the effective search space with only a very small increase in the actual number of hypotheses that must be scored. The algorithm has a new kind of user interface that removes the user expertise requirement, presenting control settings in the language of the laboratory that are translated to optimal algorithmic settings. To validate this new algorithm, a comparison with Mascot is presented for a series of analogous searches to explore the relative impact of increasing search space probed with Mascot by relaxing the tryptic digestion conformance requirements from trypsin to semitrypsin to no enzyme and with the Paragon Algorithm using its Rapid mode and Thorough mode with and without tryptic specificity. Although they performed similarly for small search space, dramatic differences were observed in large search space. With the Paragon Algorithm, hundreds of biological and artifact modifications, all possible substitutions, and all levels of conformance to the expected digestion pattern can be searched in a single search step, yet the typical cost in search time is only 2-5 times that of conventional small search space. Despite this large increase in effective search space, there is no drastic loss of discrimination that typically accompanies the exploration of large search space.
Blessy, S A Praylin Selva; Sulochana, C Helen
2015-01-01
Segmentation of brain tumor from Magnetic Resonance Imaging (MRI) becomes very complicated due to the structural complexities of human brain and the presence of intensity inhomogeneities. To propose a method that effectively segments brain tumor from MR images and to evaluate the performance of unsupervised optimal fuzzy clustering (UOFC) algorithm for segmentation of brain tumor from MR images. Segmentation is done by preprocessing the MR image to standardize intensity inhomogeneities followed by feature extraction, feature fusion and clustering. Different validation measures are used to evaluate the performance of the proposed method using different clustering algorithms. The proposed method using UOFC algorithm produces high sensitivity (96%) and low specificity (4%) compared to other clustering methods. Validation results clearly show that the proposed method with UOFC algorithm effectively segments brain tumor from MR images.
Research on aviation unsafe incidents classification with improved TF-IDF algorithm
NASA Astrophysics Data System (ADS)
Wang, Yanhua; Zhang, Zhiyuan; Huo, Weigang
2016-05-01
The text content of Aviation Safety Confidential Reports contains a large number of valuable information. Term frequency-inverse document frequency algorithm is commonly used in text analysis, but it does not take into account the sequential relationship of the words in the text and its role in semantic expression. According to the seven category labels of civil aviation unsafe incidents, aiming at solving the problems of TF-IDF algorithm, this paper improved TF-IDF algorithm based on co-occurrence network; established feature words extraction and words sequential relations for classified incidents. Aviation domain lexicon was used to improve the accuracy rate of classification. Feature words network model was designed for multi-documents unsafe incidents classification, and it was used in the experiment. Finally, the classification accuracy of improved algorithm was verified by the experiments.
NASA Astrophysics Data System (ADS)
Janaki Sathya, D.; Geetha, K.
2017-12-01
Automatic mass or lesion classification systems are developed to aid in distinguishing between malignant and benign lesions present in the breast DCE-MR images, the systems need to improve both the sensitivity and specificity of DCE-MR image interpretation in order to be successful for clinical use. A new classifier (a set of features together with a classification method) based on artificial neural networks trained using artificial fish swarm optimization (AFSO) algorithm is proposed in this paper. The basic idea behind the proposed classifier is to use AFSO algorithm for searching the best combination of synaptic weights for the neural network. An optimal set of features based on the statistical textural features is presented. The investigational outcomes of the proposed suspicious lesion classifier algorithm therefore confirm that the resulting classifier performs better than other such classifiers reported in the literature. Therefore this classifier demonstrates that the improvement in both the sensitivity and specificity are possible through automated image analysis.
Using learning automata to determine proper subset size in high-dimensional spaces
NASA Astrophysics Data System (ADS)
Seyyedi, Seyyed Hossein; Minaei-Bidgoli, Behrouz
2017-03-01
In this paper, we offer a new method called FSLA (Finding the best candidate Subset using Learning Automata), which combines the filter and wrapper approaches for feature selection in high-dimensional spaces. Considering the difficulties of dimension reduction in high-dimensional spaces, FSLA's multi-objective functionality is to determine, in an efficient manner, a feature subset that leads to an appropriate tradeoff between the learning algorithm's accuracy and efficiency. First, using an existing weighting function, the feature list is sorted and selected subsets of the list of different sizes are considered. Then, a learning automaton verifies the performance of each subset when it is used as the input space of the learning algorithm and estimates its fitness upon the algorithm's accuracy and the subset size, which determines the algorithm's efficiency. Finally, FSLA introduces the fittest subset as the best choice. We tested FSLA in the framework of text classification. The results confirm its promising performance of attaining the identified goal.
Unsupervised spike sorting based on discriminative subspace learning.
Keshtkaran, Mohammad Reza; Yang, Zhi
2014-01-01
Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. In this paper, we present two unsupervised spike sorting algorithms based on discriminative subspace learning. The first algorithm simultaneously learns the discriminative feature subspace and performs clustering. It uses histogram of features in the most discriminative projection to detect the number of neurons. The second algorithm performs hierarchical divisive clustering that learns a discriminative 1-dimensional subspace for clustering in each level of the hierarchy until achieving almost unimodal distribution in the subspace. The algorithms are tested on synthetic and in-vivo data, and are compared against two widely used spike sorting methods. The comparative results demonstrate that our spike sorting methods can achieve substantially higher accuracy in lower dimensional feature space, and they are highly robust to noise. Moreover, they provide significantly better cluster separability in the learned subspace than in the subspace obtained by principal component analysis or wavelet transform.
Sinha, S K; Karray, F
2002-01-01
Pipeline surface defects such as holes and cracks cause major problems for utility managers, particularly when the pipeline is buried under the ground. Manual inspection for surface defects in the pipeline has a number of drawbacks, including subjectivity, varying standards, and high costs. Automatic inspection system using image processing and artificial intelligence techniques can overcome many of these disadvantages and offer utility managers an opportunity to significantly improve quality and reduce costs. A recognition and classification of pipe cracks using images analysis and neuro-fuzzy algorithm is proposed. In the preprocessing step the scanned images of pipe are analyzed and crack features are extracted. In the classification step the neuro-fuzzy algorithm is developed that employs a fuzzy membership function and error backpropagation algorithm. The idea behind the proposed approach is that the fuzzy membership function will absorb variation of feature values and the backpropagation network, with its learning ability, will show good classification efficiency.
Automatic Correction Algorithm of Hyfrology Feature Attribute in National Geographic Census
NASA Astrophysics Data System (ADS)
Li, C.; Guo, P.; Liu, X.
2017-09-01
A subset of the attributes of hydrologic features data in national geographic census are not clear, the current solution to this problem was through manual filling which is inefficient and liable to mistakes. So this paper proposes an automatic correction algorithm of hydrologic features attribute. Based on the analysis of the structure characteristics and topological relation, we put forward three basic principles of correction which include network proximity, structure robustness and topology ductility. Based on the WJ-III map workstation, we realize the automatic correction of hydrologic features. Finally, practical data is used to validate the method. The results show that our method is highly reasonable and efficient.
Parametric classification of handvein patterns based on texture features
NASA Astrophysics Data System (ADS)
Al Mahafzah, Harbi; Imran, Mohammad; Supreetha Gowda H., D.
2018-04-01
In this paper, we have developed Biometric recognition system adopting hand based modality Handvein,which has the unique pattern for each individual and it is impossible to counterfeit and fabricate as it is an internal feature. We have opted in choosing feature extraction algorithms such as LBP-visual descriptor, LPQ-blur insensitive texture operator, Log-Gabor-Texture descriptor. We have chosen well known classifiers such as KNN and SVM for classification. We have experimented and tabulated results of single algorithm recognition rate for Handvein under different distance measures and kernel options. The feature level fusion is carried out which increased the performance level.
NASA Astrophysics Data System (ADS)
Li, Jing; Xie, Weixin; Pei, Jihong
2018-03-01
Sea-land segmentation is one of the key technologies of sea target detection in remote sensing images. At present, the existing algorithms have the problems of low accuracy, low universality and poor automatic performance. This paper puts forward a sea-land segmentation algorithm based on multi-feature fusion for a large-field remote sensing image removing island. Firstly, the coastline data is extracted and all of land area is labeled by using the geographic information in large-field remote sensing image. Secondly, three features (local entropy, local texture and local gradient mean) is extracted in the sea-land border area, and the three features combine a 3D feature vector. And then the MultiGaussian model is adopted to describe 3D feature vectors of sea background in the edge of the coastline. Based on this multi-gaussian sea background model, the sea pixels and land pixels near coastline are classified more precise. Finally, the coarse segmentation result and the fine segmentation result are fused to obtain the accurate sea-land segmentation. Comparing and analyzing the experimental results by subjective vision, it shows that the proposed method has high segmentation accuracy, wide applicability and strong anti-disturbance ability.
Jiang, Feng; Han, Ji-zhong
2018-01-01
Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods. PMID:29623088
Yu, Xu; Lin, Jun-Yu; Jiang, Feng; Du, Jun-Wei; Han, Ji-Zhong
2018-01-01
Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods.
Identification of informative features for predicting proinflammatory potentials of engine exhausts.
Wang, Chia-Chi; Lin, Ying-Chi; Lin, Yuan-Chung; Jhang, Syu-Ruei; Tung, Chun-Wei
2017-08-18
The immunotoxicity of engine exhausts is of high concern to human health due to the increasing prevalence of immune-related diseases. However, the evaluation of immunotoxicity of engine exhausts is currently based on expensive and time-consuming experiments. It is desirable to develop efficient methods for immunotoxicity assessment. To accelerate the development of safe alternative fuels, this study proposed a computational method for identifying informative features for predicting proinflammatory potentials of engine exhausts. A principal component regression (PCR) algorithm was applied to develop prediction models. The informative features were identified by a sequential backward feature elimination (SBFE) algorithm. A total of 19 informative chemical and biological features were successfully identified by SBFE algorithm. The informative features were utilized to develop a computational method named FS-CBM for predicting proinflammatory potentials of engine exhausts. FS-CBM model achieved a high performance with correlation coefficient values of 0.997 and 0.943 obtained from training and independent test sets, respectively. The FS-CBM model was developed for predicting proinflammatory potentials of engine exhausts with a large improvement on prediction performance compared with our previous CBM model. The proposed method could be further applied to construct models for bioactivities of mixtures.
A novel framework for feature extraction in multi-sensor action potential sorting.
Wu, Shun-Chi; Swindlehurst, A Lee; Nenadic, Zoran
2015-09-30
Extracellular recordings of multi-unit neural activity have become indispensable in neuroscience research. The analysis of the recordings begins with the detection of the action potentials (APs), followed by a classification step where each AP is associated with a given neural source. A feature extraction step is required prior to classification in order to reduce the dimensionality of the data and the impact of noise, allowing source clustering algorithms to work more efficiently. In this paper, we propose a novel framework for multi-sensor AP feature extraction based on the so-called Matched Subspace Detector (MSD), which is shown to be a natural generalization of standard single-sensor algorithms. Clustering using both simulated data and real AP recordings taken in the locust antennal lobe demonstrates that the proposed approach yields features that are discriminatory and lead to promising results. Unlike existing methods, the proposed algorithm finds joint spatio-temporal feature vectors that match the dominant subspace observed in the two-dimensional data without needs for a forward propagation model and AP templates. The proposed MSD approach provides more discriminatory features for unsupervised AP sorting applications. Copyright © 2015 Elsevier B.V. All rights reserved.
A stereo remote sensing feature selection method based on artificial bee colony algorithm
NASA Astrophysics Data System (ADS)
Yan, Yiming; Liu, Pigang; Zhang, Ye; Su, Nan; Tian, Shu; Gao, Fengjiao; Shen, Yi
2014-05-01
To improve the efficiency of stereo information for remote sensing classification, a stereo remote sensing feature selection method is proposed in this paper presents, which is based on artificial bee colony algorithm. Remote sensing stereo information could be described by digital surface model (DSM) and optical image, which contain information of the three-dimensional structure and optical characteristics, respectively. Firstly, three-dimensional structure characteristic could be analyzed by 3D-Zernike descriptors (3DZD). However, different parameters of 3DZD could descript different complexity of three-dimensional structure, and it needs to be better optimized selected for various objects on the ground. Secondly, features for representing optical characteristic also need to be optimized. If not properly handled, when a stereo feature vector composed of 3DZD and image features, that would be a lot of redundant information, and the redundant information may not improve the classification accuracy, even cause adverse effects. To reduce information redundancy while maintaining or improving the classification accuracy, an optimized frame for this stereo feature selection problem is created, and artificial bee colony algorithm is introduced for solving this optimization problem. Experimental results show that the proposed method can effectively improve the computational efficiency, improve the classification accuracy.
Woldegebriel, Michael; Derks, Eduard
2017-01-17
In this work, a novel probabilistic untargeted feature detection algorithm for liquid chromatography coupled to high-resolution mass spectrometry (LC-HRMS) using artificial neural network (ANN) is presented. The feature detection process is approached as a pattern recognition problem, and thus, ANN was utilized as an efficient feature recognition tool. Unlike most existing feature detection algorithms, with this approach, any suspected chromatographic profile (i.e., shape of a peak) can easily be incorporated by training the network, avoiding the need to perform computationally expensive regression methods with specific mathematical models. In addition, with this method, we have shown that the high-resolution raw data can be fully utilized without applying any arbitrary thresholds or data reduction, therefore improving the sensitivity of the method for compound identification purposes. Furthermore, opposed to existing deterministic (binary) approaches, this method rather estimates the probability of a feature being present/absent at a given point of interest, thus giving chance for all data points to be propagated down the data analysis pipeline, weighed with their probability. The algorithm was tested with data sets generated from spiked samples in forensic and food safety context and has shown promising results by detecting features for all compounds in a computationally reasonable time.
Poor textural image tie point matching via graph theory
NASA Astrophysics Data System (ADS)
Yuan, Xiuxiao; Chen, Shiyu; Yuan, Wei; Cai, Yang
2017-07-01
Feature matching aims to find corresponding points to serve as tie points between images. Robust matching is still a challenging task when input images are characterized by low contrast or contain repetitive patterns, occlusions, or homogeneous textures. In this paper, a novel feature matching algorithm based on graph theory is proposed. This algorithm integrates both geometric and radiometric constraints into an edge-weighted (EW) affinity tensor. Tie points are then obtained by high-order graph matching. Four pairs of poor textural images covering forests, deserts, bare lands, and urban areas are tested. For comparison, three state-of-the-art matching techniques, namely, scale-invariant feature transform (SIFT), speeded up robust features (SURF), and features from accelerated segment test (FAST), are also used. The experimental results show that the matching recall obtained by SIFT, SURF, and FAST varies from 0 to 35% in different types of poor textures. However, through the integration of both geometry and radiometry and the EW strategy, the recall obtained by the proposed algorithm is better than 50% in all four image pairs. The better matching recall improves the number of correct matches, dispersion, and positional accuracy.
Intrusion detection using rough set classification.
Zhang, Lian-hua; Zhang, Guan-hua; Zhang, Jie; Bai, Ying-cai
2004-09-01
Recently machine learning-based intrusion detection approaches have been subjected to extensive researches because they can detect both misuse and anomaly. In this paper, rough set classification (RSC), a modern learning algorithm, is used to rank the features extracted for detecting intrusions and generate intrusion detection models. Feature ranking is a very critical step when building the model. RSC performs feature ranking before generating rules, and converts the feature ranking to minimal hitting set problem addressed by using genetic algorithm (GA). This is done in classical approaches using Support Vector Machine (SVM) by executing many iterations, each of which removes one useless feature. Compared with those methods, our method can avoid many iterations. In addition, a hybrid genetic algorithm is proposed to increase the convergence speed and decrease the training time of RSC. The models generated by RSC take the form of "IF-THEN" rules, which have the advantage of explication. Tests and comparison of RSC with SVM on DARPA benchmark data showed that for Probe and DoS attacks both RSC and SVM yielded highly accurate results (greater than 99% accuracy on testing set).
Targeted Feature Detection for Data-Dependent Shotgun Proteomics
2017-01-01
Label-free quantification of shotgun LC–MS/MS data is the prevailing approach in quantitative proteomics but remains computationally nontrivial. The central data analysis step is the detection of peptide-specific signal patterns, called features. Peptide quantification is facilitated by associating signal intensities in features with peptide sequences derived from MS2 spectra; however, missing values due to imperfect feature detection are a common problem. A feature detection approach that directly targets identified peptides (minimizing missing values) but also offers robustness against false-positive features (by assigning meaningful confidence scores) would thus be highly desirable. We developed a new feature detection algorithm within the OpenMS software framework, leveraging ideas and algorithms from the OpenSWATH toolset for DIA/SRM data analysis. Our software, FeatureFinderIdentification (“FFId”), implements a targeted approach to feature detection based on information from identified peptides. This information is encoded in an MS1 assay library, based on which ion chromatogram extraction and detection of feature candidates are carried out. Significantly, when analyzing data from experiments comprising multiple samples, our approach distinguishes between “internal” and “external” (inferred) peptide identifications (IDs) for each sample. On the basis of internal IDs, two sets of positive (true) and negative (decoy) feature candidates are defined. A support vector machine (SVM) classifier is then trained to discriminate between the sets and is subsequently applied to the “uncertain” feature candidates from external IDs, facilitating selection and confidence scoring of the best feature candidate for each peptide. This approach also enables our algorithm to estimate the false discovery rate (FDR) of the feature selection step. We validated FFId based on a public benchmark data set, comprising a yeast cell lysate spiked with protein standards that provide a known ground-truth. The algorithm reached almost complete (>99%) quantification coverage for the full set of peptides identified at 1% FDR (PSM level). Compared with other software solutions for label-free quantification, this is an outstanding result, which was achieved at competitive quantification accuracy and reproducibility across replicates. The FDR for the feature selection was estimated at a low 1.5% on average per sample (3% for features inferred from external peptide IDs). The FFId software is open-source and freely available as part of OpenMS (www.openms.org). PMID:28673088
Targeted Feature Detection for Data-Dependent Shotgun Proteomics.
Weisser, Hendrik; Choudhary, Jyoti S
2017-08-04
Label-free quantification of shotgun LC-MS/MS data is the prevailing approach in quantitative proteomics but remains computationally nontrivial. The central data analysis step is the detection of peptide-specific signal patterns, called features. Peptide quantification is facilitated by associating signal intensities in features with peptide sequences derived from MS2 spectra; however, missing values due to imperfect feature detection are a common problem. A feature detection approach that directly targets identified peptides (minimizing missing values) but also offers robustness against false-positive features (by assigning meaningful confidence scores) would thus be highly desirable. We developed a new feature detection algorithm within the OpenMS software framework, leveraging ideas and algorithms from the OpenSWATH toolset for DIA/SRM data analysis. Our software, FeatureFinderIdentification ("FFId"), implements a targeted approach to feature detection based on information from identified peptides. This information is encoded in an MS1 assay library, based on which ion chromatogram extraction and detection of feature candidates are carried out. Significantly, when analyzing data from experiments comprising multiple samples, our approach distinguishes between "internal" and "external" (inferred) peptide identifications (IDs) for each sample. On the basis of internal IDs, two sets of positive (true) and negative (decoy) feature candidates are defined. A support vector machine (SVM) classifier is then trained to discriminate between the sets and is subsequently applied to the "uncertain" feature candidates from external IDs, facilitating selection and confidence scoring of the best feature candidate for each peptide. This approach also enables our algorithm to estimate the false discovery rate (FDR) of the feature selection step. We validated FFId based on a public benchmark data set, comprising a yeast cell lysate spiked with protein standards that provide a known ground-truth. The algorithm reached almost complete (>99%) quantification coverage for the full set of peptides identified at 1% FDR (PSM level). Compared with other software solutions for label-free quantification, this is an outstanding result, which was achieved at competitive quantification accuracy and reproducibility across replicates. The FDR for the feature selection was estimated at a low 1.5% on average per sample (3% for features inferred from external peptide IDs). The FFId software is open-source and freely available as part of OpenMS ( www.openms.org ).
Automatic parameter selection for feature-based multi-sensor image registration
NASA Astrophysics Data System (ADS)
DelMarco, Stephen; Tom, Victor; Webb, Helen; Chao, Alan
2006-05-01
Accurate image registration is critical for applications such as precision targeting, geo-location, change-detection, surveillance, and remote sensing. However, the increasing volume of image data is exceeding the current capacity of human analysts to perform manual registration. This image data glut necessitates the development of automated approaches to image registration, including algorithm parameter value selection. Proper parameter value selection is crucial to the success of registration techniques. The appropriate algorithm parameters can be highly scene and sensor dependent. Therefore, robust algorithm parameter value selection approaches are a critical component of an end-to-end image registration algorithm. In previous work, we developed a general framework for multisensor image registration which includes feature-based registration approaches. In this work we examine the problem of automated parameter selection. We apply the automated parameter selection approach of Yitzhaky and Peli to select parameters for feature-based registration of multisensor image data. The approach consists of generating multiple feature-detected images by sweeping over parameter combinations and using these images to generate estimated ground truth. The feature-detected images are compared to the estimated ground truth images to generate ROC points associated with each parameter combination. We develop a strategy for selecting the optimal parameter set by choosing the parameter combination corresponding to the optimal ROC point. We present numerical results showing the effectiveness of the approach using registration of collected SAR data to reference EO data.
Wang, Shuaiqun; Aorigele; Kong, Wei; Zeng, Weiming; Hong, Xiaomin
2016-01-01
Gene expression data composed of thousands of genes play an important role in classification platforms and disease diagnosis. Hence, it is vital to select a small subset of salient features over a large number of gene expression data. Lately, many researchers devote themselves to feature selection using diverse computational intelligence methods. However, in the progress of selecting informative genes, many computational methods face difficulties in selecting small subsets for cancer classification due to the huge number of genes (high dimension) compared to the small number of samples, noisy genes, and irrelevant genes. In this paper, we propose a new hybrid algorithm HICATS incorporating imperialist competition algorithm (ICA) which performs global search and tabu search (TS) that conducts fine-tuned search. In order to verify the performance of the proposed algorithm HICATS, we have tested it on 10 well-known benchmark gene expression classification datasets with dimensions varying from 2308 to 12600. The performance of our proposed method proved to be superior to other related works including the conventional version of binary optimization algorithm in terms of classification accuracy and the number of selected genes.
Aorigele; Zeng, Weiming; Hong, Xiaomin
2016-01-01
Gene expression data composed of thousands of genes play an important role in classification platforms and disease diagnosis. Hence, it is vital to select a small subset of salient features over a large number of gene expression data. Lately, many researchers devote themselves to feature selection using diverse computational intelligence methods. However, in the progress of selecting informative genes, many computational methods face difficulties in selecting small subsets for cancer classification due to the huge number of genes (high dimension) compared to the small number of samples, noisy genes, and irrelevant genes. In this paper, we propose a new hybrid algorithm HICATS incorporating imperialist competition algorithm (ICA) which performs global search and tabu search (TS) that conducts fine-tuned search. In order to verify the performance of the proposed algorithm HICATS, we have tested it on 10 well-known benchmark gene expression classification datasets with dimensions varying from 2308 to 12600. The performance of our proposed method proved to be superior to other related works including the conventional version of binary optimization algorithm in terms of classification accuracy and the number of selected genes. PMID:27579323
Real-time image annotation by manifold-based biased Fisher discriminant analysis
NASA Astrophysics Data System (ADS)
Ji, Rongrong; Yao, Hongxun; Wang, Jicheng; Sun, Xiaoshuai; Liu, Xianming
2008-01-01
Automatic Linguistic Annotation is a promising solution to bridge the semantic gap in content-based image retrieval. However, two crucial issues are not well addressed in state-of-art annotation algorithms: 1. The Small Sample Size (3S) problem in keyword classifier/model learning; 2. Most of annotation algorithms can not extend to real-time online usage due to their low computational efficiencies. This paper presents a novel Manifold-based Biased Fisher Discriminant Analysis (MBFDA) algorithm to address these two issues by transductive semantic learning and keyword filtering. To address the 3S problem, Co-Training based Manifold learning is adopted for keyword model construction. To achieve real-time annotation, a Bias Fisher Discriminant Analysis (BFDA) based semantic feature reduction algorithm is presented for keyword confidence discrimination and semantic feature reduction. Different from all existing annotation methods, MBFDA views image annotation from a novel Eigen semantic feature (which corresponds to keywords) selection aspect. As demonstrated in experiments, our manifold-based biased Fisher discriminant analysis annotation algorithm outperforms classical and state-of-art annotation methods (1.K-NN Expansion; 2.One-to-All SVM; 3.PWC-SVM) in both computational time and annotation accuracy with a large margin.
A real negative selection algorithm with evolutionary preference for anomaly detection
NASA Astrophysics Data System (ADS)
Yang, Tao; Chen, Wen; Li, Tao
2017-04-01
Traditional real negative selection algorithms (RNSAs) adopt the estimated coverage (c0) as the algorithm termination threshold, and generate detectors randomly. With increasing dimensions, the data samples could reside in the low-dimensional subspace, so that the traditional detectors cannot effectively distinguish these samples. Furthermore, in high-dimensional feature space, c0 cannot exactly reflect the detectors set coverage rate for the nonself space, and it could lead the algorithm to be terminated unexpectedly when the number of detectors is insufficient. These shortcomings make the traditional RNSAs to perform poorly in high-dimensional feature space. Based upon "evolutionary preference" theory in immunology, this paper presents a real negative selection algorithm with evolutionary preference (RNSAP). RNSAP utilizes the "unknown nonself space", "low-dimensional target subspace" and "known nonself feature" as the evolutionary preference to guide the generation of detectors, thus ensuring the detectors can cover the nonself space more effectively. Besides, RNSAP uses redundancy to replace c0 as the termination threshold, in this way RNSAP can generate adequate detectors under a proper convergence rate. The theoretical analysis and experimental result demonstrate that, compared to the classical RNSA (V-detector), RNSAP can achieve a higher detection rate, but with less detectors and computing cost.
Batool, Nazre; Chellappa, Rama
2014-09-01
Facial retouching is widely used in media and entertainment industry. Professional software usually require a minimum level of user expertise to achieve the desirable results. In this paper, we present an algorithm to detect facial wrinkles/imperfection. We believe that any such algorithm would be amenable to facial retouching applications. The detection of wrinkles/imperfections can allow these skin features to be processed differently than the surrounding skin without much user interaction. For detection, Gabor filter responses along with texture orientation field are used as image features. A bimodal Gaussian mixture model (GMM) represents distributions of Gabor features of normal skin versus skin imperfections. Then, a Markov random field model is used to incorporate the spatial relationships among neighboring pixels for their GMM distributions and texture orientations. An expectation-maximization algorithm then classifies skin versus skin wrinkles/imperfections. Once detected automatically, wrinkles/imperfections are removed completely instead of being blended or blurred. We propose an exemplar-based constrained texture synthesis algorithm to inpaint irregularly shaped gaps left by the removal of detected wrinkles/imperfections. We present results conducted on images downloaded from the Internet to show the efficacy of our algorithms.
Laser vibrometry exploitation for vehicle identification
NASA Astrophysics Data System (ADS)
Nolan, Adam; Lingg, Andrew; Goley, Steve; Sigmund, Kevin; Kangas, Scott
2014-06-01
Vibration signatures sensed from distant vehicles using laser vibrometry systems provide valuable information that may be used to help identify key vehicle features such as engine type, engine speed, and number of cylinders. Through the use of physics models of the vibration phenomenology, features are chosen to support classification algorithms. Various individual exploitation algorithms were developed using these models to classify vibration signatures into engine type (piston vs. turbine), engine configuration (Inline 4 vs. Inline 6 vs. V6 vs. V8 vs. V12) and vehicle type. The results of these algorithms will be presented for an 8 class problem. Finally, the benefits of using a factor graph representation to link these independent algorithms together will be presented which constructs a classification hierarchy for the vibration exploitation problem.
Chitalia, Rhea; Mueller, Jenna; Fu, Henry L; Whitley, Melodi Javid; Kirsch, David G; Brown, J Quincy; Willett, Rebecca; Ramanujam, Nimmi
2016-09-01
Fluorescence microscopy can be used to acquire real-time images of tissue morphology and with appropriate algorithms can rapidly quantify features associated with disease. The objective of this study was to assess the ability of various segmentation algorithms to isolate fluorescent positive features (FPFs) in heterogeneous images and identify an approach that can be used across multiple fluorescence microscopes with minimal tuning between systems. Specifically, we show a variety of image segmentation algorithms applied to images of stained tumor and muscle tissue acquired with 3 different fluorescence microscopes. Results indicate that a technique called maximally stable extremal regions followed by thresholding (MSER + Binary) yielded the greatest contrast in FPF density between tumor and muscle images across multiple microscopy systems.
Computer-aided US diagnosis of breast lesions by using cell-based contour grouping.
Cheng, Jie-Zhi; Chou, Yi-Hong; Huang, Chiun-Sheng; Chang, Yeun-Chung; Tiu, Chui-Mei; Chen, Kuei-Wu; Chen, Chung-Ming
2010-06-01
To develop a computer-aided diagnostic algorithm with automatic boundary delineation for differential diagnosis of benign and malignant breast lesions at ultrasonography (US) and investigate the effect of boundary quality on the performance of a computer-aided diagnostic algorithm. This was an institutional review board-approved retrospective study with waiver of informed consent. A cell-based contour grouping (CBCG) segmentation algorithm was used to delineate the lesion boundaries automatically. Seven morphologic features were extracted. The classifier was a logistic regression function. Five hundred twenty breast US scans were obtained from 520 subjects (age range, 15-89 years), including 275 benign (mean size, 15 mm; range, 5-35 mm) and 245 malignant (mean size, 18 mm; range, 8-29 mm) lesions. The newly developed computer-aided diagnostic algorithm was evaluated on the basis of boundary quality and differentiation performance. The segmentation algorithms and features in two conventional computer-aided diagnostic algorithms were used for comparative study. The CBCG-generated boundaries were shown to be comparable with the manually delineated boundaries. The area under the receiver operating characteristic curve (AUC) and differentiation accuracy were 0.968 +/- 0.010 and 93.1% +/- 0.7, respectively, for all 520 breast lesions. At the 5% significance level, the newly developed algorithm was shown to be superior to the use of the boundaries and features of the two conventional computer-aided diagnostic algorithms in terms of AUC (0.974 +/- 0.007 versus 0.890 +/- 0.008 and 0.788 +/- 0.024, respectively). The newly developed computer-aided diagnostic algorithm that used a CBCG segmentation method to measure boundaries achieved a high differentiation performance. Copyright RSNA, 2010
Logsdon, Benjamin A.; Mezey, Jason
2010-01-01
Cellular gene expression measurements contain regulatory information that can be used to discover novel network relationships. Here, we present a new algorithm for network reconstruction powered by the adaptive lasso, a theoretically and empirically well-behaved method for selecting the regulatory features of a network. Any algorithms designed for network discovery that make use of directed probabilistic graphs require perturbations, produced by either experiments or naturally occurring genetic variation, to successfully infer unique regulatory relationships from gene expression data. Our approach makes use of appropriately selected cis-expression Quantitative Trait Loci (cis-eQTL), which provide a sufficient set of independent perturbations for maximum network resolution. We compare the performance of our network reconstruction algorithm to four other approaches: the PC-algorithm, QTLnet, the QDG algorithm, and the NEO algorithm, all of which have been used to reconstruct directed networks among phenotypes leveraging QTL. We show that the adaptive lasso can outperform these algorithms for networks of ten genes and ten cis-eQTL, and is competitive with the QDG algorithm for networks with thirty genes and thirty cis-eQTL, with rich topologies and hundreds of samples. Using this novel approach, we identify unique sets of directed relationships in Saccharomyces cerevisiae when analyzing genome-wide gene expression data for an intercross between a wild strain and a lab strain. We recover novel putative network relationships between a tyrosine biosynthesis gene (TYR1), and genes involved in endocytosis (RCY1), the spindle checkpoint (BUB2), sulfonate catabolism (JLP1), and cell-cell communication (PRM7). Our algorithm provides a synthesis of feature selection methods and graphical model theory that has the potential to reveal new directed regulatory relationships from the analysis of population level genetic and gene expression data. PMID:21152011
Object-oriented feature-tracking algorithms for SAR images of the marginal ice zone
NASA Technical Reports Server (NTRS)
Daida, Jason; Samadani, Ramin; Vesecky, John F.
1990-01-01
An unsupervised method that chooses and applies the most appropriate tracking algorithm from among different sea-ice tracking algorithms is reported. In contrast to current unsupervised methods, this method chooses and applies an algorithm by partially examining a sequential image pair to draw inferences about what was examined. Based on these inferences the reported method subsequently chooses which algorithm to apply to specific areas of the image pair where that algorithm should work best.
NASA Technical Reports Server (NTRS)
Arduini, R. F.; Aherron, R. M.; Samms, R. W.
1984-01-01
A computational model of the deterministic and stochastic processes involved in multispectral remote sensing was designed to evaluate the performance of sensor systems and data processing algorithms for spectral feature classification. Accuracy in distinguishing between categories of surfaces or between specific types is developed as a means to compare sensor systems and data processing algorithms. The model allows studies to be made of the effects of variability of the atmosphere and of surface reflectance, as well as the effects of channel selection and sensor noise. Examples of these effects are shown.
Textual and shape-based feature extraction and neuro-fuzzy classifier for nuclear track recognition
NASA Astrophysics Data System (ADS)
Khayat, Omid; Afarideh, Hossein
2013-04-01
Track counting algorithms as one of the fundamental principles of nuclear science have been emphasized in the recent years. Accurate measurement of nuclear tracks on solid-state nuclear track detectors is the aim of track counting systems. Commonly track counting systems comprise a hardware system for the task of imaging and software for analysing the track images. In this paper, a track recognition algorithm based on 12 defined textual and shape-based features and a neuro-fuzzy classifier is proposed. Features are defined so as to discern the tracks from the background and small objects. Then, according to the defined features, tracks are detected using a trained neuro-fuzzy system. Features and the classifier are finally validated via 100 Alpha track images and 40 training samples. It is shown that principle textual and shape-based features concomitantly yield a high rate of track detection compared with the single-feature based methods.
Multi-Sensor Registration of Earth Remotely Sensed Imagery
NASA Technical Reports Server (NTRS)
LeMoigne, Jacqueline; Cole-Rhodes, Arlene; Eastman, Roger; Johnson, Kisha; Morisette, Jeffrey; Netanyahu, Nathan S.; Stone, Harold S.; Zavorin, Ilya; Zukor, Dorothy (Technical Monitor)
2001-01-01
Assuming that approximate registration is given within a few pixels by a systematic correction system, we develop automatic image registration methods for multi-sensor data with the goal of achieving sub-pixel accuracy. Automatic image registration is usually defined by three steps; feature extraction, feature matching, and data resampling or fusion. Our previous work focused on image correlation methods based on the use of different features. In this paper, we study different feature matching techniques and present five algorithms where the features are either original gray levels or wavelet-like features, and the feature matching is based on gradient descent optimization, statistical robust matching, and mutual information. These algorithms are tested and compared on several multi-sensor datasets covering one of the EOS Core Sites, the Konza Prairie in Kansas, from four different sensors: IKONOS (4m), Landsat-7/ETM+ (30m), MODIS (500m), and SeaWIFS (1000m).
ECG Based Heart Arrhythmia Detection Using Wavelet Coherence and Bat Algorithm
NASA Astrophysics Data System (ADS)
Kora, Padmavathi; Sri Rama Krishna, K.
2016-12-01
Atrial fibrillation (AF) is a type of heart abnormality, during the AF electrical discharges in the atrium are rapid, results in abnormal heart beat. The morphology of ECG changes due to the abnormalities in the heart. This paper consists of three major steps for the detection of heart diseases: signal pre-processing, feature extraction and classification. Feature extraction is the key process in detecting the heart abnormality. Most of the ECG detection systems depend on the time domain features for cardiac signal classification. In this paper we proposed a wavelet coherence (WTC) technique for ECG signal analysis. The WTC calculates the similarity between two waveforms in frequency domain. Parameters extracted from WTC function is used as the features of the ECG signal. These features are optimized using Bat algorithm. The Levenberg Marquardt neural network classifier is used to classify the optimized features. The performance of the classifier can be improved with the optimized features.
Morgan, R; Gallagher, M
2012-01-01
In this paper we extend a previously proposed randomized landscape generator in combination with a comparative experimental methodology to study the behavior of continuous metaheuristic optimization algorithms. In particular, we generate two-dimensional landscapes with parameterized, linear ridge structure, and perform pairwise comparisons of algorithms to gain insight into what kind of problems are easy and difficult for one algorithm instance relative to another. We apply this methodology to investigate the specific issue of explicit dependency modeling in simple continuous estimation of distribution algorithms. Experimental results reveal specific examples of landscapes (with certain identifiable features) where dependency modeling is useful, harmful, or has little impact on mean algorithm performance. Heat maps are used to compare algorithm performance over a large number of landscape instances and algorithm trials. Finally, we perform a meta-search in the landscape parameter space to find landscapes which maximize the performance between algorithms. The results are related to some previous intuition about the behavior of these algorithms, but at the same time lead to new insights into the relationship between dependency modeling in EDAs and the structure of the problem landscape. The landscape generator and overall methodology are quite general and extendable and can be used to examine specific features of other algorithms.
A comparative intelligibility study of single-microphone noise reduction algorithms.
Hu, Yi; Loizou, Philipos C
2007-09-01
The evaluation of intelligibility of noise reduction algorithms is reported. IEEE sentences and consonants were corrupted by four types of noise including babble, car, street and train at two signal-to-noise ratio levels (0 and 5 dB), and then processed by eight speech enhancement methods encompassing four classes of algorithms: spectral subtractive, sub-space, statistical model based and Wiener-type algorithms. The enhanced speech was presented to normal-hearing listeners for identification. With the exception of a single noise condition, no algorithm produced significant improvements in speech intelligibility. Information transmission analysis of the consonant confusion matrices indicated that no algorithm improved significantly the place feature score, significantly, which is critically important for speech recognition. The algorithms which were found in previous studies to perform the best in terms of overall quality, were not the same algorithms that performed the best in terms of speech intelligibility. The subspace algorithm, for instance, was previously found to perform the worst in terms of overall quality, but performed well in the present study in terms of preserving speech intelligibility. Overall, the analysis of consonant confusion matrices suggests that in order for noise reduction algorithms to improve speech intelligibility, they need to improve the place and manner feature scores.
Stratiform/convective rain delineation for TRMM microwave imager
NASA Astrophysics Data System (ADS)
Islam, Tanvir; Srivastava, Prashant K.; Dai, Qiang; Gupta, Manika; Wan Jaafar, Wan Zurina
2015-10-01
This article investigates the potential for using machine learning algorithms to delineate stratiform/convective (S/C) rain regimes for passive microwave imager taking calibrated brightness temperatures as only spectral parameters. The algorithms have been implemented for the Tropical Rainfall Measuring Mission (TRMM) microwave imager (TMI), and calibrated as well as validated taking the Precipitation Radar (PR) S/C information as the target class variables. Two different algorithms are particularly explored for the delineation. The first one is metaheuristic adaptive boosting algorithm that includes the real, gentle, and modest versions of the AdaBoost. The second one is the classical linear discriminant analysis that includes the Fisher's and penalized versions of the linear discriminant analysis. Furthermore, prior to the development of the delineation algorithms, a feature selection analysis has been conducted for a total of 85 features, which contains the combinations of brightness temperatures from 10 GHz to 85 GHz and some derived indexes, such as scattering index, polarization corrected temperature, and polarization difference with the help of mutual information aided minimal redundancy maximal relevance criterion (mRMR). It has been found that the polarization corrected temperature at 85 GHz and the features derived from the "addition" operator associated with the 85 GHz channels have good statistical dependency to the S/C target class variables. Further, it has been shown how the mRMR feature selection technique helps to reduce the number of features without deteriorating the results when applying through the machine learning algorithms. The proposed scheme is able to delineate the S/C rain regimes with reasonable accuracy. Based on the statistical validation experience from the validation period, the Matthews correlation coefficients are in the range of 0.60-0.70. Since, the proposed method does not rely on any a priori information, this makes it very suitable for other microwave sensors having similar channels to the TMI. The method could possibly benefit the constellation sensors in the Global Precipitation Measurement (GPM) mission era.
Onboard Robust Visual Tracking for UAVs Using a Reliable Global-Local Object Model
Fu, Changhong; Duan, Ran; Kircali, Dogan; Kayacan, Erdal
2016-01-01
In this paper, we present a novel onboard robust visual algorithm for long-term arbitrary 2D and 3D object tracking using a reliable global-local object model for unmanned aerial vehicle (UAV) applications, e.g., autonomous tracking and chasing a moving target. The first main approach in this novel algorithm is the use of a global matching and local tracking approach. In other words, the algorithm initially finds feature correspondences in a way that an improved binary descriptor is developed for global feature matching and an iterative Lucas–Kanade optical flow algorithm is employed for local feature tracking. The second main module is the use of an efficient local geometric filter (LGF), which handles outlier feature correspondences based on a new forward-backward pairwise dissimilarity measure, thereby maintaining pairwise geometric consistency. In the proposed LGF module, a hierarchical agglomerative clustering, i.e., bottom-up aggregation, is applied using an effective single-link method. The third proposed module is a heuristic local outlier factor (to the best of our knowledge, it is utilized for the first time to deal with outlier features in a visual tracking application), which further maximizes the representation of the target object in which we formulate outlier feature detection as a binary classification problem with the output features of the LGF module. Extensive UAV flight experiments show that the proposed visual tracker achieves real-time frame rates of more than thirty-five frames per second on an i7 processor with 640 × 512 image resolution and outperforms the most popular state-of-the-art trackers favorably in terms of robustness, efficiency and accuracy. PMID:27589769
A Systematic Investigation of Computation Models for Predicting Adverse Drug Reactions (ADRs)
Kuang, Qifan; Wang, MinQi; Li, Rong; Dong, YongCheng; Li, Yizhou; Li, Menglong
2014-01-01
Background Early and accurate identification of adverse drug reactions (ADRs) is critically important for drug development and clinical safety. Computer-aided prediction of ADRs has attracted increasing attention in recent years, and many computational models have been proposed. However, because of the lack of systematic analysis and comparison of the different computational models, there remain limitations in designing more effective algorithms and selecting more useful features. There is therefore an urgent need to review and analyze previous computation models to obtain general conclusions that can provide useful guidance to construct more effective computational models to predict ADRs. Principal Findings In the current study, the main work is to compare and analyze the performance of existing computational methods to predict ADRs, by implementing and evaluating additional algorithms that have been earlier used for predicting drug targets. Our results indicated that topological and intrinsic features were complementary to an extent and the Jaccard coefficient had an important and general effect on the prediction of drug-ADR associations. By comparing the structure of each algorithm, final formulas of these algorithms were all converted to linear model in form, based on this finding we propose a new algorithm called the general weighted profile method and it yielded the best overall performance among the algorithms investigated in this paper. Conclusion Several meaningful conclusions and useful findings regarding the prediction of ADRs are provided for selecting optimal features and algorithms. PMID:25180585
Multifeature-based high-resolution palmprint recognition.
Dai, Jifeng; Zhou, Jie
2011-05-01
Palmprint is a promising biometric feature for use in access control and forensic applications. Previous research on palmprint recognition mainly concentrates on low-resolution (about 100 ppi) palmprints. But for high-security applications (e.g., forensic usage), high-resolution palmprints (500 ppi or higher) are required from which more useful information can be extracted. In this paper, we propose a novel recognition algorithm for high-resolution palmprint. The main contributions of the proposed algorithm include the following: 1) use of multiple features, namely, minutiae, density, orientation, and principal lines, for palmprint recognition to significantly improve the matching performance of the conventional algorithm. 2) Design of a quality-based and adaptive orientation field estimation algorithm which performs better than the existing algorithm in case of regions with a large number of creases. 3) Use of a novel fusion scheme for an identification application which performs better than conventional fusion methods, e.g., weighted sum rule, SVMs, or Neyman-Pearson rule. Besides, we analyze the discriminative power of different feature combinations and find that density is very useful for palmprint recognition. Experimental results on the database containing 14,576 full palmprints show that the proposed algorithm has achieved a good performance. In the case of verification, the recognition system's False Rejection Rate (FRR) is 16 percent, which is 17 percent lower than the best existing algorithm at a False Acceptance Rate (FAR) of 10(-5), while in the identification experiment, the rank-1 live-scan partial palmprint recognition rate is improved from 82.0 to 91.7 percent.
O'Boyle, Noel M; Palmer, David S; Nigsch, Florian; Mitchell, John Bo
2008-10-29
We present a novel feature selection algorithm, Winnowing Artificial Ant Colony (WAAC), that performs simultaneous feature selection and model parameter optimisation for the development of predictive quantitative structure-property relationship (QSPR) models. The WAAC algorithm is an extension of the modified ant colony algorithm of Shen et al. (J Chem Inf Model 2005, 45: 1024-1029). We test the ability of the algorithm to develop a predictive partial least squares model for the Karthikeyan dataset (J Chem Inf Model 2005, 45: 581-590) of melting point values. We also test its ability to perform feature selection on a support vector machine model for the same dataset. Starting from an initial set of 203 descriptors, the WAAC algorithm selected a PLS model with 68 descriptors which has an RMSE on an external test set of 46.6 degrees C and R2 of 0.51. The number of components chosen for the model was 49, which was close to optimal for this feature selection. The selected SVM model has 28 descriptors (cost of 5, epsilon of 0.21) and an RMSE of 45.1 degrees C and R2 of 0.54. This model outperforms a kNN model (RMSE of 48.3 degrees C, R2 of 0.47) for the same data and has similar performance to a Random Forest model (RMSE of 44.5 degrees C, R2 of 0.55). However it is much less prone to bias at the extremes of the range of melting points as shown by the slope of the line through the residuals: -0.43 for WAAC/SVM, -0.53 for Random Forest. With a careful choice of objective function, the WAAC algorithm can be used to optimise machine learning and regression models that suffer from overfitting. Where model parameters also need to be tuned, as is the case with support vector machine and partial least squares models, it can optimise these simultaneously. The moving probabilities used by the algorithm are easily interpreted in terms of the best and current models of the ants, and the winnowing procedure promotes the removal of irrelevant descriptors.
NASA Astrophysics Data System (ADS)
Lestari, A. W.; Rustam, Z.
2017-07-01
In the last decade, breast cancer has become the focus of world attention as this disease is one of the primary leading cause of death for women. Therefore, it is necessary to have the correct precautions and treatment. In previous studies, Fuzzy Kennel K-Medoid algorithm has been used for multi-class data. This paper proposes an algorithm to classify the high dimensional data of breast cancer using Fuzzy Possibilistic C-means (FPCM) and a new method based on clustering analysis using Normed Kernel Function-Based Fuzzy Possibilistic C-Means (NKFPCM). The objective of this paper is to obtain the best accuracy in classification of breast cancer data. In order to improve the accuracy of the two methods, the features candidates are evaluated using feature selection, where Laplacian Score is used. The results show the comparison accuracy and running time of FPCM and NKFPCM with and without feature selection.
Statistical process control using optimized neural networks: a case study.
Addeh, Jalil; Ebrahimzadeh, Ata; Azarbad, Milad; Ranaee, Vahid
2014-09-01
The most common statistical process control (SPC) tools employed for monitoring process changes are control charts. A control chart demonstrates that the process has altered by generating an out-of-control signal. This study investigates the design of an accurate system for the control chart patterns (CCPs) recognition in two aspects. First, an efficient system is introduced that includes two main modules: feature extraction module and classifier module. In the feature extraction module, a proper set of shape features and statistical feature are proposed as the efficient characteristics of the patterns. In the classifier module, several neural networks, such as multilayer perceptron, probabilistic neural network and radial basis function are investigated. Based on an experimental study, the best classifier is chosen in order to recognize the CCPs. Second, a hybrid heuristic recognition system is introduced based on cuckoo optimization algorithm (COA) algorithm to improve the generalization performance of the classifier. The simulation results show that the proposed algorithm has high recognition accuracy. Copyright © 2013 ISA. Published by Elsevier Ltd. All rights reserved.
Detection of faults in rotating machinery using periodic time-frequency sparsity
NASA Astrophysics Data System (ADS)
Ding, Yin; He, Wangpeng; Chen, Binqiang; Zi, Yanyang; Selesnick, Ivan W.
2016-11-01
This paper addresses the problem of extracting periodic oscillatory features in vibration signals for detecting faults in rotating machinery. To extract the feature, we propose an approach in the short-time Fourier transform (STFT) domain where the periodic oscillatory feature manifests itself as a relatively sparse grid. To estimate the sparse grid, we formulate an optimization problem using customized binary weights in the regularizer, where the weights are formulated to promote periodicity. In order to solve the proposed optimization problem, we develop an algorithm called augmented Lagrangian majorization-minimization algorithm, which combines the split augmented Lagrangian shrinkage algorithm (SALSA) with majorization-minimization (MM), and is guaranteed to converge for both convex and non-convex formulation. As examples, the proposed approach is applied to simulated data, and used as a tool for diagnosing faults in bearings and gearboxes for real data, and compared to some state-of-the-art methods. The results show that the proposed approach can effectively detect and extract the periodical oscillatory features.
NASA Astrophysics Data System (ADS)
Yan, Dan; Bai, Lianfa; Zhang, Yi; Han, Jing
2018-02-01
For the problems of missing details and performance of the colorization based on sparse representation, we propose a conceptual model framework for colorizing gray-scale images, and then a multi-sparse dictionary colorization algorithm based on the feature classification and detail enhancement (CEMDC) is proposed based on this framework. The algorithm can achieve a natural colorized effect for a gray-scale image, and it is consistent with the human vision. First, the algorithm establishes a multi-sparse dictionary classification colorization model. Then, to improve the accuracy rate of the classification, the corresponding local constraint algorithm is proposed. Finally, we propose a detail enhancement based on Laplacian Pyramid, which is effective in solving the problem of missing details and improving the speed of image colorization. In addition, the algorithm not only realizes the colorization of the visual gray-scale image, but also can be applied to the other areas, such as color transfer between color images, colorizing gray fusion images, and infrared images.
Luo, Junhai; Fu, Liang
2017-06-09
With the development of communication technology, the demand for location-based services is growing rapidly. This paper presents an algorithm for indoor localization based on Received Signal Strength (RSS), which is collected from Access Points (APs). The proposed localization algorithm contains the offline information acquisition phase and online positioning phase. Firstly, the AP selection algorithm is reviewed and improved based on the stability of signals to remove useless AP; secondly, Kernel Principal Component Analysis (KPCA) is analyzed and used to remove the data redundancy and maintain useful characteristics for nonlinear feature extraction; thirdly, the Affinity Propagation Clustering (APC) algorithm utilizes RSS values to classify data samples and narrow the positioning range. In the online positioning phase, the classified data will be matched with the testing data to determine the position area, and the Maximum Likelihood (ML) estimate will be employed for precise positioning. Eventually, the proposed algorithm is implemented in a real-world environment for performance evaluation. Experimental results demonstrate that the proposed algorithm improves the accuracy and computational complexity.
Quasi-Epipolar Resampling of High Resolution Satellite Stereo Imagery for Semi Global Matching
NASA Astrophysics Data System (ADS)
Tatar, N.; Saadatseresht, M.; Arefi, H.; Hadavand, A.
2015-12-01
Semi-global matching is a well-known stereo matching algorithm in photogrammetric and computer vision society. Epipolar images are supposed as input of this algorithm. Epipolar geometry of linear array scanners is not a straight line as in case of frame camera. Traditional epipolar resampling algorithms demands for rational polynomial coefficients (RPCs), physical sensor model or ground control points. In this paper we propose a new solution for epipolar resampling method which works without the need for these information. In proposed method, automatic feature extraction algorithms are employed to generate corresponding features for registering stereo pairs. Also original images are divided into small tiles. In this way by omitting the need for extra information, the speed of matching algorithm increased and the need for high temporal memory decreased. Our experiments on GeoEye-1 stereo pair captured over Qom city in Iran demonstrates that the epipolar images are generated with sub-pixel accuracy.
An Automatic Registration Algorithm for 3D Maxillofacial Model
NASA Astrophysics Data System (ADS)
Qiu, Luwen; Zhou, Zhongwei; Guo, Jixiang; Lv, Jiancheng
2016-09-01
3D image registration aims at aligning two 3D data sets in a common coordinate system, which has been widely used in computer vision, pattern recognition and computer assisted surgery. One challenging problem in 3D registration is that point-wise correspondences between two point sets are often unknown apriori. In this work, we develop an automatic algorithm for 3D maxillofacial models registration including facial surface model and skull model. Our proposed registration algorithm can achieve a good alignment result between partial and whole maxillofacial model in spite of ambiguous matching, which has a potential application in the oral and maxillofacial reparative and reconstructive surgery. The proposed algorithm includes three steps: (1) 3D-SIFT features extraction and FPFH descriptors construction; (2) feature matching using SAC-IA; (3) coarse rigid alignment and refinement by ICP. Experiments on facial surfaces and mandible skull models demonstrate the efficiency and robustness of our algorithm.
Image based book cover recognition and retrieval
NASA Astrophysics Data System (ADS)
Sukhadan, Kalyani; Vijayarajan, V.; Krishnamoorthi, A.; Bessie Amali, D. Geraldine
2017-11-01
In this we are developing a graphical user interface using MATLAB for the users to check the information related to books in real time. We are taking the photos of the book cover using GUI, then by using MSER algorithm it will automatically detect all the features from the input image, after this it will filter bifurcate non-text features which will be based on morphological difference between text and non-text regions. We implemented a text character alignment algorithm which will improve the accuracy of the original text detection. We will also have a look upon the built in MATLAB OCR recognition algorithm and an open source OCR which is commonly used to perform better detection results, post detection algorithm is implemented and natural language processing to perform word correction and false detection inhibition. Finally, the detection result will be linked to internet to perform online matching. More than 86% accuracy can be obtained by this algorithm.
Zhu, Bohui; Ding, Yongsheng; Hao, Kuangrong
2013-01-01
This paper presents a novel maximum margin clustering method with immune evolution (IEMMC) for automatic diagnosis of electrocardiogram (ECG) arrhythmias. This diagnostic system consists of signal processing, feature extraction, and the IEMMC algorithm for clustering of ECG arrhythmias. First, raw ECG signal is processed by an adaptive ECG filter based on wavelet transforms, and waveform of the ECG signal is detected; then, features are extracted from ECG signal to cluster different types of arrhythmias by the IEMMC algorithm. Three types of performance evaluation indicators are used to assess the effect of the IEMMC method for ECG arrhythmias, such as sensitivity, specificity, and accuracy. Compared with K-means and iterSVR algorithms, the IEMMC algorithm reflects better performance not only in clustering result but also in terms of global search ability and convergence ability, which proves its effectiveness for the detection of ECG arrhythmias. PMID:23690875
Keshtkaran, Mohammad Reza; Yang, Zhi
2017-06-01
Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. Most of the feature extraction and dimensionality reduction techniques that have been used for spike sorting give a projection subspace which is not necessarily the most discriminative one. Therefore, the clusters which appear inherently separable in some discriminative subspace may overlap if projected using conventional feature extraction approaches leading to a poor sorting accuracy especially when the noise level is high. In this paper, we propose a noise-robust and unsupervised spike sorting algorithm based on learning discriminative spike features for clustering. The proposed algorithm uses discriminative subspace learning to extract low dimensional and most discriminative features from the spike waveforms and perform clustering with automatic detection of the number of the clusters. The core part of the algorithm involves iterative subspace selection using linear discriminant analysis and clustering using Gaussian mixture model with outlier detection. A statistical test in the discriminative subspace is proposed to automatically detect the number of the clusters. Comparative results on publicly available simulated and real in vivo datasets demonstrate that our algorithm achieves substantially improved cluster distinction leading to higher sorting accuracy and more reliable detection of clusters which are highly overlapping and not detectable using conventional feature extraction techniques such as principal component analysis or wavelets. By providing more accurate information about the activity of more number of individual neurons with high robustness to neural noise and outliers, the proposed unsupervised spike sorting algorithm facilitates more detailed and accurate analysis of single- and multi-unit activities in neuroscience and brain machine interface studies.
NASA Astrophysics Data System (ADS)
Keshtkaran, Mohammad Reza; Yang, Zhi
2017-06-01
Objective. Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. Most of the feature extraction and dimensionality reduction techniques that have been used for spike sorting give a projection subspace which is not necessarily the most discriminative one. Therefore, the clusters which appear inherently separable in some discriminative subspace may overlap if projected using conventional feature extraction approaches leading to a poor sorting accuracy especially when the noise level is high. In this paper, we propose a noise-robust and unsupervised spike sorting algorithm based on learning discriminative spike features for clustering. Approach. The proposed algorithm uses discriminative subspace learning to extract low dimensional and most discriminative features from the spike waveforms and perform clustering with automatic detection of the number of the clusters. The core part of the algorithm involves iterative subspace selection using linear discriminant analysis and clustering using Gaussian mixture model with outlier detection. A statistical test in the discriminative subspace is proposed to automatically detect the number of the clusters. Main results. Comparative results on publicly available simulated and real in vivo datasets demonstrate that our algorithm achieves substantially improved cluster distinction leading to higher sorting accuracy and more reliable detection of clusters which are highly overlapping and not detectable using conventional feature extraction techniques such as principal component analysis or wavelets. Significance. By providing more accurate information about the activity of more number of individual neurons with high robustness to neural noise and outliers, the proposed unsupervised spike sorting algorithm facilitates more detailed and accurate analysis of single- and multi-unit activities in neuroscience and brain machine interface studies.
Crabtree, Nathaniel M; Moore, Jason H; Bowyer, John F; George, Nysia I
2017-01-01
A computational evolution system (CES) is a knowledge discovery engine that can identify subtle, synergistic relationships in large datasets. Pareto optimization allows CESs to balance accuracy with model complexity when evolving classifiers. Using Pareto optimization, a CES is able to identify a very small number of features while maintaining high classification accuracy. A CES can be designed for various types of data, and the user can exploit expert knowledge about the classification problem in order to improve discrimination between classes. These characteristics give CES an advantage over other classification and feature selection algorithms, particularly when the goal is to identify a small number of highly relevant, non-redundant biomarkers. Previously, CESs have been developed only for binary class datasets. In this study, we developed a multi-class CES. The multi-class CES was compared to three common feature selection and classification algorithms: support vector machine (SVM), random k-nearest neighbor (RKNN), and random forest (RF). The algorithms were evaluated on three distinct multi-class RNA sequencing datasets. The comparison criteria were run-time, classification accuracy, number of selected features, and stability of selected feature set (as measured by the Tanimoto distance). The performance of each algorithm was data-dependent. CES performed best on the dataset with the smallest sample size, indicating that CES has a unique advantage since the accuracy of most classification methods suffer when sample size is small. The multi-class extension of CES increases the appeal of its application to complex, multi-class datasets in order to identify important biomarkers and features.
A method for fast automated microscope image stitching.
Yang, Fan; Deng, Zhen-Sheng; Fan, Qiu-Hong
2013-05-01
Image stitching is an important technology to produce a panorama or larger image by combining several images with overlapped areas. In many biomedical researches, image stitching is highly desirable to acquire a panoramic image which represents large areas of certain structures or whole sections, while retaining microscopic resolution. In this study, we develop a fast normal light microscope image stitching algorithm based on feature extraction. At first, an algorithm of scale-space reconstruction of speeded-up robust features (SURF) was proposed to extract features from the images to be stitched with a short time and higher repeatability. Then, the histogram equalization (HE) method was employed to preprocess the images to enhance their contrast for extracting more features. Thirdly, the rough overlapping zones of the images preprocessed were calculated by phase correlation, and the improved SURF was used to extract the image features in the rough overlapping areas. Fourthly, the features were corresponded by matching algorithm and the transformation parameters were estimated, then the images were blended seamlessly. Finally, this procedure was applied to stitch normal light microscope images to verify its validity. Our experimental results demonstrate that the improved SURF algorithm is very robust to viewpoint, illumination, blur, rotation and zoom of the images and our method is able to stitch microscope images automatically with high precision and high speed. Also, the method proposed in this paper is applicable to registration and stitching of common images as well as stitching the microscope images in the field of virtual microscope for the purpose of observing, exchanging, saving, and establishing a database of microscope images. Copyright © 2013 Elsevier Ltd. All rights reserved.
Photometric Supernova Classification with Machine Learning
NASA Astrophysics Data System (ADS)
Lochner, Michelle; McEwen, Jason D.; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.
2016-08-01
Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k-nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.
Selka, F; Nicolau, S; Agnus, V; Bessaid, A; Marescaux, J; Soler, L
2015-03-01
In minimally invasive surgery, the tracking of deformable tissue is a critical component for image-guided applications. Deformation of the tissue can be recovered by tracking features using tissue surface information (texture, color,...). Recent work in this field has shown success in acquiring tissue motion. However, the performance evaluation of detection and tracking algorithms on such images are still difficult and are not standardized. This is mainly due to the lack of ground truth data on real data. Moreover, in order to avoid supplementary techniques to remove outliers, no quantitative work has been undertaken to evaluate the benefit of a pre-process based on image filtering, which can improve feature tracking robustness. In this paper, we propose a methodology to validate detection and feature tracking algorithms, using a trick based on forward-backward tracking that provides an artificial ground truth data. We describe a clear and complete methodology to evaluate and compare different detection and tracking algorithms. In addition, we extend our framework to propose a strategy to identify the best combinations from a set of detector, tracker and pre-process algorithms, according to the live intra-operative data. Experimental results have been performed on in vivo datasets and show that pre-process can have a strong influence on tracking performance and that our strategy to find the best combinations is relevant for a reasonable computation cost. Copyright © 2014 Elsevier Ltd. All rights reserved.
Algorithms for Spectral Decomposition with Applications to Optical Plume Anomaly Detection
NASA Technical Reports Server (NTRS)
Srivastava, Askok N.; Matthews, Bryan; Das, Santanu
2008-01-01
The analysis of spectral signals for features that represent physical phenomenon is ubiquitous in the science and engineering communities. There are two main approaches that can be taken to extract relevant features from these high-dimensional data streams. The first set of approaches relies on extracting features using a physics-based paradigm where the underlying physical mechanism that generates the spectra is used to infer the most important features in the data stream. We focus on a complementary methodology that uses a data-driven technique that is informed by the underlying physics but also has the ability to adapt to unmodeled system attributes and dynamics. We discuss the following four algorithms: Spectral Decomposition Algorithm (SDA), Non-Negative Matrix Factorization (NMF), Independent Component Analysis (ICA) and Principal Components Analysis (PCA) and compare their performance on a spectral emulator which we use to generate artificial data with known statistical properties. This spectral emulator mimics the real-world phenomena arising from the plume of the space shuttle main engine and can be used to validate the results that arise from various spectral decomposition algorithms and is very useful for situations where real-world systems have very low probabilities of fault or failure. Our results indicate that methods like SDA and NMF provide a straightforward way of incorporating prior physical knowledge while NMF with a tuning mechanism can give superior performance on some tests. We demonstrate these algorithms to detect potential system-health issues on data from a spectral emulator with tunable health parameters.
Zhang, Kaihua; Zhang, Lei; Yang, Ming-Hsuan
2014-10-01
It is a challenging task to develop effective and efficient appearance models for robust object tracking due to factors such as pose variation, illumination change, occlusion, and motion blur. Existing online tracking algorithms often update models with samples from observations in recent frames. Despite much success has been demonstrated, numerous issues remain to be addressed. First, while these adaptive appearance models are data-dependent, there does not exist sufficient amount of data for online algorithms to learn at the outset. Second, online tracking algorithms often encounter the drift problems. As a result of self-taught learning, misaligned samples are likely to be added and degrade the appearance models. In this paper, we propose a simple yet effective and efficient tracking algorithm with an appearance model based on features extracted from a multiscale image feature space with data-independent basis. The proposed appearance model employs non-adaptive random projections that preserve the structure of the image feature space of objects. A very sparse measurement matrix is constructed to efficiently extract the features for the appearance model. We compress sample images of the foreground target and the background using the same sparse measurement matrix. The tracking task is formulated as a binary classification via a naive Bayes classifier with online update in the compressed domain. A coarse-to-fine search strategy is adopted to further reduce the computational complexity in the detection procedure. The proposed compressive tracking algorithm runs in real-time and performs favorably against state-of-the-art methods on challenging sequences in terms of efficiency, accuracy and robustness.
Road marking features extraction using the VIAPIX® system
NASA Astrophysics Data System (ADS)
Kaddah, W.; Ouerhani, Y.; Alfalou, A.; Desthieux, M.; Brosseau, C.; Gutierrez, C.
2016-07-01
Precise extraction of road marking features is a critical task for autonomous urban driving, augmented driver assistance, and robotics technologies. In this study, we consider an autonomous system allowing us lane detection for marked urban roads and analysis of their features. The task is to relate the georeferencing of road markings from images obtained using the VIAPIX® system. Based on inverse perspective mapping and color segmentation to detect all white objects existing on this road, the present algorithm enables us to examine these images automatically and rapidly and also to get information on road marks, their surface conditions, and their georeferencing. This algorithm allows detecting all road markings and identifying some of them by making use of a phase-only correlation filter (POF). We illustrate this algorithm and its robustness by applying it to a variety of relevant scenarios.
Automatic morphological classification of galaxy images
Shamir, Lior
2009-01-01
We describe an image analysis supervised learning algorithm that can automatically classify galaxy images. The algorithm is first trained using a manually classified images of elliptical, spiral, and edge-on galaxies. A large set of image features is extracted from each image, and the most informative features are selected using Fisher scores. Test images can then be classified using a simple Weighted Nearest Neighbor rule such that the Fisher scores are used as the feature weights. Experimental results show that galaxy images from Galaxy Zoo can be classified automatically to spiral, elliptical and edge-on galaxies with accuracy of ~90% compared to classifications carried out by the author. Full compilable source code of the algorithm is available for free download, and its general-purpose nature makes it suitable for other uses that involve automatic image analysis of celestial objects. PMID:20161594
Matching CCD images to a stellar catalog using locality-sensitive hashing
NASA Astrophysics Data System (ADS)
Liu, Bo; Yu, Jia-Zong; Peng, Qing-Yu
2018-02-01
The usage of a subset of observed stars in a CCD image to find their corresponding matched stars in a stellar catalog is an important issue in astronomical research. Subgraph isomorphic-based algorithms are the most widely used methods in star catalog matching. When more subgraph features are provided, the CCD images are recognized better. However, when the navigation feature database is large, the method requires more time to match the observing model. To solve this problem, this study investigates further and improves subgraph isomorphic matching algorithms. We present an algorithm based on a locality-sensitive hashing technique, which allocates quadrilateral models in the navigation feature database into different hash buckets and reduces the search range to the bucket in which the observed quadrilateral model is located. Experimental results indicate the effectivity of our method.
Limitations and requirements of content-based multimedia authentication systems
NASA Astrophysics Data System (ADS)
Wu, Chai W.
2001-08-01
Recently, a number of authentication schemes have been proposed for multimedia data such as images and sound data. They include both label based systems and semifragile watermarks. The main requirement for such authentication systems is that minor modifications such as lossy compression which do not alter the content of the data preserve the authenticity of the data, whereas modifications which do modify the content render the data not authentic. These schemes can be classified into two main classes depending on the model of image authentication they are based on. One of the purposes of this paper is to look at some of the advantages and disadvantages of these image authentication schemes and their relationship with fundamental limitations of the underlying model of image authentication. In particular, we study feature-based algorithms which generate an authentication tag based on some inherent features in the image such as the location of edges. The main disadvantage of most proposed feature-based algorithms is that similar images generate similar features, and therefore it is possible for a forger to generate dissimilar images that have the same features. On the other hand, the class of hash-based algorithms utilizes a cryptographic hash function or a digital signature scheme to reduce the data and generate an authentication tag. It inherits the security of digital signatures to thwart forgery attacks. The main disadvantage of hash-based algorithms is that the image needs to be modified in order to be made authenticatable. The amount of modification is on the order of the noise the image can tolerate before it is rendered inauthentic. The other purpose of this paper is to propose a multimedia authentication scheme which combines some of the best features of both classes of algorithms. The proposed scheme utilizes cryptographic hash functions and digital signature schemes and the data does not need to be modified in order to be made authenticatable. Several applications including the authentication of images on CD-ROM and handwritten documents will be discussed.
Lou, Xin Yuan; Sun, Lin Fu
2017-01-01
This paper proposes a new support vector machine (SVM) optimization scheme based on an improved chaotic fly optimization algorithm (FOA) with a mutation strategy to simultaneously perform parameter setting turning for the SVM and feature selection. In the improved FOA, the chaotic particle initializes the fruit fly swarm location and replaces the expression of distance for the fruit fly to find the food source. However, the proposed mutation strategy uses two distinct generative mechanisms for new food sources at the osphresis phase, allowing the algorithm procedure to search for the optimal solution in both the whole solution space and within the local solution space containing the fruit fly swarm location. In an evaluation based on a group of ten benchmark problems, the proposed algorithm’s performance is compared with that of other well-known algorithms, and the results support the superiority of the proposed algorithm. Moreover, this algorithm is successfully applied in a SVM to perform both parameter setting turning for the SVM and feature selection to solve real-world classification problems. This method is called chaotic fruit fly optimization algorithm (CIFOA)-SVM and has been shown to be a more robust and effective optimization method than other well-known methods, particularly in terms of solving the medical diagnosis problem and the credit card problem. PMID:28369096
Advanced biologically plausible algorithms for low-level image processing
NASA Astrophysics Data System (ADS)
Gusakova, Valentina I.; Podladchikova, Lubov N.; Shaposhnikov, Dmitry G.; Markin, Sergey N.; Golovan, Alexander V.; Lee, Seong-Whan
1999-08-01
At present, in computer vision, the approach based on modeling the biological vision mechanisms is extensively developed. However, up to now, real world image processing has no effective solution in frameworks of both biologically inspired and conventional approaches. Evidently, new algorithms and system architectures based on advanced biological motivation should be developed for solution of computational problems related to this visual task. Basic problems that should be solved for creation of effective artificial visual system to process real world imags are a search for new algorithms of low-level image processing that, in a great extent, determine system performance. In the present paper, the result of psychophysical experiments and several advanced biologically motivated algorithms for low-level processing are presented. These algorithms are based on local space-variant filter, context encoding visual information presented in the center of input window, and automatic detection of perceptually important image fragments. The core of latter algorithm are using local feature conjunctions such as noncolinear oriented segment and composite feature map formation. Developed algorithms were integrated into foveal active vision model, the MARR. It is supposed that proposed algorithms may significantly improve model performance while real world image processing during memorizing, search, and recognition.
NASA Astrophysics Data System (ADS)
Weller, Andrew F.; Harris, Anthony J.; Ware, J. Andrew; Jarvis, Paul S.
2006-11-01
The classification of sedimentary organic matter (OM) images can be improved by determining the saliency of image analysis (IA) features measured from them. Knowing the saliency of IA feature measurements means that only the most significant discriminating features need be used in the classification process. This is an important consideration for classification techniques such as artificial neural networks (ANNs), where too many features can lead to the 'curse of dimensionality'. The classification scheme adopted in this work is a hybrid of morphologically and texturally descriptive features from previous manual classification schemes. Some of these descriptive features are assigned to IA features, along with several others built into the IA software (Halcon) to ensure that a valid cross-section is available. After an image is captured and segmented, a total of 194 features are measured for each particle. To reduce this number to a more manageable magnitude, the SPSS AnswerTree Exhaustive CHAID (χ 2 automatic interaction detector) classification tree algorithm is used to establish each measurement's saliency as a classification discriminator. In the case of continuous data as used here, the F-test is used as opposed to the published algorithm. The F-test checks various statistical hypotheses about the variance of groups of IA feature measurements obtained from the particles to be classified. The aim is to reduce the number of features required to perform the classification without reducing its accuracy. In the best-case scenario, 194 inputs are reduced to 8, with a subsequent multi-layer back-propagation ANN recognition rate of 98.65%. This paper demonstrates the ability of the algorithm to reduce noise, help overcome the curse of dimensionality, and facilitate an understanding of the saliency of IA features as discriminators for sedimentary OM classification.
Motion Cueing Algorithm Development: Human-Centered Linear and Nonlinear Approaches
NASA Technical Reports Server (NTRS)
Houck, Jacob A. (Technical Monitor); Telban, Robert J.; Cardullo, Frank M.
2005-01-01
While the performance of flight simulator motion system hardware has advanced substantially, the development of the motion cueing algorithm, the software that transforms simulated aircraft dynamics into realizable motion commands, has not kept pace. Prior research identified viable features from two algorithms: the nonlinear "adaptive algorithm", and the "optimal algorithm" that incorporates human vestibular models. A novel approach to motion cueing, the "nonlinear algorithm" is introduced that combines features from both approaches. This algorithm is formulated by optimal control, and incorporates a new integrated perception model that includes both visual and vestibular sensation and the interaction between the stimuli. Using a time-varying control law, the matrix Riccati equation is updated in real time by a neurocomputing approach. Preliminary pilot testing resulted in the optimal algorithm incorporating a new otolith model, producing improved motion cues. The nonlinear algorithm vertical mode produced a motion cue with a time-varying washout, sustaining small cues for longer durations and washing out large cues more quickly compared to the optimal algorithm. The inclusion of the integrated perception model improved the responses to longitudinal and lateral cues. False cues observed with the NASA adaptive algorithm were absent. The neurocomputing approach was crucial in that the number of presentations of an input vector could be reduced to meet the real time requirement without degrading the quality of the motion cues.
A method of depth image based human action recognition
NASA Astrophysics Data System (ADS)
Li, Pei; Cheng, Wanli
2017-05-01
In this paper, we propose an action recognition algorithm framework based on human skeleton joint information. In order to extract the feature of human motion, we use the information of body posture, speed and acceleration of movement to construct spatial motion feature that can describe and reflect the joint. On the other hand, we use the classical temporal pyramid matching algorithm to construct temporal feature and describe the motion sequence variation from different time scales. Then, we use bag of words to represent these actions, which is to present every action in the histogram by clustering these extracted feature. Finally, we employ Hidden Markov Model to train and test the extracted motion features. In the experimental part, the correctness and effectiveness of the proposed model are comprehensively verified on two well-known datasets.
Research of facial feature extraction based on MMC
NASA Astrophysics Data System (ADS)
Xue, Donglin; Zhao, Jiufen; Tang, Qinhong; Shi, Shaokun
2017-07-01
Based on the maximum margin criterion (MMC), a new algorithm of statistically uncorrelated optimal discriminant vectors and a new algorithm of orthogonal optimal discriminant vectors for feature extraction were proposed. The purpose of the maximum margin criterion is to maximize the inter-class scatter while simultaneously minimizing the intra-class scatter after the projection. Compared with original MMC method and principal component analysis (PCA) method, the proposed methods are better in terms of reducing or eliminating the statistically correlation between features and improving recognition rate. The experiment results on Olivetti Research Laboratory (ORL) face database shows that the new feature extraction method of statistically uncorrelated maximum margin criterion (SUMMC) are better in terms of recognition rate and stability. Besides, the relations between maximum margin criterion and Fisher criterion for feature extraction were revealed.
Multiscale Anomaly Detection and Image Registration Algorithms for Airborne Landmine Detection
2008-05-01
with the sensed image. The two- dimensional correlation coefficient r for two matrices A and B both of size M ×N is given by r = ∑ m ∑ n (Amn...correlation based method by matching features in a high- dimensional feature- space . The current implementation of the SIFT algorithm uses a brute-force...by repeatedly convolving the image with a Guassian kernel. Each plane of the scale
An on-board pedestrian detection and warning system with features of side pedestrian
NASA Astrophysics Data System (ADS)
Cheng, Ruzhong; Zhao, Yong; Wong, ChupChung; Chan, KwokPo; Xu, Jiayao; Wang, Xin'an
2012-01-01
Automotive Active Safety(AAS) is the main branch of intelligence automobile study and pedestrian detection is the key problem of AAS, because it is related with the casualties of most vehicle accidents. For on-board pedestrian detection algorithms, the main problem is to balance efficiency and accuracy to make the on-board system available in real scenes, so an on-board pedestrian detection and warning system with the algorithm considered the features of side pedestrian is proposed. The system includes two modules, pedestrian detecting and warning module. Haar feature and a cascade of stage classifiers trained by Adaboost are first applied, and then HOG feature and SVM classifier are used to refine false positives. To make these time-consuming algorithms available in real-time use, a divide-window method together with operator context scanning(OCS) method are applied to increase efficiency. To merge the velocity information of the automotive, the distance of the detected pedestrian is also obtained, so the system could judge if there is a potential danger for the pedestrian in the front. With a new dataset captured in urban environment with side pedestrians on zebra, the embedded system and its algorithm perform an on-board available result on side pedestrian detection.
A Robust Linear Feature-Based Procedure for Automated Registration of Point Clouds
Poreba, Martyna; Goulette, François
2015-01-01
With the variety of measurement techniques available on the market today, fusing multi-source complementary information into one dataset is a matter of great interest. Target-based, point-based and feature-based methods are some of the approaches used to place data in a common reference frame by estimating its corresponding transformation parameters. This paper proposes a new linear feature-based method to perform accurate registration of point clouds, either in 2D or 3D. A two-step fast algorithm called Robust Line Matching and Registration (RLMR), which combines coarse and fine registration, was developed. The initial estimate is found from a triplet of conjugate line pairs, selected by a RANSAC algorithm. Then, this transformation is refined using an iterative optimization algorithm. Conjugates of linear features are identified with respect to a similarity metric representing a line-to-line distance. The efficiency and robustness to noise of the proposed method are evaluated and discussed. The algorithm is valid and ensures valuable results when pre-aligned point clouds with the same scale are used. The studies show that the matching accuracy is at least 99.5%. The transformation parameters are also estimated correctly. The error in rotation is better than 2.8% full scale, while the translation error is less than 12.7%. PMID:25594589
Prediction of Effective Drug Combinations by an Improved Naïve Bayesian Algorithm.
Bai, Li-Yue; Dai, Hao; Xu, Qin; Junaid, Muhammad; Peng, Shao-Liang; Zhu, Xiaolei; Xiong, Yi; Wei, Dong-Qing
2018-02-05
Drug combinatorial therapy is a promising strategy for combating complex diseases due to its fewer side effects, lower toxicity and better efficacy. However, it is not feasible to determine all the effective drug combinations in the vast space of possible combinations given the increasing number of approved drugs in the market, since the experimental methods for identification of effective drug combinations are both labor- and time-consuming. In this study, we conducted systematic analysis of various types of features to characterize pairs of drugs. These features included information about the targets of the drugs, the pathway in which the target protein of a drug was involved in, side effects of drugs, metabolic enzymes of the drugs, and drug transporters. The latter two features (metabolic enzymes and drug transporters) were related to the metabolism and transportation properties of drugs, which were not analyzed or used in previous studies. Then, we devised a novel improved naïve Bayesian algorithm to construct classification models to predict effective drug combinations by using the individual types of features mentioned above. Our results indicated that the performance of our proposed method was indeed better than the naïve Bayesian algorithm and other conventional classification algorithms such as support vector machine and K-nearest neighbor.
The Performance of Short-Term Heart Rate Variability in the Detection of Congestive Heart Failure
Barros, Allan Kardec; Ohnishi, Noboru
2016-01-01
Congestive heart failure (CHF) is a cardiac disease associated with the decreasing capacity of the cardiac output. It has been shown that the CHF is the main cause of the cardiac death around the world. Some works proposed to discriminate CHF subjects from healthy subjects using either electrocardiogram (ECG) or heart rate variability (HRV) from long-term recordings. In this work, we propose an alternative framework to discriminate CHF from healthy subjects by using HRV short-term intervals based on 256 RR continuous samples. Our framework uses a matching pursuit algorithm based on Gabor functions. From the selected Gabor functions, we derived a set of features that are inputted into a hybrid framework which uses a genetic algorithm and k-nearest neighbour classifier to select a subset of features that has the best classification performance. The performance of the framework is analyzed using both Fantasia and CHF database from Physionet archives which are, respectively, composed of 40 healthy volunteers and 29 subjects. From a set of nonstandard 16 features, the proposed framework reaches an overall accuracy of 100% with five features. Our results suggest that the application of hybrid frameworks whose classifier algorithms are based on genetic algorithms has outperformed well-known classifier methods. PMID:27891509
Weakly supervised classification in high energy physics
Dery, Lucio Mwinmaarong; Nachman, Benjamin; Rubbo, Francesco; ...
2017-05-01
As machine learning algorithms become increasingly sophisticated to exploit subtle features of the data, they often become more dependent on simulations. Here, this paper presents a new approach called weakly supervised classification in which class proportions are the only input into the machine learning algorithm. Using one of the most challenging binary classification tasks in high energy physics $-$ quark versus gluon tagging $-$ we show that weakly supervised classification can match the performance of fully supervised algorithms. Furthermore, by design, the new algorithm is insensitive to any mis-modeling of discriminating features in the data by the simulation. Weakly supervisedmore » classification is a general procedure that can be applied to a wide variety of learning problems to boost performance and robustness when detailed simulations are not reliable or not available.« less
Multistage classification of multispectral Earth observational data: The design approach
NASA Technical Reports Server (NTRS)
Bauer, M. E. (Principal Investigator); Muasher, M. J.; Landgrebe, D. A.
1981-01-01
An algorithm is proposed which predicts the optimal features at every node in a binary tree procedure. The algorithm estimates the probability of error by approximating the area under the likelihood ratio function for two classes and taking into account the number of training samples used in estimating each of these two classes. Some results on feature selection techniques, particularly in the presence of a very limited set of training samples, are presented. Results comparing probabilities of error predicted by the proposed algorithm as a function of dimensionality as compared to experimental observations are shown for aircraft and LANDSAT data. Results are obtained for both real and simulated data. Finally, two binary tree examples which use the algorithm are presented to illustrate the usefulness of the procedure.
Weakly supervised classification in high energy physics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dery, Lucio Mwinmaarong; Nachman, Benjamin; Rubbo, Francesco
As machine learning algorithms become increasingly sophisticated to exploit subtle features of the data, they often become more dependent on simulations. Here, this paper presents a new approach called weakly supervised classification in which class proportions are the only input into the machine learning algorithm. Using one of the most challenging binary classification tasks in high energy physics $-$ quark versus gluon tagging $-$ we show that weakly supervised classification can match the performance of fully supervised algorithms. Furthermore, by design, the new algorithm is insensitive to any mis-modeling of discriminating features in the data by the simulation. Weakly supervisedmore » classification is a general procedure that can be applied to a wide variety of learning problems to boost performance and robustness when detailed simulations are not reliable or not available.« less
Design and implementation of a vision-based hovering and feature tracking algorithm for a quadrotor
NASA Astrophysics Data System (ADS)
Lee, Y. H.; Chahl, J. S.
2016-10-01
This paper demonstrates an approach to the vision-based control of the unmanned quadrotors for hover and object tracking. The algorithms used the Speed Up Robust Features (SURF) algorithm to detect objects. The pose of the object in the image was then calculated in order to pass the pose information to the flight controller. Finally, the flight controller steered the quadrotor to approach the object based on the calculated pose data. The above processes was run using standard onboard resources found in the 3DR Solo quadrotor in an embedded computing environment. The obtained results showed that the algorithm behaved well during its missions, tracking and hovering, although there were significant latencies due to low CPU performance of the onboard image processing system.
Reproducibility of radiomics for deciphering tumor phenotype with imaging
NASA Astrophysics Data System (ADS)
Zhao, Binsheng; Tan, Yongqiang; Tsai, Wei-Yann; Qi, Jing; Xie, Chuanmiao; Lu, Lin; Schwartz, Lawrence H.
2016-03-01
Radiomics (radiogenomics) characterizes tumor phenotypes based on quantitative image features derived from routine radiologic imaging to improve cancer diagnosis, prognosis, prediction and response to therapy. Although radiomic features must be reproducible to qualify as biomarkers for clinical care, little is known about how routine imaging acquisition techniques/parameters affect reproducibility. To begin to fill this knowledge gap, we assessed the reproducibility of a comprehensive, commonly-used set of radiomic features using a unique, same-day repeat computed tomography data set from lung cancer patients. Each scan was reconstructed at 6 imaging settings, varying slice thicknesses (1.25 mm, 2.5 mm and 5 mm) and reconstruction algorithms (sharp, smooth). Reproducibility was assessed using the repeat scans reconstructed at identical imaging setting (6 settings in total). In separate analyses, we explored differences in radiomic features due to different imaging parameters by assessing the agreement of these radiomic features extracted from the repeat scans reconstructed at the same slice thickness but different algorithms (3 settings in total). Our data suggest that radiomic features are reproducible over a wide range of imaging settings. However, smooth and sharp reconstruction algorithms should not be used interchangeably. These findings will raise awareness of the importance of properly setting imaging acquisition parameters in radiomics/radiogenomics research.
Van Landeghem, Sofie; Abeel, Thomas; Saeys, Yvan; Van de Peer, Yves
2010-09-15
In the field of biomolecular text mining, black box behavior of machine learning systems currently limits understanding of the true nature of the predictions. However, feature selection (FS) is capable of identifying the most relevant features in any supervised learning setting, providing insight into the specific properties of the classification algorithm. This allows us to build more accurate classifiers while at the same time bridging the gap between the black box behavior and the end-user who has to interpret the results. We show that our FS methodology successfully discards a large fraction of machine-generated features, improving classification performance of state-of-the-art text mining algorithms. Furthermore, we illustrate how FS can be applied to gain understanding in the predictions of a framework for biomolecular event extraction from text. We include numerous examples of highly discriminative features that model either biological reality or common linguistic constructs. Finally, we discuss a number of insights from our FS analyses that will provide the opportunity to considerably improve upon current text mining tools. The FS algorithms and classifiers are available in Java-ML (http://java-ml.sf.net). The datasets are publicly available from the BioNLP'09 Shared Task web site (http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/SharedTask/).
node2vec: Scalable Feature Learning for Networks
Grover, Aditya; Leskovec, Jure
2016-01-01
Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node’s network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks. PMID:27853626
NASA Astrophysics Data System (ADS)
Jusman, Yessi; Ng, Siew-Cheok; Hasikin, Khairunnisa; Kurnia, Rahmadi; Osman, Noor Azuan Bin Abu; Teoh, Kean Hooi
2016-10-01
The capability of field emission scanning electron microscopy and energy dispersive x-ray spectroscopy (FE-SEM/EDX) to scan material structures at the microlevel and characterize the material with its elemental properties has inspired this research, which has developed an FE-SEM/EDX-based cervical cancer screening system. The developed computer-aided screening system consisted of two parts, which were the automatic features of extraction and classification. For the automatic features extraction algorithm, the image and spectra of cervical cells features extraction algorithm for extracting the discriminant features of FE-SEM/EDX data was introduced. The system automatically extracted two types of features based on FE-SEM/EDX images and FE-SEM/EDX spectra. Textural features were extracted from the FE-SEM/EDX image using a gray level co-occurrence matrix technique, while the FE-SEM/EDX spectra features were calculated based on peak heights and corrected area under the peaks using an algorithm. A discriminant analysis technique was employed to predict the cervical precancerous stage into three classes: normal, low-grade intraepithelial squamous lesion (LSIL), and high-grade intraepithelial squamous lesion (HSIL). The capability of the developed screening system was tested using 700 FE-SEM/EDX spectra (300 normal, 200 LSIL, and 200 HSIL cases). The accuracy, sensitivity, and specificity performances were 98.2%, 99.0%, and 98.0%, respectively.
Demonstration of a 3D vision algorithm for space applications
NASA Technical Reports Server (NTRS)
Defigueiredo, Rui J. P. (Editor)
1987-01-01
This paper reports an extension of the MIAG algorithm for recognition and motion parameter determination of general 3-D polyhedral objects based on model matching techniques and using movement invariants as features of object representation. Results of tests conducted on the algorithm under conditions simulating space conditions are presented.
The Porter Stemming Algorithm: Then and Now
ERIC Educational Resources Information Center
Willett, Peter
2006-01-01
Purpose: In 1980, Porter presented a simple algorithm for stemming English language words. This paper summarises the main features of the algorithm, and highlights its role not just in modern information retrieval research, but also in a range of related subject domains. Design/methodology/approach: Review of literature and research involving use…
Cui, Zaixu; Gong, Gaolang
2018-06-02
Individualized behavioral/cognitive prediction using machine learning (ML) regression approaches is becoming increasingly applied. The specific ML regression algorithm and sample size are two key factors that non-trivially influence prediction accuracies. However, the effects of the ML regression algorithm and sample size on individualized behavioral/cognitive prediction performance have not been comprehensively assessed. To address this issue, the present study included six commonly used ML regression algorithms: ordinary least squares (OLS) regression, least absolute shrinkage and selection operator (LASSO) regression, ridge regression, elastic-net regression, linear support vector regression (LSVR), and relevance vector regression (RVR), to perform specific behavioral/cognitive predictions based on different sample sizes. Specifically, the publicly available resting-state functional MRI (rs-fMRI) dataset from the Human Connectome Project (HCP) was used, and whole-brain resting-state functional connectivity (rsFC) or rsFC strength (rsFCS) were extracted as prediction features. Twenty-five sample sizes (ranged from 20 to 700) were studied by sub-sampling from the entire HCP cohort. The analyses showed that rsFC-based LASSO regression performed remarkably worse than the other algorithms, and rsFCS-based OLS regression performed markedly worse than the other algorithms. Regardless of the algorithm and feature type, both the prediction accuracy and its stability exponentially increased with increasing sample size. The specific patterns of the observed algorithm and sample size effects were well replicated in the prediction using re-testing fMRI data, data processed by different imaging preprocessing schemes, and different behavioral/cognitive scores, thus indicating excellent robustness/generalization of the effects. The current findings provide critical insight into how the selected ML regression algorithm and sample size influence individualized predictions of behavior/cognition and offer important guidance for choosing the ML regression algorithm or sample size in relevant investigations. Copyright © 2018 Elsevier Inc. All rights reserved.
Optimized feature-detection for on-board vision-based surveillance
NASA Astrophysics Data System (ADS)
Gond, Laetitia; Monnin, David; Schneider, Armin
2012-06-01
The detection and matching of robust features in images is an important step in many computer vision applications. In this paper, the importance of the keypoint detection algorithms and their inherent parameters in the particular context of an image-based change detection system for IED detection is studied. Through extensive application-oriented experiments, we draw an evaluation and comparison of the most popular feature detectors proposed by the computer vision community. We analyze how to automatically adjust these algorithms to changing imaging conditions and suggest improvements in order to achieve more exibility and robustness in their practical implementation.
Automatic detection and classification of artifacts in single-channel EEG.
Olund, Thomas; Duun-Henriksen, Jonas; Kjaer, Troels W; Sorensen, Helge B D
2014-01-01
Ambulatory EEG monitoring can provide medical doctors important diagnostic information, without hospitalizing the patient. These recordings are however more exposed to noise and artifacts compared to clinically recorded EEG. An automatic artifact detection and classification algorithm for single-channel EEG is proposed to help identifying these artifacts. Features are extracted from the EEG signal and wavelet subbands. Subsequently a selection algorithm is applied in order to identify the best discriminating features. A non-linear support vector machine is used to discriminate among different artifact classes using the selected features. Single-channel (Fp1-F7) EEG recordings are obtained from experiments with 12 healthy subjects performing artifact inducing movements. The dataset was used to construct and validate the model. Both subject-specific and generic implementation, are investigated. The detection algorithm yield an average sensitivity and specificity above 95% for both the subject-specific and generic models. The classification algorithm show a mean accuracy of 78 and 64% for the subject-specific and generic model, respectively. The classification model was additionally validated on a reference dataset with similar results.
A Fault Recognition System for Gearboxes of Wind Turbines
NASA Astrophysics Data System (ADS)
Yang, Zhiling; Huang, Haiyue; Yin, Zidong
2017-12-01
Costs of maintenance and loss of power generation caused by the faults of wind turbines gearboxes are the main components of operation costs for a wind farm. Therefore, the technology of condition monitoring and fault recognition for wind turbines gearboxes is becoming a hot topic. A condition monitoring and fault recognition system (CMFRS) is presented for CBM of wind turbines gearboxes in this paper. The vibration signals from acceleration sensors at different locations of gearbox and the data from supervisory control and data acquisition (SCADA) system are collected to CMFRS. Then the feature extraction and optimization algorithm is applied to these operational data. Furthermore, to recognize the fault of gearboxes, the GSO-LSSVR algorithm is proposed, combining the least squares support vector regression machine (LSSVR) with the Glowworm Swarm Optimization (GSO) algorithm. Finally, the results show that the fault recognition system used in this paper has a high rate for identifying three states of wind turbines’ gears; besides, the combination of date features can affect the identifying rate and the selection optimization algorithm presented in this paper can get a pretty good date feature subset for the fault recognition.
Feature-based three-dimensional registration for repetitive geometry in machine vision
Gong, Yuanzheng; Seibel, Eric J.
2016-01-01
As an important step in three-dimensional (3D) machine vision, 3D registration is a process of aligning two or multiple 3D point clouds that are collected from different perspectives together into a complete one. The most popular approach to register point clouds is to minimize the difference between these point clouds iteratively by Iterative Closest Point (ICP) algorithm. However, ICP does not work well for repetitive geometries. To solve this problem, a feature-based 3D registration algorithm is proposed to align the point clouds that are generated by vision-based 3D reconstruction. By utilizing texture information of the object and the robustness of image features, 3D correspondences can be retrieved so that the 3D registration of two point clouds is to solve a rigid transformation. The comparison of our method and different ICP algorithms demonstrates that our proposed algorithm is more accurate, efficient and robust for repetitive geometry registration. Moreover, this method can also be used to solve high depth uncertainty problem caused by little camera baseline in vision-based 3D reconstruction. PMID:28286703
Cheng, Jun-Hu; Sun, Da-Wen; Pu, Hongbin
2016-04-15
The potential use of feature wavelengths for predicting drip loss in grass carp fish, as affected by being frozen at -20°C for 24 h and thawed at 4°C for 1, 2, 4, and 6 days, was investigated. Hyperspectral images of frozen-thawed fish were obtained and their corresponding spectra were extracted. Least-squares support vector machine and multiple linear regression (MLR) models were established using five key wavelengths, selected by combining a genetic algorithm and successive projections algorithm, and this showed satisfactory performance in drip loss prediction. The MLR model with a determination coefficient of prediction (R(2)P) of 0.9258, and lower root mean square error estimated by a prediction (RMSEP) of 1.12%, was applied to transfer each pixel of the image and generate the distribution maps of exudation changes. The results confirmed that it is feasible to identify the feature wavelengths using variable selection methods and chemometric analysis for developing on-line multispectral imaging. Copyright © 2015 Elsevier Ltd. All rights reserved.
A regularized approach for geodesic-based semisupervised multimanifold learning.
Fan, Mingyu; Zhang, Xiaoqin; Lin, Zhouchen; Zhang, Zhongfei; Bao, Hujun
2014-05-01
Geodesic distance, as an essential measurement for data dissimilarity, has been successfully used in manifold learning. However, most geodesic distance-based manifold learning algorithms have two limitations when applied to classification: 1) class information is rarely used in computing the geodesic distances between data points on manifolds and 2) little attention has been paid to building an explicit dimension reduction mapping for extracting the discriminative information hidden in the geodesic distances. In this paper, we regard geodesic distance as a kind of kernel, which maps data from linearly inseparable space to linear separable distance space. In doing this, a new semisupervised manifold learning algorithm, namely regularized geodesic feature learning algorithm, is proposed. The method consists of three techniques: a semisupervised graph construction method, replacement of original data points with feature vectors which are built by geodesic distances, and a new semisupervised dimension reduction method for feature vectors. Experiments on the MNIST, USPS handwritten digit data sets, MIT CBCL face versus nonface data set, and an intelligent traffic data set show the effectiveness of the proposed algorithm.
Rezaee, Kh.; Azizi, E.; Haddadnia, J.
2016-01-01
Background Epilepsy is a severe disorder of the central nervous system that predisposes the person to recurrent seizures. Fifty million people worldwide suffer from epilepsy; after Alzheimer’s and stroke, it is the third widespread nervous disorder. Objective In this paper, an algorithm to detect the onset of epileptic seizures based on the analysis of brain electrical signals (EEG) has been proposed. 844 hours of EEG were recorded form 23 pediatric patients consecutively with 163 occurrences of seizures. Signals had been collected from Children’s Hospital Boston with a sampling frequency of 256 Hz through 18 channels in order to assess epilepsy surgery. By selecting effective features from seizure and non-seizure signals of each individual and putting them into two categories, the proposed algorithm detects the onset of seizures quickly and with high sensitivity. Method In this algorithm, L-sec epochs of signals are displayed in form of a third-order tensor in spatial, spectral and temporal spaces by applying wavelet transform. Then, after applying general tensor discriminant analysis (GTDA) on tensors and calculating mapping matrix, feature vectors are extracted. GTDA increases the sensitivity of the algorithm by storing data without deleting them. Finally, K-Nearest neighbors (KNN) is used to classify the selected features. Results The results of simulating algorithm on algorithm standard dataset shows that the algorithm is capable of detecting 98 percent of seizures with an average delay of 4.7 seconds and the average error rate detection of three errors in 24 hours. Conclusion Today, the lack of an automated system to detect or predict the seizure onset is strongly felt. PMID:27672628
NASA Astrophysics Data System (ADS)
Wang, G. H.; Wang, H. B.; Fan, W. F.; Liu, Y.; Chen, C.
2018-04-01
In view of the traditional change detection algorithm mainly depends on the spectral information image spot, failed to effectively mining and fusion of multi-image feature detection advantage, the article borrows the ideas of object oriented analysis proposed a multi feature fusion of remote sensing image change detection algorithm. First by the multi-scale segmentation of image objects based; then calculate the various objects of color histogram and linear gradient histogram; utilizes the color distance and edge line feature distance between EMD statistical operator in different periods of the object, using the adaptive weighted method, the color feature distance and edge in a straight line distance of combination is constructed object heterogeneity. Finally, the curvature histogram analysis image spot change detection results. The experimental results show that the method can fully fuse the color and edge line features, thus improving the accuracy of the change detection.
Indirect Correspondence-Based Robust Extrinsic Calibration of LiDAR and Camera
Sim, Sungdae; Sock, Juil; Kwak, Kiho
2016-01-01
LiDAR and cameras have been broadly utilized in computer vision and autonomous vehicle applications. However, in order to convert data between the local coordinate systems, we must estimate the rigid body transformation between the sensors. In this paper, we propose a robust extrinsic calibration algorithm that can be implemented easily and has small calibration error. The extrinsic calibration parameters are estimated by minimizing the distance between corresponding features projected onto the image plane. The features are edge and centerline features on a v-shaped calibration target. The proposed algorithm contributes two ways to improve the calibration accuracy. First, we use different weights to distance between a point and a line feature according to the correspondence accuracy of the features. Second, we apply a penalizing function to exclude the influence of outliers in the calibration datasets. Additionally, based on our robust calibration approach for a single LiDAR-camera pair, we introduce a joint calibration that estimates the extrinsic parameters of multiple sensors at once by minimizing one objective function with loop closing constraints. We conduct several experiments to evaluate the performance of our extrinsic calibration algorithm. The experimental results show that our calibration method has better performance than the other approaches. PMID:27338416
An iris recognition algorithm based on DCT and GLCM
NASA Astrophysics Data System (ADS)
Feng, G.; Wu, Ye-qing
2008-04-01
With the enlargement of mankind's activity range, the significance for person's status identity is becoming more and more important. So many different techniques for person's status identity were proposed for this practical usage. Conventional person's status identity methods like password and identification card are not always reliable. A wide variety of biometrics has been developed for this challenge. Among those biologic characteristics, iris pattern gains increasing attention for its stability, reliability, uniqueness, noninvasiveness and difficult to counterfeit. The distinct merits of the iris lead to its high reliability for personal identification. So the iris identification technique had become hot research point in the past several years. This paper presents an efficient algorithm for iris recognition using gray-level co-occurrence matrix(GLCM) and Discrete Cosine transform(DCT). To obtain more representative iris features, features from space and DCT transformation domain are extracted. Both GLCM and DCT are applied on the iris image to form the feature sequence in this paper. The combination of GLCM and DCT makes the iris feature more distinct. Upon GLCM and DCT the eigenvector of iris extracted, which reflects features of spatial transformation and frequency transformation. Experimental results show that the algorithm is effective and feasible with iris recognition.
Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods.
Polat, Huseyin; Danaei Mehr, Homay; Cetin, Aydin
2017-04-01
As Chronic Kidney Disease progresses slowly, early detection and effective treatment are the only cure to reduce the mortality rate. Machine learning techniques are gaining significance in medical diagnosis because of their classification ability with high accuracy rates. The accuracy of classification algorithms depend on the use of correct feature selection algorithms to reduce the dimension of datasets. In this study, Support Vector Machine classification algorithm was used to diagnose Chronic Kidney Disease. To diagnose the Chronic Kidney Disease, two essential types of feature selection methods namely, wrapper and filter approaches were chosen to reduce the dimension of Chronic Kidney Disease dataset. In wrapper approach, classifier subset evaluator with greedy stepwise search engine and wrapper subset evaluator with the Best First search engine were used. In filter approach, correlation feature selection subset evaluator with greedy stepwise search engine and filtered subset evaluator with the Best First search engine were used. The results showed that the Support Vector Machine classifier by using filtered subset evaluator with the Best First search engine feature selection method has higher accuracy rate (98.5%) in the diagnosis of Chronic Kidney Disease compared to other selected methods.
Agile Multi-Scale Decompositions for Automatic Image Registration
NASA Technical Reports Server (NTRS)
Murphy, James M.; Leija, Omar Navarro; Le Moigne, Jacqueline
2016-01-01
In recent works, the first and third authors developed an automatic image registration algorithm based on a multiscale hybrid image decomposition with anisotropic shearlets and isotropic wavelets. This prototype showed strong performance, improving robustness over registration with wavelets alone. However, this method imposed a strict hierarchy on the order in which shearlet and wavelet features were used in the registration process, and also involved an unintegrated mixture of MATLAB and C code. In this paper, we introduce a more agile model for generating features, in which a flexible and user-guided mix of shearlet and wavelet features are computed. Compared to the previous prototype, this method introduces a flexibility to the order in which shearlet and wavelet features are used in the registration process. Moreover, the present algorithm is now fully coded in C, making it more efficient and portable than the MATLAB and C prototype. We demonstrate the versatility and computational efficiency of this approach by performing registration experiments with the fully-integrated C algorithm. In particular, meaningful timing studies can now be performed, to give a concrete analysis of the computational costs of the flexible feature extraction. Examples of synthetically warped and real multi-modal images are analyzed.
Learning relevant features of data with multi-scale tensor networks
NASA Astrophysics Data System (ADS)
Miles Stoudenmire, E.
2018-07-01
Inspired by coarse-graining approaches used in physics, we show how similar algorithms can be adapted for data. The resulting algorithms are based on layered tree tensor networks and scale linearly with both the dimension of the input and the training set size. Computing most of the layers with an unsupervised algorithm, then optimizing just the top layer for supervised classification of the MNIST and fashion MNIST data sets gives very good results. We also discuss mixing a prior guess for supervised weights together with an unsupervised representation of the data, yielding a smaller number of features nevertheless able to give good performance.
Common spaceborne multicomputer operating system and development environment
NASA Technical Reports Server (NTRS)
Craymer, L. G.; Lewis, B. F.; Hayes, P. J.; Jones, R. L.
1994-01-01
A preliminary technical specification for a multicomputer operating system is developed. The operating system is targeted for spaceborne flight missions and provides a broad range of real-time functionality, dynamic remote code-patching capability, and system fault tolerance and long-term survivability features. Dataflow concepts are used for representing application algorithms. Functional features are included to ensure real-time predictability for a class of algorithms which require data-driven execution on an iterative steady state basis. The development environment supports the development of algorithm code, design of control parameters, performance analysis, simulation of real-time dataflow applications, and compiling and downloading of the resulting application.
A Sustainable City Planning Algorithm Based on TLBO and Local Search
NASA Astrophysics Data System (ADS)
Zhang, Ke; Lin, Li; Huang, Xuanxuan; Liu, Yiming; Zhang, Yonggang
2017-09-01
Nowadays, how to design a city with more sustainable features has become a center problem in the field of social development, meanwhile it has provided a broad stage for the application of artificial intelligence theories and methods. Because the design of sustainable city is essentially a constraint optimization problem, the swarm intelligence algorithm of extensive research has become a natural candidate for solving the problem. TLBO (Teaching-Learning-Based Optimization) algorithm is a new swarm intelligence algorithm. Its inspiration comes from the “teaching” and “learning” behavior of teaching class in the life. The evolution of the population is realized by simulating the “teaching” of the teacher and the student “learning” from each other, with features of less parameters, efficient, simple thinking, easy to achieve and so on. It has been successfully applied to scheduling, planning, configuration and other fields, which achieved a good effect and has been paid more and more attention by artificial intelligence researchers. Based on the classical TLBO algorithm, we propose a TLBO_LS algorithm combined with local search. We design and implement the random generation algorithm and evaluation model of urban planning problem. The experiments on the small and medium-sized random generation problem showed that our proposed algorithm has obvious advantages over DE algorithm and classical TLBO algorithm in terms of convergence speed and solution quality.
Scalable Nearest Neighbor Algorithms for High Dimensional Data.
Muja, Marius; Lowe, David G
2014-11-01
For many computer vision and machine learning problems, large training sets are key for good performance. However, the most computationally expensive part of many computer vision and machine learning algorithms consists of finding nearest neighbor matches to high dimensional vectors that represent the training data. We propose new algorithms for approximate nearest neighbor matching and evaluate and compare them with previous algorithms. For matching high dimensional features, we find two algorithms to be the most efficient: the randomized k-d forest and a new algorithm proposed in this paper, the priority search k-means tree. We also propose a new algorithm for matching binary features by searching multiple hierarchical clustering trees and show it outperforms methods typically used in the literature. We show that the optimal nearest neighbor algorithm and its parameters depend on the data set characteristics and describe an automated configuration procedure for finding the best algorithm to search a particular data set. In order to scale to very large data sets that would otherwise not fit in the memory of a single machine, we propose a distributed nearest neighbor matching framework that can be used with any of the algorithms described in the paper. All this research has been released as an open source library called fast library for approximate nearest neighbors (FLANN), which has been incorporated into OpenCV and is now one of the most popular libraries for nearest neighbor matching.
NASA Astrophysics Data System (ADS)
Wisesty, Untari N.; Warastri, Riris S.; Puspitasari, Shinta Y.
2018-03-01
Cancer is one of the major causes of mordibility and mortality problems in the worldwide. Therefore, the need of a system that can analyze and identify a person suffering from a cancer by using microarray data derived from the patient’s Deoxyribonucleic Acid (DNA). But on microarray data has thousands of attributes, thus making the challenges in data processing. This is often referred to as the curse of dimensionality. Therefore, in this study built a system capable of detecting a patient whether contracted cancer or not. The algorithm used is Genetic Algorithm as feature selection and Momentum Backpropagation Neural Network as a classification method, with data used from the Kent Ridge Bio-medical Dataset. Based on system testing that has been done, the system can detect Leukemia and Colon Tumor with best accuracy equal to 98.33% for colon tumor data and 100% for leukimia data. Genetic Algorithm as feature selection algorithm can improve system accuracy, which is from 64.52% to 98.33% for colon tumor data and 65.28% to 100% for leukemia data, and the use of momentum parameters can accelerate the convergence of the system in the training process of Neural Network.
Processing LiDAR Data to Predict Natural Hazards
NASA Technical Reports Server (NTRS)
Fairweather, Ian; Crabtree, Robert; Hager, Stacey
2008-01-01
ELF-Base and ELF-Hazards (wherein 'ELF' signifies 'Extract LiDAR Features' and 'LiDAR' signifies 'light detection and ranging') are developmental software modules for processing remote-sensing LiDAR data to identify past natural hazards (principally, landslides) and predict future ones. ELF-Base processes raw LiDAR data, including LiDAR intensity data that are often ignored in other software, to create digital terrain models (DTMs) and digital feature models (DFMs) with sub-meter accuracy. ELF-Hazards fuses raw LiDAR data, data from multispectral and hyperspectral optical images, and DTMs and DFMs generated by ELF-Base to generate hazard risk maps. Advanced algorithms in these software modules include line-enhancement and edge-detection algorithms, surface-characterization algorithms, and algorithms that implement innovative data-fusion techniques. The line-extraction and edge-detection algorithms enable users to locate such features as faults and landslide headwall scarps. Also implemented in this software are improved methodologies for identification and mapping of past landslide events by use of (1) accurate, ELF-derived surface characterizations and (2) three LiDAR/optical-data-fusion techniques: post-classification data fusion, maximum-likelihood estimation modeling, and hierarchical within-class discrimination. This software is expected to enable faster, more accurate forecasting of natural hazards than has previously been possible.
Separation of pulsar signals from noise using supervised machine learning algorithms
NASA Astrophysics Data System (ADS)
Bethapudi, S.; Desai, S.
2018-04-01
We evaluate the performance of four different machine learning (ML) algorithms: an Artificial Neural Network Multi-Layer Perceptron (ANN MLP), Adaboost, Gradient Boosting Classifier (GBC), and XGBoost, for the separation of pulsars from radio frequency interference (RFI) and other sources of noise, using a dataset obtained from the post-processing of a pulsar search pipeline. This dataset was previously used for the cross-validation of the SPINN-based machine learning engine, obtained from the reprocessing of the HTRU-S survey data (Morello et al., 2014). We have used the Synthetic Minority Over-sampling Technique (SMOTE) to deal with high-class imbalance in the dataset. We report a variety of quality scores from all four of these algorithms on both the non-SMOTE and SMOTE datasets. For all the above ML methods, we report high accuracy and G-mean for both the non-SMOTE and SMOTE cases. We study the feature importances using Adaboost, GBC, and XGBoost and also from the minimum Redundancy Maximum Relevance approach to report algorithm-agnostic feature ranking. From these methods, we find that the signal to noise of the folded profile to be the best feature. We find that all the ML algorithms report FPRs about an order of magnitude lower than the corresponding FPRs obtained in Morello et al. (2014), for the same recall value.
Wen, Tingxi; Zhang, Zhongnan
2017-01-01
Abstract In this paper, genetic algorithm-based frequency-domain feature search (GAFDS) method is proposed for the electroencephalogram (EEG) analysis of epilepsy. In this method, frequency-domain features are first searched and then combined with nonlinear features. Subsequently, these features are selected and optimized to classify EEG signals. The extracted features are analyzed experimentally. The features extracted by GAFDS show remarkable independence, and they are superior to the nonlinear features in terms of the ratio of interclass distance and intraclass distance. Moreover, the proposed feature search method can search for features of instantaneous frequency in a signal after Hilbert transformation. The classification results achieved using these features are reasonable; thus, GAFDS exhibits good extensibility. Multiple classical classifiers (i.e., k-nearest neighbor, linear discriminant analysis, decision tree, AdaBoost, multilayer perceptron, and Naïve Bayes) achieve satisfactory classification accuracies by using the features generated by the GAFDS method and the optimized feature selection. The accuracies for 2-classification and 3-classification problems may reach up to 99% and 97%, respectively. Results of several cross-validation experiments illustrate that GAFDS is effective in the extraction of effective features for EEG classification. Therefore, the proposed feature selection and optimization model can improve classification accuracy. PMID:28489789
Wen, Tingxi; Zhang, Zhongnan
2017-05-01
In this paper, genetic algorithm-based frequency-domain feature search (GAFDS) method is proposed for the electroencephalogram (EEG) analysis of epilepsy. In this method, frequency-domain features are first searched and then combined with nonlinear features. Subsequently, these features are selected and optimized to classify EEG signals. The extracted features are analyzed experimentally. The features extracted by GAFDS show remarkable independence, and they are superior to the nonlinear features in terms of the ratio of interclass distance and intraclass distance. Moreover, the proposed feature search method can search for features of instantaneous frequency in a signal after Hilbert transformation. The classification results achieved using these features are reasonable; thus, GAFDS exhibits good extensibility. Multiple classical classifiers (i.e., k-nearest neighbor, linear discriminant analysis, decision tree, AdaBoost, multilayer perceptron, and Naïve Bayes) achieve satisfactory classification accuracies by using the features generated by the GAFDS method and the optimized feature selection. The accuracies for 2-classification and 3-classification problems may reach up to 99% and 97%, respectively. Results of several cross-validation experiments illustrate that GAFDS is effective in the extraction of effective features for EEG classification. Therefore, the proposed feature selection and optimization model can improve classification accuracy.
Cross-indexing of binary SIFT codes for large-scale image search.
Liu, Zhen; Li, Houqiang; Zhang, Liyan; Zhou, Wengang; Tian, Qi
2014-05-01
In recent years, there has been growing interest in mapping visual features into compact binary codes for applications on large-scale image collections. Encoding high-dimensional data as compact binary codes reduces the memory cost for storage. Besides, it benefits the computational efficiency since the computation of similarity can be efficiently measured by Hamming distance. In this paper, we propose a novel flexible scale invariant feature transform (SIFT) binarization (FSB) algorithm for large-scale image search. The FSB algorithm explores the magnitude patterns of SIFT descriptor. It is unsupervised and the generated binary codes are demonstrated to be dispreserving. Besides, we propose a new searching strategy to find target features based on the cross-indexing in the binary SIFT space and original SIFT space. We evaluate our approach on two publicly released data sets. The experiments on large-scale partial duplicate image retrieval system demonstrate the effectiveness and efficiency of the proposed algorithm.
High-dimensional cluster analysis with the Masked EM Algorithm
Kadir, Shabnam N.; Goodman, Dan F. M.; Harris, Kenneth D.
2014-01-01
Cluster analysis faces two problems in high dimensions: first, the “curse of dimensionality” that can lead to overfitting and poor generalization performance; and second, the sheer time taken for conventional algorithms to process large amounts of high-dimensional data. We describe a solution to these problems, designed for the application of “spike sorting” for next-generation high channel-count neural probes. In this problem, only a small subset of features provide information about the cluster member-ship of any one data vector, but this informative feature subset is not the same for all data points, rendering classical feature selection ineffective. We introduce a “Masked EM” algorithm that allows accurate and time-efficient clustering of up to millions of points in thousands of dimensions. We demonstrate its applicability to synthetic data, and to real-world high-channel-count spike sorting data. PMID:25149694
Automated thematic mapping and change detection of ERTS-A images
NASA Technical Reports Server (NTRS)
Gramenopoulos, N. (Principal Investigator)
1975-01-01
The author has identified the following significant results. In the first part of the investigation, spatial and spectral features were developed which were employed to automatically recognize terrain features through a clustering algorithm. In this part of the investigation, the size of the cell which is the number of digital picture elements used for computing the spatial and spectral features was varied. It was determined that the accuracy of terrain recognition decreases slowly as the cell size is reduced and coincides with increased cluster diffuseness. It was also proven that a cell size of 17 x 17 pixels when used with the clustering algorithm results in high recognition rates for major terrain classes. ERTS-1 data from five diverse geographic regions of the United States were processed through the clustering algorithm with 17 x 17 pixel cells. Simple land use maps were produced and the average terrain recognition accuracy was 82 percent.
Discriminative region extraction and feature selection based on the combination of SURF and saliency
NASA Astrophysics Data System (ADS)
Deng, Li; Wang, Chunhong; Rao, Changhui
2011-08-01
The objective of this paper is to provide a possible optimization on salient region algorithm, which is extensively used in recognizing and learning object categories. Salient region algorithm owns the superiority of intra-class tolerance, global score of features and automatically prominent scale selection under certain range. However, the major limitation behaves on performance, and that is what we attempt to improve. By reducing the number of pixels involved in saliency calculation, it can be accelerated. We use interest points detected by fast-Hessian, the detector of SURF, as the candidate feature for saliency operation, rather than the whole set in image. This implementation is thereby called Saliency based Optimization over SURF (SOSU for short). Experiment shows that bringing in of such a fast detector significantly speeds up the algorithm. Meanwhile, Robustness of intra-class diversity ensures object recognition accuracy.
Natural image statistics and low-complexity feature selection.
Vasconcelos, Manuela; Vasconcelos, Nuno
2009-02-01
Low-complexity feature selection is analyzed in the context of visual recognition. It is hypothesized that high-order dependences of bandpass features contain little information for discrimination of natural images. This hypothesis is characterized formally by the introduction of the concepts of conjunctive interference and decomposability order of a feature set. Necessary and sufficient conditions for the feasibility of low-complexity feature selection are then derived in terms of these concepts. It is shown that the intrinsic complexity of feature selection is determined by the decomposability order of the feature set and not its dimension. Feature selection algorithms are then derived for all levels of complexity and are shown to be approximated by existing information-theoretic methods, which they consistently outperform. The new algorithms are also used to objectively test the hypothesis of low decomposability order through comparison of classification performance. It is shown that, for image classification, the gain of modeling feature dependencies has strongly diminishing returns: best results are obtained under the assumption of decomposability order 1. This suggests a generic law for bandpass features extracted from natural images: that the effect, on the dependence of any two features, of observing any other feature is constant across image classes.
Improving permafrost distribution modelling using feature selection algorithms
NASA Astrophysics Data System (ADS)
Deluigi, Nicola; Lambiel, Christophe; Kanevski, Mikhail
2016-04-01
The availability of an increasing number of spatial data on the occurrence of mountain permafrost allows the employment of machine learning (ML) classification algorithms for modelling the distribution of the phenomenon. One of the major problems when dealing with high-dimensional dataset is the number of input features (variables) involved. Application of ML classification algorithms to this large number of variables leads to the risk of overfitting, with the consequence of a poor generalization/prediction. For this reason, applying feature selection (FS) techniques helps simplifying the amount of factors required and improves the knowledge on adopted features and their relation with the studied phenomenon. Moreover, taking away irrelevant or redundant variables from the dataset effectively improves the quality of the ML prediction. This research deals with a comparative analysis of permafrost distribution models supported by FS variable importance assessment. The input dataset (dimension = 20-25, 10 m spatial resolution) was constructed using landcover maps, climate data and DEM derived variables (altitude, aspect, slope, terrain curvature, solar radiation, etc.). It was completed with permafrost evidences (geophysical and thermal data and rock glacier inventories) that serve as training permafrost data. Used FS algorithms informed about variables that appeared less statistically important for permafrost presence/absence. Three different algorithms were compared: Information Gain (IG), Correlation-based Feature Selection (CFS) and Random Forest (RF). IG is a filter technique that evaluates the worth of a predictor by measuring the information gain with respect to the permafrost presence/absence. Conversely, CFS is a wrapper technique that evaluates the worth of a subset of predictors by considering the individual predictive ability of each variable along with the degree of redundancy between them. Finally, RF is a ML algorithm that performs FS as part of its overall operation. It operates by constructing a large collection of decorrelated classification trees, and then predicts the permafrost occurrence through a majority vote. With the so-called out-of-bag (OOB) error estimate, the classification of permafrost data can be validated as well as the contribution of each predictor can be assessed. The performances of compared permafrost distribution models (computed on independent testing sets) increased with the application of FS algorithms on the original dataset and irrelevant or redundant variables were removed. As a consequence, the process provided faster and more cost-effective predictors and a better understanding of the underlying structures residing in permafrost data. Our work demonstrates the usefulness of a feature selection step prior to applying a machine learning algorithm. In fact, permafrost predictors could be ranked not only based on their heuristic and subjective importance (expert knowledge), but also based on their statistical relevance in relation of the permafrost distribution.
On Feature Extraction from Large Scale Linear LiDAR Data
NASA Astrophysics Data System (ADS)
Acharjee, Partha Pratim
Airborne light detection and ranging (LiDAR) can generate co-registered elevation and intensity map over large terrain. The co-registered 3D map and intensity information can be used efficiently for different feature extraction application. In this dissertation, we developed two algorithms for feature extraction, and usages of features for practical applications. One of the developed algorithms can map still and flowing waterbody features, and another one can extract building feature and estimate solar potential on rooftops and facades. Remote sensing capabilities, distinguishing characteristics of laser returns from water surface and specific data collection procedures provide LiDAR data an edge in this application domain. Furthermore, water surface mapping solutions must work on extremely large datasets, from a thousand square miles, to hundreds of thousands of square miles. National and state-wide map generation/upgradation and hydro-flattening of LiDAR data for many other applications are two leading needs of water surface mapping. These call for as much automation as possible. Researchers have developed many semi-automated algorithms using multiple semi-automated tools and human interventions. This reported work describes a consolidated algorithm and toolbox developed for large scale, automated water surface mapping. Geometric features such as flatness of water surface, higher elevation change in water-land interface and, optical properties such as dropouts caused by specular reflection, bimodal intensity distributions were some of the linear LiDAR features exploited for water surface mapping. Large-scale data handling capabilities are incorporated by automated and intelligent windowing, by resolving boundary issues and integrating all results to a single output. This whole algorithm is developed as an ArcGIS toolbox using Python libraries. Testing and validation are performed on a large datasets to determine the effectiveness of the toolbox and results are presented. Significant power demand is located in urban areas, where, theoretically, a large amount of building surface area is also available for solar panel installation. Therefore, property owners and power generation companies can benefit from a citywide solar potential map, which can provide available estimated annual solar energy at a given location. An efficient solar potential measurement is a prerequisite for an effective solar energy system in an urban area. In addition, the solar potential calculation from rooftops and building facades could open up a wide variety of options for solar panel installations. However, complex urban scenes make it hard to estimate the solar potential, partly because of shadows cast by the buildings. LiDAR-based 3D city models could possibly be the right technology for solar potential mapping. Although, most of the current LiDAR-based local solar potential assessment algorithms mainly address rooftop potential calculation, whereas building facades can contribute a significant amount of viable surface area for solar panel installation. In this paper, we introduce a new algorithm to calculate solar potential of both rooftop and building facades. Solar potential received by the rooftops and facades over the year are also investigated in the test area.
Face verification system for Android mobile devices using histogram based features
NASA Astrophysics Data System (ADS)
Sato, Sho; Kobayashi, Kazuhiro; Chen, Qiu
2016-07-01
This paper proposes a face verification system that runs on Android mobile devices. In this system, facial image is captured by a built-in camera on the Android device firstly, and then face detection is implemented using Haar-like features and AdaBoost learning algorithm. The proposed system verify the detected face using histogram based features, which are generated by binary Vector Quantization (VQ) histogram using DCT coefficients in low frequency domains, as well as Improved Local Binary Pattern (Improved LBP) histogram in spatial domain. Verification results with different type of histogram based features are first obtained separately and then combined by weighted averaging. We evaluate our proposed algorithm by using publicly available ORL database and facial images captured by an Android tablet.
Shi, Jun; Liu, Xiao; Li, Yan; Zhang, Qi; Li, Yingjie; Ying, Shihui
2015-10-30
Electroencephalography (EEG) based sleep staging is commonly used in clinical routine. Feature extraction and representation plays a crucial role in EEG-based automatic classification of sleep stages. Sparse representation (SR) is a state-of-the-art unsupervised feature learning method suitable for EEG feature representation. Collaborative representation (CR) is an effective data coding method used as a classifier. Here we use CR as a data representation method to learn features from the EEG signal. A joint collaboration model is established to develop a multi-view learning algorithm, and generate joint CR (JCR) codes to fuse and represent multi-channel EEG signals. A two-stage multi-view learning-based sleep staging framework is then constructed, in which JCR and joint sparse representation (JSR) algorithms first fuse and learning the feature representation from multi-channel EEG signals, respectively. Multi-view JCR and JSR features are then integrated and sleep stages recognized by a multiple kernel extreme learning machine (MK-ELM) algorithm with grid search. The proposed two-stage multi-view learning algorithm achieves superior performance for sleep staging. With a K-means clustering based dictionary, the mean classification accuracy, sensitivity and specificity are 81.10 ± 0.15%, 71.42 ± 0.66% and 94.57 ± 0.07%, respectively; while with the dictionary learned using the submodular optimization method, they are 80.29 ± 0.22%, 71.26 ± 0.78% and 94.38 ± 0.10%, respectively. The two-stage multi-view learning based sleep staging framework outperforms all other classification methods compared in this work, while JCR is superior to JSR. The proposed multi-view learning framework has the potential for sleep staging based on multi-channel or multi-modality polysomnography signals. Copyright © 2015 Elsevier B.V. All rights reserved.
Beheshti, Iman; Demirel, Hasan; Matsuda, Hiroshi
2017-04-01
We developed a novel computer-aided diagnosis (CAD) system that uses feature-ranking and a genetic algorithm to analyze structural magnetic resonance imaging data; using this system, we can predict conversion of mild cognitive impairment (MCI)-to-Alzheimer's disease (AD) at between one and three years before clinical diagnosis. The CAD system was developed in four stages. First, we used a voxel-based morphometry technique to investigate global and local gray matter (GM) atrophy in an AD group compared with healthy controls (HCs). Regions with significant GM volume reduction were segmented as volumes of interest (VOIs). Second, these VOIs were used to extract voxel values from the respective atrophy regions in AD, HC, stable MCI (sMCI) and progressive MCI (pMCI) patient groups. The voxel values were then extracted into a feature vector. Third, at the feature-selection stage, all features were ranked according to their respective t-test scores and a genetic algorithm designed to find the optimal feature subset. The Fisher criterion was used as part of the objective function in the genetic algorithm. Finally, the classification was carried out using a support vector machine (SVM) with 10-fold cross validation. We evaluated the proposed automatic CAD system by applying it to baseline values from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset (160 AD, 162 HC, 65 sMCI and 71 pMCI subjects). The experimental results indicated that the proposed system is capable of distinguishing between sMCI and pMCI patients, and would be appropriate for practical use in a clinical setting. Copyright © 2017 Elsevier Ltd. All rights reserved.
Heidari, Morteza; Khuzani, Abolfazl Zargari; Hollingsworth, Alan B; Danala, Gopichandh; Mirniaharikandehei, Seyedehnafiseh; Qiu, Yuchen; Liu, Hong; Zheng, Bin
2018-01-30
In order to automatically identify a set of effective mammographic image features and build an optimal breast cancer risk stratification model, this study aims to investigate advantages of applying a machine learning approach embedded with a locally preserving projection (LPP) based feature combination and regeneration algorithm to predict short-term breast cancer risk. A dataset involving negative mammograms acquired from 500 women was assembled. This dataset was divided into two age-matched classes of 250 high risk cases in which cancer was detected in the next subsequent mammography screening and 250 low risk cases, which remained negative. First, a computer-aided image processing scheme was applied to segment fibro-glandular tissue depicted on mammograms and initially compute 44 features related to the bilateral asymmetry of mammographic tissue density distribution between left and right breasts. Next, a multi-feature fusion based machine learning classifier was built to predict the risk of cancer detection in the next mammography screening. A leave-one-case-out (LOCO) cross-validation method was applied to train and test the machine learning classifier embedded with a LLP algorithm, which generated a new operational vector with 4 features using a maximal variance approach in each LOCO process. Results showed a 9.7% increase in risk prediction accuracy when using this LPP-embedded machine learning approach. An increased trend of adjusted odds ratios was also detected in which odds ratios increased from 1.0 to 11.2. This study demonstrated that applying the LPP algorithm effectively reduced feature dimensionality, and yielded higher and potentially more robust performance in predicting short-term breast cancer risk.
NASA Astrophysics Data System (ADS)
Heidari, Morteza; Zargari Khuzani, Abolfazl; Hollingsworth, Alan B.; Danala, Gopichandh; Mirniaharikandehei, Seyedehnafiseh; Qiu, Yuchen; Liu, Hong; Zheng, Bin
2018-02-01
In order to automatically identify a set of effective mammographic image features and build an optimal breast cancer risk stratification model, this study aims to investigate advantages of applying a machine learning approach embedded with a locally preserving projection (LPP) based feature combination and regeneration algorithm to predict short-term breast cancer risk. A dataset involving negative mammograms acquired from 500 women was assembled. This dataset was divided into two age-matched classes of 250 high risk cases in which cancer was detected in the next subsequent mammography screening and 250 low risk cases, which remained negative. First, a computer-aided image processing scheme was applied to segment fibro-glandular tissue depicted on mammograms and initially compute 44 features related to the bilateral asymmetry of mammographic tissue density distribution between left and right breasts. Next, a multi-feature fusion based machine learning classifier was built to predict the risk of cancer detection in the next mammography screening. A leave-one-case-out (LOCO) cross-validation method was applied to train and test the machine learning classifier embedded with a LLP algorithm, which generated a new operational vector with 4 features using a maximal variance approach in each LOCO process. Results showed a 9.7% increase in risk prediction accuracy when using this LPP-embedded machine learning approach. An increased trend of adjusted odds ratios was also detected in which odds ratios increased from 1.0 to 11.2. This study demonstrated that applying the LPP algorithm effectively reduced feature dimensionality, and yielded higher and potentially more robust performance in predicting short-term breast cancer risk.
Joint Feature Selection and Classification for Multilabel Learning.
Huang, Jun; Li, Guorong; Huang, Qingming; Wu, Xindong
2018-03-01
Multilabel learning deals with examples having multiple class labels simultaneously. It has been applied to a variety of applications, such as text categorization and image annotation. A large number of algorithms have been proposed for multilabel learning, most of which concentrate on multilabel classification problems and only a few of them are feature selection algorithms. Current multilabel classification models are mainly built on a single data representation composed of all the features which are shared by all the class labels. Since each class label might be decided by some specific features of its own, and the problems of classification and feature selection are often addressed independently, in this paper, we propose a novel method which can perform joint feature selection and classification for multilabel learning, named JFSC. Different from many existing methods, JFSC learns both shared features and label-specific features by considering pairwise label correlations, and builds the multilabel classifier on the learned low-dimensional data representations simultaneously. A comparative study with state-of-the-art approaches manifests a competitive performance of our proposed method both in classification and feature selection for multilabel learning.
Constrained dictionary learning and probabilistic hypergraph ranking for person re-identification
NASA Astrophysics Data System (ADS)
He, You; Wu, Song; Pu, Nan; Qian, Li; Xiao, Guoqiang
2018-04-01
Person re-identification is a fundamental and inevitable task in public security. In this paper, we propose a novel framework to improve the performance of this task. First, two different types of descriptors are extracted to represent a pedestrian: (1) appearance-based superpixel features, which are constituted mainly by conventional color features and extracted from the supepixel rather than a whole picture and (2) due to the limitation of discrimination of appearance features, the deep features extracted by feature fusion Network are also used. Second, a view invariant subspace is learned by dictionary learning constrained by the minimum negative sample (termed as DL-cMN) to reduce the noise in appearance-based superpixel feature domain. Then, we use deep features and sparse codes transformed by appearancebased features to establish the hyperedges respectively by k-nearest neighbor, rather than jointing different features simply. Finally, a final ranking is performed by probabilistic hypergraph ranking algorithm. Extensive experiments on three challenging datasets (VIPeR, PRID450S and CUHK01) demonstrate the advantages and effectiveness of our proposed algorithm.
Developing a radiomics framework for classifying non-small cell lung carcinoma subtypes
NASA Astrophysics Data System (ADS)
Yu, Dongdong; Zang, Yali; Dong, Di; Zhou, Mu; Gevaert, Olivier; Fang, Mengjie; Shi, Jingyun; Tian, Jie
2017-03-01
Patient-targeted treatment of non-small cell lung carcinoma (NSCLC) has been well documented according to the histologic subtypes over the past decade. In parallel, recent development of quantitative image biomarkers has recently been highlighted as important diagnostic tools to facilitate histological subtype classification. In this study, we present a radiomics analysis that classifies the adenocarcinoma (ADC) and squamous cell carcinoma (SqCC). We extract 52-dimensional, CT-based features (7 statistical features and 45 image texture features) to represent each nodule. We evaluate our approach on a clinical dataset including 324 ADCs and 110 SqCCs patients with CT image scans. Classification of these features is performed with four different machine-learning classifiers including Support Vector Machines with Radial Basis Function kernel (RBF-SVM), Random forest (RF), K-nearest neighbor (KNN), and RUSBoost algorithms. To improve the classifiers' performance, optimal feature subset is selected from the original feature set by using an iterative forward inclusion and backward eliminating algorithm. Extensive experimental results demonstrate that radiomics features achieve encouraging classification results on both complete feature set (AUC=0.89) and optimal feature subset (AUC=0.91).
NASA Astrophysics Data System (ADS)
Nishizuka, N.; Sugiura, K.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M.
2017-02-01
We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010-2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite. We detected active regions (ARs) from the full-disk magnetogram, from which ˜60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.
PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer
Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models tomore » curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nishizuka, N.; Kubo, Y.; Den, M.
We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010–2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite . We detected active regions (ARs) from the full-disk magnetogram, from which ∼60 features were extracted with their time differentials, including magnetic neutralmore » lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.« less
Chen, Ke-ping; Xu, Geng; Wu, Shulin; Tang, Baopeng; Wang, Li; Zhang, Shu
2013-03-01
The present study was to assess the accuracy of automatic atrial and ventricular capture management (ACM and VCM) in determining pacing threshold and the performance of a second-generation automatic atrioventricular (AV) interval extension algorithm for reducing unnecessary ventricular pacing. A total of 398 patients at 32 centres who received an EnPulse dual-chamber pacing/dual-chamber adaptive rate pacing pacemaker (Medtronic, Minneapolis, MN, USA) were enrolled. The last amplitude thresholds as measured by ACM and VCM prior to the 6-month follow-up were compared with manually measured thresholds. Device diagnostics were used to evaluate ACM and VCM and the percentage of ventricular pacing with and without the AV extension algorithm. Modelling was performed to assess longevity gains relating to the use of automaticity features. Atrial and ventricular capture management performed accurately and reliably provided complete capture management in 97% of studied patients. The AV interval extension algorithm reduced the median per cent of right ventricular pacing in patients with sinus node dysfunction from 99.7 to 1.5% at 6-month follow-up and in patients with intermittent AV block (excluding persistent 3° AV block) from 99.9 to 50.2%. On the basis of validated modelling, estimated device longevity could potentially be extended by 1.9 years through the use of the capture management and AV interval extension features. Both ACM and VCM features reliably measured thresholds in nearly all patients; the AV extension algorithm significantly reduced ventricular pacing; and the use of pacemaker automaticity features potentially extends device longevity.
Liu, Chang; Wang, Guofeng; Xie, Qinglu; Zhang, Yanchao
2014-01-01
Effective fault classification of rolling element bearings provides an important basis for ensuring safe operation of rotating machinery. In this paper, a novel vibration sensor-based fault diagnosis method using an Ellipsoid-ARTMAP network (EAM) and a differential evolution (DE) algorithm is proposed. The original features are firstly extracted from vibration signals based on wavelet packet decomposition. Then, a minimum-redundancy maximum-relevancy algorithm is introduced to select the most prominent features so as to decrease feature dimensions. Finally, a DE-based EAM (DE-EAM) classifier is constructed to realize the fault diagnosis. The major characteristic of EAM is that the sample distribution of each category is realized by using a hyper-ellipsoid node and smoothing operation algorithm. Therefore, it can depict the decision boundary of disperse samples accurately and effectively avoid over-fitting phenomena. To optimize EAM network parameters, the DE algorithm is presented and two objectives, including both classification accuracy and nodes number, are simultaneously introduced as the fitness functions. Meanwhile, an exponential criterion is proposed to realize final selection of the optimal parameters. To prove the effectiveness of the proposed method, the vibration signals of four types of rolling element bearings under different loads were collected. Moreover, to improve the robustness of the classifier evaluation, a two-fold cross validation scheme is adopted and the order of feature samples is randomly arranged ten times within each fold. The results show that DE-EAM classifier can recognize the fault categories of the rolling element bearings reliably and accurately. PMID:24936949
Group sparse multiview patch alignment framework with view consistency for image classification.
Gui, Jie; Tao, Dacheng; Sun, Zhenan; Luo, Yong; You, Xinge; Tang, Yuan Yan
2014-07-01
No single feature can satisfactorily characterize the semantic concepts of an image. Multiview learning aims to unify different kinds of features to produce a consensual and efficient representation. This paper redefines part optimization in the patch alignment framework (PAF) and develops a group sparse multiview patch alignment framework (GSM-PAF). The new part optimization considers not only the complementary properties of different views, but also view consistency. In particular, view consistency models the correlations between all possible combinations of any two kinds of view. In contrast to conventional dimensionality reduction algorithms that perform feature extraction and feature selection independently, GSM-PAF enjoys joint feature extraction and feature selection by exploiting l(2,1)-norm on the projection matrix to achieve row sparsity, which leads to the simultaneous selection of relevant features and learning transformation, and thus makes the algorithm more discriminative. Experiments on two real-world image data sets demonstrate the effectiveness of GSM-PAF for image classification.
Automatically measuring the effect of strategy drawing features on pupils' handwriting and gender
NASA Astrophysics Data System (ADS)
Tabatabaey-Mashadi, Narges; Sudirman, Rubita; Guest, Richard M.; Khalid, Puspa Inayat
2013-12-01
Children's dynamic drawing strategies have been recently recognized as indicators of handwriting ability. However the influence of each feature in predicting handwriting is unknown due to lack of a measuring system. An automated measuring algorithm suitable for psychological assessment and non-subjective scoring is presented here. Using the weight vector and classification rate of a machine learning algorithm, an overall feature's effect is calculated which is comparable in different groupings. In this study thirteen previously detected drawing strategy features are measured for their influence on handwriting and gender. Features are extracted from drawing a triangle, Beery VMI and Bender Gestalt tangent patterns. Samples are related to 203 pupils (77 below average writers, and 101 female). The results show that the number of strokes in drawing the triangle pattern plays a major role in both groupings; however Left Tendency flag feature is affected by children's handwriting about 2.5 times greater than their gender. Experiments indicate that different forms of a feature sometimes show different influences.
Extracting DNA words based on the sequence features: non-uniform distribution and integrity.
Li, Zhi; Cao, Hongyan; Cui, Yuehua; Zhang, Yanbo
2016-01-25
DNA sequence can be viewed as an unknown language with words as its functional units. Given that most sequence alignment algorithms such as the motif discovery algorithms depend on the quality of background information about sequences, it is necessary to develop an ab initio algorithm for extracting the "words" based only on the DNA sequences. We considered that non-uniform distribution and integrity were two important features of a word, based on which we developed an ab initio algorithm to extract "DNA words" that have potential functional meaning. A Kolmogorov-Smirnov test was used for consistency test of uniform distribution of DNA sequences, and the integrity was judged by the sequence and position alignment. Two random base sequences were adopted as negative control, and an English book was used as positive control to verify our algorithm. We applied our algorithm to the genomes of Saccharomyces cerevisiae and 10 strains of Escherichia coli to show the utility of the methods. The results provide strong evidences that the algorithm is a promising tool for ab initio building a DNA dictionary. Our method provides a fast way for large scale screening of important DNA elements and offers potential insights into the understanding of a genome.
Flexible methods for segmentation evaluation: Results from CT-based luggage screening
Karimi, Seemeen; Jiang, Xiaoqian; Cosman, Pamela; Martz, Harry
2017-01-01
BACKGROUND Imaging systems used in aviation security include segmentation algorithms in an automatic threat recognition pipeline. The segmentation algorithms evolve in response to emerging threats and changing performance requirements. Analysis of segmentation algorithms’ behavior, including the nature of errors and feature recovery, facilitates their development. However, evaluation methods from the literature provide limited characterization of the segmentation algorithms. OBJECTIVE To develop segmentation evaluation methods that measure systematic errors such as oversegmentation and undersegmentation, outliers, and overall errors. The methods must measure feature recovery and allow us to prioritize segments. METHODS We developed two complementary evaluation methods using statistical techniques and information theory. We also created a semi-automatic method to define ground truth from 3D images. We applied our methods to evaluate five segmentation algorithms developed for CT luggage screening. We validated our methods with synthetic problems and an observer evaluation. RESULTS Both methods selected the same best segmentation algorithm. Human evaluation confirmed the findings. The measurement of systematic errors and prioritization helped in understanding the behavior of each segmentation algorithm. CONCLUSIONS Our evaluation methods allow us to measure and explain the accuracy of segmentation algorithms. PMID:24699346
Parallel algorithm for determining motion vectors in ice floe images by matching edge features
NASA Technical Reports Server (NTRS)
Manohar, M.; Ramapriyan, H. K.; Strong, J. P.
1988-01-01
A parallel algorithm is described to determine motion vectors of ice floes using time sequences of images of the Arctic ocean obtained from the Synthetic Aperture Radar (SAR) instrument flown on-board the SEASAT spacecraft. Researchers describe a parallel algorithm which is implemented on the MPP for locating corresponding objects based on their translationally and rotationally invariant features. The algorithm first approximates the edges in the images by polygons or sets of connected straight-line segments. Each such edge structure is then reduced to a seed point. Associated with each seed point are the descriptions (lengths, orientations and sequence numbers) of the lines constituting the corresponding edge structure. A parallel matching algorithm is used to match packed arrays of such descriptions to identify corresponding seed points in the two images. The matching algorithm is designed such that fragmentation and merging of ice floes are taken into account by accepting partial matches. The technique has been demonstrated to work on synthetic test patterns and real image pairs from SEASAT in times ranging from .5 to 0.7 seconds for 128 x 128 images.
Research and implementation of finger-vein recognition algorithm
NASA Astrophysics Data System (ADS)
Pang, Zengyao; Yang, Jie; Chen, Yilei; Liu, Yin
2017-06-01
In finger vein image preprocessing, finger angle correction and ROI extraction are important parts of the system. In this paper, we propose an angle correction algorithm based on the centroid of the vein image, and extract the ROI region according to the bidirectional gray projection method. Inspired by the fact that features in those vein areas have similar appearance as valleys, a novel method was proposed to extract center and width of palm vein based on multi-directional gradients, which is easy-computing, quick and stable. On this basis, an encoding method was designed to determine the gray value distribution of texture image. This algorithm could effectively overcome the edge of the texture extraction error. Finally, the system was equipped with higher robustness and recognition accuracy by utilizing fuzzy threshold determination and global gray value matching algorithm. Experimental results on pairs of matched palm images show that, the proposed method has a EER with 3.21% extracts features at the speed of 27ms per image. It can be concluded that the proposed algorithm has obvious advantages in grain extraction efficiency, matching accuracy and algorithm efficiency.
Jiang, Wen Jun; Wittek, Peter; Zhao, Li; Gao, Shi Chao
2014-01-01
Photoplethysmogram (PPG) signals acquired by smartphone cameras are weaker than those acquired by dedicated pulse oximeters. Furthermore, the signals have lower sampling rates, have notches in the waveform and are more severely affected by baseline drift, leading to specific morphological characteristics. This paper introduces a new feature, the inverted triangular area, to address these specific characteristics. The new feature enables real-time adaptive waveform detection using an algorithm of linear time complexity. It can also recognize notches in the waveform and it is inherently robust to baseline drift. An implementation of the algorithm on Android is available for free download. We collected data from 24 volunteers and compared our algorithm in peak detection with two competing algorithms designed for PPG signals, Incremental-Merge Segmentation (IMS) and Adaptive Thresholding (ADT). A sensitivity of 98.0% and a positive predictive value of 98.8% were obtained, which were 7.7% higher than the IMS algorithm in sensitivity, and 8.3% higher than the ADT algorithm in positive predictive value. The experimental results confirmed the applicability of the proposed method.
NASA Astrophysics Data System (ADS)
Heidari, Morteza; Zargari Khuzani, Abolfazl; Danala, Gopichandh; Mirniaharikandehei, Seyedehnafiseh; Qian, Wei; Zheng, Bin
2018-03-01
Both conventional and deep machine learning has been used to develop decision-support tools applied in medical imaging informatics. In order to take advantages of both conventional and deep learning approach, this study aims to investigate feasibility of applying a locally preserving projection (LPP) based feature regeneration algorithm to build a new machine learning classifier model to predict short-term breast cancer risk. First, a computer-aided image processing scheme was used to segment and quantify breast fibro-glandular tissue volume. Next, initially computed 44 image features related to the bilateral mammographic tissue density asymmetry were extracted. Then, an LLP-based feature combination method was applied to regenerate a new operational feature vector using a maximal variance approach. Last, a k-nearest neighborhood (KNN) algorithm based machine learning classifier using the LPP-generated new feature vectors was developed to predict breast cancer risk. A testing dataset involving negative mammograms acquired from 500 women was used. Among them, 250 were positive and 250 remained negative in the next subsequent mammography screening. Applying to this dataset, LLP-generated feature vector reduced the number of features from 44 to 4. Using a leave-onecase-out validation method, area under ROC curve produced by the KNN classifier significantly increased from 0.62 to 0.68 (p < 0.05) and odds ratio was 4.60 with a 95% confidence interval of [3.16, 6.70]. Study demonstrated that this new LPP-based feature regeneration approach enabled to produce an optimal feature vector and yield improved performance in assisting to predict risk of women having breast cancer detected in the next subsequent mammography screening.
Gemovic, Branislava; Perovic, Vladimir; Glisic, Sanja; Veljkovic, Nevena
2013-01-01
There are more than 500 amino acid substitutions in each human genome, and bioinformatics tools irreplaceably contribute to determination of their functional effects. We have developed feature-based algorithm for the detection of mutations outside conserved functional domains (CFDs) and compared its classification efficacy with the most commonly used phylogeny-based tools, PolyPhen-2 and SIFT. The new algorithm is based on the informational spectrum method (ISM), a feature-based technique, and statistical analysis. Our dataset contained neutral polymorphisms and mutations associated with myeloid malignancies from epigenetic regulators ASXL1, DNMT3A, EZH2, and TET2. PolyPhen-2 and SIFT had significantly lower accuracies in predicting the effects of amino acid substitutions outside CFDs than expected, with especially low sensitivity. On the other hand, only ISM algorithm showed statistically significant classification of these sequences. It outperformed PolyPhen-2 and SIFT by 15% and 13%, respectively. These results suggest that feature-based methods, like ISM, are more suitable for the classification of amino acid substitutions outside CFDs than phylogeny-based tools.
Fuzzy Performance between Surface Fitting and Energy Distribution in Turbulence Runner
Liang, Zhongwei; Liu, Xiaochu; Ye, Bangyan; Brauwer, Richard Kars
2012-01-01
Because the application of surface fitting algorithms exerts a considerable fuzzy influence on the mathematical features of kinetic energy distribution, their relation mechanism in different external conditional parameters must be quantitatively analyzed. Through determining the kinetic energy value of each selected representative position coordinate point by calculating kinetic energy parameters, several typical algorithms of complicated surface fitting are applied for constructing microkinetic energy distribution surface models in the objective turbulence runner with those obtained kinetic energy values. On the base of calculating the newly proposed mathematical features, we construct fuzzy evaluation data sequence and present a new three-dimensional fuzzy quantitative evaluation method; then the value change tendencies of kinetic energy distribution surface features can be clearly quantified, and the fuzzy performance mechanism discipline between the performance results of surface fitting algorithms, the spatial features of turbulence kinetic energy distribution surface, and their respective environmental parameter conditions can be quantitatively analyzed in detail, which results in the acquirement of final conclusions concerning the inherent turbulence kinetic energy distribution performance mechanism and its mathematical relation. A further turbulence energy quantitative study can be ensured. PMID:23213287
An algorithm of adaptive scale object tracking in occlusion
NASA Astrophysics Data System (ADS)
Zhao, Congmei
2017-05-01
Although the correlation filter-based trackers achieve the competitive results both on accuracy and robustness, there are still some problems in handling scale variations, object occlusion, fast motions and so on. In this paper, a multi-scale kernel correlation filter algorithm based on random fern detector was proposed. The tracking task was decomposed into the target scale estimation and the translation estimation. At the same time, the Color Names features and HOG features were fused in response level to further improve the overall tracking performance of the algorithm. In addition, an online random fern classifier was trained to re-obtain the target after the target was lost. By comparing with some algorithms such as KCF, DSST, TLD, MIL, CT and CSK, experimental results show that the proposed approach could estimate the object state accurately and handle the object occlusion effectively.
A fast and high performance multiple data integration algorithm for identifying human disease genes
2015-01-01
Background Integrating multiple data sources is indispensable in improving disease gene identification. It is not only due to the fact that disease genes associated with similar genetic diseases tend to lie close with each other in various biological networks, but also due to the fact that gene-disease associations are complex. Although various algorithms have been proposed to identify disease genes, their prediction performances and the computational time still should be further improved. Results In this study, we propose a fast and high performance multiple data integration algorithm for identifying human disease genes. A posterior probability of each candidate gene associated with individual diseases is calculated by using a Bayesian analysis method and a binary logistic regression model. Two prior probability estimation strategies and two feature vector construction methods are developed to test the performance of the proposed algorithm. Conclusions The proposed algorithm is not only generated predictions with high AUC scores, but also runs very fast. When only a single PPI network is employed, the AUC score is 0.769 by using F2 as feature vectors. The average running time for each leave-one-out experiment is only around 1.5 seconds. When three biological networks are integrated, the AUC score using F3 as feature vectors increases to 0.830, and the average running time for each leave-one-out experiment takes only about 12.54 seconds. It is better than many existing algorithms. PMID:26399620
A fuzzy clustering algorithm to detect planar and quadric shapes
NASA Technical Reports Server (NTRS)
Krishnapuram, Raghu; Frigui, Hichem; Nasraoui, Olfa
1992-01-01
In this paper, we introduce a new fuzzy clustering algorithm to detect an unknown number of planar and quadric shapes in noisy data. The proposed algorithm is computationally and implementationally simple, and it overcomes many of the drawbacks of the existing algorithms that have been proposed for similar tasks. Since the clustering is performed in the original image space, and since no features need to be computed, this approach is particularly suited for sparse data. The algorithm may also be used in pattern recognition applications.
NASA Technical Reports Server (NTRS)
Chang, H.
1976-01-01
A computer program using Lemke, Salkin and Spielberg's Set Covering Algorithm (SCA) to optimize a traffic model problem in the Scheduling Algorithm for Mission Planning and Logistics Evaluation (SAMPLE) was documented. SCA forms a submodule of SAMPLE and provides for input and output, subroutines, and an interactive feature for performing the optimization and arranging the results in a readily understandable form for output.
NASA Astrophysics Data System (ADS)
Potlov, A. Yu.; Frolov, S. V.; Proskurin, S. G.
2018-04-01
High-quality OCT structural images reconstruction algorithm for endoscopic optical coherence tomography of biological tissue is described. The key features of the presented algorithm are: (1) raster scanning and averaging of adjacent Ascans and pixels; (2) speckle level minimization. The described algorithm can be used in the gastroenterology, urology, gynecology, otorhinolaryngology for mucous membranes and skin diagnostics in vivo and in situ.
A fast, robust algorithm for power line interference cancellation in neural recording.
Keshtkaran, Mohammad Reza; Yang, Zhi
2014-04-01
Power line interference may severely corrupt neural recordings at 50/60 Hz and harmonic frequencies. The interference is usually non-stationary and can vary in frequency, amplitude and phase. To retrieve the gamma-band oscillations at the contaminated frequencies, it is desired to remove the interference without compromising the actual neural signals at the interference frequency bands. In this paper, we present a robust and computationally efficient algorithm for removing power line interference from neural recordings. The algorithm includes four steps. First, an adaptive notch filter is used to estimate the fundamental frequency of the interference. Subsequently, based on the estimated frequency, harmonics are generated by using discrete-time oscillators, and then the amplitude and phase of each harmonic are estimated by using a modified recursive least squares algorithm. Finally, the estimated interference is subtracted from the recorded data. The algorithm does not require any reference signal, and can track the frequency, phase and amplitude of each harmonic. When benchmarked with other popular approaches, our algorithm performs better in terms of noise immunity, convergence speed and output signal-to-noise ratio (SNR). While minimally affecting the signal bands of interest, the algorithm consistently yields fast convergence (<100 ms) and substantial interference rejection (output SNR >30 dB) in different conditions of interference strengths (input SNR from -30 to 30 dB), power line frequencies (45-65 Hz) and phase and amplitude drifts. In addition, the algorithm features a straightforward parameter adjustment since the parameters are independent of the input SNR, input signal power and the sampling rate. A hardware prototype was fabricated in a 65 nm CMOS process and tested. Software implementation of the algorithm has been made available for open access at https://github.com/mrezak/removePLI. The proposed algorithm features a highly robust operation, fast adaptation to interference variations, significant SNR improvement, low computational complexity and memory requirement and straightforward parameter adjustment. These features render the algorithm suitable for wearable and implantable sensor applications, where reliable and real-time cancellation of the interference is desired.
A fast, robust algorithm for power line interference cancellation in neural recording
NASA Astrophysics Data System (ADS)
Keshtkaran, Mohammad Reza; Yang, Zhi
2014-04-01
Objective. Power line interference may severely corrupt neural recordings at 50/60 Hz and harmonic frequencies. The interference is usually non-stationary and can vary in frequency, amplitude and phase. To retrieve the gamma-band oscillations at the contaminated frequencies, it is desired to remove the interference without compromising the actual neural signals at the interference frequency bands. In this paper, we present a robust and computationally efficient algorithm for removing power line interference from neural recordings. Approach. The algorithm includes four steps. First, an adaptive notch filter is used to estimate the fundamental frequency of the interference. Subsequently, based on the estimated frequency, harmonics are generated by using discrete-time oscillators, and then the amplitude and phase of each harmonic are estimated by using a modified recursive least squares algorithm. Finally, the estimated interference is subtracted from the recorded data. Main results. The algorithm does not require any reference signal, and can track the frequency, phase and amplitude of each harmonic. When benchmarked with other popular approaches, our algorithm performs better in terms of noise immunity, convergence speed and output signal-to-noise ratio (SNR). While minimally affecting the signal bands of interest, the algorithm consistently yields fast convergence (<100 ms) and substantial interference rejection (output SNR >30 dB) in different conditions of interference strengths (input SNR from -30 to 30 dB), power line frequencies (45-65 Hz) and phase and amplitude drifts. In addition, the algorithm features a straightforward parameter adjustment since the parameters are independent of the input SNR, input signal power and the sampling rate. A hardware prototype was fabricated in a 65 nm CMOS process and tested. Software implementation of the algorithm has been made available for open access at https://github.com/mrezak/removePLI. Significance. The proposed algorithm features a highly robust operation, fast adaptation to interference variations, significant SNR improvement, low computational complexity and memory requirement and straightforward parameter adjustment. These features render the algorithm suitable for wearable and implantable sensor applications, where reliable and real-time cancellation of the interference is desired.
O'Boyle, Noel M; Palmer, David S; Nigsch, Florian; Mitchell, John BO
2008-01-01
Background We present a novel feature selection algorithm, Winnowing Artificial Ant Colony (WAAC), that performs simultaneous feature selection and model parameter optimisation for the development of predictive quantitative structure-property relationship (QSPR) models. The WAAC algorithm is an extension of the modified ant colony algorithm of Shen et al. (J Chem Inf Model 2005, 45: 1024–1029). We test the ability of the algorithm to develop a predictive partial least squares model for the Karthikeyan dataset (J Chem Inf Model 2005, 45: 581–590) of melting point values. We also test its ability to perform feature selection on a support vector machine model for the same dataset. Results Starting from an initial set of 203 descriptors, the WAAC algorithm selected a PLS model with 68 descriptors which has an RMSE on an external test set of 46.6°C and R2 of 0.51. The number of components chosen for the model was 49, which was close to optimal for this feature selection. The selected SVM model has 28 descriptors (cost of 5, ε of 0.21) and an RMSE of 45.1°C and R2 of 0.54. This model outperforms a kNN model (RMSE of 48.3°C, R2 of 0.47) for the same data and has similar performance to a Random Forest model (RMSE of 44.5°C, R2 of 0.55). However it is much less prone to bias at the extremes of the range of melting points as shown by the slope of the line through the residuals: -0.43 for WAAC/SVM, -0.53 for Random Forest. Conclusion With a careful choice of objective function, the WAAC algorithm can be used to optimise machine learning and regression models that suffer from overfitting. Where model parameters also need to be tuned, as is the case with support vector machine and partial least squares models, it can optimise these simultaneously. The moving probabilities used by the algorithm are easily interpreted in terms of the best and current models of the ants, and the winnowing procedure promotes the removal of irrelevant descriptors. PMID:18959785
A post-processing algorithm for time domain pitch trackers
NASA Astrophysics Data System (ADS)
Specker, P.
1983-01-01
This paper describes a powerful post-processing algorithm for time-domain pitch trackers. On two successive passes, the post-processing algorithm eliminates errors produced during a first pass by a time-domain pitch tracker. During the second pass, incorrect pitch values are detected as outliers by computing the distribution of values over a sliding 80 msec window. During the third pass (based on artificial intelligence techniques), remaining pitch pulses are used as anchor points to reconstruct the pitch train from the original waveform. The algorithm produced a decrease in the error rate from 21% obtained with the original time domain pitch tracker to 2% for isolated words and sentences produced in an office environment by 3 male and 3 female talkers. In a noisy computer room errors decreased from 52% to 2.9% for the same stimuli produced by 2 male talkers. The algorithm is efficient, accurate, and resistant to noise. The fundamental frequency micro-structure is tracked sufficiently well to be used in extracting phonetic features in a feature-based recognition system.
Historical feature pattern extraction based network attack situation sensing algorithm.
Zeng, Yong; Liu, Dacheng; Lei, Zhou
2014-01-01
The situation sequence contains a series of complicated and multivariate random trends, which are very sudden, uncertain, and difficult to recognize and describe its principle by traditional algorithms. To solve the above questions, estimating parameters of super long situation sequence is essential, but very difficult, so this paper proposes a situation prediction method based on historical feature pattern extraction (HFPE). First, HFPE algorithm seeks similar indications from the history situation sequence recorded and weighs the link intensity between occurred indication and subsequent effect. Then it calculates the probability that a certain effect reappears according to the current indication and makes a prediction after weighting. Meanwhile, HFPE method gives an evolution algorithm to derive the prediction deviation from the views of pattern and accuracy. This algorithm can continuously promote the adaptability of HFPE through gradual fine-tuning. The method preserves the rules in sequence at its best, does not need data preprocessing, and can track and adapt to the variation of situation sequence continuously.
Historical Feature Pattern Extraction Based Network Attack Situation Sensing Algorithm
Zeng, Yong; Liu, Dacheng; Lei, Zhou
2014-01-01
The situation sequence contains a series of complicated and multivariate random trends, which are very sudden, uncertain, and difficult to recognize and describe its principle by traditional algorithms. To solve the above questions, estimating parameters of super long situation sequence is essential, but very difficult, so this paper proposes a situation prediction method based on historical feature pattern extraction (HFPE). First, HFPE algorithm seeks similar indications from the history situation sequence recorded and weighs the link intensity between occurred indication and subsequent effect. Then it calculates the probability that a certain effect reappears according to the current indication and makes a prediction after weighting. Meanwhile, HFPE method gives an evolution algorithm to derive the prediction deviation from the views of pattern and accuracy. This algorithm can continuously promote the adaptability of HFPE through gradual fine-tuning. The method preserves the rules in sequence at its best, does not need data preprocessing, and can track and adapt to the variation of situation sequence continuously. PMID:24892054
NASA Astrophysics Data System (ADS)
Lee, Feifei; Kotani, Koji; Chen, Qiu; Ohmi, Tadahiro
2010-02-01
In this paper, a fast search algorithm for MPEG-4 video clips from video database is proposed. An adjacent pixel intensity difference quantization (APIDQ) histogram is utilized as the feature vector of VOP (video object plane), which had been reliably applied to human face recognition previously. Instead of fully decompressed video sequence, partially decoded data, namely DC sequence of the video object are extracted from the video sequence. Combined with active search, a temporal pruning algorithm, fast and robust video search can be realized. The proposed search algorithm has been evaluated by total 15 hours of video contained of TV programs such as drama, talk, news, etc. to search for given 200 MPEG-4 video clips which each length is 15 seconds. Experimental results show the proposed algorithm can detect the similar video clip in merely 80ms, and Equal Error Rate (ERR) of 2 % in drama and news categories are achieved, which are more accurately and robust than conventional fast video search algorithm.
A new clustering algorithm applicable to multispectral and polarimetric SAR images
NASA Technical Reports Server (NTRS)
Wong, Yiu-Fai; Posner, Edward C.
1993-01-01
We describe an application of a scale-space clustering algorithm to the classification of a multispectral and polarimetric SAR image of an agricultural site. After the initial polarimetric and radiometric calibration and noise cancellation, we extracted a 12-dimensional feature vector for each pixel from the scattering matrix. The clustering algorithm was able to partition a set of unlabeled feature vectors from 13 selected sites, each site corresponding to a distinct crop, into 13 clusters without any supervision. The cluster parameters were then used to classify the whole image. The classification map is much less noisy and more accurate than those obtained by hierarchical rules. Starting with every point as a cluster, the algorithm works by melting the system to produce a tree of clusters in the scale space. It can cluster data in any multidimensional space and is insensitive to variability in cluster densities, sizes and ellipsoidal shapes. This algorithm, more powerful than existing ones, may be useful for remote sensing for land use.
A Hybrid Swarm Intelligence Algorithm for Intrusion Detection Using Significant Features.
Amudha, P; Karthik, S; Sivakumari, S
2015-01-01
Intrusion detection has become a main part of network security due to the huge number of attacks which affects the computers. This is due to the extensive growth of internet connectivity and accessibility to information systems worldwide. To deal with this problem, in this paper a hybrid algorithm is proposed to integrate Modified Artificial Bee Colony (MABC) with Enhanced Particle Swarm Optimization (EPSO) to predict the intrusion detection problem. The algorithms are combined together to find out better optimization results and the classification accuracies are obtained by 10-fold cross-validation method. The purpose of this paper is to select the most relevant features that can represent the pattern of the network traffic and test its effect on the success of the proposed hybrid classification algorithm. To investigate the performance of the proposed method, intrusion detection KDDCup'99 benchmark dataset from the UCI Machine Learning repository is used. The performance of the proposed method is compared with the other machine learning algorithms and found to be significantly different.
A Hybrid Swarm Intelligence Algorithm for Intrusion Detection Using Significant Features
Amudha, P.; Karthik, S.; Sivakumari, S.
2015-01-01
Intrusion detection has become a main part of network security due to the huge number of attacks which affects the computers. This is due to the extensive growth of internet connectivity and accessibility to information systems worldwide. To deal with this problem, in this paper a hybrid algorithm is proposed to integrate Modified Artificial Bee Colony (MABC) with Enhanced Particle Swarm Optimization (EPSO) to predict the intrusion detection problem. The algorithms are combined together to find out better optimization results and the classification accuracies are obtained by 10-fold cross-validation method. The purpose of this paper is to select the most relevant features that can represent the pattern of the network traffic and test its effect on the success of the proposed hybrid classification algorithm. To investigate the performance of the proposed method, intrusion detection KDDCup'99 benchmark dataset from the UCI Machine Learning repository is used. The performance of the proposed method is compared with the other machine learning algorithms and found to be significantly different. PMID:26221625
1987-08-01
techniques for routing and testing the rout- ability of designs. The design model is ill- suited for the developement of routing algorithms, but the...circular ordering of ca- bles at a feature endpoint. The arrows de - pict the circular ordering of cables at feature ’ 3 cables endpoints p and q. There can...Figure le -1, whose only proper realizations have size fQ(n 2 ). From a practical standpoint, however, the sketch algorithms do not seem as good. Most
Classifying Imbalanced Data Streams via Dynamic Feature Group Weighting with Importance Sampling.
Wu, Ke; Edwards, Andrea; Fan, Wei; Gao, Jing; Zhang, Kun
2014-04-01
Data stream classification and imbalanced data learning are two important areas of data mining research. Each has been well studied to date with many interesting algorithms developed. However, only a few approaches reported in literature address the intersection of these two fields due to their complex interplay. In this work, we proposed an importance sampling driven, dynamic feature group weighting framework (DFGW-IS) for classifying data streams of imbalanced distribution. Two components are tightly incorporated into the proposed approach to address the intrinsic characteristics of concept-drifting, imbalanced streaming data. Specifically, the ever-evolving concepts are tackled by a weighted ensemble trained on a set of feature groups with each sub-classifier (i.e. a single classifier or an ensemble) weighed by its discriminative power and stable level. The un-even class distribution, on the other hand, is typically battled by the sub-classifier built in a specific feature group with the underlying distribution rebalanced by the importance sampling technique. We derived the theoretical upper bound for the generalization error of the proposed algorithm. We also studied the empirical performance of our method on a set of benchmark synthetic and real world data, and significant improvement has been achieved over the competing algorithms in terms of standard evaluation metrics and parallel running time. Algorithm implementations and datasets are available upon request.
Ozcift, Akin; Gulten, Arif
2011-12-01
Improving accuracies of machine learning algorithms is vital in designing high performance computer-aided diagnosis (CADx) systems. Researches have shown that a base classifier performance might be enhanced by ensemble classification strategies. In this study, we construct rotation forest (RF) ensemble classifiers of 30 machine learning algorithms to evaluate their classification performances using Parkinson's, diabetes and heart diseases from literature. While making experiments, first the feature dimension of three datasets is reduced using correlation based feature selection (CFS) algorithm. Second, classification performances of 30 machine learning algorithms are calculated for three datasets. Third, 30 classifier ensembles are constructed based on RF algorithm to assess performances of respective classifiers with the same disease data. All the experiments are carried out with leave-one-out validation strategy and the performances of the 60 algorithms are evaluated using three metrics; classification accuracy (ACC), kappa error (KE) and area under the receiver operating characteristic (ROC) curve (AUC). Base classifiers succeeded 72.15%, 77.52% and 84.43% average accuracies for diabetes, heart and Parkinson's datasets, respectively. As for RF classifier ensembles, they produced average accuracies of 74.47%, 80.49% and 87.13% for respective diseases. RF, a newly proposed classifier ensemble algorithm, might be used to improve accuracy of miscellaneous machine learning algorithms to design advanced CADx systems. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Synthetic aperture radar target detection, feature extraction, and image formation techniques
NASA Technical Reports Server (NTRS)
Li, Jian
1994-01-01
This report presents new algorithms for target detection, feature extraction, and image formation with the synthetic aperture radar (SAR) technology. For target detection, we consider target detection with SAR and coherent subtraction. We also study how the image false alarm rates are related to the target template false alarm rates when target templates are used for target detection. For feature extraction from SAR images, we present a computationally efficient eigenstructure-based 2D-MODE algorithm for two-dimensional frequency estimation. For SAR image formation, we present a robust parametric data model for estimating high resolution range signatures of radar targets and for forming high resolution SAR images.
NASA Astrophysics Data System (ADS)
Cruz-Roa, Angel; Xu, Jun; Madabhushi, Anant
2015-01-01
Nuclear architecture or the spatial arrangement of individual cancer nuclei on histopathology images has been shown to be associated with different grades and differential risk for a number of solid tumors such as breast, prostate, and oropharyngeal. Graph-based representations of individual nuclei (nuclei representing the graph nodes) allows for mining of quantitative metrics to describe tumor morphology. These graph features can be broadly categorized into global and local depending on the type of graph construction method. While a number of local graph (e.g. Cell Cluster Graphs) and global graph (e.g. Voronoi, Delaunay Triangulation, Minimum Spanning Tree) features have been shown to associated with cancer grade, risk, and outcome for different cancer types, the sensitivity of the preceding segmentation algorithms in identifying individual nuclei can have a significant bearing on the discriminability of the resultant features. This therefore begs the question as to which features while being discriminative of cancer grade and aggressiveness are also the most resilient to the segmentation errors. These properties are particularly desirable in the context of digital pathology images, where the method of slide preparation, staining, and type of nuclear segmentation algorithm employed can all dramatically affect the quality of the nuclear graphs and corresponding features. In this paper we evaluated the trade off between discriminability and stability of both global and local graph-based features in conjunction with a few different segmentation algorithms and in the context of two different histopathology image datasets of breast cancer from whole-slide images (WSI) and tissue microarrays (TMA). Specifically in this paper we investigate a few different performance measures including stability, discriminability and stability vs discriminability trade off, all of which are based on p-values from the Kruskal-Wallis one-way analysis of variance for local and global graph features. Apart from identifying the set of local and global features that satisfied the trade off between stability and discriminability, our most interesting finding was that a simple segmentation method was sufficient to identify the most discriminant features for invasive tumour detection in TMAs, whereas for tumour grading in WSI, the graph based features were more sensitive to the accuracy of the segmentation algorithm employed.
Note: A pure-sampling quantum Monte Carlo algorithm with independent Metropolis.
Vrbik, Jan; Ospadov, Egor; Rothstein, Stuart M
2016-07-14
Recently, Ospadov and Rothstein published a pure-sampling quantum Monte Carlo algorithm (PSQMC) that features an auxiliary Path Z that connects the midpoints of the current and proposed Paths X and Y, respectively. When sufficiently long, Path Z provides statistical independence of Paths X and Y. Under those conditions, the Metropolis decision used in PSQMC is done without any approximation, i.e., not requiring microscopic reversibility and without having to introduce any G(x → x'; τ) factors into its decision function. This is a unique feature that contrasts with all competing reptation algorithms in the literature. An example illustrates that dependence of Paths X and Y has adverse consequences for pure sampling.
Iris recognition using image moments and k-means algorithm.
Khan, Yaser Daanial; Khan, Sher Afzal; Ahmad, Farooq; Islam, Saeed
2014-01-01
This paper presents a biometric technique for identification of a person using the iris image. The iris is first segmented from the acquired image of an eye using an edge detection algorithm. The disk shaped area of the iris is transformed into a rectangular form. Described moments are extracted from the grayscale image which yields a feature vector containing scale, rotation, and translation invariant moments. Images are clustered using the k-means algorithm and centroids for each cluster are computed. An arbitrary image is assumed to belong to the cluster whose centroid is the nearest to the feature vector in terms of Euclidean distance computed. The described model exhibits an accuracy of 98.5%.
Iris Recognition Using Image Moments and k-Means Algorithm
Khan, Yaser Daanial; Khan, Sher Afzal; Ahmad, Farooq; Islam, Saeed
2014-01-01
This paper presents a biometric technique for identification of a person using the iris image. The iris is first segmented from the acquired image of an eye using an edge detection algorithm. The disk shaped area of the iris is transformed into a rectangular form. Described moments are extracted from the grayscale image which yields a feature vector containing scale, rotation, and translation invariant moments. Images are clustered using the k-means algorithm and centroids for each cluster are computed. An arbitrary image is assumed to belong to the cluster whose centroid is the nearest to the feature vector in terms of Euclidean distance computed. The described model exhibits an accuracy of 98.5%. PMID:24977221
Artificial intelligence tools for pattern recognition
NASA Astrophysics Data System (ADS)
Acevedo, Elena; Acevedo, Antonio; Felipe, Federico; Avilés, Pedro
2017-06-01
In this work, we present a system for pattern recognition that combines the power of genetic algorithms for solving problems and the efficiency of the morphological associative memories. We use a set of 48 tire prints divided into 8 brands of tires. The images have dimensions of 200 x 200 pixels. We applied Hough transform to obtain lines as main features. The number of lines obtained is 449. The genetic algorithm reduces the number of features to ten suitable lines that give thus the 100% of recognition. Morphological associative memories were used as evaluation function. The selection algorithms were Tournament and Roulette wheel. For reproduction, we applied one-point, two-point and uniform crossover.
Approximate maximum likelihood decoding of block codes
NASA Technical Reports Server (NTRS)
Greenberger, H. J.
1979-01-01
Approximate maximum likelihood decoding algorithms, based upon selecting a small set of candidate code words with the aid of the estimated probability of error of each received symbol, can give performance close to optimum with a reasonable amount of computation. By combining the best features of various algorithms and taking care to perform each step as efficiently as possible, a decoding scheme was developed which can decode codes which have better performance than those presently in use and yet not require an unreasonable amount of computation. The discussion of the details and tradeoffs of presently known efficient optimum and near optimum decoding algorithms leads, naturally, to the one which embodies the best features of all of them.
Note: A pure-sampling quantum Monte Carlo algorithm with independent Metropolis
NASA Astrophysics Data System (ADS)
Vrbik, Jan; Ospadov, Egor; Rothstein, Stuart M.
2016-07-01
Recently, Ospadov and Rothstein published a pure-sampling quantum Monte Carlo algorithm (PSQMC) that features an auxiliary Path Z that connects the midpoints of the current and proposed Paths X and Y, respectively. When sufficiently long, Path Z provides statistical independence of Paths X and Y. Under those conditions, the Metropolis decision used in PSQMC is done without any approximation, i.e., not requiring microscopic reversibility and without having to introduce any G(x → x'; τ) factors into its decision function. This is a unique feature that contrasts with all competing reptation algorithms in the literature. An example illustrates that dependence of Paths X and Y has adverse consequences for pure sampling.
Enhancement of ELDA Tracker Based on CNN Features and Adaptive Model Update.
Gao, Changxin; Shi, Huizhang; Yu, Jin-Gang; Sang, Nong
2016-04-15
Appearance representation and the observation model are the most important components in designing a robust visual tracking algorithm for video-based sensors. Additionally, the exemplar-based linear discriminant analysis (ELDA) model has shown good performance in object tracking. Based on that, we improve the ELDA tracking algorithm by deep convolutional neural network (CNN) features and adaptive model update. Deep CNN features have been successfully used in various computer vision tasks. Extracting CNN features on all of the candidate windows is time consuming. To address this problem, a two-step CNN feature extraction method is proposed by separately computing convolutional layers and fully-connected layers. Due to the strong discriminative ability of CNN features and the exemplar-based model, we update both object and background models to improve their adaptivity and to deal with the tradeoff between discriminative ability and adaptivity. An object updating method is proposed to select the "good" models (detectors), which are quite discriminative and uncorrelated to other selected models. Meanwhile, we build the background model as a Gaussian mixture model (GMM) to adapt to complex scenes, which is initialized offline and updated online. The proposed tracker is evaluated on a benchmark dataset of 50 video sequences with various challenges. It achieves the best overall performance among the compared state-of-the-art trackers, which demonstrates the effectiveness and robustness of our tracking algorithm.
Enhancement of ELDA Tracker Based on CNN Features and Adaptive Model Update
Gao, Changxin; Shi, Huizhang; Yu, Jin-Gang; Sang, Nong
2016-01-01
Appearance representation and the observation model are the most important components in designing a robust visual tracking algorithm for video-based sensors. Additionally, the exemplar-based linear discriminant analysis (ELDA) model has shown good performance in object tracking. Based on that, we improve the ELDA tracking algorithm by deep convolutional neural network (CNN) features and adaptive model update. Deep CNN features have been successfully used in various computer vision tasks. Extracting CNN features on all of the candidate windows is time consuming. To address this problem, a two-step CNN feature extraction method is proposed by separately computing convolutional layers and fully-connected layers. Due to the strong discriminative ability of CNN features and the exemplar-based model, we update both object and background models to improve their adaptivity and to deal with the tradeoff between discriminative ability and adaptivity. An object updating method is proposed to select the “good” models (detectors), which are quite discriminative and uncorrelated to other selected models. Meanwhile, we build the background model as a Gaussian mixture model (GMM) to adapt to complex scenes, which is initialized offline and updated online. The proposed tracker is evaluated on a benchmark dataset of 50 video sequences with various challenges. It achieves the best overall performance among the compared state-of-the-art trackers, which demonstrates the effectiveness and robustness of our tracking algorithm. PMID:27092505
Face recognition via Gabor and convolutional neural network
NASA Astrophysics Data System (ADS)
Lu, Tongwei; Wu, Menglu; Lu, Tao
2018-04-01
In recent years, the powerful feature learning and classification ability of convolutional neural network have attracted widely attention. Compared with the deep learning, the traditional machine learning algorithm has a good explanatory which deep learning does not have. Thus, In this paper, we propose a method to extract the feature of the traditional algorithm as the input of convolution neural network. In order to reduce the complexity of the network, the kernel function of Gabor wavelet is used to extract the feature from different position, frequency and direction of target image. It is sensitive to edge of image which can provide good direction and scale selection. The extraction of the image from eight directions on a scale are as the input of network that we proposed. The network have the advantage of weight sharing and local connection and texture feature of the input image can reduce the influence of facial expression, gesture and illumination. At the same time, we introduced a layer which combined the results of the pooling and convolution can extract deeper features. The training network used the open source caffe framework which is beneficial to feature extraction. The experiment results of the proposed method proved that the network structure effectively overcame the barrier of illumination and had a good robustness as well as more accurate and rapid than the traditional algorithm.
NASA Astrophysics Data System (ADS)
Zhang, Ka; Sheng, Yehua; Gong, Zhijun; Ye, Chun; Li, Yongqiang; Liang, Cheng
2007-06-01
As an important sub-system in intelligent transportation system (ITS), the detection and recognition of traffic signs from mobile images is becoming one of the hot spots in the international research field of ITS. Considering the problem of traffic sign automatic detection in motion images, a new self-adaptive algorithm for traffic sign detection based on color and shape features is proposed in this paper. Firstly, global statistical color features of different images are computed based on statistics theory. Secondly, some self-adaptive thresholds and special segmentation rules for image segmentation are designed according to these global color features. Then, for red, yellow and blue traffic signs, the color image is segmented to three binary images by these thresholds and rules. Thirdly, if the number of white pixels in the segmented binary image exceeds the filtering threshold, the binary image should be further filtered. Fourthly, the method of gray-value projection is used to confirm top, bottom, left and right boundaries for candidate regions of traffic signs in the segmented binary image. Lastly, if the shape feature of candidate region satisfies the need of real traffic sign, this candidate region is confirmed as the detected traffic sign region. The new algorithm is applied to actual motion images of natural scenes taken by a CCD camera of the mobile photogrammetry system in Nanjing at different time. The experimental results show that the algorithm is not only simple, robust and more adaptive to natural scene images, but also reliable and high-speed on real traffic sign detection.
Selecting Power-Efficient Signal Features for a Low-Power Fall Detector.
Wang, Changhong; Redmond, Stephen J; Lu, Wei; Stevens, Michael C; Lord, Stephen R; Lovell, Nigel H
2017-11-01
Falls are a serious threat to the health of older people. A wearable fall detector can automatically detect the occurrence of a fall and alert a caregiver or an emergency response service so they may deliver immediate assistance, improving the chances of recovering from fall-related injuries. One constraint of such a wearable technology is its limited battery life. Thus, minimization of power consumption is an important design concern, all the while maintaining satisfactory accuracy of the fall detection algorithms implemented on the wearable device. This paper proposes an approach for selecting power-efficient signal features such that the minimum desirable fall detection accuracy is assured. Using data collected in simulated falls, simulated activities of daily living, and real free-living trials, all using young volunteers, the proposed approach selects four features from a set of ten commonly used features, providing a power saving of 75.3%, while limiting the error rate of a binary classification decision tree fall detection algorithm to 7.1%.Falls are a serious threat to the health of older people. A wearable fall detector can automatically detect the occurrence of a fall and alert a caregiver or an emergency response service so they may deliver immediate assistance, improving the chances of recovering from fall-related injuries. One constraint of such a wearable technology is its limited battery life. Thus, minimization of power consumption is an important design concern, all the while maintaining satisfactory accuracy of the fall detection algorithms implemented on the wearable device. This paper proposes an approach for selecting power-efficient signal features such that the minimum desirable fall detection accuracy is assured. Using data collected in simulated falls, simulated activities of daily living, and real free-living trials, all using young volunteers, the proposed approach selects four features from a set of ten commonly used features, providing a power saving of 75.3%, while limiting the error rate of a binary classification decision tree fall detection algorithm to 7.1%.
Hsieh, Chi-Hsuan; Chiu, Yu-Fang; Shen, Yi-Hsiang; Chu, Ta-Shun; Huang, Yuan-Hao
2016-02-01
This paper presents an ultra-wideband (UWB) impulse-radio radar signal processing platform used to analyze human respiratory features. Conventional radar systems used in human detection only analyze human respiration rates or the response of a target. However, additional respiratory signal information is available that has not been explored using radar detection. The authors previously proposed a modified raised cosine waveform (MRCW) respiration model and an iterative correlation search algorithm that could acquire additional respiratory features such as the inspiration and expiration speeds, respiration intensity, and respiration holding ratio. To realize real-time respiratory feature extraction by using the proposed UWB signal processing platform, this paper proposes a new four-segment linear waveform (FSLW) respiration model. This model offers a superior fit to the measured respiration signal compared with the MRCW model and decreases the computational complexity of feature extraction. In addition, an early-terminated iterative correlation search algorithm is presented, substantially decreasing the computational complexity and yielding negligible performance degradation. These extracted features can be considered the compressed signals used to decrease the amount of data storage required for use in long-term medical monitoring systems and can also be used in clinical diagnosis. The proposed respiratory feature extraction algorithm was designed and implemented using the proposed UWB radar signal processing platform including a radar front-end chip and an FPGA chip. The proposed radar system can detect human respiration rates at 0.1 to 1 Hz and facilitates the real-time analysis of the respiratory features of each respiration period.
Intelligent Fault Diagnosis of HVCB with Feature Space Optimization-Based Random Forest
Ma, Suliang; Wu, Jianwen; Wang, Yuhao; Jia, Bowen; Jiang, Yuan
2018-01-01
Mechanical faults of high-voltage circuit breakers (HVCBs) always happen over long-term operation, so extracting the fault features and identifying the fault type have become a key issue for ensuring the security and reliability of power supply. Based on wavelet packet decomposition technology and random forest algorithm, an effective identification system was developed in this paper. First, compared with the incomplete description of Shannon entropy, the wavelet packet time-frequency energy rate (WTFER) was adopted as the input vector for the classifier model in the feature selection procedure. Then, a random forest classifier was used to diagnose the HVCB fault, assess the importance of the feature variable and optimize the feature space. Finally, the approach was verified based on actual HVCB vibration signals by considering six typical fault classes. The comparative experiment results show that the classification accuracy of the proposed method with the origin feature space reached 93.33% and reached up to 95.56% with optimized input feature vector of classifier. This indicates that feature optimization procedure is successful, and the proposed diagnosis algorithm has higher efficiency and robustness than traditional methods. PMID:29659548
Dimensionality Reduction Through Classifier Ensembles
NASA Technical Reports Server (NTRS)
Oza, Nikunj C.; Tumer, Kagan; Norwig, Peter (Technical Monitor)
1999-01-01
In data mining, one often needs to analyze datasets with a very large number of attributes. Performing machine learning directly on such data sets is often impractical because of extensive run times, excessive complexity of the fitted model (often leading to overfitting), and the well-known "curse of dimensionality." In practice, to avoid such problems, feature selection and/or extraction are often used to reduce data dimensionality prior to the learning step. However, existing feature selection/extraction algorithms either evaluate features by their effectiveness across the entire data set or simply disregard class information altogether (e.g., principal component analysis). Furthermore, feature extraction algorithms such as principal components analysis create new features that are often meaningless to human users. In this article, we present input decimation, a method that provides "feature subsets" that are selected for their ability to discriminate among the classes. These features are subsequently used in ensembles of classifiers, yielding results superior to single classifiers, ensembles that use the full set of features, and ensembles based on principal component analysis on both real and synthetic datasets.
Contextual Multi-armed Bandits under Feature Uncertainty
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yun, Seyoung; Nam, Jun Hyun; Mo, Sangwoo
We study contextual multi-armed bandit problems under linear realizability on rewards and uncertainty (or noise) on features. For the case of identical noise on features across actions, we propose an algorithm, coined NLinRel, having O(T⁷/₈(log(dT)+K√d)) regret bound for T rounds, K actions, and d-dimensional feature vectors. Next, for the case of non-identical noise, we observe that popular linear hypotheses including NLinRel are impossible to achieve such sub-linear regret. Instead, under assumption of Gaussian feature vectors, we prove that a greedy algorithm has O(T²/₃√log d)regret bound with respect to the optimal linear hypothesis. Utilizing our theoretical understanding on the Gaussian case,more » we also design a practical variant of NLinRel, coined Universal-NLinRel, for arbitrary feature distributions. It first runs NLinRel for finding the ‘true’ coefficient vector using feature uncertainties and then adjust it to minimize its regret using the statistical feature information. We justify the performance of Universal-NLinRel on both synthetic and real-world datasets.« less
Conservative multizonal interface algorithm for the 3-D Navier-Stokes equations
NASA Technical Reports Server (NTRS)
Klopfer, G. H.; Molvik, G. A.
1991-01-01
A conservative zonal interface algorithm using features of both structured and unstructured mesh CFD technology is presented. The flow solver within each of the zones is based on structured mesh CFD technology. The interface algorithm was implemented into two three-dimensional Navier-Stokes finite volume codes and was found to yield good results.
Pérez-Castillo, Yunierkis; Lazar, Cosmin; Taminau, Jonatan; Froeyen, Mathy; Cabrera-Pérez, Miguel Ángel; Nowé, Ann
2012-09-24
Computer-aided drug design has become an important component of the drug discovery process. Despite the advances in this field, there is not a unique modeling approach that can be successfully applied to solve the whole range of problems faced during QSAR modeling. Feature selection and ensemble modeling are active areas of research in ligand-based drug design. Here we introduce the GA(M)E-QSAR algorithm that combines the search and optimization capabilities of Genetic Algorithms with the simplicity of the Adaboost ensemble-based classification algorithm to solve binary classification problems. We also explore the usefulness of Meta-Ensembles trained with Adaboost and Voting schemes to further improve the accuracy, generalization, and robustness of the optimal Adaboost Single Ensemble derived from the Genetic Algorithm optimization. We evaluated the performance of our algorithm using five data sets from the literature and found that it is capable of yielding similar or better classification results to what has been reported for these data sets with a higher enrichment of active compounds relative to the whole actives subset when only the most active chemicals are considered. More important, we compared our methodology with state of the art feature selection and classification approaches and found that it can provide highly accurate, robust, and generalizable models. In the case of the Adaboost Ensembles derived from the Genetic Algorithm search, the final models are quite simple since they consist of a weighted sum of the output of single feature classifiers. Furthermore, the Adaboost scores can be used as ranking criterion to prioritize chemicals for synthesis and biological evaluation after virtual screening experiments.
Multi-Scale Peak and Trough Detection Optimised for Periodic and Quasi-Periodic Neuroscience Data.
Bishop, Steven M; Ercole, Ari
2018-01-01
The reliable detection of peaks and troughs in physiological signals is essential to many investigative techniques in medicine and computational biology. Analysis of the intracranial pressure (ICP) waveform is a particular challenge due to multi-scale features, a changing morphology over time and signal-to-noise limitations. Here we present an efficient peak and trough detection algorithm that extends the scalogram approach of Scholkmann et al., and results in greatly improved algorithm runtime performance. Our improved algorithm (modified Scholkmann) was developed and analysed in MATLAB R2015b. Synthesised waveforms (periodic, quasi-periodic and chirp sinusoids) were degraded with white Gaussian noise to achieve signal-to-noise ratios down to 5 dB and were used to compare the performance of the original Scholkmann and modified Scholkmann algorithms. The modified Scholkmann algorithm has false-positive (0%) and false-negative (0%) detection rates identical to the original Scholkmann when applied to our test suite. Actual compute time for a 200-run Monte Carlo simulation over a multicomponent noisy test signal was 40.96 ± 0.020 s (mean ± 95%CI) for the original Scholkmann and 1.81 ± 0.003 s (mean ± 95%CI) for the modified Scholkmann, demonstrating the expected improvement in runtime complexity from [Formula: see text] to [Formula: see text]. The accurate interpretation of waveform data to identify peaks and troughs is crucial in signal parameterisation, feature extraction and waveform identification tasks. Modification of a standard scalogram technique has produced a robust algorithm with linear computational complexity that is particularly suited to the challenges presented by large, noisy physiological datasets. The algorithm is optimised through a single parameter and can identify sub-waveform features with minimal additional overhead, and is easily adapted to run in real time on commodity hardware.
Cui, Lingli; Wu, Na; Wang, Wenjing; Kang, Chenhui
2014-01-01
This paper presents a new method for a composite dictionary matching pursuit algorithm, which is applied to vibration sensor signal feature extraction and fault diagnosis of a gearbox. Three advantages are highlighted in the new method. First, the composite dictionary in the algorithm has been changed from multi-atom matching to single-atom matching. Compared to non-composite dictionary single-atom matching, the original composite dictionary multi-atom matching pursuit (CD-MaMP) algorithm can achieve noise reduction in the reconstruction stage, but it cannot dramatically reduce the computational cost and improve the efficiency in the decomposition stage. Therefore, the optimized composite dictionary single-atom matching algorithm (CD-SaMP) is proposed. Second, the termination condition of iteration based on the attenuation coefficient is put forward to improve the sparsity and efficiency of the algorithm, which adjusts the parameters of the termination condition constantly in the process of decomposition to avoid noise. Third, composite dictionaries are enriched with the modulation dictionary, which is one of the important structural characteristics of gear fault signals. Meanwhile, the termination condition of iteration settings, sub-feature dictionary selections and operation efficiency between CD-MaMP and CD-SaMP are discussed, aiming at gear simulation vibration signals with noise. The simulation sensor-based vibration signal results show that the termination condition of iteration based on the attenuation coefficient enhances decomposition sparsity greatly and achieves a good effect of noise reduction. Furthermore, the modulation dictionary achieves a better matching effect compared to the Fourier dictionary, and CD-SaMP has a great advantage of sparsity and efficiency compared with the CD-MaMP. The sensor-based vibration signals measured from practical engineering gearbox analyses have further shown that the CD-SaMP decomposition and reconstruction algorithm is feasible and effective. PMID:25207870
Cui, Lingli; Wu, Na; Wang, Wenjing; Kang, Chenhui
2014-09-09
This paper presents a new method for a composite dictionary matching pursuit algorithm, which is applied to vibration sensor signal feature extraction and fault diagnosis of a gearbox. Three advantages are highlighted in the new method. First, the composite dictionary in the algorithm has been changed from multi-atom matching to single-atom matching. Compared to non-composite dictionary single-atom matching, the original composite dictionary multi-atom matching pursuit (CD-MaMP) algorithm can achieve noise reduction in the reconstruction stage, but it cannot dramatically reduce the computational cost and improve the efficiency in the decomposition stage. Therefore, the optimized composite dictionary single-atom matching algorithm (CD-SaMP) is proposed. Second, the termination condition of iteration based on the attenuation coefficient is put forward to improve the sparsity and efficiency of the algorithm, which adjusts the parameters of the termination condition constantly in the process of decomposition to avoid noise. Third, composite dictionaries are enriched with the modulation dictionary, which is one of the important structural characteristics of gear fault signals. Meanwhile, the termination condition of iteration settings, sub-feature dictionary selections and operation efficiency between CD-MaMP and CD-SaMP are discussed, aiming at gear simulation vibration signals with noise. The simulation sensor-based vibration signal results show that the termination condition of iteration based on the attenuation coefficient enhances decomposition sparsity greatly and achieves a good effect of noise reduction. Furthermore, the modulation dictionary achieves a better matching effect compared to the Fourier dictionary, and CD-SaMP has a great advantage of sparsity and efficiency compared with the CD-MaMP. The sensor-based vibration signals measured from practical engineering gearbox analyses have further shown that the CD-SaMP decomposition and reconstruction algorithm is feasible and effective.
NASA Astrophysics Data System (ADS)
Han, Sheng; Xi, Shi-qiong; Geng, Wei-dong
2017-11-01
In order to solve the problem of low recognition rate of traditional feature extraction operators under low-resolution images, a novel algorithm of expression recognition is proposed, named central oblique average center-symmetric local binary pattern (CS-LBP) with adaptive threshold (ATCS-LBP). Firstly, the features of face images can be extracted by the proposed operator after pretreatment. Secondly, the obtained feature image is divided into blocks. Thirdly, the histogram of each block is computed independently and all histograms can be connected serially to create a final feature vector. Finally, expression classification is achieved by using support vector machine (SVM) classifier. Experimental results on Japanese female facial expression (JAFFE) database show that the proposed algorithm can achieve a recognition rate of 81.9% when the resolution is as low as 16×16, which is much better than that of the traditional feature extraction operators.
Li, Ke; Liu, Yi; Wang, Quanxin; Wu, Yalei; Song, Shimin; Sun, Yi; Liu, Tengchong; Wang, Jun; Li, Yang; Du, Shaoyi
2015-01-01
This paper proposes a novel multi-label classification method for resolving the spacecraft electrical characteristics problems which involve many unlabeled test data processing, high-dimensional features, long computing time and identification of slow rate. Firstly, both the fuzzy c-means (FCM) offline clustering and the principal component feature extraction algorithms are applied for the feature selection process. Secondly, the approximate weighted proximal support vector machine (WPSVM) online classification algorithms is used to reduce the feature dimension and further improve the rate of recognition for electrical characteristics spacecraft. Finally, the data capture contribution method by using thresholds is proposed to guarantee the validity and consistency of the data selection. The experimental results indicate that the method proposed can obtain better data features of the spacecraft electrical characteristics, improve the accuracy of identification and shorten the computing time effectively. PMID:26544549
Random forest feature selection approach for image segmentation
NASA Astrophysics Data System (ADS)
Lefkovits, László; Lefkovits, Szidónia; Emerich, Simina; Vaida, Mircea Florin
2017-03-01
In the field of image segmentation, discriminative models have shown promising performance. Generally, every such model begins with the extraction of numerous features from annotated images. Most authors create their discriminative model by using many features without using any selection criteria. A more reliable model can be built by using a framework that selects the important variables, from the point of view of the classification, and eliminates the unimportant once. In this article we present a framework for feature selection and data dimensionality reduction. The methodology is built around the random forest (RF) algorithm and its variable importance evaluation. In order to deal with datasets so large as to be practically unmanageable, we propose an algorithm based on RF that reduces the dimension of the database by eliminating irrelevant features. Furthermore, this framework is applied to optimize our discriminative model for brain tumor segmentation.
Combined Feature Based and Shape Based Visual Tracker for Robot Navigation
NASA Technical Reports Server (NTRS)
Deans, J.; Kunz, C.; Sargent, R.; Park, E.; Pedersen, L.
2005-01-01
We have developed a combined feature based and shape based visual tracking system designed to enable a planetary rover to visually track and servo to specific points chosen by a user with centimeter precision. The feature based tracker uses invariant feature detection and matching across a stereo pair, as well as matching pairs before and after robot movement in order to compute an incremental 6-DOF motion at each tracker update. This tracking method is subject to drift over time, which can be compensated by the shape based method. The shape based tracking method consists of 3D model registration, which recovers 6-DOF motion given sufficient shape and proper initialization. By integrating complementary algorithms, the combined tracker leverages the efficiency and robustness of feature based methods with the precision and accuracy of model registration. In this paper, we present the algorithms and their integration into a combined visual tracking system.
A Dimensionally Aligned Signal Projection for Classification of Unintended Radiated Emissions
Vann, Jason Michael; Karnowski, Thomas P.; Kerekes, Ryan; ...
2017-04-24
Characterization of unintended radiated emissions (URE) from electronic devices plays an important role in many research areas from electromagnetic interference to nonintrusive load monitoring to information system security. URE can provide insights for applications ranging from load disaggregation and energy efficiency to condition-based maintenance of equipment-based upon detected fault conditions. URE characterization often requires subject matter expertise to tailor transforms and feature extractors for the specific electrical devices of interest. We present a novel approach, named dimensionally aligned signal projection (DASP), for projecting aligned signal characteristics that are inherent to the physical implementation of many commercial electronic devices. These projectionsmore » minimize the need for an intimate understanding of the underlying physical circuitry and significantly reduce the number of features required for signal classification. We present three possible DASP algorithms that leverage frequency harmonics, modulation alignments, and frequency peak spacings, along with a two-dimensional image manipulation method for statistical feature extraction. To demonstrate the ability of DASP to generate relevant features from URE, we measured the conducted URE from 14 residential electronic devices using a 2 MS/s collection system. Furthermore, a linear discriminant analysis classifier was trained using DASP generated features and was blind tested resulting in a greater than 90% classification accuracy for each of the DASP algorithms and an accuracy of 99.1% when DASP features are used in combination. Furthermore, we show that a rank reduced feature set of the combined DASP algorithms provides a 98.9% classification accuracy with only three features and outperforms a set of spectral features in terms of general classification as well as applicability across a broad number of devices.« less
A Dimensionally Aligned Signal Projection for Classification of Unintended Radiated Emissions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vann, Jason Michael; Karnowski, Thomas P.; Kerekes, Ryan
Characterization of unintended radiated emissions (URE) from electronic devices plays an important role in many research areas from electromagnetic interference to nonintrusive load monitoring to information system security. URE can provide insights for applications ranging from load disaggregation and energy efficiency to condition-based maintenance of equipment-based upon detected fault conditions. URE characterization often requires subject matter expertise to tailor transforms and feature extractors for the specific electrical devices of interest. We present a novel approach, named dimensionally aligned signal projection (DASP), for projecting aligned signal characteristics that are inherent to the physical implementation of many commercial electronic devices. These projectionsmore » minimize the need for an intimate understanding of the underlying physical circuitry and significantly reduce the number of features required for signal classification. We present three possible DASP algorithms that leverage frequency harmonics, modulation alignments, and frequency peak spacings, along with a two-dimensional image manipulation method for statistical feature extraction. To demonstrate the ability of DASP to generate relevant features from URE, we measured the conducted URE from 14 residential electronic devices using a 2 MS/s collection system. Furthermore, a linear discriminant analysis classifier was trained using DASP generated features and was blind tested resulting in a greater than 90% classification accuracy for each of the DASP algorithms and an accuracy of 99.1% when DASP features are used in combination. Furthermore, we show that a rank reduced feature set of the combined DASP algorithms provides a 98.9% classification accuracy with only three features and outperforms a set of spectral features in terms of general classification as well as applicability across a broad number of devices.« less
MultiMiTar: a novel multi objective optimization based miRNA-target prediction method.
Mitra, Ramkrishna; Bandyopadhyay, Sanghamitra
2011-01-01
Machine learning based miRNA-target prediction algorithms often fail to obtain a balanced prediction accuracy in terms of both sensitivity and specificity due to lack of the gold standard of negative examples, miRNA-targeting site context specific relevant features and efficient feature selection process. Moreover, all the sequence, structure and machine learning based algorithms are unable to distribute the true positive predictions preferentially at the top of the ranked list; hence the algorithms become unreliable to the biologists. In addition, these algorithms fail to obtain considerable combination of precision and recall for the target transcripts that are translationally repressed at protein level. In the proposed article, we introduce an efficient miRNA-target prediction system MultiMiTar, a Support Vector Machine (SVM) based classifier integrated with a multiobjective metaheuristic based feature selection technique. The robust performance of the proposed method is mainly the result of using high quality negative examples and selection of biologically relevant miRNA-targeting site context specific features. The features are selected by using a novel feature selection technique AMOSA-SVM, that integrates the multi objective optimization technique Archived Multi-Objective Simulated Annealing (AMOSA) and SVM. MultiMiTar is found to achieve much higher Matthew's correlation coefficient (MCC) of 0.583 and average class-wise accuracy (ACA) of 0.8 compared to the others target prediction methods for a completely independent test data set. The obtained MCC and ACA values of these algorithms range from -0.269 to 0.155 and 0.321 to 0.582, respectively. Moreover, it shows a more balanced result in terms of precision and sensitivity (recall) for the translationally repressed data set as compared to all the other existing methods. An important aspect is that the true positive predictions are distributed preferentially at the top of the ranked list that makes MultiMiTar reliable for the biologists. MultiMiTar is now available as an online tool at www.isical.ac.in/~bioinfo_miu/multimitar.htm. MultiMiTar software can be downloaded from www.isical.ac.in/~bioinfo_miu/multimitar-download.htm.
Elastic SCAD as a novel penalization method for SVM classification tasks in high-dimensional data.
Becker, Natalia; Toedt, Grischa; Lichter, Peter; Benner, Axel
2011-05-09
Classification and variable selection play an important role in knowledge discovery in high-dimensional data. Although Support Vector Machine (SVM) algorithms are among the most powerful classification and prediction methods with a wide range of scientific applications, the SVM does not include automatic feature selection and therefore a number of feature selection procedures have been developed. Regularisation approaches extend SVM to a feature selection method in a flexible way using penalty functions like LASSO, SCAD and Elastic Net.We propose a novel penalty function for SVM classification tasks, Elastic SCAD, a combination of SCAD and ridge penalties which overcomes the limitations of each penalty alone.Since SVM models are extremely sensitive to the choice of tuning parameters, we adopted an interval search algorithm, which in comparison to a fixed grid search finds rapidly and more precisely a global optimal solution. Feature selection methods with combined penalties (Elastic Net and Elastic SCAD SVMs) are more robust to a change of the model complexity than methods using single penalties. Our simulation study showed that Elastic SCAD SVM outperformed LASSO (L1) and SCAD SVMs. Moreover, Elastic SCAD SVM provided sparser classifiers in terms of median number of features selected than Elastic Net SVM and often better predicted than Elastic Net in terms of misclassification error.Finally, we applied the penalization methods described above on four publicly available breast cancer data sets. Elastic SCAD SVM was the only method providing robust classifiers in sparse and non-sparse situations. The proposed Elastic SCAD SVM algorithm provides the advantages of the SCAD penalty and at the same time avoids sparsity limitations for non-sparse data. We were first to demonstrate that the integration of the interval search algorithm and penalized SVM classification techniques provides fast solutions on the optimization of tuning parameters.The penalized SVM classification algorithms as well as fixed grid and interval search for finding appropriate tuning parameters were implemented in our freely available R package 'penalizedSVM'.We conclude that the Elastic SCAD SVM is a flexible and robust tool for classification and feature selection tasks for high-dimensional data such as microarray data sets.
Elastic SCAD as a novel penalization method for SVM classification tasks in high-dimensional data
2011-01-01
Background Classification and variable selection play an important role in knowledge discovery in high-dimensional data. Although Support Vector Machine (SVM) algorithms are among the most powerful classification and prediction methods with a wide range of scientific applications, the SVM does not include automatic feature selection and therefore a number of feature selection procedures have been developed. Regularisation approaches extend SVM to a feature selection method in a flexible way using penalty functions like LASSO, SCAD and Elastic Net. We propose a novel penalty function for SVM classification tasks, Elastic SCAD, a combination of SCAD and ridge penalties which overcomes the limitations of each penalty alone. Since SVM models are extremely sensitive to the choice of tuning parameters, we adopted an interval search algorithm, which in comparison to a fixed grid search finds rapidly and more precisely a global optimal solution. Results Feature selection methods with combined penalties (Elastic Net and Elastic SCAD SVMs) are more robust to a change of the model complexity than methods using single penalties. Our simulation study showed that Elastic SCAD SVM outperformed LASSO (L1) and SCAD SVMs. Moreover, Elastic SCAD SVM provided sparser classifiers in terms of median number of features selected than Elastic Net SVM and often better predicted than Elastic Net in terms of misclassification error. Finally, we applied the penalization methods described above on four publicly available breast cancer data sets. Elastic SCAD SVM was the only method providing robust classifiers in sparse and non-sparse situations. Conclusions The proposed Elastic SCAD SVM algorithm provides the advantages of the SCAD penalty and at the same time avoids sparsity limitations for non-sparse data. We were first to demonstrate that the integration of the interval search algorithm and penalized SVM classification techniques provides fast solutions on the optimization of tuning parameters. The penalized SVM classification algorithms as well as fixed grid and interval search for finding appropriate tuning parameters were implemented in our freely available R package 'penalizedSVM'. We conclude that the Elastic SCAD SVM is a flexible and robust tool for classification and feature selection tasks for high-dimensional data such as microarray data sets. PMID:21554689
Hassan, Ahnaf Rashik; Bhuiyan, Mohammed Imamul Hassan
2017-03-01
Automatic sleep staging is essential for alleviating the burden of the physicians of analyzing a large volume of data by visual inspection. It is also a precondition for making an automated sleep monitoring system feasible. Further, computerized sleep scoring will expedite large-scale data analysis in sleep research. Nevertheless, most of the existing works on sleep staging are either multichannel or multiple physiological signal based which are uncomfortable for the user and hinder the feasibility of an in-home sleep monitoring device. So, a successful and reliable computer-assisted sleep staging scheme is yet to emerge. In this work, we propose a single channel EEG based algorithm for computerized sleep scoring. In the proposed algorithm, we decompose EEG signal segments using Ensemble Empirical Mode Decomposition (EEMD) and extract various statistical moment based features. The effectiveness of EEMD and statistical features are investigated. Statistical analysis is performed for feature selection. A newly proposed classification technique, namely - Random under sampling boosting (RUSBoost) is introduced for sleep stage classification. This is the first implementation of EEMD in conjunction with RUSBoost to the best of the authors' knowledge. The proposed feature extraction scheme's performance is investigated for various choices of classification models. The algorithmic performance of our scheme is evaluated against contemporary works in the literature. The performance of the proposed method is comparable or better than that of the state-of-the-art ones. The proposed algorithm gives 88.07%, 83.49%, 92.66%, 94.23%, and 98.15% for 6-state to 2-state classification of sleep stages on Sleep-EDF database. Our experimental outcomes reveal that RUSBoost outperforms other classification models for the feature extraction framework presented in this work. Besides, the algorithm proposed in this work demonstrates high detection accuracy for the sleep states S1 and REM. Statistical moment based features in the EEMD domain distinguish the sleep states successfully and efficaciously. The automated sleep scoring scheme propounded herein can eradicate the onus of the clinicians, contribute to the device implementation of a sleep monitoring system, and benefit sleep research. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Discrimination of malignant lymphomas and leukemia using Radon transform based-higher order spectra
NASA Astrophysics Data System (ADS)
Luo, Yi; Celenk, Mehmet; Bejai, Prashanth
2006-03-01
A new algorithm that can be used to automatically recognize and classify malignant lymphomas and leukemia is proposed in this paper. The algorithm utilizes the morphological watersheds to obtain boundaries of cells from cell images and isolate them from the surrounding background. The areas of cells are extracted from cell images after background subtraction. The Radon transform and higher-order spectra (HOS) analysis are utilized as an image processing tool to generate class feature vectors of different type cells and to extract testing cells' feature vectors. The testing cells' feature vectors are then compared with the known class feature vectors for a possible match by computing the Euclidean distances. The cell in question is classified as belonging to one of the existing cell classes in the least Euclidean distance sense.
NASA Astrophysics Data System (ADS)
Rahmadani, S.; Dongoran, A.; Zarlis, M.; Zakarias
2018-03-01
This paper discusses the problem of feature selection using genetic algorithms on a dataset for classification problems. The classification model used is the decicion tree (DT), and Naive Bayes. In this paper we will discuss how the Naive Bayes and Decision Tree models to overcome the classification problem in the dataset, where the dataset feature is selectively selected using GA. Then both models compared their performance, whether there is an increase in accuracy or not. From the results obtained shows an increase in accuracy if the feature selection using GA. The proposed model is referred to as GADT (GA-Decision Tree) and GANB (GA-Naive Bayes). The data sets tested in this paper are taken from the UCI Machine Learning repository.