NASA Astrophysics Data System (ADS)
Chen, Y.; Luo, M.; Xu, L.; Zhou, X.; Ren, J.; Zhou, J.
2018-04-01
The RF method based on grid-search parameter optimization could achieve a classification accuracy of 88.16 % in the classification of images with multiple feature variables. This classification accuracy was higher than that of SVM and ANN under the same feature variables. In terms of efficiency, the RF classification method performs better than SVM and ANN, it is more capable of handling multidimensional feature variables. The RF method combined with object-based analysis approach could highlight the classification accuracy further. The multiresolution segmentation approach on the basis of ESP scale parameter optimization was used for obtaining six scales to execute image segmentation, when the segmentation scale was 49, the classification accuracy reached the highest value of 89.58 %. The classification accuracy of object-based RF classification was 1.42 % higher than that of pixel-based classification (88.16 %), and the classification accuracy was further improved. Therefore, the RF classification method combined with object-based analysis approach could achieve relatively high accuracy in the classification and extraction of land use information for industrial and mining reclamation areas. Moreover, the interpretation of remotely sensed imagery using the proposed method could provide technical support and theoretical reference for remotely sensed monitoring land reclamation.
Di-codon Usage for Gene Classification
NASA Astrophysics Data System (ADS)
Nguyen, Minh N.; Ma, Jianmin; Fogel, Gary B.; Rajapakse, Jagath C.
Classification of genes into biologically related groups facilitates inference of their functions. Codon usage bias has been described previously as a potential feature for gene classification. In this paper, we demonstrate that di-codon usage can further improve classification of genes. By using both codon and di-codon features, we achieve near perfect accuracies for the classification of HLA molecules into major classes and sub-classes. The method is illustrated on 1,841 HLA sequences which are classified into two major classes, HLA-I and HLA-II. Major classes are further classified into sub-groups. A binary SVM using di-codon usage patterns achieved 99.95% accuracy in the classification of HLA genes into major HLA classes; and multi-class SVM achieved accuracy rates of 99.82% and 99.03% for sub-class classification of HLA-I and HLA-II genes, respectively. Furthermore, by combining codon and di-codon usages, the prediction accuracies reached 100%, 99.82%, and 99.84% for HLA major class classification, and for sub-class classification of HLA-I and HLA-II genes, respectively.
Real-time, resource-constrained object classification on a micro-air vehicle
NASA Astrophysics Data System (ADS)
Buck, Louis; Ray, Laura
2013-12-01
A real-time embedded object classification algorithm is developed through the novel combination of binary feature descriptors, a bag-of-visual-words object model and the cortico-striatal loop (CSL) learning algorithm. The BRIEF, ORB and FREAK binary descriptors are tested and compared to SIFT descriptors with regard to their respective classification accuracies, execution times, and memory requirements when used with CSL on a 12.6 g ARM Cortex embedded processor running at 800 MHz. Additionally, the effect of x2 feature mapping and opponent-color representations used with these descriptors is examined. These tests are performed on four data sets of varying sizes and difficulty, and the BRIEF descriptor is found to yield the best combination of speed and classification accuracy. Its use with CSL achieves accuracies between 67% and 95% of those achieved with SIFT descriptors and allows for the embedded classification of a 128x192 pixel image in 0.15 seconds, 60 times faster than classification with SIFT. X2 mapping is found to provide substantial improvements in classification accuracy for all of the descriptors at little cost, while opponent-color descriptors are offer accuracy improvements only on colorful datasets.
Wang, Xueyi; Davidson, Nicholas J.
2011-01-01
Ensemble methods have been widely used to improve prediction accuracy over individual classifiers. In this paper, we achieve a few results about the prediction accuracies of ensemble methods for binary classification that are missed or misinterpreted in previous literature. First we show the upper and lower bounds of the prediction accuracies (i.e. the best and worst possible prediction accuracies) of ensemble methods. Next we show that an ensemble method can achieve > 0.5 prediction accuracy, while individual classifiers have < 0.5 prediction accuracies. Furthermore, for individual classifiers with different prediction accuracies, the average of the individual accuracies determines the upper and lower bounds. We perform two experiments to verify the results and show that it is hard to achieve the upper and lower bounds accuracies by random individual classifiers and better algorithms need to be developed. PMID:21853162
NASA Astrophysics Data System (ADS)
Fujita, Yusuke; Mitani, Yoshihiro; Hamamoto, Yoshihiko; Segawa, Makoto; Terai, Shuji; Sakaida, Isao
2017-03-01
Ultrasound imaging is a popular and non-invasive tool used in the diagnoses of liver disease. Cirrhosis is a chronic liver disease and it can advance to liver cancer. Early detection and appropriate treatment are crucial to prevent liver cancer. However, ultrasound image analysis is very challenging, because of the low signal-to-noise ratio of ultrasound images. To achieve the higher classification performance, selection of training regions of interest (ROIs) is very important that effect to classification accuracy. The purpose of our study is cirrhosis detection with high accuracy using liver ultrasound images. In our previous works, training ROI selection by MILBoost and multiple-ROI classification based on the product rule had been proposed, to achieve high classification performance. In this article, we propose self-training method to select training ROIs effectively. Evaluation experiments were performed to evaluate effect of self-training, using manually selected ROIs and also automatically selected ROIs. Experimental results show that self-training for manually selected ROIs achieved higher classification performance than other approaches, including our conventional methods. The manually ROI definition and sample selection are important to improve classification accuracy in cirrhosis detection using ultrasound images.
Compensatory neurofuzzy model for discrete data classification in biomedical
NASA Astrophysics Data System (ADS)
Ceylan, Rahime
2015-03-01
Biomedical data is separated to two main sections: signals and discrete data. So, studies in this area are about biomedical signal classification or biomedical discrete data classification. There are artificial intelligence models which are relevant to classification of ECG, EMG or EEG signals. In same way, in literature, many models exist for classification of discrete data taken as value of samples which can be results of blood analysis or biopsy in medical process. Each algorithm could not achieve high accuracy rate on classification of signal and discrete data. In this study, compensatory neurofuzzy network model is presented for classification of discrete data in biomedical pattern recognition area. The compensatory neurofuzzy network has a hybrid and binary classifier. In this system, the parameters of fuzzy systems are updated by backpropagation algorithm. The realized classifier model is conducted to two benchmark datasets (Wisconsin Breast Cancer dataset and Pima Indian Diabetes dataset). Experimental studies show that compensatory neurofuzzy network model achieved 96.11% accuracy rate in classification of breast cancer dataset and 69.08% accuracy rate was obtained in experiments made on diabetes dataset with only 10 iterations.
Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.
Chen, Shizhi; Yang, Xiaodong; Tian, Yingli
2015-09-01
A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement.
Evaluation of space SAR as a land-cover classification
NASA Technical Reports Server (NTRS)
Brisco, B.; Ulaby, F. T.; Williams, T. H. L.
1985-01-01
The multidimensional approach to the mapping of land cover, crops, and forests is reported. Dimensionality is achieved by using data from sensors such as LANDSAT to augment Seasat and Shuttle Image Radar (SIR) data, using different image features such as tone and texture, and acquiring multidate data. Seasat, Shuttle Imaging Radar (SIR-A), and LANDSAT data are used both individually and in combination to map land cover in Oklahoma. The results indicates that radar is the best single sensor (72% accuracy) and produces the best sensor combination (97.5% accuracy) for discriminating among five land cover categories. Multidate Seasat data and a single data of LANDSAT coverage are then used in a crop classification study of western Kansas. The highest accuracy for a single channel is achieved using a Seasat scene, which produces a classification accuracy of 67%. Classification accuracy increases to approximately 75% when either a multidate Seasat combination or LANDSAT data in a multisensor combination is used. The tonal and textural elements of SIR-A data are then used both alone and in combination to classify forests into five categories.
Singha, Mrinal; Wu, Bingfang; Zhang, Miao
2016-01-01
Accurate and timely mapping of paddy rice is vital for food security and environmental sustainability. This study evaluates the utility of temporal features extracted from coarse resolution data for object-based paddy rice classification of fine resolution data. The coarse resolution vegetation index data is first fused with the fine resolution data to generate the time series fine resolution data. Temporal features are extracted from the fused data and added with the multi-spectral data to improve the classification accuracy. Temporal features provided the crop growth information, while multi-spectral data provided the pattern variation of paddy rice. The achieved overall classification accuracy and kappa coefficient were 84.37% and 0.68, respectively. The results indicate that the use of temporal features improved the overall classification accuracy of a single-date multi-spectral image by 18.75% from 65.62% to 84.37%. The minimum sensitivity (MS) of the paddy rice classification has also been improved. The comparison showed that the mapped paddy area was analogous to the agricultural statistics at the district level. This work also highlighted the importance of feature selection to achieve higher classification accuracies. These results demonstrate the potential of the combined use of temporal and spectral features for accurate paddy rice classification. PMID:28025525
Singha, Mrinal; Wu, Bingfang; Zhang, Miao
2016-12-22
Accurate and timely mapping of paddy rice is vital for food security and environmental sustainability. This study evaluates the utility of temporal features extracted from coarse resolution data for object-based paddy rice classification of fine resolution data. The coarse resolution vegetation index data is first fused with the fine resolution data to generate the time series fine resolution data. Temporal features are extracted from the fused data and added with the multi-spectral data to improve the classification accuracy. Temporal features provided the crop growth information, while multi-spectral data provided the pattern variation of paddy rice. The achieved overall classification accuracy and kappa coefficient were 84.37% and 0.68, respectively. The results indicate that the use of temporal features improved the overall classification accuracy of a single-date multi-spectral image by 18.75% from 65.62% to 84.37%. The minimum sensitivity (MS) of the paddy rice classification has also been improved. The comparison showed that the mapped paddy area was analogous to the agricultural statistics at the district level. This work also highlighted the importance of feature selection to achieve higher classification accuracies. These results demonstrate the potential of the combined use of temporal and spectral features for accurate paddy rice classification.
A fuzzy hill-climbing algorithm for the development of a compact associative classifier
NASA Astrophysics Data System (ADS)
Mitra, Soumyaroop; Lam, Sarah S.
2012-02-01
Classification, a data mining technique, has widespread applications including medical diagnosis, targeted marketing, and others. Knowledge discovery from databases in the form of association rules is one of the important data mining tasks. An integrated approach, classification based on association rules, has drawn the attention of the data mining community over the last decade. While attention has been mainly focused on increasing classifier accuracies, not much efforts have been devoted towards building interpretable and less complex models. This paper discusses the development of a compact associative classification model using a hill-climbing approach and fuzzy sets. The proposed methodology builds the rule-base by selecting rules which contribute towards increasing training accuracy, thus balancing classification accuracy with the number of classification association rules. The results indicated that the proposed associative classification model can achieve competitive accuracies on benchmark datasets with continuous attributes and lend better interpretability, when compared with other rule-based systems.
Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks.
Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R; Nguyen, Tuan N; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T
2017-01-01
This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively.
Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks
Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R.; Nguyen, Tuan N.; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T.
2017-01-01
This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively. PMID:28326009
Metric learning for automatic sleep stage classification.
Phan, Huy; Do, Quan; Do, The-Luan; Vu, Duc-Lung
2013-01-01
We introduce in this paper a metric learning approach for automatic sleep stage classification based on single-channel EEG data. We show that learning a global metric from training data instead of using the default Euclidean metric, the k-nearest neighbor classification rule outperforms state-of-the-art methods on Sleep-EDF dataset with various classification settings. The overall accuracy for Awake/Sleep and 4-class classification setting are 98.32% and 94.49% respectively. Furthermore, the superior accuracy is achieved by performing classification on a low-dimensional feature space derived from time and frequency domains and without the need for artifact removal as a preprocessing step.
Application of Sensor Fusion to Improve Uav Image Classification
NASA Astrophysics Data System (ADS)
Jabari, S.; Fathollahi, F.; Zhang, Y.
2017-08-01
Image classification is one of the most important tasks of remote sensing projects including the ones that are based on using UAV images. Improving the quality of UAV images directly affects the classification results and can save a huge amount of time and effort in this area. In this study, we show that sensor fusion can improve image quality which results in increasing the accuracy of image classification. Here, we tested two sensor fusion configurations by using a Panchromatic (Pan) camera along with either a colour camera or a four-band multi-spectral (MS) camera. We use the Pan camera to benefit from its higher sensitivity and the colour or MS camera to benefit from its spectral properties. The resulting images are then compared to the ones acquired by a high resolution single Bayer-pattern colour camera (here referred to as HRC). We assessed the quality of the output images by performing image classification tests. The outputs prove that the proposed sensor fusion configurations can achieve higher accuracies compared to the images of the single Bayer-pattern colour camera. Therefore, incorporating a Pan camera on-board in the UAV missions and performing image fusion can help achieving higher quality images and accordingly higher accuracy classification results.
a Gsa-Svm Hybrid System for Classification of Binary Problems
NASA Astrophysics Data System (ADS)
Sarafrazi, Soroor; Nezamabadi-pour, Hossein; Barahman, Mojgan
2011-06-01
This paperhybridizesgravitational search algorithm (GSA) with support vector machine (SVM) and made a novel GSA-SVM hybrid system to improve the classification accuracy in binary problems. GSA is an optimization heuristic toolused to optimize the value of SVM kernel parameter (in this paper, radial basis function (RBF) is chosen as the kernel function). The experimental results show that this newapproach can achieve high classification accuracy and is comparable to or better than the particle swarm optimization (PSO)-SVM and genetic algorithm (GA)-SVM, which are two hybrid systems for classification.
Belgiu, Mariana; Dr Guţ, Lucian
2014-10-01
Although multiresolution segmentation (MRS) is a powerful technique for dealing with very high resolution imagery, some of the image objects that it generates do not match the geometries of the target objects, which reduces the classification accuracy. MRS can, however, be guided to produce results that approach the desired object geometry using either supervised or unsupervised approaches. Although some studies have suggested that a supervised approach is preferable, there has been no comparative evaluation of these two approaches. Therefore, in this study, we have compared supervised and unsupervised approaches to MRS. One supervised and two unsupervised segmentation methods were tested on three areas using QuickBird and WorldView-2 satellite imagery. The results were assessed using both segmentation evaluation methods and an accuracy assessment of the resulting building classifications. Thus, differences in the geometries of the image objects and in the potential to achieve satisfactory thematic accuracies were evaluated. The two approaches yielded remarkably similar classification results, with overall accuracies ranging from 82% to 86%. The performance of one of the unsupervised methods was unexpectedly similar to that of the supervised method; they identified almost identical scale parameters as being optimal for segmenting buildings, resulting in very similar geometries for the resulting image objects. The second unsupervised method produced very different image objects from the supervised method, but their classification accuracies were still very similar. The latter result was unexpected because, contrary to previously published findings, it suggests a high degree of independence between the segmentation results and classification accuracy. The results of this study have two important implications. The first is that object-based image analysis can be automated without sacrificing classification accuracy, and the second is that the previously accepted idea that classification is dependent on segmentation is challenged by our unexpected results, casting doubt on the value of pursuing 'optimal segmentation'. Our results rather suggest that as long as under-segmentation remains at acceptable levels, imperfections in segmentation can be ruled out, so that a high level of classification accuracy can still be achieved.
Improving crop classification through attention to the timing of airborne radar acquisitions
NASA Technical Reports Server (NTRS)
Brisco, B.; Ulaby, F. T.; Protz, R.
1984-01-01
Radar remote sensors may provide valuable input to crop classification procedures because of (1) their independence of weather conditions and solar illumination, and (2) their ability to respond to differences in crop type. Manual classification of multidate synthetic aperture radar (SAR) imagery resulted in an overall accuracy of 83 percent for corn, forest, grain, and 'other' cover types. Forests and corn fields were identified with accuracies approaching or exceeding 90 percent. Grain fields and 'other' fields were often confused with each other, resulting in classification accuracies of 51 and 66 percent, respectively. The 83 percent correct classification represents a 10 percent improvement when compared to similar SAR data for the same area collected at alternate time periods in 1978. These results demonstrate that improvements in crop classification accuracy can be achieved with SAR data by synchronizing data collection times with crop growth stages in order to maximize differences in the geometric and dielectric properties of the cover types of interest.
Detection of eardrum abnormalities using ensemble deep learning approaches
NASA Astrophysics Data System (ADS)
Senaras, Caglar; Moberly, Aaron C.; Teknos, Theodoros; Essig, Garth; Elmaraghy, Charles; Taj-Schaal, Nazhat; Yua, Lianbo; Gurcan, Metin N.
2018-02-01
In this study, we proposed an approach to report the condition of the eardrum as "normal" or "abnormal" by ensembling two different deep learning architectures. In the first network (Network 1), we applied transfer learning to the Inception V3 network by using 409 labeled samples. As a second network (Network 2), we designed a convolutional neural network to take advantage of auto-encoders by using additional 673 unlabeled eardrum samples. The individual classification accuracies of the Network 1 and Network 2 were calculated as 84.4%(+/- 12.1%) and 82.6% (+/- 11.3%), respectively. Only 32% of the errors of the two networks were the same, making it possible to combine two approaches to achieve better classification accuracy. The proposed ensemble method allows us to achieve robust classification because it has high accuracy (84.4%) with the lowest standard deviation (+/- 10.3%).
NASA Astrophysics Data System (ADS)
Ahmed, H. M.; Al-azawi, R. J.; Abdulhameed, A. A.
2018-05-01
Huge efforts have been put in the developing of diagnostic methods to skin cancer disease. In this paper, two different approaches have been addressed for detection the skin cancer in dermoscopy images. The first approach uses a global method that uses global features for classifying skin lesions, whereas the second approach uses a local method that uses local features for classifying skin lesions. The aim of this paper is selecting the best approach for skin lesion classification. The dataset has been used in this paper consist of 200 dermoscopy images from Pedro Hispano Hospital (PH2). The achieved results are; sensitivity about 96%, specificity about 100%, precision about 100%, and accuracy about 97% for globalization approach while, sensitivity about 100%, specificity about 100%, precision about 100%, and accuracy about 100% for Localization Approach, these results showed that the localization approach achieved acceptable accuracy and better than globalization approach for skin cancer lesions classification.
NASA Technical Reports Server (NTRS)
Mehta, N. C.
1984-01-01
The utility of radar scatterometers for discrimination and characterization of natural vegetation was investigated. Backscatter measurements were acquired with airborne multi-frequency, multi-polarization, multi-angle radar scatterometers over a test site in a southern temperate forest. Separability between ground cover classes was studied using a two-class separability measure. Very good separability is achieved between most classes. Longer wavelength is useful in separating trees from non-tree classes, while shorter wavelength and cross polarization are helpful for discrimination among tree classes. Using the maximum likelihood classifier, 50% overall classification accuracy is achieved using a single, short-wavelength scatterometer channel. Addition of multiple incidence angles and another radar band improves classification accuracy by 20% and 50%, respectively, over the single channel accuracy. Incorporation of a third radar band seems redundant for vegetation classification. Vertical transmit polarization is critically important for all classes.
ERIC Educational Resources Information Center
Ball, Carrie R.; O'Connor, Edward
2016-01-01
This study examined the predictive validity and classification accuracy of two commonly used universal screening measures relative to a statewide achievement test. Results indicated that second-grade performance on oral reading fluency and the Measures of Academic Progress (MAP), together with special education status, explained 68% of the…
Zhou, Tao; Li, Zhaofu; Pan, Jianjun
2018-01-27
This paper focuses on evaluating the ability and contribution of using backscatter intensity, texture, coherence, and color features extracted from Sentinel-1A data for urban land cover classification and comparing different multi-sensor land cover mapping methods to improve classification accuracy. Both Landsat-8 OLI and Hyperion images were also acquired, in combination with Sentinel-1A data, to explore the potential of different multi-sensor urban land cover mapping methods to improve classification accuracy. The classification was performed using a random forest (RF) method. The results showed that the optimal window size of the combination of all texture features was 9 × 9, and the optimal window size was different for each individual texture feature. For the four different feature types, the texture features contributed the most to the classification, followed by the coherence and backscatter intensity features; and the color features had the least impact on the urban land cover classification. Satisfactory classification results can be obtained using only the combination of texture and coherence features, with an overall accuracy up to 91.55% and a kappa coefficient up to 0.8935, respectively. Among all combinations of Sentinel-1A-derived features, the combination of the four features had the best classification result. Multi-sensor urban land cover mapping obtained higher classification accuracy. The combination of Sentinel-1A and Hyperion data achieved higher classification accuracy compared to the combination of Sentinel-1A and Landsat-8 OLI images, with an overall accuracy of up to 99.12% and a kappa coefficient up to 0.9889. When Sentinel-1A data was added to Hyperion images, the overall accuracy and kappa coefficient were increased by 4.01% and 0.0519, respectively.
Classification of ECG beats using deep belief network and active learning.
G, Sayantan; T, Kien P; V, Kadambari K
2018-04-12
A new semi-supervised approach based on deep learning and active learning for classification of electrocardiogram signals (ECG) is proposed. The objective of the proposed work is to model a scientific method for classification of cardiac irregularities using electrocardiogram beats. The model follows the Association for the Advancement of medical instrumentation (AAMI) standards and consists of three phases. In phase I, feature representation of ECG is learnt using Gaussian-Bernoulli deep belief network followed by a linear support vector machine (SVM) training in the consecutive phase. It yields three deep models which are based on AAMI-defined classes, namely N, V, S, and F. In the last phase, a query generator is introduced to interact with the expert to label few beats to improve accuracy and sensitivity. The proposed approach depicts significant improvement in accuracy with minimal queries posed to the expert and fast online training as tested on the MIT-BIH Arrhythmia Database and the MIT-BIH Supra-ventricular Arrhythmia Database (SVDB). With 100 queries labeled by the expert in phase III, the method achieves an accuracy of 99.5% in "S" versus all classifications (SVEB) and 99.4% accuracy in "V " versus all classifications (VEB) on MIT-BIH Arrhythmia Database. In a similar manner, it is attributed that an accuracy of 97.5% for SVEB and 98.6% for VEB on SVDB database is achieved respectively. Graphical Abstract Reply- Deep belief network augmented by active learning for efficient prediction of arrhythmia.
Exploration of Force Myography and surface Electromyography in hand gesture classification.
Jiang, Xianta; Merhi, Lukas-Karim; Xiao, Zhen Gang; Menon, Carlo
2017-03-01
Whereas pressure sensors increasingly have received attention as a non-invasive interface for hand gesture recognition, their performance has not been comprehensively evaluated. This work examined the performance of hand gesture classification using Force Myography (FMG) and surface Electromyography (sEMG) technologies by performing 3 sets of 48 hand gestures using a prototyped FMG band and an array of commercial sEMG sensors worn both on the wrist and forearm simultaneously. The results show that the FMG band achieved classification accuracies as good as the high quality, commercially available, sEMG system on both wrist and forearm positions; specifically, by only using 8 Force Sensitive Resisters (FSRs), the FMG band achieved accuracies of 91.2% and 83.5% in classifying the 48 hand gestures in cross-validation and cross-trial evaluations, which were higher than those of sEMG (84.6% and 79.1%). By using all 16 FSRs on the band, our device achieved high accuracies of 96.7% and 89.4% in cross-validation and cross-trial evaluations. Copyright © 2017 IPEM. Published by Elsevier Ltd. All rights reserved.
Mohammed, Ameer; Zamani, Majid; Bayford, Richard; Demosthenous, Andreas
2017-12-01
In Parkinson's disease (PD), on-demand deep brain stimulation is required so that stimulation is regulated to reduce side effects resulting from continuous stimulation and PD exacerbation due to untimely stimulation. Also, the progressive nature of PD necessitates the use of dynamic detection schemes that can track the nonlinearities in PD. This paper proposes the use of dynamic feature extraction and dynamic pattern classification to achieve dynamic PD detection taking into account the demand for high accuracy, low computation, and real-time detection. The dynamic feature extraction and dynamic pattern classification are selected by evaluating a subset of feature extraction, dimensionality reduction, and classification algorithms that have been used in brain-machine interfaces. A novel dimensionality reduction technique, the maximum ratio method (MRM) is proposed, which provides the most efficient performance. In terms of accuracy and complexity for hardware implementation, a combination having discrete wavelet transform for feature extraction, MRM for dimensionality reduction, and dynamic k-nearest neighbor for classification was chosen as the most efficient. It achieves a classification accuracy of 99.29%, an F1-score of 97.90%, and a choice probability of 99.86%.
Wen, Tingxi; Zhang, Zhongnan
2017-01-01
Abstract In this paper, genetic algorithm-based frequency-domain feature search (GAFDS) method is proposed for the electroencephalogram (EEG) analysis of epilepsy. In this method, frequency-domain features are first searched and then combined with nonlinear features. Subsequently, these features are selected and optimized to classify EEG signals. The extracted features are analyzed experimentally. The features extracted by GAFDS show remarkable independence, and they are superior to the nonlinear features in terms of the ratio of interclass distance and intraclass distance. Moreover, the proposed feature search method can search for features of instantaneous frequency in a signal after Hilbert transformation. The classification results achieved using these features are reasonable; thus, GAFDS exhibits good extensibility. Multiple classical classifiers (i.e., k-nearest neighbor, linear discriminant analysis, decision tree, AdaBoost, multilayer perceptron, and Naïve Bayes) achieve satisfactory classification accuracies by using the features generated by the GAFDS method and the optimized feature selection. The accuracies for 2-classification and 3-classification problems may reach up to 99% and 97%, respectively. Results of several cross-validation experiments illustrate that GAFDS is effective in the extraction of effective features for EEG classification. Therefore, the proposed feature selection and optimization model can improve classification accuracy. PMID:28489789
Wen, Tingxi; Zhang, Zhongnan
2017-05-01
In this paper, genetic algorithm-based frequency-domain feature search (GAFDS) method is proposed for the electroencephalogram (EEG) analysis of epilepsy. In this method, frequency-domain features are first searched and then combined with nonlinear features. Subsequently, these features are selected and optimized to classify EEG signals. The extracted features are analyzed experimentally. The features extracted by GAFDS show remarkable independence, and they are superior to the nonlinear features in terms of the ratio of interclass distance and intraclass distance. Moreover, the proposed feature search method can search for features of instantaneous frequency in a signal after Hilbert transformation. The classification results achieved using these features are reasonable; thus, GAFDS exhibits good extensibility. Multiple classical classifiers (i.e., k-nearest neighbor, linear discriminant analysis, decision tree, AdaBoost, multilayer perceptron, and Naïve Bayes) achieve satisfactory classification accuracies by using the features generated by the GAFDS method and the optimized feature selection. The accuracies for 2-classification and 3-classification problems may reach up to 99% and 97%, respectively. Results of several cross-validation experiments illustrate that GAFDS is effective in the extraction of effective features for EEG classification. Therefore, the proposed feature selection and optimization model can improve classification accuracy.
Multi-source remotely sensed data fusion for improving land cover classification
NASA Astrophysics Data System (ADS)
Chen, Bin; Huang, Bo; Xu, Bing
2017-02-01
Although many advances have been made in past decades, land cover classification of fine-resolution remotely sensed (RS) data integrating multiple temporal, angular, and spectral features remains limited, and the contribution of different RS features to land cover classification accuracy remains uncertain. We proposed to improve land cover classification accuracy by integrating multi-source RS features through data fusion. We further investigated the effect of different RS features on classification performance. The results of fusing Landsat-8 Operational Land Imager (OLI) data with Moderate Resolution Imaging Spectroradiometer (MODIS), China Environment 1A series (HJ-1A), and Advanced Spaceborne Thermal Emission and Reflection (ASTER) digital elevation model (DEM) data, showed that the fused data integrating temporal, spectral, angular, and topographic features achieved better land cover classification accuracy than the original RS data. Compared with the topographic feature, the temporal and angular features extracted from the fused data played more important roles in classification performance, especially those temporal features containing abundant vegetation growth information, which markedly increased the overall classification accuracy. In addition, the multispectral and hyperspectral fusion successfully discriminated detailed forest types. Our study provides a straightforward strategy for hierarchical land cover classification by making full use of available RS data. All of these methods and findings could be useful for land cover classification at both regional and global scales.
Orhan, Umut; Erdogmus, Deniz; Roark, Brian; Purwar, Shalini; Hild, Kenneth E.; Oken, Barry; Nezamfar, Hooman; Fried-Oken, Melanie
2013-01-01
Event related potentials (ERP) corresponding to a stimulus in electroencephalography (EEG) can be used to detect the intent of a person for brain computer interfaces (BCI). This paradigm is widely utilized to build letter-by-letter text input systems using BCI. Nevertheless using a BCI-typewriter depending only on EEG responses will not be sufficiently accurate for single-trial operation in general, and existing systems utilize many-trial schemes to achieve accuracy at the cost of speed. Hence incorporation of a language model based prior or additional evidence is vital to improve accuracy and speed. In this paper, we study the effects of Bayesian fusion of an n-gram language model with a regularized discriminant analysis ERP detector for EEG-based BCIs. The letter classification accuracies are rigorously evaluated for varying language model orders as well as number of ERP-inducing trials. The results demonstrate that the language models contribute significantly to letter classification accuracy. Specifically, we find that a BCI-speller supported by a 4-gram language model may achieve the same performance using 3-trial ERP classification for the initial letters of the words and using single trial ERP classification for the subsequent ones. Overall, fusion of evidence from EEG and language models yields a significant opportunity to increase the word rate of a BCI based typing system. PMID:22255652
HEp-2 cell image classification method based on very deep convolutional networks with small datasets
NASA Astrophysics Data System (ADS)
Lu, Mengchi; Gao, Long; Guo, Xifeng; Liu, Qiang; Yin, Jianping
2017-07-01
Human Epithelial-2 (HEp-2) cell images staining patterns classification have been widely used to identify autoimmune diseases by the anti-Nuclear antibodies (ANA) test in the Indirect Immunofluorescence (IIF) protocol. Because manual test is time consuming, subjective and labor intensive, image-based Computer Aided Diagnosis (CAD) systems for HEp-2 cell classification are developing. However, methods proposed recently are mostly manual features extraction with low accuracy. Besides, the scale of available benchmark datasets is small, which does not exactly suitable for using deep learning methods. This issue will influence the accuracy of cell classification directly even after data augmentation. To address these issues, this paper presents a high accuracy automatic HEp-2 cell classification method with small datasets, by utilizing very deep convolutional networks (VGGNet). Specifically, the proposed method consists of three main phases, namely image preprocessing, feature extraction and classification. Moreover, an improved VGGNet is presented to address the challenges of small-scale datasets. Experimental results over two benchmark datasets demonstrate that the proposed method achieves superior performance in terms of accuracy compared with existing methods.
Computer-aided diagnosis system: a Bayesian hybrid classification method.
Calle-Alonso, F; Pérez, C J; Arias-Nicolás, J P; Martín, J
2013-10-01
A novel method to classify multi-class biomedical objects is presented. The method is based on a hybrid approach which combines pairwise comparison, Bayesian regression and the k-nearest neighbor technique. It can be applied in a fully automatic way or in a relevance feedback framework. In the latter case, the information obtained from both an expert and the automatic classification is iteratively used to improve the results until a certain accuracy level is achieved, then, the learning process is finished and new classifications can be automatically performed. The method has been applied in two biomedical contexts by following the same cross-validation schemes as in the original studies. The first one refers to cancer diagnosis, leading to an accuracy of 77.35% versus 66.37%, originally obtained. The second one considers the diagnosis of pathologies of the vertebral column. The original method achieves accuracies ranging from 76.5% to 96.7%, and from 82.3% to 97.1% in two different cross-validation schemes. Even with no supervision, the proposed method reaches 96.71% and 97.32% in these two cases. By using a supervised framework the achieved accuracy is 97.74%. Furthermore, all abnormal cases were correctly classified. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Zhe Fan; Zhong Wang; Guanglin Li; Ruomei Wang
2016-08-01
Motion classification system based on surface Electromyography (sEMG) pattern recognition has achieved good results in experimental condition. But it is still a challenge for clinical implement and practical application. Many factors contribute to the difficulty of clinical use of the EMG based dexterous control. The most obvious and important is the noise in the EMG signal caused by electrode shift, muscle fatigue, motion artifact, inherent instability of signal and biological signals such as Electrocardiogram. In this paper, a novel method based on Canonical Correlation Analysis (CCA) was developed to eliminate the reduction of classification accuracy caused by electrode shift. The average classification accuracy of our method were above 95% for the healthy subjects. In the process, we validated the influence of electrode shift on motion classification accuracy and discovered the strong correlation with correlation coefficient of >0.9 between shift position data and normal position data.
A neural network approach to cloud classification
NASA Technical Reports Server (NTRS)
Lee, Jonathan; Weger, Ronald C.; Sengupta, Sailes K.; Welch, Ronald M.
1990-01-01
It is shown that, using high-spatial-resolution data, very high cloud classification accuracies can be obtained with a neural network approach. A texture-based neural network classifier using only single-channel visible Landsat MSS imagery achieves an overall cloud identification accuracy of 93 percent. Cirrus can be distinguished from boundary layer cloudiness with an accuracy of 96 percent, without the use of an infrared channel. Stratocumulus is retrieved with an accuracy of 92 percent, cumulus at 90 percent. The use of the neural network does not improve cirrus classification accuracy. Rather, its main effect is in the improved separation between stratocumulus and cumulus cloudiness. While most cloud classification algorithms rely on linear parametric schemes, the present study is based on a nonlinear, nonparametric four-layer neural network approach. A three-layer neural network architecture, the nonparametric K-nearest neighbor approach, and the linear stepwise discriminant analysis procedure are compared. A significant finding is that significantly higher accuracies are attained with the nonparametric approaches using only 20 percent of the database as training data, compared to 67 percent of the database in the linear approach.
Study of wavelet packet energy entropy for emotion classification in speech and glottal signals
NASA Astrophysics Data System (ADS)
He, Ling; Lech, Margaret; Zhang, Jing; Ren, Xiaomei; Deng, Lihua
2013-07-01
The automatic speech emotion recognition has important applications in human-machine communication. Majority of current research in this area is focused on finding optimal feature parameters. In recent studies, several glottal features were examined as potential cues for emotion differentiation. In this study, a new type of feature parameter is proposed, which calculates energy entropy on values within selected Wavelet Packet frequency bands. The modeling and classification tasks are conducted using the classical GMM algorithm. The experiments use two data sets: the Speech Under Simulated Emotion (SUSE) data set annotated with three different emotions (angry, neutral and soft) and Berlin Emotional Speech (BES) database annotated with seven different emotions (angry, bored, disgust, fear, happy, sad and neutral). The average classification accuracy achieved for the SUSE data (74%-76%) is significantly higher than the accuracy achieved for the BES data (51%-54%). In both cases, the accuracy was significantly higher than the respective random guessing levels (33% for SUSE and 14.3% for BES).
Pan, Jianjun
2018-01-01
This paper focuses on evaluating the ability and contribution of using backscatter intensity, texture, coherence, and color features extracted from Sentinel-1A data for urban land cover classification and comparing different multi-sensor land cover mapping methods to improve classification accuracy. Both Landsat-8 OLI and Hyperion images were also acquired, in combination with Sentinel-1A data, to explore the potential of different multi-sensor urban land cover mapping methods to improve classification accuracy. The classification was performed using a random forest (RF) method. The results showed that the optimal window size of the combination of all texture features was 9 × 9, and the optimal window size was different for each individual texture feature. For the four different feature types, the texture features contributed the most to the classification, followed by the coherence and backscatter intensity features; and the color features had the least impact on the urban land cover classification. Satisfactory classification results can be obtained using only the combination of texture and coherence features, with an overall accuracy up to 91.55% and a kappa coefficient up to 0.8935, respectively. Among all combinations of Sentinel-1A-derived features, the combination of the four features had the best classification result. Multi-sensor urban land cover mapping obtained higher classification accuracy. The combination of Sentinel-1A and Hyperion data achieved higher classification accuracy compared to the combination of Sentinel-1A and Landsat-8 OLI images, with an overall accuracy of up to 99.12% and a kappa coefficient up to 0.9889. When Sentinel-1A data was added to Hyperion images, the overall accuracy and kappa coefficient were increased by 4.01% and 0.0519, respectively. PMID:29382073
Classifying four-category visual objects using multiple ERP components in single-trial ERP.
Qin, Yu; Zhan, Yu; Wang, Changming; Zhang, Jiacai; Yao, Li; Guo, Xiaojuan; Wu, Xia; Hu, Bin
2016-08-01
Object categorization using single-trial electroencephalography (EEG) data measured while participants view images has been studied intensively. In previous studies, multiple event-related potential (ERP) components (e.g., P1, N1, P2, and P3) were used to improve the performance of object categorization of visual stimuli. In this study, we introduce a novel method that uses multiple-kernel support vector machine to fuse multiple ERP component features. We investigate whether fusing the potential complementary information of different ERP components (e.g., P1, N1, P2a, and P2b) can improve the performance of four-category visual object classification in single-trial EEGs. We also compare the classification accuracy of different ERP component fusion methods. Our experimental results indicate that the classification accuracy increases through multiple ERP fusion. Additional comparative analyses indicate that the multiple-kernel fusion method can achieve a mean classification accuracy higher than 72 %, which is substantially better than that achieved with any single ERP component feature (55.07 % for the best single ERP component, N1). We compare the classification results with those of other fusion methods and determine that the accuracy of the multiple-kernel fusion method is 5.47, 4.06, and 16.90 % higher than those of feature concatenation, feature extraction, and decision fusion, respectively. Our study shows that our multiple-kernel fusion method outperforms other fusion methods and thus provides a means to improve the classification performance of single-trial ERPs in brain-computer interface research.
Applications of remote sensing, volume 1
NASA Technical Reports Server (NTRS)
Landgrebe, D. A. (Principal Investigator)
1977-01-01
The author has identified the following significant results. ECHO successfully exploits the redundancy of states characteristics of sampled imagery of ground scenes to achieve better classification accuracy, reduce the number of classifications required, and reduce the variability of classification results. The information required to produce ECHO classifications are cell size, cell homogeneity, cell-to-field annexation parameters, input data, and a class conditional marginal density statistics deck.
Test of spectral/spatial classifier
NASA Technical Reports Server (NTRS)
Landgrebe, D. A. (Principal Investigator); Kast, J. L.; Davis, B. J.
1977-01-01
The author has identified the following significant results. The supervised ECHO processor (which utilizes class statistics for object identification) successfully exploits the redundancy of states characteristic of sampled imagery of ground scenes to achieve better classification accuracy, reduce the number of classifications required, and reduce the variability of classification results. The nonsupervised ECHO processor (which identifies objects without the benefit of class statistics) successfully reduces the number of classifications required and the variability of the classification results.
Detection of artificially ripened mango using spectrometric analysis
NASA Astrophysics Data System (ADS)
Mithun, B. S.; Mondal, Milton; Vishwakarma, Harsh; Shinde, Sujit; Kimbahune, Sanjay
2017-05-01
Hyperspectral sensing has been proven to be useful to determine the quality of food in general. It has also been used to distinguish naturally and artificially ripened mangoes by analyzing the spectral signature. However the focus has been on improving the accuracy of classification after performing dimensionality reduction, optimum feature selection and using suitable learning algorithm on the complete visible and NIR spectrum range data, namely 350nm to 1050nm. In this paper we focus on, (i) the use of low wavelength resolution and low cost multispectral sensor to reliably identify artificially ripened mango by selectively using the spectral information so that classification accuracy is not hampered at the cost of low resolution spectral data and (ii) use of visible spectrum i.e. 390nm to 700 nm data to accurately discriminate artificially ripened mangoes. Our results show that on a low resolution spectral data, the use of logistic regression produces an accuracy of 98.83% and outperforms other methods like classification tree, random forest significantly. And this is achieved by analyzing only 36 spectral reflectance data points instead of the complete 216 data points available in visual and NIR range. Another interesting experimental observation is that we are able to achieve more than 98% classification accuracy by selecting only 15 irradiance values in the visible spectrum. Even the number of data needs to be collected using hyper-spectral or multi-spectral sensor can be reduced by a factor of 24 for classification with high degree of confidence
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features.
Li, Linyi; Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images.
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features
Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images. PMID:28761440
NASA Astrophysics Data System (ADS)
Hänsch, Ronny; Hellwich, Olaf
2018-04-01
Random Forests have continuously proven to be one of the most accurate, robust, as well as efficient methods for the supervised classification of images in general and polarimetric synthetic aperture radar data in particular. While the majority of previous work focus on improving classification accuracy, we aim for accelerating the training of the classifier as well as its usage during prediction while maintaining its accuracy. Unlike other approaches we mainly consider algorithmic changes to stay as much as possible independent of platform and programming language. The final model achieves an approximately 60 times faster training and a 500 times faster prediction, while the accuracy is only marginally decreased by roughly 1 %.
NASA Astrophysics Data System (ADS)
Susanti, Yuliana; Zukhronah, Etik; Pratiwi, Hasih; Respatiwulan; Sri Sulistijowati, H.
2017-11-01
To achieve food resilience in Indonesia, food diversification by exploring potentials of local food is required. Corn is one of alternating staple food of Javanese society. For that reason, corn production needs to be improved by considering the influencing factors. CHAID and CRT are methods of data mining which can be used to classify the influencing variables. The present study seeks to dig up information on the potentials of local food availability of corn in regencies and cities in Java Island. CHAID analysis yields four classifications with accuracy of 78.8%, while CRT analysis yields seven classifications with accuracy of 79.6%.
Effective classification of the prevalence of Schistosoma mansoni.
Mitchell, Shira A; Pagano, Marcello
2012-12-01
To present an effective classification method based on the prevalence of Schistosoma mansoni in the community. We created decision rules (defined by cut-offs for number of positive slides), which account for imperfect sensitivity, both with a simple adjustment of fixed sensitivity and with a more complex adjustment of changing sensitivity with prevalence. To reduce screening costs while maintaining accuracy, we propose a pooled classification method. To estimate sensitivity, we use the De Vlas model for worm and egg distributions. We compare the proposed method with the standard method to investigate differences in efficiency, measured by number of slides read, and accuracy, measured by probability of correct classification. Modelling varying sensitivity lowers the lower cut-off more significantly than the upper cut-off, correctly classifying regions as moderate rather than lower, thus receiving life-saving treatment. The classification method goes directly to classification on the basis of positive pools, avoiding having to know sensitivity to estimate prevalence. For model parameter values describing worm and egg distributions among children, the pooled method with 25 slides achieves an expected 89.9% probability of correct classification, whereas the standard method with 50 slides achieves 88.7%. Among children, it is more efficient and more accurate to use the pooled method for classification of S. mansoni prevalence than the current standard method. © 2012 Blackwell Publishing Ltd.
Classification Based on Pruning and Double Covered Rule Sets for the Internet of Things Applications
Zhou, Zhongmei; Wang, Weiping
2014-01-01
The Internet of things (IOT) is a hot issue in recent years. It accumulates large amounts of data by IOT users, which is a great challenge to mining useful knowledge from IOT. Classification is an effective strategy which can predict the need of users in IOT. However, many traditional rule-based classifiers cannot guarantee that all instances can be covered by at least two classification rules. Thus, these algorithms cannot achieve high accuracy in some datasets. In this paper, we propose a new rule-based classification, CDCR-P (Classification based on the Pruning and Double Covered Rule sets). CDCR-P can induce two different rule sets A and B. Every instance in training set can be covered by at least one rule not only in rule set A, but also in rule set B. In order to improve the quality of rule set B, we take measure to prune the length of rules in rule set B. Our experimental results indicate that, CDCR-P not only is feasible, but also it can achieve high accuracy. PMID:24511304
Li, Shasha; Zhou, Zhongmei; Wang, Weiping
2014-01-01
The Internet of things (IOT) is a hot issue in recent years. It accumulates large amounts of data by IOT users, which is a great challenge to mining useful knowledge from IOT. Classification is an effective strategy which can predict the need of users in IOT. However, many traditional rule-based classifiers cannot guarantee that all instances can be covered by at least two classification rules. Thus, these algorithms cannot achieve high accuracy in some datasets. In this paper, we propose a new rule-based classification, CDCR-P (Classification based on the Pruning and Double Covered Rule sets). CDCR-P can induce two different rule sets A and B. Every instance in training set can be covered by at least one rule not only in rule set A, but also in rule set B. In order to improve the quality of rule set B, we take measure to prune the length of rules in rule set B. Our experimental results indicate that, CDCR-P not only is feasible, but also it can achieve high accuracy.
Lee, Jong Hoon; Jang, Hong Seok; Kim, Jun-Gi; Lee, Myung Ah; Kim, Dae Yong; Kim, Tae Hyun; Oh, Jae Hwan; Park, Sung Chan; Kim, Sun Young; Baek, Ji Yeon; Park, Hee Chul; Kim, Hee Cheol; Nam, Taek-Keun; Chie, Eui Kyu; Jung, Ji-Han; Oh, Seong Taek
2014-10-01
The reported overall accuracy of MRI in predicting the pathologic stage of nonirradiated rectal cancer is high. However, the role of MRI in restaging rectal tumors after neoadjuvant CRT is contentious. Thus, we evaluate the accuracy of restaging magnetic resonance imaging (MRI) for rectal cancer patients who receive preoperative chemoradiotherapy (CRT). We analyzed 150 patients with locally advanced rectal cancer (T3-4N0-2) who had received preoperative CRT. Pre-CRT MRI was performed for local tumor and nodal staging. All patients underwent restaging MRI followed by total mesorectal excision after the end of radiotherapy. The primary endpoint of the present study was to estimate the accuracy of post-CRT MRI as compared with pathologic staging. Pathologic T classification matched the post-CRT MRI findings in 97 (64.7%) of 150 patients. 36 (24.0%) of 150 patients were overstaged in T classification, and the concordance degree was moderate (k=0.33, p<0.01). Pathologic N classification matched the post-CRI MRI findings in 85 (56.6%) of 150 patients. 54 (36.0%) of 150 patients were overstaged in N classification. 26 patients achieved downstaging (ycT0-2N0) on restaging MRI after CRT. 23 (88.5%) of 26 patients who had been downstaged on MRI after CRT were confirmed on the pathological staging, and the concordance degree was good (k=0.72, p<0.01). Restaging MRI has low accuracy for the prediction of the pathologic T and N classifications in rectal cancer patients who received preoperative CRT. The diagnostic accuracy of restaging MRI is relatively high in rectal cancer patients who achieved clinical downstaging after CRT. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Convolutional neural network with transfer learning for rice type classification
NASA Astrophysics Data System (ADS)
Patel, Vaibhav Amit; Joshi, Manjunath V.
2018-04-01
Presently, rice type is identified manually by humans, which is time consuming and error prone. Therefore, there is a need to do this by machine which makes it faster with greater accuracy. This paper proposes a deep learning based method for classification of rice types. We propose two methods to classify the rice types. In the first method, we train a deep convolutional neural network (CNN) using the given segmented rice images. In the second method, we train a combination of a pretrained VGG16 network and the proposed method, while using transfer learning in which the weights of a pretrained network are used to achieve better accuracy. Our approach can also be used for classification of rice grain as broken or fine. We train a 5-class model for classifying rice types using 4000 training images and another 2- class model for the classification of broken and normal rice using 1600 training images. We observe that despite having distinct rice images, our architecture, pretrained on ImageNet data boosts classification accuracy significantly.
Comparative Analysis of Haar and Daubechies Wavelet for Hyper Spectral Image Classification
NASA Astrophysics Data System (ADS)
Sharif, I.; Khare, S.
2014-11-01
With the number of channels in the hundreds instead of in the tens Hyper spectral imagery possesses much richer spectral information than multispectral imagery. The increased dimensionality of such Hyper spectral data provides a challenge to the current technique for analyzing data. Conventional classification methods may not be useful without dimension reduction pre-processing. So dimension reduction has become a significant part of Hyper spectral image processing. This paper presents a comparative analysis of the efficacy of Haar and Daubechies wavelets for dimensionality reduction in achieving image classification. Spectral data reduction using Wavelet Decomposition could be useful because it preserves the distinction among spectral signatures. Daubechies wavelets optimally capture the polynomial trends while Haar wavelet is discontinuous and resembles a step function. The performance of these wavelets are compared in terms of classification accuracy and time complexity. This paper shows that wavelet reduction has more separate classes and yields better or comparable classification accuracy. In the context of the dimensionality reduction algorithm, it is found that the performance of classification of Daubechies wavelets is better as compared to Haar wavelet while Daubechies takes more time compare to Haar wavelet. The experimental results demonstrate the classification system consistently provides over 84% classification accuracy.
NASA Astrophysics Data System (ADS)
Quesada-Barriuso, Pablo; Heras, Dora B.; Argüello, Francisco
2016-10-01
The classification of remote sensing hyperspectral images for land cover applications is a very intensive topic. In the case of supervised classification, Support Vector Machines (SVMs) play a dominant role. Recently, the Extreme Learning Machine algorithm (ELM) has been extensively used. The classification scheme previously published by the authors, and called WT-EMP, introduces spatial information in the classification process by means of an Extended Morphological Profile (EMP) that is created from features extracted by wavelets. In addition, the hyperspectral image is denoised in the 2-D spatial domain, also using wavelets and it is joined to the EMP via a stacked vector. In this paper, the scheme is improved achieving two goals. The first one is to reduce the classification time while preserving the accuracy of the classification by using ELM instead of SVM. The second one is to improve the accuracy results by performing not only a 2-D denoising for every spectral band, but also a previous additional 1-D spectral signature denoising applied to each pixel vector of the image. For each denoising the image is transformed by applying a 1-D or 2-D wavelet transform, and then a NeighShrink thresholding is applied. Improvements in terms of classification accuracy are obtained, especially for images with close regions in the classification reference map, because in these cases the accuracy of the classification in the edges between classes is more relevant.
Park, Jinhee; Javier, Rios Jesus; Moon, Taesup; Kim, Youngwook
2016-11-24
Accurate classification of human aquatic activities using radar has a variety of potential applications such as rescue operations and border patrols. Nevertheless, the classification of activities on water using radar has not been extensively studied, unlike the case on dry ground, due to its unique challenge. Namely, not only is the radar cross section of a human on water small, but the micro-Doppler signatures are much noisier due to water drops and waves. In this paper, we first investigate whether discriminative signatures could be obtained for activities on water through a simulation study. Then, we show how we can effectively achieve high classification accuracy by applying deep convolutional neural networks (DCNN) directly to the spectrogram of real measurement data. From the five-fold cross-validation on our dataset, which consists of five aquatic activities, we report that the conventional feature-based scheme only achieves an accuracy of 45.1%. In contrast, the DCNN trained using only the collected data attains 66.7%, and the transfer learned DCNN, which takes a DCNN pre-trained on a RGB image dataset and fine-tunes the parameters using the collected data, achieves a much higher 80.3%, which is a significant performance boost.
Classification of urban features using airborne hyperspectral data
NASA Astrophysics Data System (ADS)
Ganesh Babu, Bharath
Accurate mapping and modeling of urban environments are critical for their efficient and successful management. Superior understanding of complex urban environments is made possible by using modern geospatial technologies. This research focuses on thematic classification of urban land use and land cover (LULC) using 248 bands of 2.0 meter resolution hyperspectral data acquired from an airborne imaging spectrometer (AISA+) on 24th July 2006 in and near Terre Haute, Indiana. Three distinct study areas including two commercial classes, two residential classes, and two urban parks/recreational classes were selected for classification and analysis. Four commonly used classification methods -- maximum likelihood (ML), extraction and classification of homogeneous objects (ECHO), spectral angle mapper (SAM), and iterative self organizing data analysis (ISODATA) - were applied to each data set. Accuracy assessment was conducted and overall accuracies were compared between the twenty four resulting thematic maps. With the exception of SAM and ISODATA in a complex commercial area, all methods employed classified the designated urban features with more than 80% accuracy. The thematic classification from ECHO showed the best agreement with ground reference samples. The residential area with relatively homogeneous composition was classified consistently with highest accuracy by all four of the classification methods used. The average accuracy amongst the classifiers was 93.60% for this area. When individually observed, the complex recreational area (Deming Park) was classified with the highest accuracy by ECHO, with an accuracy of 96.80% and 96.10% Kappa. The average accuracy amongst all the classifiers was 92.07%. The commercial area with relatively high complexity was classified with the least accuracy by all classifiers. The lowest accuracy was achieved by SAM at 63.90% with 59.20% Kappa. This was also the lowest accuracy in the entire analysis. This study demonstrates the potential for using the visible and near infrared (VNIR) bands from AISA+ hyperspectral data in urban LULC classification. Based on their performance, the need for further research using ECHO and SAM is underscored. The importance incorporating imaging spectrometer data in high resolution urban feature mapping is emphasized.
Feature ranking and rank aggregation for automatic sleep stage classification: a comparative study.
Najdi, Shirin; Gharbali, Ali Abdollahi; Fonseca, José Manuel
2017-08-18
Nowadays, sleep quality is one of the most important measures of healthy life, especially considering the huge number of sleep-related disorders. Identifying sleep stages using polysomnographic (PSG) signals is the traditional way of assessing sleep quality. However, the manual process of sleep stage classification is time-consuming, subjective and costly. Therefore, in order to improve the accuracy and efficiency of the sleep stage classification, researchers have been trying to develop automatic classification algorithms. Automatic sleep stage classification mainly consists of three steps: pre-processing, feature extraction and classification. Since classification accuracy is deeply affected by the extracted features, a poor feature vector will adversely affect the classifier and eventually lead to low classification accuracy. Therefore, special attention should be given to the feature extraction and selection process. In this paper the performance of seven feature selection methods, as well as two feature rank aggregation methods, were compared. Pz-Oz EEG, horizontal EOG and submental chin EMG recordings of 22 healthy males and females were used. A comprehensive feature set including 49 features was extracted from these recordings. The extracted features are among the most common and effective features used in sleep stage classification from temporal, spectral, entropy-based and nonlinear categories. The feature selection methods were evaluated and compared using three criteria: classification accuracy, stability, and similarity. Simulation results show that MRMR-MID achieves the highest classification performance while Fisher method provides the most stable ranking. In our simulations, the performance of the aggregation methods was in the average level, although they are known to generate more stable results and better accuracy. The Borda and RRA rank aggregation methods could not outperform significantly the conventional feature ranking methods. Among conventional methods, some of them slightly performed better than others, although the choice of a suitable technique is dependent on the computational complexity and accuracy requirements of the user.
Bahadure, Nilesh Bhaskarrao; Ray, Arun Kumar; Thethi, Har Pal
2018-01-17
The detection of a brain tumor and its classification from modern imaging modalities is a primary concern, but a time-consuming and tedious work was performed by radiologists or clinical supervisors. The accuracy of detection and classification of tumor stages performed by radiologists is depended on their experience only, so the computer-aided technology is very important to aid with the diagnosis accuracy. In this study, to improve the performance of tumor detection, we investigated comparative approach of different segmentation techniques and selected the best one by comparing their segmentation score. Further, to improve the classification accuracy, the genetic algorithm is employed for the automatic classification of tumor stage. The decision of classification stage is supported by extracting relevant features and area calculation. The experimental results of proposed technique are evaluated and validated for performance and quality analysis on magnetic resonance brain images, based on segmentation score, accuracy, sensitivity, specificity, and dice similarity index coefficient. The experimental results achieved 92.03% accuracy, 91.42% specificity, 92.36% sensitivity, and an average segmentation score between 0.82 and 0.93 demonstrating the effectiveness of the proposed technique for identifying normal and abnormal tissues from brain MR images. The experimental results also obtained an average of 93.79% dice similarity index coefficient, which indicates better overlap between the automated extracted tumor regions with manually extracted tumor region by radiologists.
Multispectral LiDAR Data for Land Cover Classification of Urban Areas
Morsy, Salem; Shaker, Ahmed; El-Rabbany, Ahmed
2017-01-01
Airborne Light Detection And Ranging (LiDAR) systems usually operate at a monochromatic wavelength measuring the range and the strength of the reflected energy (intensity) from objects. Recently, multispectral LiDAR sensors, which acquire data at different wavelengths, have emerged. This allows for recording of a diversity of spectral reflectance from objects. In this context, we aim to investigate the use of multispectral LiDAR data in land cover classification using two different techniques. The first is image-based classification, where intensity and height images are created from LiDAR points and then a maximum likelihood classifier is applied. The second is point-based classification, where ground filtering and Normalized Difference Vegetation Indices (NDVIs) computation are conducted. A dataset of an urban area located in Oshawa, Ontario, Canada, is classified into four classes: buildings, trees, roads and grass. An overall accuracy of up to 89.9% and 92.7% is achieved from image classification and 3D point classification, respectively. A radiometric correction model is also applied to the intensity data in order to remove the attenuation due to the system distortion and terrain height variation. The classification process is then repeated, and the results demonstrate that there are no significant improvements achieved in the overall accuracy. PMID:28445432
Multispectral LiDAR Data for Land Cover Classification of Urban Areas.
Morsy, Salem; Shaker, Ahmed; El-Rabbany, Ahmed
2017-04-26
Airborne Light Detection And Ranging (LiDAR) systems usually operate at a monochromatic wavelength measuring the range and the strength of the reflected energy (intensity) from objects. Recently, multispectral LiDAR sensors, which acquire data at different wavelengths, have emerged. This allows for recording of a diversity of spectral reflectance from objects. In this context, we aim to investigate the use of multispectral LiDAR data in land cover classification using two different techniques. The first is image-based classification, where intensity and height images are created from LiDAR points and then a maximum likelihood classifier is applied. The second is point-based classification, where ground filtering and Normalized Difference Vegetation Indices (NDVIs) computation are conducted. A dataset of an urban area located in Oshawa, Ontario, Canada, is classified into four classes: buildings, trees, roads and grass. An overall accuracy of up to 89.9% and 92.7% is achieved from image classification and 3D point classification, respectively. A radiometric correction model is also applied to the intensity data in order to remove the attenuation due to the system distortion and terrain height variation. The classification process is then repeated, and the results demonstrate that there are no significant improvements achieved in the overall accuracy.
Automatic interpretation of ERTS data for forest management
NASA Technical Reports Server (NTRS)
Kirvida, L.; Johnson, G. R.
1973-01-01
Automatic stratification of forested land from ERTS-1 data provides a valuable tool for resource management. The results are useful for wood product yield estimates, recreation and wild life management, forest inventory and forest condition monitoring. Automatic procedures based on both multi-spectral and spatial features are evaluated. With five classes, training and testing on the same samples, classification accuracy of 74% was achieved using the MSS multispectral features. When adding texture computed from 8 x 8 arrays, classification accuracy of 99% was obtained.
Forest tree species discrimination in western Himalaya using EO-1 Hyperion
NASA Astrophysics Data System (ADS)
George, Rajee; Padalia, Hitendra; Kushwaha, S. P. S.
2014-05-01
The information acquired in the narrow bands of hyperspectral remote sensing data has potential to capture plant species spectral variability, thereby improving forest tree species mapping. This study assessed the utility of spaceborne EO-1 Hyperion data in discrimination and classification of broadleaved evergreen and conifer forest tree species in western Himalaya. The pre-processing of 242 bands of Hyperion data resulted into 160 noise-free and vertical stripe corrected reflectance bands. Of these, 29 bands were selected through step-wise exclusion of bands (Wilk's Lambda). Spectral Angle Mapper (SAM) and Support Vector Machine (SVM) algorithms were applied to the selected bands to assess their effectiveness in classification. SVM was also applied to broadband data (Landsat TM) to compare the variation in classification accuracy. All commonly occurring six gregarious tree species, viz., white oak, brown oak, chir pine, blue pine, cedar and fir in western Himalaya could be effectively discriminated. SVM produced a better species classification (overall accuracy 82.27%, kappa statistic 0.79) than SAM (overall accuracy 74.68%, kappa statistic 0.70). It was noticed that classification accuracy achieved with Hyperion bands was significantly higher than Landsat TM bands (overall accuracy 69.62%, kappa statistic 0.65). Study demonstrated the potential utility of narrow spectral bands of Hyperion data in discriminating tree species in a hilly terrain.
Training set extension for SVM ensemble in P300-speller with familiar face paradigm.
Li, Qi; Shi, Kaiyang; Gao, Ning; Li, Jian; Bai, Ou
2018-03-27
P300-spellers are brain-computer interface (BCI)-based character input systems. Support vector machine (SVM) ensembles are trained with large-scale training sets and used as classifiers in these systems. However, the required large-scale training data necessitate a prolonged collection time for each subject, which results in data collected toward the end of the period being contaminated by the subject's fatigue. This study aimed to develop a method for acquiring more training data based on a collected small training set. A new method was developed in which two corresponding training datasets in two sequences are superposed and averaged to extend the training set. The proposed method was tested offline on a P300-speller with the familiar face paradigm. The SVM ensemble with extended training set achieved 85% classification accuracy for the averaged results of four sequences, and 100% for 11 sequences in the P300-speller. In contrast, the conventional SVM ensemble with non-extended training set achieved only 65% accuracy for four sequences, and 92% for 11 sequences. The SVM ensemble with extended training set achieves higher classification accuracies than the conventional SVM ensemble, which verifies that the proposed method effectively improves the classification performance of BCI P300-spellers, thus enhancing their practicality.
Aided diagnosis methods of breast cancer based on machine learning
NASA Astrophysics Data System (ADS)
Zhao, Yue; Wang, Nian; Cui, Xiaoyu
2017-08-01
In the field of medicine, quickly and accurately determining whether the patient is malignant or benign is the key to treatment. In this paper, K-Nearest Neighbor, Linear Discriminant Analysis, Logistic Regression were applied to predict the classification of thyroid,Her-2,PR,ER,Ki67,metastasis and lymph nodes in breast cancer, in order to recognize the benign and malignant breast tumors and achieve the purpose of aided diagnosis of breast cancer. The results showed that the highest classification accuracy of LDA was 88.56%, while the classification effect of KNN and Logistic Regression were better than that of LDA, the best accuracy reached 96.30%.
Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants.
Yousef, Malik; Saçar Demirci, Müşerref Duygu; Khalifa, Waleed; Allmer, Jens
2016-01-01
MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of ~95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection.
Baltzer, Pascal A T; Dietzel, Matthias; Kaiser, Werner A
2013-08-01
In the face of multiple available diagnostic criteria in MR-mammography (MRM), a practical algorithm for lesion classification is needed. Such an algorithm should be as simple as possible and include only important independent lesion features to differentiate benign from malignant lesions. This investigation aimed to develop a simple classification tree for differential diagnosis in MRM. A total of 1,084 lesions in standardised MRM with subsequent histological verification (648 malignant, 436 benign) were investigated. Seventeen lesion criteria were assessed by 2 readers in consensus. Classification analysis was performed using the chi-squared automatic interaction detection (CHAID) method. Results include the probability for malignancy for every descriptor combination in the classification tree. A classification tree incorporating 5 lesion descriptors with a depth of 3 ramifications (1, root sign; 2, delayed enhancement pattern; 3, border, internal enhancement and oedema) was calculated. Of all 1,084 lesions, 262 (40.4 %) and 106 (24.3 %) could be classified as malignant and benign with an accuracy above 95 %, respectively. Overall diagnostic accuracy was 88.4 %. The classification algorithm reduced the number of categorical descriptors from 17 to 5 (29.4 %), resulting in a high classification accuracy. More than one third of all lesions could be classified with accuracy above 95 %. • A practical algorithm has been developed to classify lesions found in MR-mammography. • A simple decision tree consisting of five criteria reaches high accuracy of 88.4 %. • Unique to this approach, each classification is associated with a diagnostic certainty. • Diagnostic certainty of greater than 95 % is achieved in 34 % of all cases.
Variability of wetland reflectance and its effect on automatic categorization of satellite imagery
NASA Technical Reports Server (NTRS)
Klemas, V. (Principal Investigator); Bartlett, D.
1977-01-01
The author has identified the following significant results. Land cover categorization of data from the same overpass in four test wetland areas was carried out using a four category classification system. The tests indicate that training data based on in situ reflectance measurements and atmospheric correction of LANDSAT data can produce comparable accuracy of categorization to that achieved using more than four wetlands cover categories (salt marsh cordgrass, salt hay, unvegetated, and water tidal flat) produced overall classification accuracies of 85% by conventional and relative radiance training and 81% by use of in situ measurements. Overall mapping accuracies were 76% and 72% respectively.
Zourmand, Alireza; Ting, Hua-Nong; Mirhassani, Seyed Mostafa
2013-03-01
Speech is one of the prevalent communication mediums for humans. Identifying the gender of a child speaker based on his/her speech is crucial in telecommunication and speech therapy. This article investigates the use of fundamental and formant frequencies from sustained vowel phonation to distinguish the gender of Malay children aged between 7 and 12 years. The Euclidean minimum distance and multilayer perceptron were used to classify the gender of 360 Malay children based on different combinations of fundamental and formant frequencies (F0, F1, F2, and F3). The Euclidean minimum distance with normalized frequency data achieved a classification accuracy of 79.44%, which was higher than that of the nonnormalized frequency data. Age-dependent modeling was used to improve the accuracy of gender classification. The Euclidean distance method obtained 84.17% based on the optimal classification accuracy for all age groups. The accuracy was further increased to 99.81% using multilayer perceptron based on mel-frequency cepstral coefficients. Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Hybrid feature selection algorithm using symmetrical uncertainty and a harmony search algorithm
NASA Astrophysics Data System (ADS)
Salameh Shreem, Salam; Abdullah, Salwani; Nazri, Mohd Zakree Ahmad
2016-04-01
Microarray technology can be used as an efficient diagnostic system to recognise diseases such as tumours or to discriminate between different types of cancers in normal tissues. This technology has received increasing attention from the bioinformatics community because of its potential in designing powerful decision-making tools for cancer diagnosis. However, the presence of thousands or tens of thousands of genes affects the predictive accuracy of this technology from the perspective of classification. Thus, a key issue in microarray data is identifying or selecting the smallest possible set of genes from the input data that can achieve good predictive accuracy for classification. In this work, we propose a two-stage selection algorithm for gene selection problems in microarray data-sets called the symmetrical uncertainty filter and harmony search algorithm wrapper (SU-HSA). Experimental results show that the SU-HSA is better than HSA in isolation for all data-sets in terms of the accuracy and achieves a lower number of genes on 6 out of 10 instances. Furthermore, the comparison with state-of-the-art methods shows that our proposed approach is able to obtain 5 (out of 10) new best results in terms of the number of selected genes and competitive results in terms of the classification accuracy.
Otitis Media Diagnosis for Developing Countries Using Tympanic Membrane Image-Analysis.
Myburgh, Hermanus C; van Zijl, Willemien H; Swanepoel, DeWet; Hellström, Sten; Laurent, Claude
2016-03-01
Otitis media is one of the most common childhood diseases worldwide, but because of lack of doctors and health personnel in developing countries it is often misdiagnosed or not diagnosed at all. This may lead to serious, and life-threatening complications. There is, thus a need for an automated computer based image-analyzing system that could assist in making accurate otitis media diagnoses anywhere. A method for automated diagnosis of otitis media is proposed. The method uses image-processing techniques to classify otitis media. The system is trained using high quality pre-assessed images of tympanic membranes, captured by digital video-otoscopes, and classifies undiagnosed images into five otitis media categories based on predefined signs. Several verification tests analyzed the classification capability of the method. An accuracy of 80.6% was achieved for images taken with commercial video-otoscopes, while an accuracy of 78.7% was achieved for images captured on-site with a low cost custom-made video-otoscope. The high accuracy of the proposed otitis media classification system compares well with the classification accuracy of general practitioners and pediatricians (~64% to 80%) using traditional otoscopes, and therefore holds promise for the future in making automated diagnosis of otitis media in medically underserved populations.
Otitis Media Diagnosis for Developing Countries Using Tympanic Membrane Image-Analysis
Myburgh, Hermanus C.; van Zijl, Willemien H.; Swanepoel, DeWet; Hellström, Sten; Laurent, Claude
2016-01-01
Background Otitis media is one of the most common childhood diseases worldwide, but because of lack of doctors and health personnel in developing countries it is often misdiagnosed or not diagnosed at all. This may lead to serious, and life-threatening complications. There is, thus a need for an automated computer based image-analyzing system that could assist in making accurate otitis media diagnoses anywhere. Methods A method for automated diagnosis of otitis media is proposed. The method uses image-processing techniques to classify otitis media. The system is trained using high quality pre-assessed images of tympanic membranes, captured by digital video-otoscopes, and classifies undiagnosed images into five otitis media categories based on predefined signs. Several verification tests analyzed the classification capability of the method. Findings An accuracy of 80.6% was achieved for images taken with commercial video-otoscopes, while an accuracy of 78.7% was achieved for images captured on-site with a low cost custom-made video-otoscope. Interpretation The high accuracy of the proposed otitis media classification system compares well with the classification accuracy of general practitioners and pediatricians (~ 64% to 80%) using traditional otoscopes, and therefore holds promise for the future in making automated diagnosis of otitis media in medically underserved populations. PMID:27077122
NASA Technical Reports Server (NTRS)
Fagan, Matthew E.; Defries, Ruth S.; Sesnie, Steven E.; Arroyo-Mora, J. Pablo; Soto, Carlomagno; Singh, Aditya; Townsend, Philip A.; Chazdon, Robin L.
2015-01-01
An efficient means to map tree plantations is needed to detect tropical land use change and evaluate reforestation projects. To analyze recent tree plantation expansion in northeastern Costa Rica, we examined the potential of combining moderate-resolution hyperspectral imagery (2005 HyMap mosaic) with multitemporal, multispectral data (Landsat) to accurately classify (1) general forest types and (2) tree plantations by species composition. Following a linear discriminant analysis to reduce data dimensionality, we compared four Random Forest classification models: hyperspectral data (HD) alone; HD plus interannual spectral metrics; HD plus a multitemporal forest regrowth classification; and all three models combined. The fourth, combined model achieved overall accuracy of 88.5%. Adding multitemporal data significantly improved classification accuracy (p less than 0.0001) of all forest types, although the effect on tree plantation accuracy was modest. The hyperspectral data alone classified six species of tree plantations with 75% to 93% producer's accuracy; adding multitemporal spectral data increased accuracy only for two species with dense canopies. Non-native tree species had higher classification accuracy overall and made up the majority of tree plantations in this landscape. Our results indicate that combining occasionally acquired hyperspectral data with widely available multitemporal satellite imagery enhances mapping and monitoring of reforestation in tropical landscapes.
NASA Astrophysics Data System (ADS)
Tao, C.-S.; Chen, S.-W.; Li, Y.-Z.; Xiao, S.-P.
2017-09-01
Land cover classification is an important application for polarimetric synthetic aperture radar (PolSAR) data utilization. Rollinvariant polarimetric features such as H / Ani / α / Span are commonly adopted in PolSAR land cover classification. However, target orientation diversity effect makes PolSAR images understanding and interpretation difficult. Only using the roll-invariant polarimetric features may introduce ambiguity in the interpretation of targets' scattering mechanisms and limit the followed classification accuracy. To address this problem, this work firstly focuses on hidden polarimetric feature mining in the rotation domain along the radar line of sight using the recently reported uniform polarimetric matrix rotation theory and the visualization and characterization tool of polarimetric coherence pattern. The former rotates the acquired polarimetric matrix along the radar line of sight and fully describes the rotation characteristics of each entry of the matrix. Sets of new polarimetric features are derived to describe the hidden scattering information of the target in the rotation domain. The latter extends the traditional polarimetric coherence at a given rotation angle to the rotation domain for complete interpretation. A visualization and characterization tool is established to derive new polarimetric features for hidden information exploration. Then, a classification scheme is developed combing both the selected new hidden polarimetric features in rotation domain and the commonly used roll-invariant polarimetric features with a support vector machine (SVM) classifier. Comparison experiments based on AIRSAR and multi-temporal UAVSAR data demonstrate that compared with the conventional classification scheme which only uses the roll-invariant polarimetric features, the proposed classification scheme achieves both higher classification accuracy and better robustness. For AIRSAR data, the overall classification accuracy with the proposed classification scheme is 94.91 %, while that with the conventional classification scheme is 93.70 %. Moreover, for multi-temporal UAVSAR data, the averaged overall classification accuracy with the proposed classification scheme is up to 97.08 %, which is much higher than the 87.79 % from the conventional classification scheme. Furthermore, for multitemporal PolSAR data, the proposed classification scheme can achieve better robustness. The comparison studies also clearly demonstrate that mining and utilization of hidden polarimetric features and information in the rotation domain can gain the added benefits for PolSAR land cover classification and provide a new vision for PolSAR image interpretation and application.
Real-Time Food Authentication Using a Miniature Mass Spectrometer.
Gerbig, Stefanie; Neese, Stephan; Penner, Alexander; Spengler, Bernhard; Schulz, Sabine
2017-10-17
Food adulteration is a threat to public health and the economy. In order to determine food adulteration efficiently, rapid and easy-to-use on-site analytical methods are needed. In this study, a miniaturized mass spectrometer in combination with three ambient ionization methods was used for food authentication. The chemical fingerprints of three milk types, five fish species, and two coffee types were measured using electrospray ionization, desorption electrospray ionization, and low temperature plasma ionization. Minimum sample preparation was needed for the analysis of liquid and solid food samples. Mass spectrometric data was processed using the laboratory-built software MS food classifier, which allows for the definition of specific food profiles from reference data sets using multivariate statistical methods and the subsequent classification of unknown data. Applicability of the obtained mass spectrometric fingerprints for food authentication was evaluated using different data processing methods, leave-10%-out cross-validation, and real-time classification of new data. Classification accuracy of 100% was achieved for the differentiation of milk types and fish species, and a classification accuracy of 96.4% was achieved for coffee types in cross-validation experiments. Measurement of two milk mixtures yielded correct classification of >94%. For real-time classification, the accuracies were comparable. Functionality of the software program and its performance is described. Processing time for a reference data set and a newly acquired spectrum was found to be 12 s and 2 s, respectively. These proof-of-principle experiments show that the combination of a miniaturized mass spectrometer, ambient ionization, and statistical analysis is suitable for on-site real-time food authentication.
Detection of distorted frames in retinal video-sequences via machine learning
NASA Astrophysics Data System (ADS)
Kolar, Radim; Liberdova, Ivana; Odstrcilik, Jan; Hracho, Michal; Tornow, Ralf P.
2017-07-01
This paper describes detection of distorted frames in retinal sequences based on set of global features extracted from each frame. The feature vector is consequently used in classification step, in which three types of classifiers are tested. The best classification accuracy 96% has been achieved with support vector machine approach.
McRoy, Susan; Jones, Sean; Kurmally, Adam
2016-09-01
This article examines methods for automated question classification applied to cancer-related questions that people have asked on the web. This work is part of a broader effort to provide automated question answering for health education. We created a new corpus of consumer-health questions related to cancer and a new taxonomy for those questions. We then compared the effectiveness of different statistical methods for developing classifiers, including weighted classification and resampling. Basic methods for building classifiers were limited by the high variability in the natural distribution of questions and typical refinement approaches of feature selection and merging categories achieved only small improvements to classifier accuracy. Best performance was achieved using weighted classification and resampling methods, the latter yielding an accuracy of F1 = 0.963. Thus, it would appear that statistical classifiers can be trained on natural data, but only if natural distributions of classes are smoothed. Such classifiers would be useful for automated question answering, for enriching web-based content, or assisting clinical professionals to answer questions. © The Author(s) 2015.
Hao, Pengyu; Wang, Li; Niu, Zheng
2015-01-01
A range of single classifiers have been proposed to classify crop types using time series vegetation indices, and hybrid classifiers are used to improve discriminatory power. Traditional fusion rules use the product of multi-single classifiers, but that strategy cannot integrate the classification output of machine learning classifiers. In this research, the performance of two hybrid strategies, multiple voting (M-voting) and probabilistic fusion (P-fusion), for crop classification using NDVI time series were tested with different training sample sizes at both pixel and object levels, and two representative counties in north Xinjiang were selected as study area. The single classifiers employed in this research included Random Forest (RF), Support Vector Machine (SVM), and See 5 (C 5.0). The results indicated that classification performance improved (increased the mean overall accuracy by 5%~10%, and reduced standard deviation of overall accuracy by around 1%) substantially with the training sample number, and when the training sample size was small (50 or 100 training samples), hybrid classifiers substantially outperformed single classifiers with higher mean overall accuracy (1%~2%). However, when abundant training samples (4,000) were employed, single classifiers could achieve good classification accuracy, and all classifiers obtained similar performances. Additionally, although object-based classification did not improve accuracy, it resulted in greater visual appeal, especially in study areas with a heterogeneous cropping pattern. PMID:26360597
Koch, Stefan P.; Hägele, Claudia; Haynes, John-Dylan; Heinz, Andreas; Schlagenhauf, Florian; Sterzer, Philipp
2015-01-01
Functional neuroimaging has provided evidence for altered function of mesolimbic circuits implicated in reward processing, first and foremost the ventral striatum, in patients with schizophrenia. While such findings based on significant group differences in brain activations can provide important insights into the pathomechanisms of mental disorders, the use of neuroimaging results from standard univariate statistical analysis for individual diagnosis has proven difficult. In this proof of concept study, we tested whether the predictive accuracy for the diagnostic classification of schizophrenia patients vs. healthy controls could be improved using multivariate pattern analysis (MVPA) of regional functional magnetic resonance imaging (fMRI) activation patterns for the anticipation of monetary reward. With a searchlight MVPA approach using support vector machine classification, we found that the diagnostic category could be predicted from local activation patterns in frontal, temporal, occipital and midbrain regions, with a maximal cluster peak classification accuracy of 93% for the right pallidum. Region-of-interest based MVPA for the ventral striatum achieved a maximal cluster peak accuracy of 88%, whereas the classification accuracy on the basis of standard univariate analysis reached only 75%. Moreover, using support vector regression we could additionally predict the severity of negative symptoms from ventral striatal activation patterns. These results show that MVPA can be used to substantially increase the accuracy of diagnostic classification on the basis of task-related fMRI signal patterns in a regionally specific way. PMID:25799236
Pilania, G.; Gubernatis, J. E.; Lookman, T.
2015-12-03
The role of dynamical (or Born effective) charges in classification of octet AB-type binary compounds between four-fold (zincblende/wurtzite crystal structures) and six-fold (rocksalt crystal structure) coordinated systems is discussed. We show that the difference in the dynamical charges of the fourfold and sixfold coordinated structures, in combination with Harrison’s polarity, serves as an excellent feature to classify the coordination of 82 sp–bonded binary octet compounds. We use a support vector machine classifier to estimate the average classification accuracy and the associated variance in our model where a decision boundary is learned in a supervised manner. Lastly, we compare the out-of-samplemore » classification accuracy achieved by our feature pair with those reported previously.« less
Skimming Digits: Neuromorphic Classification of Spike-Encoded Images
Cohen, Gregory K.; Orchard, Garrick; Leng, Sio-Hoi; Tapson, Jonathan; Benosman, Ryad B.; van Schaik, André
2016-01-01
The growing demands placed upon the field of computer vision have renewed the focus on alternative visual scene representations and processing paradigms. Silicon retinea provide an alternative means of imaging the visual environment, and produce frame-free spatio-temporal data. This paper presents an investigation into event-based digit classification using N-MNIST, a neuromorphic dataset created with a silicon retina, and the Synaptic Kernel Inverse Method (SKIM), a learning method based on principles of dendritic computation. As this work represents the first large-scale and multi-class classification task performed using the SKIM network, it explores different training patterns and output determination methods necessary to extend the original SKIM method to support multi-class problems. Making use of SKIM networks applied to real-world datasets, implementing the largest hidden layer sizes and simultaneously training the largest number of output neurons, the classification system achieved a best-case accuracy of 92.87% for a network containing 10,000 hidden layer neurons. These results represent the highest accuracies achieved against the dataset to date and serve to validate the application of the SKIM method to event-based visual classification tasks. Additionally, the study found that using a square pulse as the supervisory training signal produced the highest accuracy for most output determination methods, but the results also demonstrate that an exponential pattern is better suited to hardware implementations as it makes use of the simplest output determination method based on the maximum value. PMID:27199646
Mikhno, Arthur; Nuevo, Pablo Martinez; Devanand, Davangere P.; Parsey, Ramin V.; Laine, Andrew F.
2013-01-01
Multimodality classification of Alzheimer’s disease (AD) and its prodromal stage, Mild Cognitive Impairment (MCI), is of interest to the medical community. We improve on prior classification frameworks by incorporating multiple features from MRI and PET data obtained with multiple radioligands, fluorodeoxyglucose (FDG) and Pittsburg compound B (PIB). We also introduce a new MRI feature, invariant shape descriptors based on 3D Zernike moments applied to the hippocampus region. Classification performance is evaluated on data from 17 healthy controls (CTR), 22 MCI, and 17 AD subjects. Zernike significantly outperforms volume, accuracy (Zernike to volume): CTR/AD (90.7% to 71.6%), CTR/MCI (76.2% to 60.0%), MCI/AD (84.3% to 65.5%). Zernike also provides comparable and complementary performance to PET. Optimal accuracy is achieved when Zernike and PET features are combined (accuracy, specificity, sensitivity), CTR/AD (98.8%, 99.5%, 98.1%), CTR/MCI (84.3%, 82.9%, 85.9%) and MCI/AD (93.3%, 93.6%, 93.3%). PMID:24576927
Mikhno, Arthur; Nuevo, Pablo Martinez; Devanand, Davangere P; Parsey, Ramin V; Laine, Andrew F
2012-01-01
Multimodality classification of Alzheimer's disease (AD) and its prodromal stage, Mild Cognitive Impairment (MCI), is of interest to the medical community. We improve on prior classification frameworks by incorporating multiple features from MRI and PET data obtained with multiple radioligands, fluorodeoxyglucose (FDG) and Pittsburg compound B (PIB). We also introduce a new MRI feature, invariant shape descriptors based on 3D Zernike moments applied to the hippocampus region. Classification performance is evaluated on data from 17 healthy controls (CTR), 22 MCI, and 17 AD subjects. Zernike significantly outperforms volume, accuracy (Zernike to volume): CTR/AD (90.7% to 71.6%), CTR/MCI (76.2% to 60.0%), MCI/AD (84.3% to 65.5%). Zernike also provides comparable and complementary performance to PET. Optimal accuracy is achieved when Zernike and PET features are combined (accuracy, specificity, sensitivity), CTR/AD (98.8%, 99.5%, 98.1%), CTR/MCI (84.3%, 82.9%, 85.9%) and MCI/AD (93.3%, 93.6%, 93.3%).
Blob-level active-passive data fusion for Benthic classification
NASA Astrophysics Data System (ADS)
Park, Joong Yong; Kalluri, Hemanth; Mathur, Abhinav; Ramnath, Vinod; Kim, Minsu; Aitken, Jennifer; Tuell, Grady
2012-06-01
We extend the data fusion pixel level to the more semantically meaningful blob level, using the mean-shift algorithm to form labeled blobs having high similarity in the feature domain, and connectivity in the spatial domain. We have also developed Bhattacharyya Distance (BD) and rule-based classifiers, and have implemented these higher-level data fusion algorithms into the CZMIL Data Processing System. Applying these new algorithms to recent SHOALS and CASI data at Plymouth Harbor, Massachusetts, we achieved improved benthic classification accuracies over those produced with either single sensor, or pixel-level fusion strategies. These results appear to validate the hypothesis that classification accuracy may be generally improved by adopting higher spatial and semantic levels of fusion.
Automatic photointerpretation for plant species and stress identification (ERTS-A1)
NASA Technical Reports Server (NTRS)
Swanlund, G. D. (Principal Investigator); Kirvida, L.; Johnson, G. R.
1973-01-01
The author has identified the following significant results. Automatic stratification of forested land from ERTS-1 data provides a valuable tool for resource management. The results are useful for wood product yield estimates, recreation and wildlife management, forest inventory, and forest condition monitoring. Automatic procedures based on both multispectral and spatial features are evaluated. With five classes, training and testing on the same samples, classification accuracy of 74 percent was achieved using the MSS multispectral features. When adding texture computed from 8 x 8 arrays, classification accuracy of 90 percent was obtained.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Yongjun; Lim, Jonghyuck; Kim, Namkug
2013-05-15
Purpose: To investigate the effect of using different computed tomography (CT) scanners on the accuracy of high-resolution CT (HRCT) images in classifying regional disease patterns in patients with diffuse lung disease, support vector machine (SVM) and Bayesian classifiers were applied to multicenter data. Methods: Two experienced radiologists marked sets of 600 rectangular 20 Multiplication-Sign 20 pixel regions of interest (ROIs) on HRCT images obtained from two scanners (GE and Siemens), including 100 ROIs for each of local patterns of lungs-normal lung and five of regional pulmonary disease patterns (ground-glass opacity, reticular opacity, honeycombing, emphysema, and consolidation). Each ROI was assessedmore » using 22 quantitative features belonging to one of the following descriptors: histogram, gradient, run-length, gray level co-occurrence matrix, low-attenuation area cluster, and top-hat transform. For automatic classification, a Bayesian classifier and a SVM classifier were compared under three different conditions. First, classification accuracies were estimated using data from each scanner. Next, data from the GE and Siemens scanners were used for training and testing, respectively, and vice versa. Finally, all ROI data were integrated regardless of the scanner type and were then trained and tested together. All experiments were performed based on forward feature selection and fivefold cross-validation with 20 repetitions. Results: For each scanner, better classification accuracies were achieved with the SVM classifier than the Bayesian classifier (92% and 82%, respectively, for the GE scanner; and 92% and 86%, respectively, for the Siemens scanner). The classification accuracies were 82%/72% for training with GE data and testing with Siemens data, and 79%/72% for the reverse. The use of training and test data obtained from the HRCT images of different scanners lowered the classification accuracy compared to the use of HRCT images from the same scanner. For integrated ROI data obtained from both scanners, the classification accuracies with the SVM and Bayesian classifiers were 92% and 77%, respectively. The selected features resulting from the classification process differed by scanner, with more features included for the classification of the integrated HRCT data than for the classification of the HRCT data from each scanner. For the integrated data, consisting of HRCT images of both scanners, the classification accuracy based on the SVM was statistically similar to the accuracy of the data obtained from each scanner. However, the classification accuracy of the integrated data using the Bayesian classifier was significantly lower than the classification accuracy of the ROI data of each scanner. Conclusions: The use of an integrated dataset along with a SVM classifier rather than a Bayesian classifier has benefits in terms of the classification accuracy of HRCT images acquired with more than one scanner. This finding is of relevance in studies involving large number of images, as is the case in a multicenter trial with different scanners.« less
Cao, Jianfang; Cui, Hongyan; Shi, Hao; Jiao, Lijuan
2016-01-01
A back-propagation (BP) neural network can solve complicated random nonlinear mapping problems; therefore, it can be applied to a wide range of problems. However, as the sample size increases, the time required to train BP neural networks becomes lengthy. Moreover, the classification accuracy decreases as well. To improve the classification accuracy and runtime efficiency of the BP neural network algorithm, we proposed a parallel design and realization method for a particle swarm optimization (PSO)-optimized BP neural network based on MapReduce on the Hadoop platform using both the PSO algorithm and a parallel design. The PSO algorithm was used to optimize the BP neural network's initial weights and thresholds and improve the accuracy of the classification algorithm. The MapReduce parallel programming model was utilized to achieve parallel processing of the BP algorithm, thereby solving the problems of hardware and communication overhead when the BP neural network addresses big data. Datasets on 5 different scales were constructed using the scene image library from the SUN Database. The classification accuracy of the parallel PSO-BP neural network algorithm is approximately 92%, and the system efficiency is approximately 0.85, which presents obvious advantages when processing big data. The algorithm proposed in this study demonstrated both higher classification accuracy and improved time efficiency, which represents a significant improvement obtained from applying parallel processing to an intelligent algorithm on big data.
Hussain, Shaista; Basu, Arindam
2016-01-01
The development of power-efficient neuromorphic devices presents the challenge of designing spike pattern classification algorithms which can be implemented on low-precision hardware and can also achieve state-of-the-art performance. In our pursuit of meeting this challenge, we present a pattern classification model which uses a sparse connection matrix and exploits the mechanism of nonlinear dendritic processing to achieve high classification accuracy. A rate-based structural learning rule for multiclass classification is proposed which modifies a connectivity matrix of binary synaptic connections by choosing the best “k” out of “d” inputs to make connections on every dendritic branch (k < < d). Because learning only modifies connectivity, the model is well suited for implementation in neuromorphic systems using address-event representation (AER). We develop an ensemble method which combines several dendritic classifiers to achieve enhanced generalization over individual classifiers. We have two major findings: (1) Our results demonstrate that an ensemble created with classifiers comprising moderate number of dendrites performs better than both ensembles of perceptrons and of complex dendritic trees. (2) In order to determine the moderate number of dendrites required for a specific classification problem, a two-step solution is proposed. First, an adaptive approach is proposed which scales the relative size of the dendritic trees of neurons for each class. It works by progressively adding dendrites with fixed number of synapses to the network, thereby allocating synaptic resources as per the complexity of the given problem. As a second step, theoretical capacity calculations are used to convert each neuronal dendritic tree to its optimal topology where dendrites of each class are assigned different number of synapses. The performance of the model is evaluated on classification of handwritten digits from the benchmark MNIST dataset and compared with other spike classifiers. We show that our system can achieve classification accuracy within 1 − 2% of other reported spike-based classifiers while using much less synaptic resources (only 7%) compared to that used by other methods. Further, an ensemble classifier created with adaptively learned sizes can attain accuracy of 96.4% which is at par with the best reported performance of spike-based classifiers. Moreover, the proposed method achieves this by using about 20% of the synapses used by other spike algorithms. We also present results of applying our algorithm to classify the MNIST-DVS dataset collected from a real spike-based image sensor and show results comparable to the best reported ones (88.1% accuracy). For VLSI implementations, we show that the reduced synaptic memory can save upto 4X area compared to conventional crossbar topologies. Finally, we also present a biologically realistic spike-based version for calculating the correlations required by the structural learning rule and demonstrate the correspondence between the rate-based and spike-based methods of learning. PMID:27065782
NASA Astrophysics Data System (ADS)
Georganos, Stefanos; Grippa, Tais; Vanhuysse, Sabine; Lennert, Moritz; Shimoni, Michal; Wolff, Eléonore
2017-10-01
This study evaluates the impact of three Feature Selection (FS) algorithms in an Object Based Image Analysis (OBIA) framework for Very-High-Resolution (VHR) Land Use-Land Cover (LULC) classification. The three selected FS algorithms, Correlation Based Selection (CFS), Mean Decrease in Accuracy (MDA) and Random Forest (RF) based Recursive Feature Elimination (RFE), were tested on Support Vector Machine (SVM), K-Nearest Neighbor, and Random Forest (RF) classifiers. The results demonstrate that the accuracy of SVM and KNN classifiers are the most sensitive to FS. The RF appeared to be more robust to high dimensionality, although a significant increase in accuracy was found by using the RFE method. In terms of classification accuracy, SVM performed the best using FS, followed by RF and KNN. Finally, only a small number of features is needed to achieve the highest performance using each classifier. This study emphasizes the benefits of rigorous FS for maximizing performance, as well as for minimizing model complexity and interpretation.
NASA Astrophysics Data System (ADS)
Dou, P.
2017-12-01
Guangzhou has experienced a rapid urbanization period called "small change in three years and big change in five years" since the reform of China, resulting in significant land use/cover changes(LUC). To overcome the disadvantages of single classifier for remote sensing image classification accuracy, a multiple classifier system (MCS) is proposed to improve the quality of remote sensing image classification. The new method combines advantages of different learning algorithms, and achieves higher accuracy (88.12%) than any single classifier did. With the proposed MCS, land use/cover (LUC) on Landsat images from 1987 to 2015 was obtained, and the LUCs were used on three watersheds (Shijing river, Chebei stream, and Shahe stream) to estimate the impact of urbanization on water flood. The results show that with the high accuracy LUC, the uncertainty in flood simulations are reduced effectively (for Shijing river, Chebei stream, and Shahe stream, the uncertainty reduced 15.5%, 17.3% and 19.8% respectively).
SVM classifier on chip for melanoma detection.
Afifi, Shereen; GholamHosseini, Hamid; Sinha, Roopak
2017-07-01
Support Vector Machine (SVM) is a common classifier used for efficient classification with high accuracy. SVM shows high accuracy for classifying melanoma (skin cancer) clinical images within computer-aided diagnosis systems used by skin cancer specialists to detect melanoma early and save lives. We aim to develop a medical low-cost handheld device that runs a real-time embedded SVM-based diagnosis system for use in primary care for early detection of melanoma. In this paper, an optimized SVM classifier is implemented onto a recent FPGA platform using the latest design methodology to be embedded into the proposed device for realizing online efficient melanoma detection on a single system on chip/device. The hardware implementation results demonstrate a high classification accuracy of 97.9% and a significant acceleration factor of 26 from equivalent software implementation on an embedded processor, with 34% of resources utilization and 2 watts for power consumption. Consequently, the implemented system meets crucial embedded systems constraints of high performance and low cost, resources utilization and power consumption, while achieving high classification accuracy.
Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders.
Subasi, Abdulhamit
2013-06-01
Support vector machine (SVM) is an extensively used machine learning method with many biomedical signal classification applications. In this study, a novel PSO-SVM model has been proposed that hybridized the particle swarm optimization (PSO) and SVM to improve the EMG signal classification accuracy. This optimization mechanism involves kernel parameter setting in the SVM training procedure, which significantly influences the classification accuracy. The experiments were conducted on the basis of EMG signal to classify into normal, neurogenic or myopathic. In the proposed method the EMG signals were decomposed into the frequency sub-bands using discrete wavelet transform (DWT) and a set of statistical features were extracted from these sub-bands to represent the distribution of wavelet coefficients. The obtained results obviously validate the superiority of the SVM method compared to conventional machine learning methods, and suggest that further significant enhancements in terms of classification accuracy can be achieved by the proposed PSO-SVM classification system. The PSO-SVM yielded an overall accuracy of 97.41% on 1200 EMG signals selected from 27 subject records against 96.75%, 95.17% and 94.08% for the SVM, the k-NN and the RBF classifiers, respectively. PSO-SVM is developed as an efficient tool so that various SVMs can be used conveniently as the core of PSO-SVM for diagnosis of neuromuscular disorders. Copyright © 2013 Elsevier Ltd. All rights reserved.
High-accuracy user identification using EEG biometrics.
Koike-Akino, Toshiaki; Mahajan, Ruhi; Marks, Tim K; Ye Wang; Watanabe, Shinji; Tuzel, Oncel; Orlik, Philip
2016-08-01
We analyze brain waves acquired through a consumer-grade EEG device to investigate its capabilities for user identification and authentication. First, we show the statistical significance of the P300 component in event-related potential (ERP) data from 14-channel EEGs across 25 subjects. We then apply a variety of machine learning techniques, comparing the user identification performance of various different combinations of a dimensionality reduction technique followed by a classification algorithm. Experimental results show that an identification accuracy of 72% can be achieved using only a single 800 ms ERP epoch. In addition, we demonstrate that the user identification accuracy can be significantly improved to more than 96.7% by joint classification of multiple epochs.
Leveraging Long-term Seismic Catalogs for Automated Real-time Event Classification
NASA Astrophysics Data System (ADS)
Linville, L.; Draelos, T.; Pankow, K. L.; Young, C. J.; Alvarez, S.
2017-12-01
We investigate the use of labeled event types available through reviewed seismic catalogs to produce automated event labels on new incoming data from the crustal region spanned by the cataloged events. Using events cataloged by the University of Utah Seismograph Stations between October, 2012 and June, 2017, we calculate the spectrogram for a time window that spans the duration of each event as seen on individual stations, resulting in 110k event spectrograms (50% local earthquakes examples, 50% quarry blasts examples). Using 80% of the randomized example events ( 90k), a classifier is trained to distinguish between local earthquakes and quarry blasts. We explore variations of deep learning classifiers, incorporating elements of convolutional and recurrent neural networks. Using a single-layer Long Short Term Memory recurrent neural network, we achieve 92% accuracy on the classification task on the remaining 20K test examples. Leveraging the decisions from a group of stations that detected the same event by using the median of all classifications in the group increases the model accuracy to 96%. Additional data with equivalent processing from 500 more recently cataloged events (July, 2017), achieves the same accuracy as our test data on both single-station examples and multi-station medians, suggesting that the model can maintain accurate and stable classification rates on real-time automated events local to the University of Utah Seismograph Stations, with potentially minimal levels of re-training through time.
Hotz, Christine S; Templeton, Steven J; Christopher, Mary M
2005-03-01
A rule-based expert system using CLIPS programming language was created to classify body cavity effusions as transudates, modified transudates, exudates, chylous, and hemorrhagic effusions. The diagnostic accuracy of the rule-based system was compared with that produced by 2 machine-learning methods: Rosetta, a rough sets algorithm and RIPPER, a rule-induction method. Results of 508 body cavity fluid analyses (canine, feline, equine) obtained from the University of California-Davis Veterinary Medical Teaching Hospital computerized patient database were used to test CLIPS and to test and train RIPPER and Rosetta. The CLIPS system, using 17 rules, achieved an accuracy of 93.5% compared with pathologist consensus diagnoses. Rosetta accurately classified 91% of effusions by using 5,479 rules. RIPPER achieved the greatest accuracy (95.5%) using only 10 rules. When the original rules of the CLIPS application were replaced with those of RIPPER, the accuracy rates were identical. These results suggest that both rule-based expert systems and machine-learning methods hold promise for the preliminary classification of body fluids in the clinical laboratory.
Rule-driven defect detection in CT images of hardwood logs
Erol Sarigul; A. Lynn Abbott; Daniel L. Schmoldt
2000-01-01
This paper deals with automated detection and identification of internal defects in hardwood logs using computed tomography (CT) images. We have developed a system that employs artificial neural networks to perform tentative classification of logs on a pixel-by-pixel basis. This approach achieves a high level of classification accuracy for several hardwood species (...
Bashir, Saba; Qamar, Usman; Khan, Farhan Hassan
2016-02-01
Accuracy plays a vital role in the medical field as it concerns with the life of an individual. Extensive research has been conducted on disease classification and prediction using machine learning techniques. However, there is no agreement on which classifier produces the best results. A specific classifier may be better than others for a specific dataset, but another classifier could perform better for some other dataset. Ensemble of classifiers has been proved to be an effective way to improve classification accuracy. In this research we present an ensemble framework with multi-layer classification using enhanced bagging and optimized weighting. The proposed model called "HM-BagMoov" overcomes the limitations of conventional performance bottlenecks by utilizing an ensemble of seven heterogeneous classifiers. The framework is evaluated on five different heart disease datasets, four breast cancer datasets, two diabetes datasets, two liver disease datasets and one hepatitis dataset obtained from public repositories. The analysis of the results show that ensemble framework achieved the highest accuracy, sensitivity and F-Measure when compared with individual classifiers for all the diseases. In addition to this, the ensemble framework also achieved the highest accuracy when compared with the state of the art techniques. An application named "IntelliHealth" is also developed based on proposed model that may be used by hospitals/doctors for diagnostic advice. Copyright © 2015 Elsevier Inc. All rights reserved.
Subject-Adaptive Real-Time Sleep Stage Classification Based on Conditional Random Field
Luo, Gang; Min, Wanli
2007-01-01
Sleep staging is the pattern recognition task of classifying sleep recordings into sleep stages. This task is one of the most important steps in sleep analysis. It is crucial for the diagnosis and treatment of various sleep disorders, and also relates closely to brain-machine interfaces. We report an automatic, online sleep stager using electroencephalogram (EEG) signal based on a recently-developed statistical pattern recognition method, conditional random field, and novel potential functions that have explicit physical meanings. Using sleep recordings from human subjects, we show that the average classification accuracy of our sleep stager almost approaches the theoretical limit and is about 8% higher than that of existing systems. Moreover, for a new subject snew with limited training data Dnew, we perform subject adaptation to improve classification accuracy. Our idea is to use the knowledge learned from old subjects to obtain from Dnew a regulated estimate of CRF’s parameters. Using sleep recordings from human subjects, we show that even without any Dnew, our sleep stager can achieve an average classification accuracy of 70% on snew. This accuracy increases with the size of Dnew and eventually becomes close to the theoretical limit. PMID:18693884
Trakoolwilaiwan, Thanawin; Behboodi, Bahareh; Lee, Jaeseok; Kim, Kyungsoo; Choi, Ji-Woong
2018-01-01
The aim of this work is to develop an effective brain-computer interface (BCI) method based on functional near-infrared spectroscopy (fNIRS). In order to improve the performance of the BCI system in terms of accuracy, the ability to discriminate features from input signals and proper classification are desired. Previous studies have mainly extracted features from the signal manually, but proper features need to be selected carefully. To avoid performance degradation caused by manual feature selection, we applied convolutional neural networks (CNNs) as the automatic feature extractor and classifier for fNIRS-based BCI. In this study, the hemodynamic responses evoked by performing rest, right-, and left-hand motor execution tasks were measured on eight healthy subjects to compare performances. Our CNN-based method provided improvements in classification accuracy over conventional methods employing the most commonly used features of mean, peak, slope, variance, kurtosis, and skewness, classified by support vector machine (SVM) and artificial neural network (ANN). Specifically, up to 6.49% and 3.33% improvement in classification accuracy was achieved by CNN compared with SVM and ANN, respectively.
NASA Astrophysics Data System (ADS)
Roychowdhury, K.
2016-06-01
Landcover is the easiest detectable indicator of human interventions on land. Urban and peri-urban areas present a complex combination of landcover, which makes classification challenging. This paper assesses the different methods of classifying landcover using dual polarimetric Sentinel-1 data collected during monsoon (July) and winter (December) months of 2015. Four broad landcover classes such as built up areas, water bodies and wetlands, vegetation and open spaces of Kolkata and its surrounding regions were identified. Polarimetric analyses were conducted on Single Look Complex (SLC) data of the region while ground range detected (GRD) data were used for spectral and spatial classification. Unsupervised classification by means of K-Means clustering used backscatter values and was able to identify homogenous landcovers over the study area. The results produced an overall accuracy of less than 50% for both the seasons. Higher classification accuracy (around 70%) was achieved by adding texture variables as inputs along with the backscatter values. However, the accuracy of classification increased significantly with polarimetric analyses. The overall accuracy was around 80% in Wishart H-A-Alpha unsupervised classification. The method was useful in identifying urban areas due to their double-bounce scattering and vegetated areas, which have more random scattering. Normalized Difference Built-up index (NDBI) and Normalized Difference Vegetation Index (NDVI) obtained from Landsat 8 data over the study area were used to verify vegetation and urban classes. The study compares the accuracies of different methods of classifying landcover using medium resolution SAR data in a complex urban area and suggests that polarimetric analyses present the most accurate results for urban and suburban areas.
Paraskevopoulou, Sivylla E; Wu, Di; Eftekhar, Amir; Constandinou, Timothy G
2014-09-30
This work presents a novel unsupervised algorithm for real-time adaptive clustering of neural spike data (spike sorting). The proposed Hierarchical Adaptive Means (HAM) clustering method combines centroid-based clustering with hierarchical cluster connectivity to classify incoming spikes using groups of clusters. It is described how the proposed method can adaptively track the incoming spike data without requiring any past history, iteration or training and autonomously determines the number of spike classes. Its performance (classification accuracy) has been tested using multiple datasets (both simulated and recorded) achieving a near-identical accuracy compared to k-means (using 10-iterations and provided with the number of spike classes). Also, its robustness in applying to different feature extraction methods has been demonstrated by achieving classification accuracies above 80% across multiple datasets. Last but crucially, its low complexity, that has been quantified through both memory and computation requirements makes this method hugely attractive for future hardware implementation. Copyright © 2014 Elsevier B.V. All rights reserved.
Sub-pixel image classification for forest types in East Texas
NASA Astrophysics Data System (ADS)
Westbrook, Joey
Sub-pixel classification is the extraction of information about the proportion of individual materials of interest within a pixel. Landcover classification at the sub-pixel scale provides more discrimination than traditional per-pixel multispectral classifiers for pixels where the material of interest is mixed with other materials. It allows for the un-mixing of pixels to show the proportion of each material of interest. The materials of interest for this study are pine, hardwood, mixed forest and non-forest. The goal of this project was to perform a sub-pixel classification, which allows a pixel to have multiple labels, and compare the result to a traditional supervised classification, which allows a pixel to have only one label. The satellite image used was a Landsat 5 Thematic Mapper (TM) scene of the Stephen F. Austin Experimental Forest in Nacogdoches County, Texas and the four cover type classes are pine, hardwood, mixed forest and non-forest. Once classified, a multi-layer raster datasets was created that comprised four raster layers where each layer showed the percentage of that cover type within the pixel area. Percentage cover type maps were then produced and the accuracy of each was assessed using a fuzzy error matrix for the sub-pixel classifications, and the results were compared to the supervised classification in which a traditional error matrix was used. The overall accuracy of the sub-pixel classification using the aerial photo for both training and reference data had the highest (65% overall) out of the three sub-pixel classifications. This was understandable because the analyst can visually observe the cover types actually on the ground for training data and reference data, whereas using the FIA (Forest Inventory and Analysis) plot data, the analyst must assume that an entire pixel contains the exact percentage of a cover type found in a plot. An increase in accuracy was found after reclassifying each sub-pixel classification from nine classes with 10 percent interval each to five classes with 20 percent interval each. When compared to the supervised classification which has a satisfactory overall accuracy of 90%, none of the sub-pixel classification achieved the same level. However, since traditional per-pixel classifiers assign only one label to pixels throughout the landscape while sub-pixel classifications assign multiple labels to each pixel, the traditional 85% accuracy of acceptance for pixel-based classifications should not apply to sub-pixel classifications. More research is needed in order to define the level of accuracy that is deemed acceptable for sub-pixel classifications.
NASA Astrophysics Data System (ADS)
Xie, W.-J.; Zhang, L.; Chen, H.-P.; Zhou, J.; Mao, W.-J.
2018-04-01
The purpose of carrying out national geographic conditions monitoring is to obtain information of surface changes caused by human social and economic activities, so that the geographic information can be used to offer better services for the government, enterprise and public. Land cover data contains detailed geographic conditions information, thus has been listed as one of the important achievements in the national geographic conditions monitoring project. At present, the main issue of the production of the land cover data is about how to improve the classification accuracy. For the land cover data quality inspection and acceptance, classification accuracy is also an important check point. So far, the classification accuracy inspection is mainly based on human-computer interaction or manual inspection in the project, which are time consuming and laborious. By harnessing the automatic high-resolution remote sensing image change detection technology based on the ERDAS IMAGINE platform, this paper carried out the classification accuracy inspection test of land cover data in the project, and presented a corresponding technical route, which includes data pre-processing, change detection, result output and information extraction. The result of the quality inspection test shows the effectiveness of the technical route, which can meet the inspection needs for the two typical errors, that is, missing and incorrect update error, and effectively reduces the work intensity of human-computer interaction inspection for quality inspectors, and also provides a technical reference for the data production and quality control of the land cover data.
Cao, Jianfang; Cui, Hongyan; Shi, Hao; Jiao, Lijuan
2016-01-01
A back-propagation (BP) neural network can solve complicated random nonlinear mapping problems; therefore, it can be applied to a wide range of problems. However, as the sample size increases, the time required to train BP neural networks becomes lengthy. Moreover, the classification accuracy decreases as well. To improve the classification accuracy and runtime efficiency of the BP neural network algorithm, we proposed a parallel design and realization method for a particle swarm optimization (PSO)-optimized BP neural network based on MapReduce on the Hadoop platform using both the PSO algorithm and a parallel design. The PSO algorithm was used to optimize the BP neural network’s initial weights and thresholds and improve the accuracy of the classification algorithm. The MapReduce parallel programming model was utilized to achieve parallel processing of the BP algorithm, thereby solving the problems of hardware and communication overhead when the BP neural network addresses big data. Datasets on 5 different scales were constructed using the scene image library from the SUN Database. The classification accuracy of the parallel PSO-BP neural network algorithm is approximately 92%, and the system efficiency is approximately 0.85, which presents obvious advantages when processing big data. The algorithm proposed in this study demonstrated both higher classification accuracy and improved time efficiency, which represents a significant improvement obtained from applying parallel processing to an intelligent algorithm on big data. PMID:27304987
Multiclass cancer diagnosis using tumor gene expression signatures
Ramaswamy, S.; Tamayo, P.; Rifkin, R.; ...
2001-12-11
The optimal treatment of patients with cancer depends on establishing accurate diagnoses by using a complex combination of clinical and histopathological data. In some instances, this task is difficult or impossible because of atypical clinical presentation or histopathology. To determine whether the diagnosis of multiple common adult malignancies could be achieved purely by molecular classification, we subjected 218 tumor samples, spanning 14 common tumor types, and 90 normal tissue samples to oligonucleotide microarray gene expression analysis. The expression levels of 16,063 genes and expressed sequence tags were used to evaluate the accuracy of a multiclass classifier based on a supportmore » vector machine algorithm. Overall classification accuracy was 78%, far exceeding the accuracy of random classification (9%). Poorly differentiated cancers resulted in low-confidence predictions and could not be accurately classified according to their tissue of origin, indicating that they are molecularly distinct entities with dramatically different gene expression patterns compared with their well differentiated counterparts. Taken together, these results demonstrate the feasibility of accurate, multiclass molecular cancer classification and suggest a strategy for future clinical implementation of molecular cancer diagnostics.« less
An incremental knowledge assimilation system (IKAS) for mine detection
NASA Astrophysics Data System (ADS)
Porway, Jake; Raju, Chaitanya; Varadarajan, Karthik Mahesh; Nguyen, Hieu; Yadegar, Joseph
2010-04-01
In this paper we present an adaptive incremental learning system for underwater mine detection and classification that utilizes statistical models of seabed texture and an adaptive nearest-neighbor classifier to identify varied underwater targets in many different environments. The first stage of processing uses our Background Adaptive ANomaly detector (BAAN), which identifies statistically likely target regions using Gabor filter responses over the image. Using this information, BAAN classifies the background type and updates its detection using background-specific parameters. To perform classification, a Fully Adaptive Nearest Neighbor (FAAN) determines the best label for each detection. FAAN uses an extremely fast version of Nearest Neighbor to find the most likely label for the target. The classifier perpetually assimilates new and relevant information into its existing knowledge database in an incremental fashion, allowing improved classification accuracy and capturing concept drift in the target classes. Experiments show that the system achieves >90% classification accuracy on underwater mine detection tasks performed on synthesized datasets provided by the Office of Naval Research. We have also demonstrated that the system can incrementally improve its detection accuracy by constantly learning from new samples.
The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images
Mitry, Danny; Zutis, Kris; Dhillon, Baljean; Peto, Tunde; Hayat, Shabina; Khaw, Kay-Tee; Morgan, James E.; Moncur, Wendy; Trucco, Emanuele; Foster, Paul J.
2016-01-01
Purpose Crowdsourcing is based on outsourcing computationally intensive tasks to numerous individuals in the online community who have no formal training. Our aim was to develop a novel online tool designed to facilitate large-scale annotation of digital retinal images, and to assess the accuracy of crowdsource grading using this tool, comparing it to expert classification. Methods We used 100 retinal fundus photograph images with predetermined disease criteria selected by two experts from a large cohort study. The Amazon Mechanical Turk Web platform was used to drive traffic to our site so anonymous workers could perform a classification and annotation task of the fundus photographs in our dataset after a short training exercise. Three groups were assessed: masters only, nonmasters only and nonmasters with compulsory training. We calculated the sensitivity, specificity, and area under the curve (AUC) of receiver operating characteristic (ROC) plots for all classifications compared to expert grading, and used the Dice coefficient and consensus threshold to assess annotation accuracy. Results In total, we received 5389 annotations for 84 images (excluding 16 training images) in 2 weeks. A specificity and sensitivity of 71% (95% confidence interval [CI], 69%–74%) and 87% (95% CI, 86%–88%) was achieved for all classifications. The AUC in this study for all classifications combined was 0.93 (95% CI, 0.91–0.96). For image annotation, a maximal Dice coefficient (∼0.6) was achieved with a consensus threshold of 0.25. Conclusions This study supports the hypothesis that annotation of abnormalities in retinal images by ophthalmologically naive individuals is comparable to expert annotation. The highest AUC and agreement with expert annotation was achieved in the nonmasters with compulsory training group. Translational Relevance The use of crowdsourcing as a technique for retinal image analysis may be comparable to expert graders and has the potential to deliver timely, accurate, and cost-effective image analysis. PMID:27668130
The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images.
Mitry, Danny; Zutis, Kris; Dhillon, Baljean; Peto, Tunde; Hayat, Shabina; Khaw, Kay-Tee; Morgan, James E; Moncur, Wendy; Trucco, Emanuele; Foster, Paul J
2016-09-01
Crowdsourcing is based on outsourcing computationally intensive tasks to numerous individuals in the online community who have no formal training. Our aim was to develop a novel online tool designed to facilitate large-scale annotation of digital retinal images, and to assess the accuracy of crowdsource grading using this tool, comparing it to expert classification. We used 100 retinal fundus photograph images with predetermined disease criteria selected by two experts from a large cohort study. The Amazon Mechanical Turk Web platform was used to drive traffic to our site so anonymous workers could perform a classification and annotation task of the fundus photographs in our dataset after a short training exercise. Three groups were assessed: masters only, nonmasters only and nonmasters with compulsory training. We calculated the sensitivity, specificity, and area under the curve (AUC) of receiver operating characteristic (ROC) plots for all classifications compared to expert grading, and used the Dice coefficient and consensus threshold to assess annotation accuracy. In total, we received 5389 annotations for 84 images (excluding 16 training images) in 2 weeks. A specificity and sensitivity of 71% (95% confidence interval [CI], 69%-74%) and 87% (95% CI, 86%-88%) was achieved for all classifications. The AUC in this study for all classifications combined was 0.93 (95% CI, 0.91-0.96). For image annotation, a maximal Dice coefficient (∼0.6) was achieved with a consensus threshold of 0.25. This study supports the hypothesis that annotation of abnormalities in retinal images by ophthalmologically naive individuals is comparable to expert annotation. The highest AUC and agreement with expert annotation was achieved in the nonmasters with compulsory training group. The use of crowdsourcing as a technique for retinal image analysis may be comparable to expert graders and has the potential to deliver timely, accurate, and cost-effective image analysis.
Parsons, Helen M; Ludwig, Christian; Günther, Ulrich L; Viant, Mark R
2007-01-01
Background Classifying nuclear magnetic resonance (NMR) spectra is a crucial step in many metabolomics experiments. Since several multivariate classification techniques depend upon the variance of the data, it is important to first minimise any contribution from unwanted technical variance arising from sample preparation and analytical measurements, and thereby maximise any contribution from wanted biological variance between different classes. The generalised logarithm (glog) transform was developed to stabilise the variance in DNA microarray datasets, but has rarely been applied to metabolomics data. In particular, it has not been rigorously evaluated against other scaling techniques used in metabolomics, nor tested on all forms of NMR spectra including 1-dimensional (1D) 1H, projections of 2D 1H, 1H J-resolved (pJRES), and intact 2D J-resolved (JRES). Results Here, the effects of the glog transform are compared against two commonly used variance stabilising techniques, autoscaling and Pareto scaling, as well as unscaled data. The four methods are evaluated in terms of the effects on the variance of NMR metabolomics data and on the classification accuracy following multivariate analysis, the latter achieved using principal component analysis followed by linear discriminant analysis. For two of three datasets analysed, classification accuracies were highest following glog transformation: 100% accuracy for discriminating 1D NMR spectra of hypoxic and normoxic invertebrate muscle, and 100% accuracy for discriminating 2D JRES spectra of fish livers sampled from two rivers. For the third dataset, pJRES spectra of urine from two breeds of dog, the glog transform and autoscaling achieved equal highest accuracies. Additionally we extended the glog algorithm to effectively suppress noise, which proved critical for the analysis of 2D JRES spectra. Conclusion We have demonstrated that the glog and extended glog transforms stabilise the technical variance in NMR metabolomics datasets. This significantly improves the discrimination between sample classes and has resulted in higher classification accuracies compared to unscaled, autoscaled or Pareto scaled data. Additionally we have confirmed the broad applicability of the glog approach using three disparate datasets from different biological samples using 1D NMR spectra, 1D projections of 2D JRES spectra, and intact 2D JRES spectra. PMID:17605789
NASA Astrophysics Data System (ADS)
Liu, Wanjun; Liang, Xuejian; Qu, Haicheng
2017-11-01
Hyperspectral image (HSI) classification is one of the most popular topics in remote sensing community. Traditional and deep learning-based classification methods were proposed constantly in recent years. In order to improve the classification accuracy and robustness, a dimensionality-varied convolutional neural network (DVCNN) was proposed in this paper. DVCNN was a novel deep architecture based on convolutional neural network (CNN). The input of DVCNN was a set of 3D patches selected from HSI which contained spectral-spatial joint information. In the following feature extraction process, each patch was transformed into some different 1D vectors by 3D convolution kernels, which were able to extract features from spectral-spatial data. The rest of DVCNN was about the same as general CNN and processed 2D matrix which was constituted by by all 1D data. So that the DVCNN could not only extract more accurate and rich features than CNN, but also fused spectral-spatial information to improve classification accuracy. Moreover, the robustness of network on water-absorption bands was enhanced in the process of spectral-spatial fusion by 3D convolution, and the calculation was simplified by dimensionality varied convolution. Experiments were performed on both Indian Pines and Pavia University scene datasets, and the results showed that the classification accuracy of DVCNN improved by 32.87% on Indian Pines and 19.63% on Pavia University scene than spectral-only CNN. The maximum accuracy improvement of DVCNN achievement was 13.72% compared with other state-of-the-art HSI classification methods, and the robustness of DVCNN on water-absorption bands noise was demonstrated.
Kim, Eun Young; Lee, Min Young; Kim, Se Hyun; Ha, Kyooseob; Kim, Kwang Pyo; Ahn, Yong Min
2017-06-02
Major depressive disorder (MDD) is a systemic and multifactorial disorder that involves abnormalities in multiple biochemical pathways and the autonomic nervous system. This study applied a machine-learning method to classify MDD and control groups by incorporating data from serum proteomic analysis and heart rate variability (HRV) analysis for the identification of novel peripheral biomarkers. The study subjects consisted of 25 drug-free female MDD patients and 25 age- and sex-matched healthy controls. First, quantitative serum proteome profiles were analyzed by liquid chromatography-tandem mass spectrometry using pooled serum samples from 10 patients and 10 controls. Next, candidate proteins were quantified with multiple reaction monitoring (MRM) in 50 subjects. We also analyzed 22 linear and nonlinear HRV parameters in 50 subjects. Finally, we identified a combined biomarker panel consisting of proteins and HRV indexes using a support vector machine with recursive feature elimination. A separation between MDD and control groups was achieved using five parameters (apolipoprotein B, group-specific component, ceruloplasmin, RMSSD, and SampEn) at 80.1% classification accuracy. A combination of HRV and proteomic data achieved better classification accuracy. A high classification accuracy can be achieved by combining multimodal information from heart rate dynamics and serum proteomics in MDD. Our approach can be helpful for accurate clinical diagnosis of MDD. Further studies using larger, independent cohorts are needed to verify the role of these candidate biomarkers for MDD diagnosis. Copyright © 2017 Elsevier Inc. All rights reserved.
Taamneh, Madhar; Taamneh, Salah; Alkheder, Sharaf
2017-09-01
Artificial neural networks (ANNs) have been widely used in predicting the severity of road traffic crashes. All available information about previously occurred accidents is typically used for building a single prediction model (i.e., classifier). Too little attention has been paid to the differences between these accidents, leading, in most cases, to build less accurate predictors. Hierarchical clustering is a well-known clustering method that seeks to group data by creating a hierarchy of clusters. Using hierarchical clustering and ANNs, a clustering-based classification approach for predicting the injury severity of road traffic accidents was proposed. About 6000 road accidents occurred over a six-year period from 2008 to 2013 in Abu Dhabi were used throughout this study. In order to reduce the amount of variation in data, hierarchical clustering was applied on the data set to organize it into six different forms, each with different number of clusters (i.e., clusters from 1 to 6). Two ANN models were subsequently built for each cluster of accidents in each generated form. The first model was built and validated using all accidents (training set), whereas only 66% of the accidents were used to build the second model, and the remaining 34% were used to test it (percentage split). Finally, the weighted average accuracy was computed for each type of models in each from of data. The results show that when testing the models using the training set, clustering prior to classification achieves (11%-16%) more accuracy than without using clustering, while the percentage split achieves (2%-5%) more accuracy. The results also suggest that partitioning the accidents into six clusters achieves the best accuracy if both types of models are taken into account.
NASA Astrophysics Data System (ADS)
Liu, Tao; Im, Jungho; Quackenbush, Lindi J.
2015-12-01
This study provides a novel approach to individual tree crown delineation (ITCD) using airborne Light Detection and Ranging (LiDAR) data in dense natural forests using two main steps: crown boundary refinement based on a proposed Fishing Net Dragging (FiND) method, and segment merging based on boundary classification. FiND starts with approximate tree crown boundaries derived using a traditional watershed method with Gaussian filtering and refines these boundaries using an algorithm that mimics how a fisherman drags a fishing net. Random forest machine learning is then used to classify boundary segments into two classes: boundaries between trees and boundaries between branches that belong to a single tree. Three groups of LiDAR-derived features-two from the pseudo waveform generated along with crown boundaries and one from a canopy height model (CHM)-were used in the classification. The proposed ITCD approach was tested using LiDAR data collected over a mountainous region in the Adirondack Park, NY, USA. Overall accuracy of boundary classification was 82.4%. Features derived from the CHM were generally more important in the classification than the features extracted from the pseudo waveform. A comprehensive accuracy assessment scheme for ITCD was also introduced by considering both area of crown overlap and crown centroids. Accuracy assessment using this new scheme shows the proposed ITCD achieved 74% and 78% as overall accuracy, respectively, for deciduous and mixed forest.
Martin, Bryan D.; Wolfson, Julian; Adomavicius, Gediminas; Fan, Yingling
2017-01-01
We propose and compare combinations of several methods for classifying transportation activity data from smartphone GPS and accelerometer sensors. We have two main objectives. First, we aim to classify our data as accurately as possible. Second, we aim to reduce the dimensionality of the data as much as possible in order to reduce the computational burden of the classification. We combine dimension reduction and classification algorithms and compare them with a metric that balances accuracy and dimensionality. In doing so, we develop a classification algorithm that accurately classifies five different modes of transportation (i.e., walking, biking, car, bus and rail) while being computationally simple enough to run on a typical smartphone. Further, we use data that required no behavioral changes from the smartphone users to collect. Our best classification model uses the random forest algorithm to achieve 96.8% accuracy. PMID:28885550
Martin, Bryan D; Addona, Vittorio; Wolfson, Julian; Adomavicius, Gediminas; Fan, Yingling
2017-09-08
We propose and compare combinations of several methods for classifying transportation activity data from smartphone GPS and accelerometer sensors. We have two main objectives. First, we aim to classify our data as accurately as possible. Second, we aim to reduce the dimensionality of the data as much as possible in order to reduce the computational burden of the classification. We combine dimension reduction and classification algorithms and compare them with a metric that balances accuracy and dimensionality. In doing so, we develop a classification algorithm that accurately classifies five different modes of transportation (i.e., walking, biking, car, bus and rail) while being computationally simple enough to run on a typical smartphone. Further, we use data that required no behavioral changes from the smartphone users to collect. Our best classification model uses the random forest algorithm to achieve 96.8% accuracy.
Deep Convolutional Neural Networks for Classifying Body Constitution Based on Face Image.
Huan, Er-Yang; Wen, Gui-Hua; Zhang, Shi-Jun; Li, Dan-Yang; Hu, Yang; Chang, Tian-Yuan; Wang, Qing; Huang, Bing-Lin
2017-01-01
Body constitution classification is the basis and core content of traditional Chinese medicine constitution research. It is to extract the relevant laws from the complex constitution phenomenon and finally build the constitution classification system. Traditional identification methods have the disadvantages of inefficiency and low accuracy, for instance, questionnaires. This paper proposed a body constitution recognition algorithm based on deep convolutional neural network, which can classify individual constitution types according to face images. The proposed model first uses the convolutional neural network to extract the features of face image and then combines the extracted features with the color features. Finally, the fusion features are input to the Softmax classifier to get the classification result. Different comparison experiments show that the algorithm proposed in this paper can achieve the accuracy of 65.29% about the constitution classification. And its performance was accepted by Chinese medicine practitioners.
Zhang, Chu; Liu, Fei; He, Yong
2018-02-01
Hyperspectral imaging was used to identify and to visualize the coffee bean varieties. Spectral preprocessing of pixel-wise spectra was conducted by different methods, including moving average smoothing (MA), wavelet transform (WT) and empirical mode decomposition (EMD). Meanwhile, spatial preprocessing of the gray-scale image at each wavelength was conducted by median filter (MF). Support vector machine (SVM) models using full sample average spectra and pixel-wise spectra, and the selected optimal wavelengths by second derivative spectra all achieved classification accuracy over 80%. Primarily, the SVM models using pixel-wise spectra were used to predict the sample average spectra, and these models obtained over 80% of the classification accuracy. Secondly, the SVM models using sample average spectra were used to predict pixel-wise spectra, but achieved with lower than 50% of classification accuracy. The results indicated that WT and EMD were suitable for pixel-wise spectra preprocessing. The use of pixel-wise spectra could extend the calibration set, and resulted in the good prediction results for pixel-wise spectra and sample average spectra. The overall results indicated the effectiveness of using spectral preprocessing and the adoption of pixel-wise spectra. The results provided an alternative way of data processing for applications of hyperspectral imaging in food industry.
ERIC Educational Resources Information Center
Marston, Doug; And Others
Two studies were conducted to examine the efficacy of direct measurement, standardized achievement tests, and aptitude-achievement discrepancy scores in distinguishing learning disabled (LD) and nonlearning disabled (NLD) students in grades 3 to 6. For both reading (Study I) and written expression (Study II), students' scores on direct and…
An Extreme Learning Machine-Based Neuromorphic Tactile Sensing System for Texture Recognition.
Rasouli, Mahdi; Chen, Yi; Basu, Arindam; Kukreja, Sunil L; Thakor, Nitish V
2018-04-01
Despite significant advances in computational algorithms and development of tactile sensors, artificial tactile sensing is strikingly less efficient and capable than the human tactile perception. Inspired by efficiency of biological systems, we aim to develop a neuromorphic system for tactile pattern recognition. We particularly target texture recognition as it is one of the most necessary and challenging tasks for artificial sensory systems. Our system consists of a piezoresistive fabric material as the sensor to emulate skin, an interface that produces spike patterns to mimic neural signals from mechanoreceptors, and an extreme learning machine (ELM) chip to analyze spiking activity. Benefiting from intrinsic advantages of biologically inspired event-driven systems and massively parallel and energy-efficient processing capabilities of the ELM chip, the proposed architecture offers a fast and energy-efficient alternative for processing tactile information. Moreover, it provides the opportunity for the development of low-cost tactile modules for large-area applications by integration of sensors and processing circuits. We demonstrate the recognition capability of our system in a texture discrimination task, where it achieves a classification accuracy of 92% for categorization of ten graded textures. Our results confirm that there exists a tradeoff between response time and classification accuracy (and information transfer rate). A faster decision can be achieved at early time steps or by using a shorter time window. This, however, results in deterioration of the classification accuracy and information transfer rate. We further observe that there exists a tradeoff between the classification accuracy and the input spike rate (and thus energy consumption). Our work substantiates the importance of development of efficient sparse codes for encoding sensory data to improve the energy efficiency. These results have a significance for a wide range of wearable, robotic, prosthetic, and industrial applications.
EEG Sleep Stages Classification Based on Time Domain Features and Structural Graph Similarity.
Diykh, Mohammed; Li, Yan; Wen, Peng
2016-11-01
The electroencephalogram (EEG) signals are commonly used in diagnosing and treating sleep disorders. Many existing methods for sleep stages classification mainly depend on the analysis of EEG signals in time or frequency domain to obtain a high classification accuracy. In this paper, the statistical features in time domain, the structural graph similarity and the K-means (SGSKM) are combined to identify six sleep stages using single channel EEG signals. Firstly, each EEG segment is partitioned into sub-segments. The size of a sub-segment is determined empirically. Secondly, statistical features are extracted, sorted into different sets of features and forwarded to the SGSKM to classify EEG sleep stages. We have also investigated the relationships between sleep stages and the time domain features of the EEG data used in this paper. The experimental results show that the proposed method yields better classification results than other four existing methods and the support vector machine (SVM) classifier. A 95.93% average classification accuracy is achieved by using the proposed method.
Tissue classification using depth-dependent ultrasound time series analysis: in-vitro animal study
NASA Astrophysics Data System (ADS)
Imani, Farhad; Daoud, Mohammad; Moradi, Mehdi; Abolmaesumi, Purang; Mousavi, Parvin
2011-03-01
Time series analysis of ultrasound radio-frequency (RF) signals has been shown to be an effective tissue classification method. Previous studies of this method for tissue differentiation at high and clinical-frequencies have been reported. In this paper, analysis of RF time series is extended to improve tissue classification at the clinical frequencies by including novel features extracted from the time series spectrum. The primary feature examined is the Mean Central Frequency (MCF) computed for regions of interest (ROIs) in the tissue extending along the axial axis of the transducer. In addition, the intercept and slope of a line fitted to the MCF-values of the RF time series as a function of depth have been included. To evaluate the accuracy of the new features, an in vitro animal study is performed using three tissue types: bovine muscle, bovine liver, and chicken breast, where perfect two-way classification is achieved. The results show statistically significant improvements over the classification accuracies with previously reported features.
NASA Astrophysics Data System (ADS)
Tonbul, H.; Kavzoglu, T.
2016-12-01
In recent years, object based image analysis (OBIA) has spread out and become a widely accepted technique for the analysis of remotely sensed data. OBIA deals with grouping pixels into homogenous objects based on spectral, spatial and textural features of contiguous pixels in an image. The first stage of OBIA, named as image segmentation, is the most prominent part of object recognition. In this study, multiresolution segmentation, which is a region-based approach, was employed to construct image objects. In the application of multi-resolution, three parameters, namely shape, compactness and scale must be set by the analyst. Segmentation quality remarkably influences the fidelity of the thematic maps and accordingly the classification accuracy. Therefore, it is of great importance to search and set optimal values for the segmentation parameters. In the literature, main focus has been on the definition of scale parameter, assuming that the effect of shape and compactness parameters is limited in terms of achieved classification accuracy. The aim of this study is to deeply analyze the influence of shape/compactness parameters by varying their values while using the optimal scale parameter determined by the use of Estimation of Scale Parameter (ESP-2) approach. A pansharpened Qickbird-2 image covering Trabzon, Turkey was employed to investigate the objectives of the study. For this purpose, six different combinations of shape/compactness were utilized to make deductions on the behavior of shape and compactness parameters and optimal setting for all parameters as a whole. Objects were assigned to classes using nearest neighbor classifier in all segmentation observations and equal number of pixels was randomly selected to calculate accuracy metrics. The highest overall accuracy (92.3%) was achieved by setting the shape/compactness criteria to 0.3/0.3. The results of this study indicate that shape/compactness parameters can have significant effect on classification accuracy with 4% change in overall accuracy. Also, statistical significance of differences in accuracy was tested using the McNemar's test and found that the difference between poor and optimal setting of shape/compactness parameters was statistically significant, suggesting a search for optimal parameterization instead of default setting.
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-01-01
Background Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Methodology/Principal Findings Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. Conclusions/Significance The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice. PMID:21359184
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-02-16
Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice.
Classification of EEG Signals Based on Pattern Recognition Approach.
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a "pattern recognition" approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90-7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11-89.63% and 91.60-81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy.
Classification of EEG Signals Based on Pattern Recognition Approach
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a “pattern recognition” approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90–7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11–89.63% and 91.60–81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy. PMID:29209190
Xia, Jiaqi; Peng, Zhenling; Qi, Dawei; Mu, Hongbo; Yang, Jianyi
2017-03-15
Protein fold classification is a critical step in protein structure prediction. There are two possible ways to classify protein folds. One is through template-based fold assignment and the other is ab-initio prediction using machine learning algorithms. Combination of both solutions to improve the prediction accuracy was never explored before. We developed two algorithms, HH-fold and SVM-fold for protein fold classification. HH-fold is a template-based fold assignment algorithm using the HHsearch program. SVM-fold is a support vector machine-based ab-initio classification algorithm, in which a comprehensive set of features are extracted from three complementary sequence profiles. These two algorithms are then combined, resulting to the ensemble approach TA-fold. We performed a comprehensive assessment for the proposed methods by comparing with ab-initio methods and template-based threading methods on six benchmark datasets. An accuracy of 0.799 was achieved by TA-fold on the DD dataset that consists of proteins from 27 folds. This represents improvement of 5.4-11.7% over ab-initio methods. After updating this dataset to include more proteins in the same folds, the accuracy increased to 0.971. In addition, TA-fold achieved >0.9 accuracy on a large dataset consisting of 6451 proteins from 184 folds. Experiments on the LE dataset show that TA-fold consistently outperforms other threading methods at the family, superfamily and fold levels. The success of TA-fold is attributed to the combination of template-based fold assignment and ab-initio classification using features from complementary sequence profiles that contain rich evolution information. http://yanglab.nankai.edu.cn/TA-fold/. yangjy@nankai.edu.cn or mhb-506@163.com. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
ERIC Educational Resources Information Center
Chen, Chau-Kuang
2010-01-01
Artificial Neural Network (ANN) and Support Vector Machine (SVM) approaches have been on the cutting edge of science and technology for pattern recognition and data classification. In the ANN model, classification accuracy can be achieved by using the feed-forward of inputs, back-propagation of errors, and the adjustment of connection weights. In…
NASA Astrophysics Data System (ADS)
Cavigelli, Lukas; Bernath, Dominic; Magno, Michele; Benini, Luca
2016-10-01
Detecting and classifying targets in video streams from surveillance cameras is a cumbersome, error-prone and expensive task. Often, the incurred costs are prohibitive for real-time monitoring. This leads to data being stored locally or transmitted to a central storage site for post-incident examination. The required communication links and archiving of the video data are still expensive and this setup excludes preemptive actions to respond to imminent threats. An effective way to overcome these limitations is to build a smart camera that analyzes the data on-site, close to the sensor, and transmits alerts when relevant video sequences are detected. Deep neural networks (DNNs) have come to outperform humans in visual classifications tasks and are also performing exceptionally well on other computer vision tasks. The concept of DNNs and Convolutional Networks (ConvNets) can easily be extended to make use of higher-dimensional input data such as multispectral data. We explore this opportunity in terms of achievable accuracy and required computational effort. To analyze the precision of DNNs for scene labeling in an urban surveillance scenario we have created a dataset with 8 classes obtained in a field experiment. We combine an RGB camera with a 25-channel VIS-NIR snapshot sensor to assess the potential of multispectral image data for target classification. We evaluate several new DNNs, showing that the spectral information fused together with the RGB frames can be used to improve the accuracy of the system or to achieve similar accuracy with a 3x smaller computation effort. We achieve a very high per-pixel accuracy of 99.1%. Even for scarcely occurring, but particularly interesting classes, such as cars, 75% of the pixels are labeled correctly with errors occurring only around the border of the objects. This high accuracy was obtained with a training set of only 30 labeled images, paving the way for fast adaptation to various application scenarios.
NASA Astrophysics Data System (ADS)
Salvaris, Mathew; Sepulveda, Francisco
2010-10-01
Brain-computer interfaces (BCIs) rely on various electroencephalography methodologies that allow the user to convey their desired control to the machine. Common approaches include the use of event-related potentials (ERPs) such as the P300 and modulation of the beta and mu rhythms. All of these methods have their benefits and drawbacks. In this paper, three different selective attention tasks were tested in conjunction with a P300-based protocol (i.e. the standard counting of target stimuli as well as the conduction of real and imaginary movements in sync with the target stimuli). The three tasks were performed by a total of 10 participants, with the majority (7 out of 10) of the participants having never before participated in imaginary movement BCI experiments. Channels and methods used were optimized for the P300 ERP and no sensory-motor rhythms were explicitly used. The classifier used was a simple Fisher's linear discriminant. Results were encouraging, showing that on average the imaginary movement achieved a P300 versus No-P300 classification accuracy of 84.53%. In comparison, mental counting, the standard selective attention task used in previous studies, achieved 78.9% and real movement 90.3%. Furthermore, multiple trial classification results were recorded and compared, with real movement reaching 99.5% accuracy after four trials (12.8 s), imaginary movement reaching 99.5% accuracy after five trials (16 s) and counting reaching 98.2% accuracy after ten trials (32 s).
Salvaris, Mathew; Sepulveda, Francisco
2010-10-01
Brain-computer interfaces (BCIs) rely on various electroencephalography methodologies that allow the user to convey their desired control to the machine. Common approaches include the use of event-related potentials (ERPs) such as the P300 and modulation of the beta and mu rhythms. All of these methods have their benefits and drawbacks. In this paper, three different selective attention tasks were tested in conjunction with a P300-based protocol (i.e. the standard counting of target stimuli as well as the conduction of real and imaginary movements in sync with the target stimuli). The three tasks were performed by a total of 10 participants, with the majority (7 out of 10) of the participants having never before participated in imaginary movement BCI experiments. Channels and methods used were optimized for the P300 ERP and no sensory-motor rhythms were explicitly used. The classifier used was a simple Fisher's linear discriminant. Results were encouraging, showing that on average the imaginary movement achieved a P300 versus No-P300 classification accuracy of 84.53%. In comparison, mental counting, the standard selective attention task used in previous studies, achieved 78.9% and real movement 90.3%. Furthermore, multiple trial classification results were recorded and compared, with real movement reaching 99.5% accuracy after four trials (12.8 s), imaginary movement reaching 99.5% accuracy after five trials (16 s) and counting reaching 98.2% accuracy after ten trials (32 s).
Postcraniometric sex and ancestry estimation in South Africa: a validation study.
Liebenberg, Leandi; Krüger, Gabriele C; L'Abbé, Ericka N; Stull, Kyra E
2018-05-24
With the acceptance of the Daubert criteria as the standards for best practice in forensic anthropological research, more emphasis is being placed on the validation of published methods. Methods, both traditional and novel, need to be validated, adjusted, and refined for optimal performance within forensic anthropological analyses. Recently, a custom postcranial database of modern South Africans was created for use in Fordisc 3.1. Classification accuracies of up to 85% for ancestry estimation and 98% for sex estimation were achieved using a multivariate approach. To measure the external validity and report more realistic performance statistics, an independent sample was tested. The postcrania from 180 black, white, and colored South Africans were measured and classified using the custom postcranial database. A decrease in accuracy was observed for both ancestry estimation (79%) and sex estimation (95%) of the validation sample. When incorporating both sex and ancestry simultaneously, the method achieved 70% accuracy, and 79% accuracy when sex-specific ancestry analyses were run. Classification matrices revealed that postcrania were more likely to misclassify as a result of ancestry rather than sex. While both sex and ancestry influence the size of an individual, sex differences are more marked in the postcranial skeleton and are therefore easier to identify. The external validity of the postcranial database was verified and therefore shown to be a useful tool for forensic casework in South Africa. While the classification rates were slightly lower than the original method, this is expected when a method is generalized.
Ensemble Methods for Classification of Physical Activities from Wrist Accelerometry.
Chowdhury, Alok Kumar; Tjondronegoro, Dian; Chandran, Vinod; Trost, Stewart G
2017-09-01
To investigate whether the use of ensemble learning algorithms improve physical activity recognition accuracy compared to the single classifier algorithms, and to compare the classification accuracy achieved by three conventional ensemble machine learning methods (bagging, boosting, random forest) and a custom ensemble model comprising four algorithms commonly used for activity recognition (binary decision tree, k nearest neighbor, support vector machine, and neural network). The study used three independent data sets that included wrist-worn accelerometer data. For each data set, a four-step classification framework consisting of data preprocessing, feature extraction, normalization and feature selection, and classifier training and testing was implemented. For the custom ensemble, decisions from the single classifiers were aggregated using three decision fusion methods: weighted majority vote, naïve Bayes combination, and behavior knowledge space combination. Classifiers were cross-validated using leave-one subject out cross-validation and compared on the basis of average F1 scores. In all three data sets, ensemble learning methods consistently outperformed the individual classifiers. Among the conventional ensemble methods, random forest models provided consistently high activity recognition; however, the custom ensemble model using weighted majority voting demonstrated the highest classification accuracy in two of the three data sets. Combining multiple individual classifiers using conventional or custom ensemble learning methods can improve activity recognition accuracy from wrist-worn accelerometer data.
Tartar, A; Akan, A; Kilic, N
2014-01-01
Computer-aided detection systems can help radiologists to detect pulmonary nodules at an early stage. In this paper, a novel Computer-Aided Diagnosis system (CAD) is proposed for the classification of pulmonary nodules as malignant and benign. The proposed CAD system using ensemble learning classifiers, provides an important support to radiologists at the diagnosis process of the disease, achieves high classification performance. The proposed approach with bagging classifier results in 94.7 %, 90.0 % and 77.8 % classification sensitivities for benign, malignant and undetermined classes (89.5 % accuracy), respectively.
Classification of skin cancer images using local binary pattern and SVM classifier
NASA Astrophysics Data System (ADS)
Adjed, Faouzi; Faye, Ibrahima; Ababsa, Fakhreddine; Gardezi, Syed Jamal; Dass, Sarat Chandra
2016-11-01
In this paper, a classification method for melanoma and non-melanoma skin cancer images has been presented using the local binary patterns (LBP). The LBP computes the local texture information from the skin cancer images, which is later used to compute some statistical features that have capability to discriminate the melanoma and non-melanoma skin tissues. Support vector machine (SVM) is applied on the feature matrix for classification into two skin image classes (malignant and benign). The method achieves good classification accuracy of 76.1% with sensitivity of 75.6% and specificity of 76.7%.
Pettersson-Yeo, William; Benetti, Stefania; Marquand, Andre F.; Joules, Richard; Catani, Marco; Williams, Steve C. R.; Allen, Paul; McGuire, Philip; Mechelli, Andrea
2014-01-01
In the pursuit of clinical utility, neuroimaging researchers of psychiatric and neurological illness are increasingly using analyses, such as support vector machine, that allow inference at the single-subject level. Recent studies employing single-modality data, however, suggest that classification accuracies must be improved for such utility to be realized. One possible solution is to integrate different data types to provide a single combined output classification; either by generating a single decision function based on an integrated kernel matrix, or, by creating an ensemble of multiple single modality classifiers and integrating their predictions. Here, we describe four integrative approaches: (1) an un-weighted sum of kernels, (2) multi-kernel learning, (3) prediction averaging, and (4) majority voting, and compare their ability to enhance classification accuracy relative to the best single-modality classification accuracy. We achieve this by integrating structural, functional, and diffusion tensor magnetic resonance imaging data, in order to compare ultra-high risk (n = 19), first episode psychosis (n = 19) and healthy control subjects (n = 23). Our results show that (i) whilst integration can enhance classification accuracy by up to 13%, the frequency of such instances may be limited, (ii) where classification can be enhanced, simple methods may yield greater increases relative to more computationally complex alternatives, and, (iii) the potential for classification enhancement is highly influenced by the specific diagnostic comparison under consideration. In conclusion, our findings suggest that for moderately sized clinical neuroimaging datasets, combining different imaging modalities in a data-driven manner is no “magic bullet” for increasing classification accuracy. However, it remains possible that this conclusion is dependent on the use of neuroimaging modalities that had little, or no, complementary information to offer one another, and that the integration of more diverse types of data would have produced greater classification enhancement. We suggest that future studies ideally examine a greater variety of data types (e.g., genetic, cognitive, and neuroimaging) in order to identify the data types and combinations optimally suited to the classification of early stage psychosis. PMID:25076868
Pettersson-Yeo, William; Benetti, Stefania; Marquand, Andre F; Joules, Richard; Catani, Marco; Williams, Steve C R; Allen, Paul; McGuire, Philip; Mechelli, Andrea
2014-01-01
In the pursuit of clinical utility, neuroimaging researchers of psychiatric and neurological illness are increasingly using analyses, such as support vector machine, that allow inference at the single-subject level. Recent studies employing single-modality data, however, suggest that classification accuracies must be improved for such utility to be realized. One possible solution is to integrate different data types to provide a single combined output classification; either by generating a single decision function based on an integrated kernel matrix, or, by creating an ensemble of multiple single modality classifiers and integrating their predictions. Here, we describe four integrative approaches: (1) an un-weighted sum of kernels, (2) multi-kernel learning, (3) prediction averaging, and (4) majority voting, and compare their ability to enhance classification accuracy relative to the best single-modality classification accuracy. We achieve this by integrating structural, functional, and diffusion tensor magnetic resonance imaging data, in order to compare ultra-high risk (n = 19), first episode psychosis (n = 19) and healthy control subjects (n = 23). Our results show that (i) whilst integration can enhance classification accuracy by up to 13%, the frequency of such instances may be limited, (ii) where classification can be enhanced, simple methods may yield greater increases relative to more computationally complex alternatives, and, (iii) the potential for classification enhancement is highly influenced by the specific diagnostic comparison under consideration. In conclusion, our findings suggest that for moderately sized clinical neuroimaging datasets, combining different imaging modalities in a data-driven manner is no "magic bullet" for increasing classification accuracy. However, it remains possible that this conclusion is dependent on the use of neuroimaging modalities that had little, or no, complementary information to offer one another, and that the integration of more diverse types of data would have produced greater classification enhancement. We suggest that future studies ideally examine a greater variety of data types (e.g., genetic, cognitive, and neuroimaging) in order to identify the data types and combinations optimally suited to the classification of early stage psychosis.
Vidić, Igor; Egnell, Liv; Jerome, Neil P; Teruel, Jose R; Sjøbakk, Torill E; Østlie, Agnes; Fjøsne, Hans E; Bathen, Tone F; Goa, Pål Erik
2018-05-01
Diffusion-weighted MRI (DWI) is currently one of the fastest developing MRI-based techniques in oncology. Histogram properties from model fitting of DWI are useful features for differentiation of lesions, and classification can potentially be improved by machine learning. To evaluate classification of malignant and benign tumors and breast cancer subtypes using support vector machine (SVM). Prospective. Fifty-one patients with benign (n = 23) and malignant (n = 28) breast tumors (26 ER+, whereof six were HER2+). Patients were imaged with DW-MRI (3T) using twice refocused spin-echo echo-planar imaging with echo time / repetition time (TR/TE) = 9000/86 msec, 90 × 90 matrix size, 2 × 2 mm in-plane resolution, 2.5 mm slice thickness, and 13 b-values. Apparent diffusion coefficient (ADC), relative enhanced diffusivity (RED), and the intravoxel incoherent motion (IVIM) parameters diffusivity (D), pseudo-diffusivity (D*), and perfusion fraction (f) were calculated. The histogram properties (median, mean, standard deviation, skewness, kurtosis) were used as features in SVM (10-fold cross-validation) for differentiation of lesions and subtyping. Accuracies of the SVM classifications were calculated to find the combination of features with highest prediction accuracy. Mann-Whitney tests were performed for univariate comparisons. For benign versus malignant tumors, univariate analysis found 11 histogram properties to be significant differentiators. Using SVM, the highest accuracy (0.96) was achieved from a single feature (mean of RED), or from three feature combinations of IVIM or ADC. Combining features from all models gave perfect classification. No single feature predicted HER2 status of ER + tumors (univariate or SVM), although high accuracy (0.90) was achieved with SVM combining several features. Importantly, these features had to include higher-order statistics (kurtosis and skewness), indicating the importance to account for heterogeneity. Our findings suggest that SVM, using features from a combination of diffusion models, improves prediction accuracy for differentiation of benign versus malignant breast tumors, and may further assist in subtyping of breast cancer. 3 Technical Efficacy: Stage 3 J. Magn. Reson. Imaging 2018;47:1205-1216. © 2017 International Society for Magnetic Resonance in Medicine.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Masci, Frank J.; Grillmair, Carl J.; Cutri, Roc M.
2014-07-01
We describe a methodology to classify periodic variable stars identified using photometric time-series measurements constructed from the Wide-field Infrared Survey Explorer (WISE) full-mission single-exposure Source Databases. This will assist in the future construction of a WISE Variable Source Database that assigns variables to specific science classes as constrained by the WISE observing cadence with statistically meaningful classification probabilities. We have analyzed the WISE light curves of 8273 variable stars identified in previous optical variability surveys (MACHO, GCVS, and ASAS) and show that Fourier decomposition techniques can be extended into the mid-IR to assist with their classification. Combined with other periodicmore » light-curve features, this sample is then used to train a machine-learned classifier based on the random forest (RF) method. Consistent with previous classification studies of variable stars in general, the RF machine-learned classifier is superior to other methods in terms of accuracy, robustness against outliers, and relative immunity to features that carry little or redundant class information. For the three most common classes identified by WISE: Algols, RR Lyrae, and W Ursae Majoris type variables, we obtain classification efficiencies of 80.7%, 82.7%, and 84.5% respectively using cross-validation analyses, with 95% confidence intervals of approximately ±2%. These accuracies are achieved at purity (or reliability) levels of 88.5%, 96.2%, and 87.8% respectively, similar to that achieved in previous automated classification studies of periodic variable stars.« less
NASA Astrophysics Data System (ADS)
Han, Xiaopeng; Huang, Xin; Li, Jiayi; Li, Yansheng; Yang, Michael Ying; Gong, Jianya
2018-04-01
In recent years, the availability of high-resolution imagery has enabled more detailed observation of the Earth. However, it is imperative to simultaneously achieve accurate interpretation and preserve the spatial details for the classification of such high-resolution data. To this aim, we propose the edge-preservation multi-classifier relearning framework (EMRF). This multi-classifier framework is made up of support vector machine (SVM), random forest (RF), and sparse multinomial logistic regression via variable splitting and augmented Lagrangian (LORSAL) classifiers, considering their complementary characteristics. To better characterize complex scenes of remote sensing images, relearning based on landscape metrics is proposed, which iteratively quantizes both the landscape composition and spatial configuration by the use of the initial classification results. In addition, a novel tri-training strategy is proposed to solve the over-smoothing effect of relearning by means of automatic selection of training samples with low classification certainties, which always distribute in or near the edge areas. Finally, EMRF flexibly combines the strengths of relearning and tri-training via the classification certainties calculated by the probabilistic output of the respective classifiers. It should be noted that, in order to achieve an unbiased evaluation, we assessed the classification accuracy of the proposed framework using both edge and non-edge test samples. The experimental results obtained with four multispectral high-resolution images confirm the efficacy of the proposed framework, in terms of both edge and non-edge accuracy.
Boskamp, Tobias; Lachmund, Delf; Oetjen, Janina; Cordero Hernandez, Yovany; Trede, Dennis; Maass, Peter; Casadonte, Rita; Kriegsmann, Jörg; Warth, Arne; Dienemann, Hendrik; Weichert, Wilko; Kriegsmann, Mark
2017-07-01
Matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) shows a high potential for applications in histopathological diagnosis, and in particular for supporting tumor typing and subtyping. The development of such applications requires the extraction of spectral fingerprints that are relevant for the given tissue and the identification of biomarkers associated with these spectral patterns. We propose a novel data analysis method based on the extraction of characteristic spectral patterns (CSPs) that allow automated generation of classification models for spectral data. Formalin-fixed paraffin embedded (FFPE) tissue samples from N=445 patients assembled on 12 tissue microarrays were analyzed. The method was applied to discriminate primary lung and pancreatic cancer, as well as adenocarcinoma and squamous cell carcinoma of the lung. A classification accuracy of 100% and 82.8%, resp., could be achieved on core level, assessed by cross-validation. The method outperformed the more conventional classification method based on the extraction of individual m/z values in the first application, while achieving a comparable accuracy in the second. LC-MS/MS peptide identification demonstrated that the spectral features present in selected CSPs correspond to peptides relevant for the respective classification. This article is part of a Special Issue entitled: MALDI Imaging, edited by Dr. Corinna Henkel and Prof. Peter Hoffmann. Copyright © 2016 Elsevier B.V. All rights reserved.
Estimation of different data compositions for early-season crop type classification.
Hao, Pengyu; Wu, Mingquan; Niu, Zheng; Wang, Li; Zhan, Yulin
2018-01-01
Timely and accurate crop type distribution maps are an important inputs for crop yield estimation and production forecasting as multi-temporal images can observe phenological differences among crops. Therefore, time series remote sensing data are essential for crop type mapping, and image composition has commonly been used to improve the quality of the image time series. However, the optimal composition period is unclear as long composition periods (such as compositions lasting half a year) are less informative and short composition periods lead to information redundancy and missing pixels. In this study, we initially acquired daily 30 m Normalized Difference Vegetation Index (NDVI) time series by fusing MODIS, Landsat, Gaofen and Huanjing (HJ) NDVI, and then composited the NDVI time series using four strategies (daily, 8-day, 16-day, and 32-day). We used Random Forest to identify crop types and evaluated the classification performances of the NDVI time series generated from four composition strategies in two studies regions from Xinjiang, China. Results indicated that crop classification performance improved as crop separabilities and classification accuracies increased, and classification uncertainties dropped in the green-up stage of the crops. When using daily NDVI time series, overall accuracies saturated at 113-day and 116-day in Bole and Luntai, and the saturated overall accuracies (OAs) were 86.13% and 91.89%, respectively. Cotton could be identified 40∼60 days and 35∼45 days earlier than the harvest in Bole and Luntai when using daily, 8-day and 16-day composition NDVI time series since both producer's accuracies (PAs) and user's accuracies (UAs) were higher than 85%. Among the four compositions, the daily NDVI time series generated the highest classification accuracies. Although the 8-day, 16-day and 32-day compositions had similar saturated overall accuracies (around 85% in Bole and 83% in Luntai), the 8-day and 16-day compositions achieved these accuracies around 155-day in Bole and 133-day in Luntai, which were earlier than the 32-day composition (170-day in both Bole and Luntai). Therefore, when the daily NDVI time series cannot be acquired, the 16-day composition is recommended in this study.
Estimation of different data compositions for early-season crop type classification
Wu, Mingquan; Wang, Li; Zhan, Yulin
2018-01-01
Timely and accurate crop type distribution maps are an important inputs for crop yield estimation and production forecasting as multi-temporal images can observe phenological differences among crops. Therefore, time series remote sensing data are essential for crop type mapping, and image composition has commonly been used to improve the quality of the image time series. However, the optimal composition period is unclear as long composition periods (such as compositions lasting half a year) are less informative and short composition periods lead to information redundancy and missing pixels. In this study, we initially acquired daily 30 m Normalized Difference Vegetation Index (NDVI) time series by fusing MODIS, Landsat, Gaofen and Huanjing (HJ) NDVI, and then composited the NDVI time series using four strategies (daily, 8-day, 16-day, and 32-day). We used Random Forest to identify crop types and evaluated the classification performances of the NDVI time series generated from four composition strategies in two studies regions from Xinjiang, China. Results indicated that crop classification performance improved as crop separabilities and classification accuracies increased, and classification uncertainties dropped in the green-up stage of the crops. When using daily NDVI time series, overall accuracies saturated at 113-day and 116-day in Bole and Luntai, and the saturated overall accuracies (OAs) were 86.13% and 91.89%, respectively. Cotton could be identified 40∼60 days and 35∼45 days earlier than the harvest in Bole and Luntai when using daily, 8-day and 16-day composition NDVI time series since both producer’s accuracies (PAs) and user’s accuracies (UAs) were higher than 85%. Among the four compositions, the daily NDVI time series generated the highest classification accuracies. Although the 8-day, 16-day and 32-day compositions had similar saturated overall accuracies (around 85% in Bole and 83% in Luntai), the 8-day and 16-day compositions achieved these accuracies around 155-day in Bole and 133-day in Luntai, which were earlier than the 32-day composition (170-day in both Bole and Luntai). Therefore, when the daily NDVI time series cannot be acquired, the 16-day composition is recommended in this study. PMID:29868265
An efficient scheme for automatic web pages categorization using the support vector machine
NASA Astrophysics Data System (ADS)
Bhalla, Vinod Kumar; Kumar, Neeraj
2016-07-01
In the past few years, with an evolution of the Internet and related technologies, the number of the Internet users grows exponentially. These users demand access to relevant web pages from the Internet within fraction of seconds. To achieve this goal, there is a requirement of an efficient categorization of web page contents. Manual categorization of these billions of web pages to achieve high accuracy is a challenging task. Most of the existing techniques reported in the literature are semi-automatic. Using these techniques, higher level of accuracy cannot be achieved. To achieve these goals, this paper proposes an automatic web pages categorization into the domain category. The proposed scheme is based on the identification of specific and relevant features of the web pages. In the proposed scheme, first extraction and evaluation of features are done followed by filtering the feature set for categorization of domain web pages. A feature extraction tool based on the HTML document object model of the web page is developed in the proposed scheme. Feature extraction and weight assignment are based on the collection of domain-specific keyword list developed by considering various domain pages. Moreover, the keyword list is reduced on the basis of ids of keywords in keyword list. Also, stemming of keywords and tag text is done to achieve a higher accuracy. An extensive feature set is generated to develop a robust classification technique. The proposed scheme was evaluated using a machine learning method in combination with feature extraction and statistical analysis using support vector machine kernel as the classification tool. The results obtained confirm the effectiveness of the proposed scheme in terms of its accuracy in different categories of web pages.
Zemp, Roland; Tanadini, Matteo; Plüss, Stefan; Schnüriger, Karin; Singh, Navrag B; Taylor, William R; Lorenzetti, Silvio
2016-01-01
Occupational musculoskeletal disorders, particularly chronic low back pain (LBP), are ubiquitous due to prolonged static sitting or nonergonomic sitting positions. Therefore, the aim of this study was to develop an instrumented chair with force and acceleration sensors to determine the accuracy of automatically identifying the user's sitting position by applying five different machine learning methods (Support Vector Machines, Multinomial Regression, Boosting, Neural Networks, and Random Forest). Forty-one subjects were requested to sit four times in seven different prescribed sitting positions (total 1148 samples). Sixteen force sensor values and the backrest angle were used as the explanatory variables (features) for the classification. The different classification methods were compared by means of a Leave-One-Out cross-validation approach. The best performance was achieved using the Random Forest classification algorithm, producing a mean classification accuracy of 90.9% for subjects with which the algorithm was not familiar. The classification accuracy varied between 81% and 98% for the seven different sitting positions. The present study showed the possibility of accurately classifying different sitting positions by means of the introduced instrumented office chair combined with machine learning analyses. The use of such novel approaches for the accurate assessment of chair usage could offer insights into the relationships between sitting position, sitting behaviour, and the occurrence of musculoskeletal disorders.
NASA Astrophysics Data System (ADS)
Seo, Young Wook; Yoon, Seung Chul; Park, Bosoon; Hinton, Arthur; Windham, William R.; Lawrence, Kurt C.
2013-05-01
Salmonella is a major cause of foodborne disease outbreaks resulting from the consumption of contaminated food products in the United States. This paper reports the development of a hyperspectral imaging technique for detecting and differentiating two of the most common Salmonella serotypes, Salmonella Enteritidis (SE) and Salmonella Typhimurium (ST), from background microflora that are often found in poultry carcass rinse. Presumptive positive screening of colonies with a traditional direct plating method is a labor intensive and time consuming task. Thus, this paper is concerned with the detection of differences in spectral characteristics among the pure SE, ST, and background microflora grown on brilliant green sulfa (BGS) and xylose lysine tergitol 4 (XLT4) agar media with a spread plating technique. Visible near-infrared hyperspectral imaging, providing the spectral and spatial information unique to each microorganism, was utilized to differentiate SE and ST from the background microflora. A total of 10 classification models, including five machine learning algorithms, each without and with principal component analysis (PCA), were validated and compared to find the best model in classification accuracy. The five machine learning (classification) algorithms used in this study were Mahalanobis distance (MD), k-nearest neighbor (kNN), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM). The average classification accuracy of all 10 models on a calibration (or training) set of the pure cultures on BGS agar plates was 98% (Kappa coefficient = 0.95) in determining the presence of SE and/or ST although it was difficult to differentiate between SE and ST. The average classification accuracy of all 10 models on a training set for ST detection on XLT4 agar was over 99% (Kappa coefficient = 0.99) although SE colonies on XLT4 agar were difficult to differentiate from background microflora. The average classification accuracy of all 10 models on a validation set of chicken carcass rinses spiked with SE or ST and incubated on BGS agar plates was 94.45% and 83.73%, without and with PCA for classification, respectively. The best performing classification model on the validation set was QDA without PCA by achieving the classification accuracy of 98.65% (Kappa coefficient=0.98). The overall best performing classification model regardless of using PCA was MD with the classification accuracy of 94.84% (Kappa coefficient=0.88) on the validation set.
2013-11-15
features and designed a classifier that achieves up to 95% classification accuracy on classifying the occupancy with indoor footstep data. MDL-based...information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and...maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other
Genetic algorithm for the optimization of features and neural networks in ECG signals classification
NASA Astrophysics Data System (ADS)
Li, Hongqiang; Yuan, Danyang; Ma, Xiangdong; Cui, Dianyin; Cao, Lu
2017-01-01
Feature extraction and classification of electrocardiogram (ECG) signals are necessary for the automatic diagnosis of cardiac diseases. In this study, a novel method based on genetic algorithm-back propagation neural network (GA-BPNN) for classifying ECG signals with feature extraction using wavelet packet decomposition (WPD) is proposed. WPD combined with the statistical method is utilized to extract the effective features of ECG signals. The statistical features of the wavelet packet coefficients are calculated as the feature sets. GA is employed to decrease the dimensions of the feature sets and to optimize the weights and biases of the back propagation neural network (BPNN). Thereafter, the optimized BPNN classifier is applied to classify six types of ECG signals. In addition, an experimental platform is constructed for ECG signal acquisition to supply the ECG data for verifying the effectiveness of the proposed method. The GA-BPNN method with the MIT-BIH arrhythmia database achieved a dimension reduction of nearly 50% and produced good classification results with an accuracy of 97.78%. The experimental results based on the established acquisition platform indicated that the GA-BPNN method achieved a high classification accuracy of 99.33% and could be efficiently applied in the automatic identification of cardiac arrhythmias.
NASA Astrophysics Data System (ADS)
Teffahi, Hanane; Yao, Hongxun; Belabid, Nasreddine; Chaib, Souleyman
2018-02-01
The satellite images with very high spatial resolution have been recently widely used in image classification topic as it has become challenging task in remote sensing field. Due to a number of limitations such as the redundancy of features and the high dimensionality of the data, different classification methods have been proposed for remote sensing images classification particularly the methods using feature extraction techniques. This paper propose a simple efficient method exploiting the capability of extended multi-attribute profiles (EMAP) with sparse autoencoder (SAE) for remote sensing image classification. The proposed method is used to classify various remote sensing datasets including hyperspectral and multispectral images by extracting spatial and spectral features based on the combination of EMAP and SAE by linking them to kernel support vector machine (SVM) for classification. Experiments on new hyperspectral image "Huston data" and multispectral image "Washington DC data" shows that this new scheme can achieve better performance of feature learning than the primitive features, traditional classifiers and ordinary autoencoder and has huge potential to achieve higher accuracy for classification in short running time.
D Land Cover Classification Based on Multispectral LIDAR Point Clouds
NASA Astrophysics Data System (ADS)
Zou, Xiaoliang; Zhao, Guihua; Li, Jonathan; Yang, Yuanxi; Fang, Yong
2016-06-01
Multispectral Lidar System can emit simultaneous laser pulses at the different wavelengths. The reflected multispectral energy is captured through a receiver of the sensor, and the return signal together with the position and orientation information of sensor is recorded. These recorded data are solved with GNSS/IMU data for further post-processing, forming high density multispectral 3D point clouds. As the first commercial multispectral airborne Lidar sensor, Optech Titan system is capable of collecting point clouds data from all three channels at 532nm visible (Green), at 1064 nm near infrared (NIR) and at 1550nm intermediate infrared (IR). It has become a new source of data for 3D land cover classification. The paper presents an Object Based Image Analysis (OBIA) approach to only use multispectral Lidar point clouds datasets for 3D land cover classification. The approach consists of three steps. Firstly, multispectral intensity images are segmented into image objects on the basis of multi-resolution segmentation integrating different scale parameters. Secondly, intensity objects are classified into nine categories by using the customized features of classification indexes and a combination the multispectral reflectance with the vertical distribution of object features. Finally, accuracy assessment is conducted via comparing random reference samples points from google imagery tiles with the classification results. The classification results show higher overall accuracy for most of the land cover types. Over 90% of overall accuracy is achieved via using multispectral Lidar point clouds for 3D land cover classification.
Mumtaz, Wajid; Ali, Syed Saad Azhar; Yasin, Mohd Azhar Mohd; Malik, Aamir Saeed
2018-02-01
Major depressive disorder (MDD), a debilitating mental illness, could cause functional disabilities and could become a social problem. An accurate and early diagnosis for depression could become challenging. This paper proposed a machine learning framework involving EEG-derived synchronization likelihood (SL) features as input data for automatic diagnosis of MDD. It was hypothesized that EEG-based SL features could discriminate MDD patients and healthy controls with an acceptable accuracy better than measures such as interhemispheric coherence and mutual information. In this work, classification models such as support vector machine (SVM), logistic regression (LR) and Naïve Bayesian (NB) were employed to model relationship between the EEG features and the study groups (MDD patient and healthy controls) and ultimately achieved discrimination of study participants. The results indicated that the classification rates were better than chance. More specifically, the study resulted into SVM classification accuracy = 98%, sensitivity = 99.9%, specificity = 95% and f-measure = 0.97; LR classification accuracy = 91.7%, sensitivity = 86.66%, specificity = 96.6% and f-measure = 0.90; NB classification accuracy = 93.6%, sensitivity = 100%, specificity = 87.9% and f-measure = 0.95. In conclusion, SL could be a promising method for diagnosing depression. The findings could be generalized to develop a robust CAD-based tool that may help for clinical purposes.
Ruiz Hidalgo, Irene; Rodriguez, Pablo; Rozema, Jos J; Ní Dhubhghaill, Sorcha; Zakaria, Nadia; Tassignon, Marie-José; Koppen, Carina
2016-06-01
To evaluate the performance of a support vector machine algorithm that automatically and objectively identifies corneal patterns based on a combination of 22 parameters obtained from Pentacam measurements and to compare this method with other known keratoconus (KC) classification methods. Pentacam data from 860 eyes were included in the study and divided into 5 groups: 454 KC, 67 forme fruste (FF), 28 astigmatic, 117 after refractive surgery (PR), and 194 normal eyes (N). Twenty-two parameters were used for classification using a support vector machine algorithm developed in Weka, a machine-learning computer software. The cross-validation accuracy for 3 different classification tasks (KC vs. N, FF vs. N and all 5 groups) was calculated and compared with other known classification methods. The accuracy achieved in the KC versus N discrimination task was 98.9%, with 99.1% sensitivity and 98.5% specificity for KC detection. The accuracy in the FF versus N task was 93.1%, with 79.1% sensitivity and 97.9% specificity for the FF discrimination. Finally, for the 5-groups classification, the accuracy was 88.8%, with a weighted average sensitivity of 89.0% and specificity of 95.2%. Despite using the strictest definition for FF KC, the present study obtained comparable or better results than the single-parameter methods and indices reported in the literature. In some cases, direct comparisons with the literature were not possible because of differences in the compositions and definitions of the study groups, especially the FF KC.
NASA Astrophysics Data System (ADS)
Boschetto, Davide; Di Claudio, Gianluca; Mirzaei, Hadis; Leong, Rupert; Grisan, Enrico
2016-03-01
Celiac disease (CD) is an immune-mediated enteropathy triggered by exposure to gluten and similar proteins, affecting genetically susceptible persons, increasing their risk of different complications. Small bowels mucosa damage due to CD involves various degrees of endoscopically relevant lesions, which are not easily recognized: their overall sensitivity and positive predictive values are poor even when zoom-endoscopy is used. Confocal Laser Endomicroscopy (CLE) allows skilled and trained experts to qualitative evaluate mucosa alteration such as a decrease in goblet cells density, presence of villous atrophy or crypt hypertrophy. We present a method for automatically classifying CLE images into three different classes: normal regions, villous atrophy and crypt hypertrophy. This classification is performed after a features selection process, in which four features are extracted from each image, through the application of homomorphic filtering and border identification through Canny and Sobel operators. Three different classifiers have been tested on a dataset of 67 different images labeled by experts in three classes (normal, VA and CH): linear approach, Naive-Bayes quadratic approach and a standard quadratic analysis, all validated with a ten-fold cross validation. Linear classification achieves 82.09% accuracy (class accuracies: 90.32% for normal villi, 82.35% for VA and 68.42% for CH, sensitivity: 0.68, specificity 1.00), Naive Bayes analysis returns 83.58% accuracy (90.32% for normal villi, 70.59% for VA and 84.21% for CH, sensitivity: 0.84 specificity: 0.92), while the quadratic analysis achieves a final accuracy of 94.03% (96.77% accuracy for normal villi, 94.12% for VA and 89.47% for CH, sensitivity: 0.89, specificity: 0.98).
Semi-supervised morphosyntactic classification of Old Icelandic.
Urban, Kryztof; Tangherlini, Timothy R; Vijūnas, Aurelijus; Broadwell, Peter M
2014-01-01
We present IceMorph, a semi-supervised morphosyntactic analyzer of Old Icelandic. In addition to machine-read corpora and dictionaries, it applies a small set of declension prototypes to map corpus words to dictionary entries. A web-based GUI allows expert users to modify and augment data through an online process. A machine learning module incorporates prototype data, edit-distance metrics, and expert feedback to continuously update part-of-speech and morphosyntactic classification. An advantage of the analyzer is its ability to achieve competitive classification accuracy with minimum training data.
Coarse-to-fine deep neural network for fast pedestrian detection
NASA Astrophysics Data System (ADS)
Li, Yaobin; Yang, Xinmei; Cao, Lijun
2017-11-01
Pedestrian detection belongs to a category of object detection is a key issue in the field of video surveillance and automatic driving. Although recent object detection methods, such as Fast/Faster RCNN, have achieved excellent performance, it is difficult to meet real-time requirements and limits the application in real scenarios. A coarse-to-fine deep neural network for fast pedestrian detection is proposed in this paper. Two-stage approach is presented to realize fine trade-off between accuracy and speed. In the coarse stage, we train a fast deep convolution neural network to generate most pedestrian candidates at the cost of a number of false positives. The detector can cover the majority of scales, sizes, and occlusions of pedestrians. After that, a classification network is introduced to refine the pedestrian candidates generated from the previous stage. Refining through classification network, most of false detections will be excluded easily and the final pedestrian predictions with bounding box and confidence score are produced. Competitive results have been achieved on INRIA dataset in terms of accuracy, especially the method can achieve real-time detection that is faster than the previous leading methods. The effectiveness of coarse-to-fine approach to detect pedestrians is verified, and the accuracy and stability are also improved.
Clustering performance comparison using K-means and expectation maximization algorithms.
Jung, Yong Gyu; Kang, Min Soo; Heo, Jun
2014-11-14
Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K -means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K -means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.
Ensemble Sparse Classification of Alzheimer’s Disease
Liu, Manhua; Zhang, Daoqiang; Shen, Dinggang
2012-01-01
The high-dimensional pattern classification methods, e.g., support vector machines (SVM), have been widely investigated for analysis of structural and functional brain images (such as magnetic resonance imaging (MRI)) to assist the diagnosis of Alzheimer’s disease (AD) including its prodromal stage, i.e., mild cognitive impairment (MCI). Most existing classification methods extract features from neuroimaging data and then construct a single classifier to perform classification. However, due to noise and small sample size of neuroimaging data, it is challenging to train only a global classifier that can be robust enough to achieve good classification performance. In this paper, instead of building a single global classifier, we propose a local patch-based subspace ensemble method which builds multiple individual classifiers based on different subsets of local patches and then combines them for more accurate and robust classification. Specifically, to capture the local spatial consistency, each brain image is partitioned into a number of local patches and a subset of patches is randomly selected from the patch pool to build a weak classifier. Here, the sparse representation-based classification (SRC) method, which has shown effective for classification of image data (e.g., face), is used to construct each weak classifier. Then, multiple weak classifiers are combined to make the final decision. We evaluate our method on 652 subjects (including 198 AD patients, 225 MCI and 229 normal controls) from Alzheimer’s Disease Neuroimaging Initiative (ADNI) database using MR images. The experimental results show that our method achieves an accuracy of 90.8% and an area under the ROC curve (AUC) of 94.86% for AD classification and an accuracy of 87.85% and an AUC of 92.90% for MCI classification, respectively, demonstrating a very promising performance of our method compared with the state-of-the-art methods for AD/MCI classification using MR images. PMID:22270352
Bhaduri, Aritra; Banerjee, Amitava; Roy, Subhrajit; Kar, Sougata; Basu, Arindam
2018-03-01
We present a neuromorphic current mode implementation of a spiking neural classifier with lumped square law dendritic nonlinearity. It has been shown previously in software simulations that such a system with binary synapses can be trained with structural plasticity algorithms to achieve comparable classification accuracy with fewer synaptic resources than conventional algorithms. We show that even in real analog systems with manufacturing imperfections (CV of 23.5% and 14.4% for dendritic branch gains and leaks respectively), this network is able to produce comparable results with fewer synaptic resources. The chip fabricated in [Formula: see text]m complementary metal oxide semiconductor has eight dendrites per cell and uses two opposing cells per class to cancel common-mode inputs. The chip can operate down to a [Formula: see text] V and dissipates 19 nW of static power per neuronal cell and [Formula: see text] 125 pJ/spike. For two-class classification problems of high-dimensional rate encoded binary patterns, the hardware achieves comparable performance as software implementation of the same with only about a 0.5% reduction in accuracy. On two UCI data sets, the IC integrated circuit has classification accuracy comparable to standard machine learners like support vector machines and extreme learning machines while using two to five times binary synapses. We also show that the system can operate on mean rate encoded spike patterns, as well as short bursts of spikes. To the best of our knowledge, this is the first attempt in hardware to perform classification exploiting dendritic properties and binary synapses.
Computerized detection of leukocytes in microscopic leukorrhea images.
Zhang, Jing; Zhong, Ya; Wang, Xiangzhou; Ni, Guangming; Du, Xiaohui; Liu, Juanxiu; Liu, Lin; Liu, Yong
2017-09-01
Detection of leukocytes is critical for the routine leukorrhea exam, which is widely used in gynecological examinations. An elevated vaginal leukocyte count in women with bacterial vaginosis is a strong predictor of vaginal or cervical infections. In the routine leukorrhea exam, the counting of leukocytes is primarily performed by manual techniques. However, the viewing and counting of leukocytes from multiple high-power viewing fields on a glass slide under a microscope leads to subjectivity, low efficiency, and low accuracy. To date, many biological cells in stool, blood, and breast cancer have been studied to realize computerized detection; however, the detection of leukocytes in microscopic leukorrhea images has not been studied. Thus, there is an increasing need for computerized detection of leukocytes. There are two key processes in the computerized detection of leukocytes in digital image processing. One is segmentation; the other is intelligent classification. In this paper, we propose a combined ensemble to detect leukocytes in the microscopic leukorrhea image. After image segmentation and selecting likely leukocyte subimages, we obtain the leukocyte candidates. Then, for intelligent classification, we adopt two methods: feature extraction and classification by a support vector machine (SVM); applying a modified convolutional neural network (CNN) to the larger subimages. If different methods classify a candidate in the same category, the process is finished. If not, the outputs of the methods are provided to a classifier to further classify the candidate. After acquiring leukocyte candidates, we attempted three methods to perform classification. The first approach using features and SVM achieved 88% sensitivity, 97% specificity, and 92.5% accuracy. The second method using CNN achieved 95% sensitivity, 84% specificity, and 89.5% accuracy. Then, in the combination approach, we achieved 92% sensitivity, 95% specificity, and 93.5% accuracy. Finally, the images with marked and counted leukocytes were obtained. A novel computerized detection system was developed for automated detection of leukocytes in microscopic images. Different methods resulted in comparable overall qualities by enabling computerized detection of leukocytes. The proposed approach further improved the performance. This preliminary study proves the feasibility of computerized detection of leukocytes in clinical use. © 2017 American Association of Physicists in Medicine.
Zhang, Xu; Li, Yun; Chen, Xiang; Li, Guanglin; Rymer, William Zev; Zhou, Ping
2013-01-01
This study investigates the effect of involuntary motor activity of paretic-spastic muscles on classification of surface electromyography (EMG) signals. Two data collection sessions were designed for 8 stroke subjects to voluntarily perform 11 functional movements using their affected forearm and hand at a relatively slow and fast speed. For each stroke subject, the degree of involuntary motor activity present in voluntary surface EMG recordings was qualitatively described from such slow and fast experimental protocols. Myoelectric pattern recognition analysis was performed using different combinations of voluntary surface EMG data recorded from slow and fast sessions. Across all tested stroke subjects, our results revealed that when involuntary surface EMG was absent or present in both training and testing datasets, high accuracies (> 96%, > 98%, respectively, averaged over all the subjects) can be achieved in classification of different movements using surface EMG signals from paretic muscles. When involuntary surface EMG was solely involved in either training or testing datasets, the classification accuracies were dramatically reduced (< 89%, < 85%, respectively). However, if both training and testing datasets contained EMG signals with presence and absence of involuntary EMG interference, high accuracies were still achieved (> 97%). The findings of this study can be used to guide appropriate design and implementation of myoelectric pattern recognition based systems or devices toward promoting robot-aided therapy for stroke rehabilitation. PMID:23860192
Sørensen, Lauge; Nielsen, Mads
2018-05-15
The International Challenge for Automated Prediction of MCI from MRI data offered independent, standardized comparison of machine learning algorithms for multi-class classification of normal control (NC), mild cognitive impairment (MCI), converting MCI (cMCI), and Alzheimer's disease (AD) using brain imaging and general cognition. We proposed to use an ensemble of support vector machines (SVMs) that combined bagging without replacement and feature selection. SVM is the most commonly used algorithm in multivariate classification of dementia, and it was therefore valuable to evaluate the potential benefit of ensembling this type of classifier. The ensemble SVM, using either a linear or a radial basis function (RBF) kernel, achieved multi-class classification accuracies of 55.6% and 55.0% in the challenge test set (60 NC, 60 MCI, 60 cMCI, 60 AD), resulting in a third place in the challenge. Similar feature subset sizes were obtained for both kernels, and the most frequently selected MRI features were the volumes of the two hippocampal subregions left presubiculum and right subiculum. Post-challenge analysis revealed that enforcing a minimum number of selected features and increasing the number of ensemble classifiers improved classification accuracy up to 59.1%. The ensemble SVM outperformed single SVM classifications consistently in the challenge test set. Ensemble methods using bagging and feature selection can improve the performance of the commonly applied SVM classifier in dementia classification. This resulted in competitive classification accuracies in the International Challenge for Automated Prediction of MCI from MRI data. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Samsudin, Sarah Hanim; Shafri, Helmi Z. M.; Hamedianfar, Alireza
2016-04-01
Status observations of roofing material degradation are constantly evolving due to urban feature heterogeneities. Although advanced classification techniques have been introduced to improve within-class impervious surface classifications, these techniques involve complex processing and high computation times. This study integrates field spectroscopy and satellite multispectral remote sensing data to generate degradation status maps of concrete and metal roofing materials. Field spectroscopy data were used as bases for selecting suitable bands for spectral index development because of the limited number of multispectral bands. Mapping methods for roof degradation status were established for metal and concrete roofing materials by developing the normalized difference concrete condition index (NDCCI) and the normalized difference metal condition index (NDMCI). Results indicate that the accuracies achieved using the spectral indices are higher than those obtained using supervised pixel-based classification. The NDCCI generated an accuracy of 84.44%, whereas the support vector machine (SVM) approach yielded an accuracy of 73.06%. The NDMCI obtained an accuracy of 94.17% compared with 62.5% for the SVM approach. These findings support the suitability of the developed spectral index methods for determining roof degradation statuses from satellite observations in heterogeneous urban environments.
Dyrba, Martin; Barkhof, Frederik; Fellgiebel, Andreas; Filippi, Massimo; Hausner, Lucrezia; Hauenstein, Karlheinz; Kirste, Thomas; Teipel, Stefan J
2015-01-01
Alzheimer's disease (AD) patients show early changes in white matter (WM) structural integrity. We studied the use of diffusion tensor imaging (DTI) in assessing WM alterations in the predementia stage of mild cognitive impairment (MCI). We applied a Support Vector Machine (SVM) classifier to DTI and volumetric magnetic resonance imaging data from 35 amyloid-β42 negative MCI subjects (MCI-Aβ42-), 35 positive MCI subjects (MCI-Aβ42+), and 25 healthy controls (HC) retrieved from the European DTI Study on Dementia. The SVM was applied to DTI-derived fractional anisotropy, mean diffusivity (MD), and mode of anisotropy (MO) maps. For comparison, we studied classification based on gray matter (GM) and WM volume. We obtained accuracies of up to 68% for MO and 63% for GM volume when it came to distinguishing between MCI-Aβ42- and MCI-Aβ42+. When it came to separating MCI-Aβ42+ from HC we achieved an accuracy of up to 77% for MD and a significantly lower accuracy of 68% for GM volume. The accuracy of multimodal classification was not higher than the accuracy of the best single modality. Our results suggest that DTI data provide better prediction accuracy than GM volume in predementia AD. Copyright © 2015 by the American Society of Neuroimaging.
Schmitter, Daniel; Roche, Alexis; Maréchal, Bénédicte; Ribes, Delphine; Abdulkadir, Ahmed; Bach-Cuadra, Meritxell; Daducci, Alessandro; Granziera, Cristina; Klöppel, Stefan; Maeder, Philippe; Meuli, Reto; Krueger, Gunnar
2014-01-01
Voxel-based morphometry from conventional T1-weighted images has proved effective to quantify Alzheimer's disease (AD) related brain atrophy and to enable fairly accurate automated classification of AD patients, mild cognitive impaired patients (MCI) and elderly controls. Little is known, however, about the classification power of volume-based morphometry, where features of interest consist of a few brain structure volumes (e.g. hippocampi, lobes, ventricles) as opposed to hundreds of thousands of voxel-wise gray matter concentrations. In this work, we experimentally evaluate two distinct volume-based morphometry algorithms (FreeSurfer and an in-house algorithm called MorphoBox) for automatic disease classification on a standardized data set from the Alzheimer's Disease Neuroimaging Initiative. Results indicate that both algorithms achieve classification accuracy comparable to the conventional whole-brain voxel-based morphometry pipeline using SPM for AD vs elderly controls and MCI vs controls, and higher accuracy for classification of AD vs MCI and early vs late AD converters, thereby demonstrating the potential of volume-based morphometry to assist diagnosis of mild cognitive impairment and Alzheimer's disease. PMID:25429357
Branch classification: A new mechanism for improving branch predictor performance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, P.Y.; Hao, E.; Patt, Y.
There is wide agreement that one of the most significant impediments to the performance of current and future pipelined superscalar processors is the presence of conditional branches in the instruction stream. Speculative execution is one solution to the branch problem, but speculative work is discarded if a branch is mispredicted. For it to be effective, speculative work is discarded if a branch is mispredicted. For it to be effective, speculative execution requires a very accurate branch predictor; 95% accuracy is not good enough. This paper proposes branch classification, a methodology for building more accurate branch predictors. Branch classification allows anmore » individual branch instruction to be associated with the branch predictor best suited to predict its direction. Using this approach, a hybrid branch predictor can be constructed such that each component branch predictor predicts those branches for which it is best suited. To demonstrate the usefulness of branch classification, an example classification scheme is given and a new hybrid predictor is built based on this scheme which achieves a higher prediction accuracy than any branch predictor previously reported in the literature.« less
Arrhythmia Classification Based on Multi-Domain Feature Extraction for an ECG Recognition System.
Li, Hongqiang; Yuan, Danyang; Wang, Youxi; Cui, Dianyin; Cao, Lu
2016-10-20
Automatic recognition of arrhythmias is particularly important in the diagnosis of heart diseases. This study presents an electrocardiogram (ECG) recognition system based on multi-domain feature extraction to classify ECG beats. An improved wavelet threshold method for ECG signal pre-processing is applied to remove noise interference. A novel multi-domain feature extraction method is proposed; this method employs kernel-independent component analysis in nonlinear feature extraction and uses discrete wavelet transform to extract frequency domain features. The proposed system utilises a support vector machine classifier optimized with a genetic algorithm to recognize different types of heartbeats. An ECG acquisition experimental platform, in which ECG beats are collected as ECG data for classification, is constructed to demonstrate the effectiveness of the system in ECG beat classification. The presented system, when applied to the MIT-BIH arrhythmia database, achieves a high classification accuracy of 98.8%. Experimental results based on the ECG acquisition experimental platform show that the system obtains a satisfactory classification accuracy of 97.3% and is able to classify ECG beats efficiently for the automatic identification of cardiac arrhythmias.
Arrhythmia Classification Based on Multi-Domain Feature Extraction for an ECG Recognition System
Li, Hongqiang; Yuan, Danyang; Wang, Youxi; Cui, Dianyin; Cao, Lu
2016-01-01
Automatic recognition of arrhythmias is particularly important in the diagnosis of heart diseases. This study presents an electrocardiogram (ECG) recognition system based on multi-domain feature extraction to classify ECG beats. An improved wavelet threshold method for ECG signal pre-processing is applied to remove noise interference. A novel multi-domain feature extraction method is proposed; this method employs kernel-independent component analysis in nonlinear feature extraction and uses discrete wavelet transform to extract frequency domain features. The proposed system utilises a support vector machine classifier optimized with a genetic algorithm to recognize different types of heartbeats. An ECG acquisition experimental platform, in which ECG beats are collected as ECG data for classification, is constructed to demonstrate the effectiveness of the system in ECG beat classification. The presented system, when applied to the MIT-BIH arrhythmia database, achieves a high classification accuracy of 98.8%. Experimental results based on the ECG acquisition experimental platform show that the system obtains a satisfactory classification accuracy of 97.3% and is able to classify ECG beats efficiently for the automatic identification of cardiac arrhythmias. PMID:27775596
An Automatic User-Adapted Physical Activity Classification Method Using Smartphones.
Li, Pengfei; Wang, Yu; Tian, Yu; Zhou, Tian-Shu; Li, Jing-Song
2017-03-01
In recent years, an increasing number of people have become concerned about their health. Most chronic diseases are related to lifestyle, and daily activity records can be used as an important indicator of health. Specifically, using advanced technology to automatically monitor actual activities can effectively prevent and manage chronic diseases. The data used in this paper were obtained from acceleration sensors and gyroscopes integrated in smartphones. We designed an efficient Adaboost-Stump running on a smartphone to classify five common activities: cycling, running, sitting, standing, and walking and achieved a satisfactory classification accuracy of 98%. We designed an online learning method, and the classification model requires continuous training with actual data. The parameters in the model then become increasingly fitted to the specific user, which allows the classification accuracy to reach 95% under different use environments. In addition, this paper also utilized the OpenCL framework to design the program in parallel. This process can enhance the computing efficiency approximately ninefold.
Rajagopal, Rekha; Ranganathan, Vidhyapriya
2018-06-05
Automation in cardiac arrhythmia classification helps medical professionals make accurate decisions about the patient's health. The aim of this work was to design a hybrid classification model to classify cardiac arrhythmias. The design phase of the classification model comprises the following stages: preprocessing of the cardiac signal by eliminating detail coefficients that contain noise, feature extraction through Daubechies wavelet transform, and arrhythmia classification using a collaborative decision from the K nearest neighbor classifier (KNN) and a support vector machine (SVM). The proposed model is able to classify 5 arrhythmia classes as per the ANSI/AAMI EC57: 1998 classification standard. Level 1 of the proposed model involves classification using the KNN and the classifier is trained with examples from all classes. Level 2 involves classification using an SVM and is trained specifically to classify overlapped classes. The final classification of a test heartbeat pertaining to a particular class is done using the proposed KNN/SVM hybrid model. The experimental results demonstrated that the average sensitivity of the proposed model was 92.56%, the average specificity 99.35%, the average positive predictive value 98.13%, the average F-score 94.5%, and the average accuracy 99.78%. The results obtained using the proposed model were compared with the results of discriminant, tree, and KNN classifiers. The proposed model is able to achieve a high classification accuracy.
NASA Astrophysics Data System (ADS)
Sung, Changhyuck; Lim, Seokjae; Kim, Hyungjun; Kim, Taesu; Moon, Kibong; Song, Jeonghwan; Kim, Jae-Joon; Hwang, Hyunsang
2018-03-01
To improve the classification accuracy of an image data set (CIFAR-10) by using analog input voltage, synapse devices with excellent conductance linearity (CL) and multi-level cell (MLC) characteristics are required. We analyze the CL and MLC characteristics of TaOx-based filamentary resistive random access memory (RRAM) to implement the synapse device in neural network hardware. Our findings show that the number of oxygen vacancies in the filament constriction region of the RRAM directly controls the CL and MLC characteristics. By adopting a Ta electrode (instead of Ti) and the hot-forming step, we could form a dense conductive filament. As a result, a wide range of conductance levels with CL is achieved and significantly improved image classification accuracy is confirmed.
NutriNet: A Deep Learning Food and Drink Image Recognition System for Dietary Assessment.
Mezgec, Simon; Koroušić Seljak, Barbara
2017-06-27
Automatic food image recognition systems are alleviating the process of food-intake estimation and dietary assessment. However, due to the nature of food images, their recognition is a particularly challenging task, which is why traditional approaches in the field have achieved a low classification accuracy. Deep neural networks have outperformed such solutions, and we present a novel approach to the problem of food and drink image detection and recognition that uses a newly-defined deep convolutional neural network architecture, called NutriNet. This architecture was tuned on a recognition dataset containing 225,953 512 × 512 pixel images of 520 different food and drink items from a broad spectrum of food groups, on which we achieved a classification accuracy of 86 . 72 % , along with an accuracy of 94 . 47 % on a detection dataset containing 130 , 517 images. We also performed a real-world test on a dataset of self-acquired images, combined with images from Parkinson's disease patients, all taken using a smartphone camera, achieving a top-five accuracy of 55 % , which is an encouraging result for real-world images. Additionally, we tested NutriNet on the University of Milano-Bicocca 2016 (UNIMIB2016) food image dataset, on which we improved upon the provided baseline recognition result. An online training component was implemented to continually fine-tune the food and drink recognition model on new images. The model is being used in practice as part of a mobile app for the dietary assessment of Parkinson's disease patients.
Learning optimal embedded cascades.
Saberian, Mohammad Javad; Vasconcelos, Nuno
2012-10-01
The problem of automatic and optimal design of embedded object detector cascades is considered. Two main challenges are identified: optimization of the cascade configuration and optimization of individual cascade stages, so as to achieve the best tradeoff between classification accuracy and speed, under a detection rate constraint. Two novel boosting algorithms are proposed to address these problems. The first, RCBoost, formulates boosting as a constrained optimization problem which is solved with a barrier penalty method. The constraint is the target detection rate, which is met at all iterations of the boosting process. This enables the design of embedded cascades of known configuration without extensive cross validation or heuristics. The second, ECBoost, searches over cascade configurations to achieve the optimal tradeoff between classification risk and speed. The two algorithms are combined into an overall boosting procedure, RCECBoost, which optimizes both the cascade configuration and its stages under a detection rate constraint, in a fully automated manner. Extensive experiments in face, car, pedestrian, and panda detection show that the resulting detectors achieve an accuracy versus speed tradeoff superior to those of previous methods.
EEG-based classification of imaginary left and right foot movements using beta rebound.
Hashimoto, Yasunari; Ushiba, Junichi
2013-11-01
The purpose of this study was to investigate cortical lateralization of event-related (de)synchronization during left and right foot motor imagery tasks and to determine classification accuracy of the two imaginary movements in a brain-computer interface (BCI) paradigm. We recorded 31-channel scalp electroencephalograms (EEGs) from nine healthy subjects during brisk imagery tasks of left and right foot movements. EEG was analyzed with time-frequency maps and topographies, and the accuracy rate of classification between left and right foot movements was calculated. Beta rebound at the end of imagination (increase of EEG beta rhythm amplitude) was identified from the two EEGs derived from the right-shift and left-shift bipolar pairs at the vertex. This process enabled discrimination between right or left foot imagery at a high accuracy rate (maximum 81.6% in single trial analysis). These data suggest that foot motor imagery has potential to elicit left-right differences in EEG, while BCI using the unilateral foot imagery can achieve high classification accuracy, similar to ordinary BCI, based on hand motor imagery. By combining conventional discrimination techniques, the left-right discrimination of unilateral foot motor imagery provides a novel BCI system that could control a foot neuroprosthesis or a robotic foot. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Semantic labeling of digital photos by classification
NASA Astrophysics Data System (ADS)
Ciocca, Gianluigi; Cusano, Claudio; Schettini, Raimondo; Brambilla, Carla
2003-01-01
The paper addresses the problem of annotating photographs with broad semantic labels. To cope with the great variety of photos available on the WEB we have designed a hierarchical classification strategy which first classifies images as pornographic or not-pornographic. Not-pornographic images are then classified as indoor, outdoor, or close-up. On a database of over 9000 images, mostly downloaded from the web, our method achieves an average accuracy of close to 90%.
NASA Astrophysics Data System (ADS)
Mohammadimanesh, F.; Salehi, B.; Mahdianpari, M.; Homayouni, S.
2016-06-01
Polarimetric Synthetic Aperture Radar (PolSAR) imagery is a complex multi-dimensional dataset, which is an important source of information for various natural resources and environmental classification and monitoring applications. PolSAR imagery produces valuable information by observing scattering mechanisms from different natural and man-made objects. Land cover mapping using PolSAR data classification is one of the most important applications of SAR remote sensing earth observations, which have gained increasing attention in the recent years. However, one of the most challenging aspects of classification is selecting features with maximum discrimination capability. To address this challenge, a statistical approach based on the Fisher Linear Discriminant Analysis (FLDA) and the incorporation of physical interpretation of PolSAR data into classification is proposed in this paper. After pre-processing of PolSAR data, including the speckle reduction, the H/α classification is used in order to classify the basic scattering mechanisms. Then, a new method for feature weighting, based on the fusion of FLDA and physical interpretation, is implemented. This method proves to increase the classification accuracy as well as increasing between-class discrimination in the final Wishart classification. The proposed method was applied to a full polarimetric C-band RADARSAT-2 data set from Avalon area, Newfoundland and Labrador, Canada. This imagery has been acquired in June 2015, and covers various types of wetlands including bogs, fens, marshes and shallow water. The results were compared with the standard Wishart classification, and an improvement of about 20% was achieved in the overall accuracy. This method provides an opportunity for operational wetland classification in northern latitude with high accuracy using only SAR polarimetric data.
High resolution microendoscopy for classification of colorectal polyps.
Chang, S S; Shukla, R; Polydorides, A D; Vila, P M; Lee, M; Han, H; Kedia, P; Lewis, J; Gonzalez, S; Kim, M K; Harpaz, N; Godbold, J; Richards-Kortum, R; Anandasabapathy, S
2013-07-01
It can be difficult to distinguish adenomas from benign polyps during routine colonoscopy. High resolution microendoscopy (HRME) is a novel method for imaging colorectal mucosa with subcellular detail. HRME criteria for the classification of colorectal neoplasia have not been previously described. Study goals were to develop criteria to characterize HRME images of colorectal mucosa (normal, hyperplastic polyps, adenomas, cancer) and to determine the accuracy and interobserver variability for the discrimination of neoplastic from non-neoplastic polyps when these criteria were applied by novice and expert microendoscopists. Two expert pathologists created consensus HRME image criteria using images from 68 patients with polyps who had undergone colonoscopy plus HRME. Using these criteria, HRME expert and novice microendoscopists were shown a set of training images and then tested to determine accuracy and interobserver variability. Expert microendoscopists identified neoplasia with sensitivity, specificity, and accuracy of 67 % (95 % confidence interval [CI] 58 % - 75 %), 97 % (94 % - 100 %), and 87 %, respectively. Nonexperts achieved sensitivity, specificity, and accuracy of 73 % (66 % - 80 %), 91 % (80 % - 100 %), and 85 %, respectively. Overall, neoplasia were identified with sensitivity 70 % (65 % - 76 %), specificity 94 % (87 % - 100 %), and accuracy 85 %. Kappa values were: experts 0.86; nonexperts 0.72; and overall 0.78. Using the new criteria, observers achieved high specificity and substantial interobserver agreement for distinguishing benign polyps from neoplasia. Increased expertise in HRME imaging improves accuracy. This low-cost microendoscopic platform may be an alternative to confocal microendoscopy in lower-resource or community-based settings.
Brain communication in the locked-in state.
De Massari, Daniele; Ruf, Carolin A; Furdea, Adrian; Matuz, Tamara; van der Heiden, Linda; Halder, Sebastian; Silvoni, Stefano; Birbaumer, Niels
2013-06-01
Patients in the completely locked-in state have no means of communication and they represent the target population for brain-computer interface research in the last 15 years. Although different paradigms have been tested and different physiological signals used, to date no sufficiently documented completely locked-in state patient was able to control a brain-computer interface over an extended time period. We introduce Pavlovian semantic conditioning to enable basic communication in completely locked-in state. This novel paradigm is based on semantic conditioning for online classification of neuroelectric or any other physiological signals to discriminate between covert (cognitive) 'yes' and 'no' responses. The paradigm comprised the presentation of affirmative and negative statements used as conditioned stimuli, while the unconditioned stimulus consisted of electrical stimulation of the skin paired with affirmative statements. Three patients with advanced amyotrophic lateral sclerosis participated over an extended time period, one of which was in a completely locked-in state, the other two in the locked-in state. The patients' level of vigilance was assessed through auditory oddball procedures to study the correlation between vigilance level and the classifier's performance. The average online classification accuracies of slow cortical components of electroencephalographic signals were around chance level for all the patients. The use of a non-linear classifier in the offline classification procedure resulted in a substantial improvement of the accuracy in one locked-in state patient achieving 70% correct classification. A reliable level of performance in the completely locked-in state patient was not achieved uniformly throughout the 37 sessions despite intact cognitive processing capacity, but in some sessions communication accuracies up to 70% were achieved. Paradigm modifications are proposed. Rapid drop of vigilance was detected suggesting attentional variations or variations of circadian period as important factors in brain-computer interface communication with locked-in state and completely locked-in state.
Feature selection for elderly faller classification based on wearable sensors.
Howcroft, Jennifer; Kofman, Jonathan; Lemaire, Edward D
2017-05-30
Wearable sensors can be used to derive numerous gait pattern features for elderly fall risk and faller classification; however, an appropriate feature set is required to avoid high computational costs and the inclusion of irrelevant features. The objectives of this study were to identify and evaluate smaller feature sets for faller classification from large feature sets derived from wearable accelerometer and pressure-sensing insole gait data. A convenience sample of 100 older adults (75.5 ± 6.7 years; 76 non-fallers, 24 fallers based on 6 month retrospective fall occurrence) walked 7.62 m while wearing pressure-sensing insoles and tri-axial accelerometers at the head, pelvis, left and right shanks. Feature selection was performed using correlation-based feature selection (CFS), fast correlation based filter (FCBF), and Relief-F algorithms. Faller classification was performed using multi-layer perceptron neural network, naïve Bayesian, and support vector machine classifiers, with 75:25 single stratified holdout and repeated random sampling. The best performing model was a support vector machine with 78% accuracy, 26% sensitivity, 95% specificity, 0.36 F1 score, and 0.31 MCC and one posterior pelvis accelerometer input feature (left acceleration standard deviation). The second best model achieved better sensitivity (44%) and used a support vector machine with 74% accuracy, 83% specificity, 0.44 F1 score, and 0.29 MCC. This model had ten input features: maximum, mean and standard deviation posterior acceleration; maximum, mean and standard deviation anterior acceleration; mean superior acceleration; and three impulse features. The best multi-sensor model sensitivity (56%) was achieved using posterior pelvis and both shank accelerometers and a naïve Bayesian classifier. The best single-sensor model sensitivity (41%) was achieved using the posterior pelvis accelerometer and a naïve Bayesian classifier. Feature selection provided models with smaller feature sets and improved faller classification compared to faller classification without feature selection. CFS and FCBF provided the best feature subset (one posterior pelvis accelerometer feature) for faller classification. However, better sensitivity was achieved by the second best model based on a Relief-F feature subset with three pressure-sensing insole features and seven head accelerometer features. Feature selection should be considered as an important step in faller classification using wearable sensors.
A hybrid three-class brain-computer interface system utilizing SSSEPs and transient ERPs
NASA Astrophysics Data System (ADS)
Breitwieser, Christian; Pokorny, Christoph; Müller-Putz, Gernot R.
2016-12-01
Objective. This paper investigates the fusion of steady-state somatosensory evoked potentials (SSSEPs) and transient event-related potentials (tERPs), evoked through tactile simulation on the left and right-hand fingertips, in a three-class EEG based hybrid brain-computer interface. It was hypothesized, that fusing the input signals leads to higher classification rates than classifying tERP and SSSEP individually. Approach. Fourteen subjects participated in the studies, consisting of a screening paradigm to determine person dependent resonance-like frequencies and a subsequent online paradigm. The whole setup of the BCI system was based on open interfaces, following suggestions for a common implementation platform. During the online experiment, subjects were instructed to focus their attention on the stimulated fingertips as indicated by a visual cue. The recorded data were classified during runtime using a multi-class shrinkage LDA classifier and the outputs were fused together applying a posterior probability based fusion. Data were further analyzed offline, involving a combined classification of SSSEP and tERP features as a second fusion principle. The final results were tested for statistical significance applying a repeated measures ANOVA. Main results. A significant classification increase was achieved when fusing the results with a combined classification compared to performing an individual classification. Furthermore, the SSSEP classifier was significantly better in detecting a non-control state, whereas the tERP classifier was significantly better in detecting control states. Subjects who had a higher relative band power increase during the screening session also achieved significantly higher classification results than subjects with lower relative band power increase. Significance. It could be shown that utilizing SSSEP and tERP for hBCIs increases the classification accuracy and also that tERP and SSSEP are not classifying control- and non-control states with the same level of accuracy.
Liu, Jingfang; Zhang, Pengzhu; Lu, Yingjie
2014-11-01
User-generated medical messages on Internet contain extensive information related to adverse drug reactions (ADRs) and are known as valuable resources for post-marketing drug surveillance. The aim of this study was to find an effective method to identify messages related to ADRs automatically from online user reviews. We conducted experiments on online user reviews using different feature set and different classification technique. Firstly, the messages from three communities, allergy community, schizophrenia community and pain management community, were collected, the 3000 messages were annotated. Secondly, the N-gram-based features set and medical domain-specific features set were generated. Thirdly, three classification techniques, SVM, C4.5 and Naïve Bayes, were used to perform classification tasks separately. Finally, we evaluated the performance of different method using different feature set and different classification technique by comparing the metrics including accuracy and F-measure. In terms of accuracy, the accuracy of SVM classifier was higher than 0.8, the accuracy of C4.5 classifier or Naïve Bayes classifier was lower than 0.8; meanwhile, the combination feature sets including n-gram-based feature set and domain-specific feature set consistently outperformed single feature set. In terms of F-measure, the highest F-measure is 0.895 which was achieved by using combination feature sets and a SVM classifier. In all, we can get the best classification performance by using combination feature sets and SVM classifier. By using combination feature sets and SVM classifier, we can get an effective method to identify messages related to ADRs automatically from online user reviews.
A Novel Energy-Efficient Approach for Human Activity Recognition.
Zheng, Lingxiang; Wu, Dihong; Ruan, Xiaoyang; Weng, Shaolin; Peng, Ao; Tang, Biyu; Lu, Hai; Shi, Haibin; Zheng, Huiru
2017-09-08
In this paper, we propose a novel energy-efficient approach for mobile activity recognition system (ARS) to detect human activities. The proposed energy-efficient ARS, using low sampling rates, can achieve high recognition accuracy and low energy consumption. A novel classifier that integrates hierarchical support vector machine and context-based classification (HSVMCC) is presented to achieve a high accuracy of activity recognition when the sampling rate is less than the activity frequency, i.e., the Nyquist sampling theorem is not satisfied. We tested the proposed energy-efficient approach with the data collected from 20 volunteers (14 males and six females) and the average recognition accuracy of around 96.0% was achieved. Results show that using a low sampling rate of 1Hz can save 17.3% and 59.6% of energy compared with the sampling rates of 5 Hz and 50 Hz. The proposed low sampling rate approach can greatly reduce the power consumption while maintaining high activity recognition accuracy. The composition of power consumption in online ARS is also investigated in this paper.
Texture analysis of common renal masses in multiple MR sequences for prediction of pathology
NASA Astrophysics Data System (ADS)
Hoang, Uyen N.; Malayeri, Ashkan A.; Lay, Nathan S.; Summers, Ronald M.; Yao, Jianhua
2017-03-01
This pilot study performs texture analysis on multiple magnetic resonance (MR) images of common renal masses for differentiation of renal cell carcinoma (RCC). Bounding boxes are drawn around each mass on one axial slice in T1 delayed sequence to use for feature extraction and classification. All sequences (T1 delayed, venous, arterial, pre-contrast phases, T2, and T2 fat saturated sequences) are co-registered and texture features are extracted from each sequence simultaneously. Random forest is used to construct models to classify lesions on 96 normal regions, 87 clear cell RCCs, 8 papillary RCCs, and 21 renal oncocytomas; ground truths are verified through pathology reports. The highest performance is seen in random forest model when data from all sequences are used in conjunction, achieving an overall classification accuracy of 83.7%. When using data from one single sequence, the overall accuracies achieved for T1 delayed, venous, arterial, and pre-contrast phase, T2, and T2 fat saturated were 79.1%, 70.5%, 56.2%, 61.0%, 60.0%, and 44.8%, respectively. This demonstrates promising results of utilizing intensity information from multiple MR sequences for accurate classification of renal masses.
Das, D K; Maiti, A K; Chakraborty, C
2015-03-01
In this paper, we propose a comprehensive image characterization cum classification framework for malaria-infected stage detection using microscopic images of thin blood smears. The methodology mainly includes microscopic imaging of Leishman stained blood slides, noise reduction and illumination correction, erythrocyte segmentation, feature selection followed by machine classification. Amongst three-image segmentation algorithms (namely, rule-based, Chan-Vese-based and marker-controlled watershed methods), marker-controlled watershed technique provides better boundary detection of erythrocytes specially in overlapping situations. Microscopic features at intensity, texture and morphology levels are extracted to discriminate infected and noninfected erythrocytes. In order to achieve subgroup of potential features, feature selection techniques, namely, F-statistic and information gain criteria are considered here for ranking. Finally, five different classifiers, namely, Naive Bayes, multilayer perceptron neural network, logistic regression, classification and regression tree (CART), RBF neural network have been trained and tested by 888 erythrocytes (infected and noninfected) for each features' subset. Performance evaluation of the proposed methodology shows that multilayer perceptron network provides higher accuracy for malaria-infected erythrocytes recognition and infected stage classification. Results show that top 90 features ranked by F-statistic (specificity: 98.64%, sensitivity: 100%, PPV: 99.73% and overall accuracy: 96.84%) and top 60 features ranked by information gain provides better results (specificity: 97.29%, sensitivity: 100%, PPV: 99.46% and overall accuracy: 96.73%) for malaria-infected stage classification. © 2014 The Authors Journal of Microscopy © 2014 Royal Microscopical Society.
Deep Recurrent Neural Networks for Supernovae Classification
NASA Astrophysics Data System (ADS)
Charnock, Tom; Moss, Adam
2017-03-01
We apply deep recurrent neural networks, which are capable of learning complex sequential information, to classify supernovae (code available at https://github.com/adammoss/supernovae). The observational time and filter fluxes are used as inputs to the network, but since the inputs are agnostic, additional data such as host galaxy information can also be included. Using the Supernovae Photometric Classification Challenge (SPCC) data, we find that deep networks are capable of learning about light curves, however the performance of the network is highly sensitive to the amount of training data. For a training size of 50% of the representational SPCC data set (around 104 supernovae) we obtain a type-Ia versus non-type-Ia classification accuracy of 94.7%, an area under the Receiver Operating Characteristic curve AUC of 0.986 and an SPCC figure-of-merit F 1 = 0.64. When using only the data for the early-epoch challenge defined by the SPCC, we achieve a classification accuracy of 93.1%, AUC of 0.977, and F 1 = 0.58, results almost as good as with the whole light curve. By employing bidirectional neural networks, we can acquire impressive classification results between supernovae types I, II and III at an accuracy of 90.4% and AUC of 0.974. We also apply a pre-trained model to obtain classification probabilities as a function of time and show that it can give early indications of supernovae type. Our method is competitive with existing algorithms and has applications for future large-scale photometric surveys.
Multimodal Classification of Mild Cognitive Impairment Based on Partial Least Squares.
Wang, Pingyue; Chen, Kewei; Yao, Li; Hu, Bin; Wu, Xia; Zhang, Jiacai; Ye, Qing; Guo, Xiaojuan
2016-08-10
In recent years, increasing attention has been given to the identification of the conversion of mild cognitive impairment (MCI) to Alzheimer's disease (AD). Brain neuroimaging techniques have been widely used to support the classification or prediction of MCI. The present study combined magnetic resonance imaging (MRI), 18F-fluorodeoxyglucose PET (FDG-PET), and 18F-florbetapir PET (florbetapir-PET) to discriminate MCI converters (MCI-c, individuals with MCI who convert to AD) from MCI non-converters (MCI-nc, individuals with MCI who have not converted to AD in the follow-up period) based on the partial least squares (PLS) method. Two types of PLS models (informed PLS and agnostic PLS) were built based on 64 MCI-c and 65 MCI-nc from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. The results showed that the three-modality informed PLS model achieved better classification accuracy of 81.40%, sensitivity of 79.69%, and specificity of 83.08% compared with the single-modality model, and the three-modality agnostic PLS model also achieved better classification compared with the two-modality model. Moreover, combining the three modalities with clinical test score (ADAS-cog), the agnostic PLS model (independent data: florbetapir-PET; dependent data: FDG-PET and MRI) achieved optimal accuracy of 86.05%, sensitivity of 81.25%, and specificity of 90.77%. In addition, the comparison of PLS, support vector machine (SVM), and random forest (RF) showed greater diagnostic power of PLS. These results suggested that our multimodal PLS model has the potential to discriminate MCI-c from the MCI-nc and may therefore be helpful in the early diagnosis of AD.
Identifying Autism from Resting-State fMRI Using Long Short-Term Memory Networks.
Dvornek, Nicha C; Ventola, Pamela; Pelphrey, Kevin A; Duncan, James S
2017-09-01
Functional magnetic resonance imaging (fMRI) has helped characterize the pathophysiology of autism spectrum disorders (ASD) and carries promise for producing objective biomarkers for ASD. Recent work has focused on deriving ASD biomarkers from resting-state functional connectivity measures. However, current efforts that have identified ASD with high accuracy were limited to homogeneous, small datasets, while classification results for heterogeneous, multi-site data have shown much lower accuracy. In this paper, we propose the use of recurrent neural networks with long short-term memory (LSTMs) for classification of individuals with ASD and typical controls directly from the resting-state fMRI time-series. We used the entire large, multi-site Autism Brain Imaging Data Exchange (ABIDE) I dataset for training and testing the LSTM models. Under a cross-validation framework, we achieved classification accuracy of 68.5%, which is 9% higher than previously reported methods that used fMRI data from the whole ABIDE cohort. Finally, we presented interpretation of the trained LSTM weights, which highlight potential functional networks and regions that are known to be implicated in ASD.
Identifying Autism from Resting-State fMRI Using Long Short-Term Memory Networks
Dvornek, Nicha C.; Ventola, Pamela; Pelphrey, Kevin A.; Duncan, James S.
2017-01-01
Functional magnetic resonance imaging (fMRI) has helped characterize the pathophysiology of autism spectrum disorders (ASD) and carries promise for producing objective biomarkers for ASD. Recent work has focused on deriving ASD biomarkers from resting-state functional connectivity measures. However, current efforts that have identified ASD with high accuracy were limited to homogeneous, small datasets, while classification results for heterogeneous, multi-site data have shown much lower accuracy. In this paper, we propose the use of recurrent neural networks with long short-term memory (LSTMs) for classification of individuals with ASD and typical controls directly from the resting-state fMRI time-series. We used the entire large, multi-site Autism Brain Imaging Data Exchange (ABIDE) I dataset for training and testing the LSTM models. Under a cross-validation framework, we achieved classification accuracy of 68.5%, which is 9% higher than previously reported methods that used fMRI data from the whole ABIDE cohort. Finally, we presented interpretation of the trained LSTM weights, which highlight potential functional networks and regions that are known to be implicated in ASD. PMID:29104967
Hyperspectral imaging with wavelet transform for classification of colon tissue biopsy samples
NASA Astrophysics Data System (ADS)
Masood, Khalid
2008-08-01
Automatic classification of medical images is a part of our computerised medical imaging programme to support the pathologists in their diagnosis. Hyperspectral data has found its applications in medical imagery. Its usage is increasing significantly in biopsy analysis of medical images. In this paper, we present a histopathological analysis for the classification of colon biopsy samples into benign and malignant classes. The proposed study is based on comparison between 3D spectral/spatial analysis and 2D spatial analysis. Wavelet textural features in the wavelet domain are used in both these approaches for classification of colon biopsy samples. Experimental results indicate that the incorporation of wavelet textural features using a support vector machine, in 2D spatial analysis, achieve best classification accuracy.
Findeisen, Peter; Peccerella, Teresa; Post, Stefan; Wenz, Frederik; Neumaier, Michael
2008-04-01
Serum is a difficult matrix for the identification of biomarkers by mass spectrometry (MS). This is due to high-abundance proteins and their complex processing by a multitude of endogenous proteases making rigorous standardisation difficult. Here, we have investigated the use of defined exogenous reporter peptides as substrates for disease-specific proteases with respect to improved standardisation and disease classification accuracy. A recombinant N-terminal fragment of the Adenomatous Polyposis Coli (APC) protein was digested with trypsin to yield a peptide mixture for subsequent Reporter Peptide Spiking (RPS) of serum. Different preanalytical handling of serum samples was simulated by storage of serum samples for up to 6 h at ambient temperature, followed by RPS, further incubation under standardised conditions and testing for stability of protease-generated MS profiles. To demonstrate the superior classification accuracy achieved by RPS, a pilot profiling experiment was performed using serum specimens from pancreatic cancer patients (n = 50) and healthy controls (n = 50). After RPS six different peak categories could be defined, two of which (categories C and D) are modulated by endogenous proteases. These latter are relevant for improved classification accuracy as shown by enhanced disease-specific classification from 78% to 87% in unspiked and spiked samples, respectively. Peaks of these categories presented with unchanged signal intensities regardless of preanalytical conditions. The use of RPS generally improved the signal intensities of protease-generated peptide peaks. RPS circumvents preanalytical variabilities and improves classification accuracies. Our approach will be helpful to introduce MS-based proteomic profiling into routine laboratory testing.
Mapping Winter Wheat with Multi-Temporal SAR and Optical Images in an Urban Agricultural Region
Zhou, Tao; Pan, Jianjun; Zhang, Peiyu; Wei, Shanbao; Han, Tao
2017-01-01
Winter wheat is the second largest food crop in China. It is important to obtain reliable winter wheat acreage to guarantee the food security for the most populous country in the world. This paper focuses on assessing the feasibility of in-season winter wheat mapping and investigating potential classification improvement by using SAR (Synthetic Aperture Radar) images, optical images, and the integration of both types of data in urban agricultural regions with complex planting structures in Southern China. Both SAR (Sentinel-1A) and optical (Landsat-8) data were acquired, and classification using different combinations of Sentinel-1A-derived information and optical images was performed using a support vector machine (SVM) and a random forest (RF) method. The interference coherence and texture images were obtained and used to assess the effect of adding them to the backscatter intensity images on the classification accuracy. The results showed that the use of four Sentinel-1A images acquired before the jointing period of winter wheat can provide satisfactory winter wheat classification accuracy, with an F1 measure of 87.89%. The combination of SAR and optical images for winter wheat mapping achieved the best F1 measure–up to 98.06%. The SVM was superior to RF in terms of the overall accuracy and the kappa coefficient, and was faster than RF, while the RF classifier was slightly better than SVM in terms of the F1 measure. In addition, the classification accuracy can be effectively improved by adding the texture and coherence images to the backscatter intensity data. PMID:28587066
NASA Astrophysics Data System (ADS)
Sakuma, Jun; Wright, Rebecca N.
Privacy-preserving classification is the task of learning or training a classifier on the union of privately distributed datasets without sharing the datasets. The emphasis of existing studies in privacy-preserving classification has primarily been put on the design of privacy-preserving versions of particular data mining algorithms, However, in classification problems, preprocessing and postprocessing— such as model selection or attribute selection—play a prominent role in achieving higher classification accuracy. In this paper, we show generalization error of classifiers in privacy-preserving classification can be securely evaluated without sharing prediction results. Our main technical contribution is a new generalized Hamming distance protocol that is universally applicable to preprocessing and postprocessing of various privacy-preserving classification problems, such as model selection in support vector machine and attribute selection in naive Bayes classification.
NASA Astrophysics Data System (ADS)
Qu, Haicheng; Liang, Xuejian; Liang, Shichao; Liu, Wanjun
2018-01-01
Many methods of hyperspectral image classification have been proposed recently, and the convolutional neural network (CNN) achieves outstanding performance. However, spectral-spatial classification of CNN requires an excessively large model, tremendous computations, and complex network, and CNN is generally unable to use the noisy bands caused by water-vapor absorption. A dimensionality-varied CNN (DV-CNN) is proposed to address these issues. There are four stages in DV-CNN and the dimensionalities of spectral-spatial feature maps vary with the stages. DV-CNN can reduce the computation and simplify the structure of the network. All feature maps are processed by more kernels in higher stages to extract more precise features. DV-CNN also improves the classification accuracy and enhances the robustness to water-vapor absorption bands. The experiments are performed on data sets of Indian Pines and Pavia University scene. The classification performance of DV-CNN is compared with state-of-the-art methods, which contain the variations of CNN, traditional, and other deep learning methods. The experiment of performance analysis about DV-CNN itself is also carried out. The experimental results demonstrate that DV-CNN outperforms state-of-the-art methods for spectral-spatial classification and it is also robust to water-vapor absorption bands. Moreover, reasonable parameters selection is effective to improve classification accuracy.
Wavelet-based multicomponent denoising on GPU to improve the classification of hyperspectral images
NASA Astrophysics Data System (ADS)
Quesada-Barriuso, Pablo; Heras, Dora B.; Argüello, Francisco; Mouriño, J. C.
2017-10-01
Supervised classification allows handling a wide range of remote sensing hyperspectral applications. Enhancing the spatial organization of the pixels over the image has proven to be beneficial for the interpretation of the image content, thus increasing the classification accuracy. Denoising in the spatial domain of the image has been shown as a technique that enhances the structures in the image. This paper proposes a multi-component denoising approach in order to increase the classification accuracy when a classification method is applied. It is computed on multicore CPUs and NVIDIA GPUs. The method combines feature extraction based on a 1Ddiscrete wavelet transform (DWT) applied in the spectral dimension followed by an Extended Morphological Profile (EMP) and a classifier (SVM or ELM). The multi-component noise reduction is applied to the EMP just before the classification. The denoising recursively applies a separable 2D DWT after which the number of wavelet coefficients is reduced by using a threshold. Finally, inverse 2D-DWT filters are applied to reconstruct the noise free original component. The computational cost of the classifiers as well as the cost of the whole classification chain is high but it is reduced achieving real-time behavior for some applications through their computation on NVIDIA multi-GPU platforms.
Machine learning approaches to diagnosis and laterality effects in semantic dementia discourse.
Garrard, Peter; Rentoumi, Vassiliki; Gesierich, Benno; Miller, Bruce; Gorno-Tempini, Maria Luisa
2014-06-01
Advances in automatic text classification have been necessitated by the rapid increase in the availability of digital documents. Machine learning (ML) algorithms can 'learn' from data: for instance a ML system can be trained on a set of features derived from written texts belonging to known categories, and learn to distinguish between them. Such a trained system can then be used to classify unseen texts. In this paper, we explore the potential of the technique to classify transcribed speech samples along clinical dimensions, using vocabulary data alone. We report the accuracy with which two related ML algorithms [naive Bayes Gaussian (NBG) and naive Bayes multinomial (NBM)] categorized picture descriptions produced by: 32 semantic dementia (SD) patients versus 10 healthy, age-matched controls; and SD patients with left- (n = 21) versus right-predominant (n = 11) patterns of temporal lobe atrophy. We used information gain (IG) to identify the vocabulary features that were most informative to each of these two distinctions. In the SD versus control classification task, both algorithms achieved accuracies of greater than 90%. In the right- versus left-temporal lobe predominant classification, NBM achieved a high level of accuracy (88%), but this was achieved by both NBM and NBG when the features used in the training set were restricted to those with high values of IG. The most informative features for the patient versus control task were low frequency content words, generic terms and components of metanarrative statements. For the right versus left task the number of informative lexical features was too small to support any specific inferences. An enriched feature set, including values derived from Quantitative Production Analysis (QPA) may shed further light on this little understood distinction. Copyright © 2013 Elsevier Ltd. All rights reserved.
Forest tree species clssification based on airborne hyper-spectral imagery
NASA Astrophysics Data System (ADS)
Dian, Yuanyong; Li, Zengyuan; Pang, Yong
2013-10-01
Forest precision classification products were the basic data for surveying of forest resource, updating forest subplot information, logging and design of forest. However, due to the diversity of stand structure, complexity of the forest growth environment, it's difficult to discriminate forest tree species using multi-spectral image. The airborne hyperspectral images can achieve the high spatial and spectral resolution imagery of forest canopy, so it will good for tree species level classification. The aim of this paper was to test the effective of combining spatial and spectral features in airborne hyper-spectral image classification. The CASI hyper spectral image data were acquired from Liangshui natural reserves area. Firstly, we use the MNF (minimum noise fraction) transform method for to reduce the hyperspectral image dimensionality and highlighting variation. And secondly, we use the grey level co-occurrence matrix (GLCM) to extract the texture features of forest tree canopy from the hyper-spectral image, and thirdly we fused the texture and the spectral features of forest canopy to classify the trees species using support vector machine (SVM) with different kernel functions. The results showed that when using the SVM classifier, MNF and texture-based features combined with linear kernel function can achieve the best overall accuracy which was 85.92%. It was also confirm that combine the spatial and spectral information can improve the accuracy of tree species classification.
Atzori, Manfredo; Cognolato, Matteo; Müller, Henning
2016-01-01
Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too. PMID:27656140
Atzori, Manfredo; Cognolato, Matteo; Müller, Henning
2016-01-01
Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too.
Application of Template Matching for Improving Classification of Urban Railroad Point Clouds
Arastounia, Mostafa; Oude Elberink, Sander
2016-01-01
This study develops an integrated data-driven and model-driven approach (template matching) that clusters the urban railroad point clouds into three classes of rail track, contact cable, and catenary cable. The employed dataset covers 630 m of the Dutch urban railroad corridors in which there are four rail tracks, two contact cables, and two catenary cables. The dataset includes only geometrical information (three dimensional (3D) coordinates of the points) with no intensity data and no RGB data. The obtained results indicate that all objects of interest are successfully classified at the object level with no false positives and no false negatives. The results also show that an average 97.3% precision and an average 97.7% accuracy at the point cloud level are achieved. The high precision and high accuracy of the rail track classification (both greater than 96%) at the point cloud level stems from the great impact of the employed template matching method on excluding the false positives. The cables also achieve quite high average precision (96.8%) and accuracy (98.4%) due to their high sampling and isolated position in the railroad corridor. PMID:27973452
Barua, Shaibal; Begum, Shahina; Ahmed, Mobyen Uddin
2015-01-01
Machine learning algorithms play an important role in computer science research. Recent advancement in sensor data collection in clinical sciences lead to a complex, heterogeneous data processing, and analysis for patient diagnosis and prognosis. Diagnosis and treatment of patients based on manual analysis of these sensor data are difficult and time consuming. Therefore, development of Knowledge-based systems to support clinicians in decision-making is important. However, it is necessary to perform experimental work to compare performances of different machine learning methods to help to select appropriate method for a specific characteristic of data sets. This paper compares classification performance of three popular machine learning methods i.e., case-based reasoning, neutral networks and support vector machine to diagnose stress of vehicle drivers using finger temperature and heart rate variability. The experimental results show that case-based reasoning outperforms other two methods in terms of classification accuracy. Case-based reasoning has achieved 80% and 86% accuracy to classify stress using finger temperature and heart rate variability. On contrary, both neural network and support vector machine have achieved less than 80% accuracy by using both physiological signals.
NASA Astrophysics Data System (ADS)
Wang, Bingjie; Pi, Shaohua; Sun, Qi; Jia, Bo
2015-05-01
An improved classification algorithm that considers multiscale wavelet packet Shannon entropy is proposed. Decomposition coefficients at all levels are obtained to build the initial Shannon entropy feature vector. After subtracting the Shannon entropy map of the background signal, components of the strongest discriminating power in the initial feature vector are picked out to rebuild the Shannon entropy feature vector, which is transferred to radial basis function (RBF) neural network for classification. Four types of man-made vibrational intrusion signals are recorded based on a modified Sagnac interferometer. The performance of the improved classification algorithm has been evaluated by the classification experiments via RBF neural network under different diffusion coefficients. An 85% classification accuracy rate is achieved, which is higher than the other common algorithms. The classification results show that this improved classification algorithm can be used to classify vibrational intrusion signals in an automatic real-time monitoring system.
A new classification scheme of plastic wastes based upon recycling labels
DOE Office of Scientific and Technical Information (OSTI.GOV)
Özkan, Kemal, E-mail: kozkan@ogu.edu.tr; Ergin, Semih, E-mail: sergin@ogu.edu.tr; Işık, Şahin, E-mail: sahini@ogu.edu.tr
Highlights: • PET, HPDE or PP types of plastics are considered. • An automated classification of plastic bottles based on the feature extraction and classification methods is performed. • The decision mechanism consists of PCA, Kernel PCA, FLDA, SVD and Laplacian Eigenmaps methods. • SVM is selected to achieve the classification task and majority voting technique is used. - Abstract: Since recycling of materials is widely assumed to be environmentally and economically beneficial, reliable sorting and processing of waste packaging materials such as plastics is very important for recycling with high efficiency. An automated system that can quickly categorize thesemore » materials is certainly needed for obtaining maximum classification while maintaining high throughput. In this paper, first of all, the photographs of the plastic bottles have been taken and several preprocessing steps were carried out. The first preprocessing step is to extract the plastic area of a bottle from the background. Then, the morphological image operations are implemented. These operations are edge detection, noise removal, hole removing, image enhancement, and image segmentation. These morphological operations can be generally defined in terms of the combinations of erosion and dilation. The effect of bottle color as well as label are eliminated using these operations. Secondly, the pixel-wise intensity values of the plastic bottle images have been used together with the most popular subspace and statistical feature extraction methods to construct the feature vectors in this study. Only three types of plastics are considered due to higher existence ratio of them than the other plastic types in the world. The decision mechanism consists of five different feature extraction methods including as Principal Component Analysis (PCA), Kernel PCA (KPCA), Fisher’s Linear Discriminant Analysis (FLDA), Singular Value Decomposition (SVD) and Laplacian Eigenmaps (LEMAP) and uses a simple experimental setup with a camera and homogenous backlighting. Due to the giving global solution for a classification problem, Support Vector Machine (SVM) is selected to achieve the classification task and majority voting technique is used as the decision mechanism. This technique equally weights each classification result and assigns the given plastic object to the class that the most classification results agree on. The proposed classification scheme provides high accuracy rate, and also it is able to run in real-time applications. It can automatically classify the plastic bottle types with approximately 90% recognition accuracy. Besides this, the proposed methodology yields approximately 96% classification rate for the separation of PET or non-PET plastic types. It also gives 92% accuracy for the categorization of non-PET plastic types into HPDE or PP.« less
An Evaluation of Item Response Theory Classification Accuracy and Consistency Indices
ERIC Educational Resources Information Center
Wyse, Adam E.; Hao, Shiqi
2012-01-01
This article introduces two new classification consistency indices that can be used when item response theory (IRT) models have been applied. The new indices are shown to be related to Rudner's classification accuracy index and Guo's classification accuracy index. The Rudner- and Guo-based classification accuracy and consistency indices are…
Automated sleep scoring and sleep apnea detection in children
NASA Astrophysics Data System (ADS)
Baraglia, David P.; Berryman, Matthew J.; Coussens, Scott W.; Pamula, Yvonne; Kennedy, Declan; Martin, A. James; Abbott, Derek
2005-12-01
This paper investigates the automated detection of a patient's breathing rate and heart rate from their skin conductivity as well as sleep stage scoring and breathing event detection from their EEG. The software developed for these tasks is tested on data sets obtained from the sleep disorders unit at the Adelaide Women's and Children's Hospital. The sleep scoring and breathing event detection tasks used neural networks to achieve signal classification. The Fourier transform and the Higuchi fractal dimension were used to extract features for input to the neural network. The filtered skin conductivity appeared visually to bear a similarity to the breathing and heart rate signal, but a more detailed evaluation showed the relation was not consistent. Sleep stage classification was achieved with and accuracy of around 65% with some stages being accurately scored and others poorly scored. The two breathing events hypopnea and apnea were scored with varying degrees of accuracy with the highest scores being around 75% and 30%.
Major Depression Detection from EEG Signals Using Kernel Eigen-Filter-Bank Common Spatial Patterns.
Liao, Shih-Cheng; Wu, Chien-Te; Huang, Hao-Chuan; Cheng, Wei-Teng; Liu, Yi-Hung
2017-06-14
Major depressive disorder (MDD) has become a leading contributor to the global burden of disease; however, there are currently no reliable biological markers or physiological measurements for efficiently and effectively dissecting the heterogeneity of MDD. Here we propose a novel method based on scalp electroencephalography (EEG) signals and a robust spectral-spatial EEG feature extractor called kernel eigen-filter-bank common spatial pattern (KEFB-CSP). The KEFB-CSP first filters the multi-channel raw EEG signals into a set of frequency sub-bands covering the range from theta to gamma bands, then spatially transforms the EEG signals of each sub-band from the original sensor space to a new space where the new signals (i.e., CSPs) are optimal for the classification between MDD and healthy controls, and finally applies the kernel principal component analysis (kernel PCA) to transform the vector containing the CSPs from all frequency sub-bands to a lower-dimensional feature vector called KEFB-CSP. Twelve patients with MDD and twelve healthy controls participated in this study, and from each participant we collected 54 resting-state EEGs of 6 s length (5 min and 24 s in total). Our results show that the proposed KEFB-CSP outperforms other EEG features including the powers of EEG frequency bands, and fractal dimension, which had been widely applied in previous EEG-based depression detection studies. The results also reveal that the 8 electrodes from the temporal areas gave higher accuracies than other scalp areas. The KEFB-CSP was able to achieve an average EEG classification accuracy of 81.23% in single-trial analysis when only the 8-electrode EEGs of the temporal area and a support vector machine (SVM) classifier were used. We also designed a voting-based leave-one-participant-out procedure to test the participant-independent individual classification accuracy. The voting-based results show that the mean classification accuracy of about 80% can be achieved by the KEFP-CSP feature and the SVM classifier with only several trials, and this level of accuracy seems to become stable as more trials (i.e., <7 trials) are used. These findings therefore suggest that the proposed method has a great potential for developing an efficient (required only a few 6-s EEG signals from the 8 electrodes over the temporal) and effective (~80% classification accuracy) EEG-based brain-computer interface (BCI) system which may, in the future, help psychiatrists provide individualized and effective treatments for MDD patients.
NASA Astrophysics Data System (ADS)
Liu, Yang; Wang, Jiang; Cai, Lihui; Chen, Yingyuan; Qin, Yingmei
2018-03-01
As a pattern of cross-frequency coupling (CFC), phase-amplitude coupling (PAC) depicts the interaction between the phase and amplitude of distinct frequency bands from the same signal, and has been proved to be closely related to the brain’s cognitive and memory activities. This work utilized PAC and support vector machine (SVM) classifier to identify the epileptic seizures from electroencephalogram (EEG) data. The entropy-based modulation index (MI) matrixes are used to express the strength of PAC, from which we extracted features as the input for classifier. Based on the Bonn database, which contains five datasets of EEG segments obtained from healthy volunteers and epileptic subjects, a 100% classification accuracy is achieved for identifying seizure ictal from healthy data, and an accuracy of 97.67% is reached in the classification of ictal EEG signals from inter-ictal EEGs. Based on the CHB-MIT database which is a group of continuously recorded epileptic EEGs by scalp electrodes, a 97.50% classification accuracy is obtained and a raising sign of MI value is found at 6s before seizure onset. The classification performance in this work is effective, and PAC can be considered as a useful tool for detecting and predicting the epileptic seizures and providing reference for clinical diagnosis.
Mander, Luke; Li, Mao; Mio, Washington; Fowlkes, Charless C; Punyasena, Surangi W
2013-11-07
Taxonomic identification of pollen and spores uses inherently qualitative descriptions of morphology. Consequently, identifications are restricted to categories that can be reliably classified by multiple analysts, resulting in the coarse taxonomic resolution of the pollen and spore record. Grass pollen represents an archetypal example; it is not routinely identified below family level. To address this issue, we developed quantitative morphometric methods to characterize surface ornamentation and classify grass pollen grains. This produces a means of quantifying morphological features that are traditionally described qualitatively. We used scanning electron microscopy to image 240 specimens of pollen from 12 species within the grass family (Poaceae). We classified these species by developing algorithmic features that quantify the size and density of sculptural elements on the pollen surface, and measure the complexity of the ornamentation they form. These features yielded a classification accuracy of 77.5%. In comparison, a texture descriptor based on modelling the statistical distribution of brightness values in image patches yielded a classification accuracy of 85.8%, and seven human subjects achieved accuracies between 68.33 and 81.67%. The algorithmic features we developed directly relate to biologically meaningful features of grass pollen morphology, and could facilitate direct interpretation of unsupervised classification results from fossil material.
NASA Astrophysics Data System (ADS)
Zhang, Xu; Li, Yun; Chen, Xiang; Li, Guanglin; Zev Rymer, William; Zhou, Ping
2013-08-01
Objective. This study investigates the effect of the involuntary motor activity of paretic-spastic muscles on the classification of surface electromyography (EMG) signals. Approach. Two data collection sessions were designed for 8 stroke subjects to voluntarily perform 11 functional movements using their affected forearm and hand at relatively slow and fast speeds. For each stroke subject, the degree of involuntary motor activity present in the voluntary surface EMG recordings was qualitatively described from such slow and fast experimental protocols. Myoelectric pattern recognition analysis was performed using different combinations of voluntary surface EMG data recorded from the slow and fast sessions. Main results. Across all tested stroke subjects, our results revealed that when involuntary surface EMG is absent or present in both the training and testing datasets, high accuracies (>96%, >98%, respectively, averaged over all the subjects) can be achieved in the classification of different movements using surface EMG signals from paretic muscles. When involuntary surface EMG was solely involved in either the training or testing datasets, the classification accuracies were dramatically reduced (<89%, <85%, respectively). However, if both the training and testing datasets contained EMG signals with the presence and absence of involuntary EMG interference, high accuracies were still achieved (>97%). Significance. The findings of this study can be used to guide the appropriate design and implementation of myoelectric pattern recognition based systems or devices toward promoting robot-aided therapy for stroke rehabilitation.
Automated detection of neovascularization for proliferative diabetic retinopathy screening.
Roychowdhury, Sohini; Koozekanani, Dara D; Parhi, Keshab K
2016-08-01
Neovascularization is the primary manifestation of proliferative diabetic retinopathy (PDR) that can lead to acquired blindness. This paper presents a novel method that classifies neovascularizations in the 1-optic disc (OD) diameter region (NVD) and elsewhere (NVE) separately to achieve low false positive rates of neovascularization classification. First, the OD region and blood vessels are extracted. Next, the major blood vessel segments in the 1-OD diameter region are classified for NVD, and minor blood vessel segments elsewhere are classified for NVE. For NVD and NVE classifications, optimal region-based feature sets of 10 and 6 features, respectively, are used. The proposed method achieves classification sensitivity, specificity and accuracy for NVD and NVE of 74%, 98.2%, 87.6%, and 61%, 97.5%, 92.1%, respectively. Also, the proposed method achieves 86.4% sensitivity and 76% specificity for screening images with PDR from public and local data sets. Thus, the proposed NVD and NVE detection methods can play a key role in automated screening and prioritization of patients with diabetic retinopathy.
Comparison of Classification Methods for P300 Brain-Computer Interface on Disabled Subjects
Manyakov, Nikolay V.; Chumerin, Nikolay; Combaz, Adrien; Van Hulle, Marc M.
2011-01-01
We report on tests with a mind typing paradigm based on a P300 brain-computer interface (BCI) on a group of amyotrophic lateral sclerosis (ALS), middle cerebral artery (MCA) stroke, and subarachnoid hemorrhage (SAH) patients, suffering from motor and speech disabilities. We investigate the achieved typing accuracy given the individual patient's disorder, and how it correlates with the type of classifier used. We considered 7 types of classifiers, linear as well as nonlinear ones, and found that, overall, one type of linear classifier yielded a higher classification accuracy. In addition to the selection of the classifier, we also suggest and discuss a number of recommendations to be considered when building a P300-based typing system for disabled subjects. PMID:21941530
Schizophrenia classification using functional network features
NASA Astrophysics Data System (ADS)
Rish, Irina; Cecchi, Guillermo A.; Heuton, Kyle
2012-03-01
This paper focuses on discovering statistical biomarkers (features) that are predictive of schizophrenia, with a particular focus on topological properties of fMRI functional networks. We consider several network properties, such as node (voxel) strength, clustering coefficients, local efficiency, as well as just a subset of pairwise correlations. While all types of features demonstrate highly significant statistical differences in several brain areas, and close to 80% classification accuracy, the most remarkable results of 93% accuracy are achieved by using a small subset of only a dozen of most-informative (lowest p-value) correlation features. Our results suggest that voxel-level correlations and functional network features derived from them are highly informative about schizophrenia and can be used as statistical biomarkers for the disease.
Kayagil, Turan A; Bai, Ou; Henriquez, Craig S; Lin, Peter; Furlani, Stephen J; Vorbach, Sherry; Hallett, Mark
2009-05-06
Brain-computer interfaces (BCI) use electroencephalography (EEG) to interpret user intention and control an output device accordingly. We describe a novel BCI method to use a signal from five EEG channels (comprising one primary channel with four additional channels used to calculate its Laplacian derivation) to provide two-dimensional (2-D) control of a cursor on a computer screen, with simple threshold-based binary classification of band power readings taken over pre-defined time windows during subject hand movement. We tested the paradigm with four healthy subjects, none of whom had prior BCI experience. Each subject played a game wherein he or she attempted to move a cursor to a target within a grid while avoiding a trap. We also present supplementary results including one healthy subject using motor imagery, one primary lateral sclerosis (PLS) patient, and one healthy subject using a single EEG channel without Laplacian derivation. For the four healthy subjects using real hand movement, the system provided accurate cursor control with little or no required user training. The average accuracy of the cursor movement was 86.1% (SD 9.8%), which is significantly better than chance (p = 0.0015). The best subject achieved a control accuracy of 96%, with only one incorrect bit classification out of 47. The supplementary results showed that control can be achieved under the respective experimental conditions, but with reduced accuracy. The binary method provides naïve subjects with real-time control of a cursor in 2-D using dichotomous classification of synchronous EEG band power readings from a small number of channels during hand movement. The primary strengths of our method are simplicity of hardware and software, and high accuracy when used by untrained subjects.
An embedded implementation based on adaptive filter bank for brain-computer interface systems.
Belwafi, Kais; Romain, Olivier; Gannouni, Sofien; Ghaffari, Fakhreddine; Djemal, Ridha; Ouni, Bouraoui
2018-07-15
Brain-computer interface (BCI) is a new communication pathway for users with neurological deficiencies. The implementation of a BCI system requires complex electroencephalography (EEG) signal processing including filtering, feature extraction and classification algorithms. Most of current BCI systems are implemented on personal computers. Therefore, there is a great interest in implementing BCI on embedded platforms to meet system specifications in terms of time response, cost effectiveness, power consumption, and accuracy. This article presents an embedded-BCI (EBCI) system based on a Stratix-IV field programmable gate array. The proposed system relays on the weighted overlap-add (WOLA) algorithm to perform dynamic filtering of EEG-signals by analyzing the event-related desynchronization/synchronization (ERD/ERS). The EEG-signals are classified, using the linear discriminant analysis algorithm, based on their spatial features. The proposed system performs fast classification within a time delay of 0.430 s/trial, achieving an average accuracy of 76.80% according to an offline approach and 80.25% using our own recording. The estimated power consumption of the prototype is approximately 0.7 W. Results show that the proposed EBCI system reduces the overall classification error rate for the three datasets of the BCI-competition by 5% compared to other similar implementations. Moreover, experiment shows that the proposed system maintains a high accuracy rate with a short processing time, a low power consumption, and a low cost. Performing dynamic filtering of EEG-signals using WOLA increases the recognition rate of ERD/ERS patterns of motor imagery brain activity. This approach allows to develop a complete prototype of a EBCI system that achieves excellent accuracy rates. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Prochazka, D.; Mazura, M.; Samek, O.; Rebrošová, K.; Pořízka, P.; Klus, J.; Prochazková, P.; Novotný, J.; Novotný, K.; Kaiser, J.
2018-01-01
In this work, we investigate the impact of data provided by complementary laser-based spectroscopic methods on multivariate classification accuracy. Discrimination and classification of five Staphylococcus bacterial strains and one strain of Escherichia coli is presented. The technique that we used for measurements is a combination of Raman spectroscopy and Laser-Induced Breakdown Spectroscopy (LIBS). Obtained spectroscopic data were then processed using Multivariate Data Analysis algorithms. Principal Components Analysis (PCA) was selected as the most suitable technique for visualization of bacterial strains data. To classify the bacterial strains, we used Neural Networks, namely a supervised version of Kohonen's self-organizing maps (SOM). We were processing results in three different ways - separately from LIBS measurements, from Raman measurements, and we also merged data from both mentioned methods. The three types of results were then compared. By applying the PCA to Raman spectroscopy data, we observed that two bacterial strains were fully distinguished from the rest of the data set. In the case of LIBS data, three bacterial strains were fully discriminated. Using a combination of data from both methods, we achieved the complete discrimination of all bacterial strains. All the data were classified with a high success rate using SOM algorithm. The most accurate classification was obtained using a combination of data from both techniques. The classification accuracy varied, depending on specific samples and techniques. As for LIBS, the classification accuracy ranged from 45% to 100%, as for Raman Spectroscopy from 50% to 100% and in case of merged data, all samples were classified correctly. Based on the results of the experiments presented in this work, we can assume that the combination of Raman spectroscopy and LIBS significantly enhances discrimination and classification accuracy of bacterial species and strains. The reason is the complementarity in obtained chemical information while using these two methods.
NutriNet: A Deep Learning Food and Drink Image Recognition System for Dietary Assessment
Koroušić Seljak, Barbara
2017-01-01
Automatic food image recognition systems are alleviating the process of food-intake estimation and dietary assessment. However, due to the nature of food images, their recognition is a particularly challenging task, which is why traditional approaches in the field have achieved a low classification accuracy. Deep neural networks have outperformed such solutions, and we present a novel approach to the problem of food and drink image detection and recognition that uses a newly-defined deep convolutional neural network architecture, called NutriNet. This architecture was tuned on a recognition dataset containing 225,953 512 × 512 pixel images of 520 different food and drink items from a broad spectrum of food groups, on which we achieved a classification accuracy of 86.72%, along with an accuracy of 94.47% on a detection dataset containing 130,517 images. We also performed a real-world test on a dataset of self-acquired images, combined with images from Parkinson’s disease patients, all taken using a smartphone camera, achieving a top-five accuracy of 55%, which is an encouraging result for real-world images. Additionally, we tested NutriNet on the University of Milano-Bicocca 2016 (UNIMIB2016) food image dataset, on which we improved upon the provided baseline recognition result. An online training component was implemented to continually fine-tune the food and drink recognition model on new images. The model is being used in practice as part of a mobile app for the dietary assessment of Parkinson’s disease patients. PMID:28653995
Comparison of seven protocols to identify fecal contamination sources using Escherichia coli
Stoeckel, D.M.; Mathes, M.V.; Hyer, K.E.; Hagedorn, C.; Kator, H.; Lukasik, J.; O'Brien, T. L.; Fenger, T.W.; Samadpour, M.; Strickler, K.M.; Wiggins, B.A.
2004-01-01
Microbial source tracking (MST) uses various approaches to classify fecal-indicator microorganisms to source hosts. Reproducibility, accuracy, and robustness of seven phenotypic and genotypic MST protocols were evaluated by use of Escherichia coli from an eight-host library of known-source isolates and a separate, blinded challenge library. In reproducibility tests, measuring each protocol's ability to reclassify blinded replicates, only one (pulsed-field gel electrophoresis; PFGE) correctly classified all test replicates to host species; three protocols classified 48-62% correctly, and the remaining three classified fewer than 25% correctly. In accuracy tests, measuring each protocol's ability to correctly classify new isolates, ribotyping with EcoRI and PvuII approached 100% correct classification but only 6% of isolates were classified; four of the other six protocols (antibiotic resistance analysis, PFGE, and two repetitive-element PCR protocols) achieved better than random accuracy rates when 30-100% of challenge isolates were classified. In robustness tests, measuring each protocol's ability to recognize isolates from nonlibrary hosts, three protocols correctly classified 33-100% of isolates as "unknown origin," whereas four protocols classified all isolates to a source category. A relevance test, summarizing interpretations for a hypothetical water sample containing 30 challenge isolates, indicated that false-positive classifications would hinder interpretations for most protocols. Study results indicate that more representation in known-source libraries and better classification accuracy would be needed before field application. Thorough reliability assessment of classification results is crucial before and during application of MST protocols.
Domínguez, Rocio Berenice; Moreno-Barón, Laura; Muñoz, Roberto; Gutiérrez, Juan Manuel
2014-01-01
This paper describes a new method based on a voltammetric electronic tongue (ET) for the recognition of distinctive features in coffee samples. An ET was directly applied to different samples from the main Mexican coffee regions without any pretreatment before the analysis. The resulting electrochemical information was modeled with two different mathematical tools, namely Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM). Growing conditions (i.e., organic or non-organic practices and altitude of crops) were considered for a first classification. LDA results showed an average discrimination rate of 88% ± 6.53% while SVM successfully accomplished an overall accuracy of 96.4% ± 3.50% for the same task. A second classification based on geographical origin of samples was carried out. Results showed an overall accuracy of 87.5% ± 7.79% for LDA and a superior performance of 97.5% ± 3.22% for SVM. Given the complexity of coffee samples, the high accuracy percentages achieved by ET coupled with SVM in both classification problems suggested a potential applicability of ET in the assessment of selected coffee features with a simpler and faster methodology along with a null sample pretreatment. In addition, the proposed method can be applied to authentication assessment while improving cost, time and accuracy of the general procedure. PMID:25254303
Domínguez, Rocio Berenice; Moreno-Barón, Laura; Muñoz, Roberto; Gutiérrez, Juan Manuel
2014-09-24
This paper describes a new method based on a voltammetric electronic tongue (ET) for the recognition of distinctive features in coffee samples. An ET was directly applied to different samples from the main Mexican coffee regions without any pretreatment before the analysis. The resulting electrochemical information was modeled with two different mathematical tools, namely Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM). Growing conditions (i.e., organic or non-organic practices and altitude of crops) were considered for a first classification. LDA results showed an average discrimination rate of 88% ± 6.53% while SVM successfully accomplished an overall accuracy of 96.4% ± 3.50% for the same task. A second classification based on geographical origin of samples was carried out. Results showed an overall accuracy of 87.5% ± 7.79% for LDA and a superior performance of 97.5% ± 3.22% for SVM. Given the complexity of coffee samples, the high accuracy percentages achieved by ET coupled with SVM in both classification problems suggested a potential applicability of ET in the assessment of selected coffee features with a simpler and faster methodology along with a null sample pretreatment. In addition, the proposed method can be applied to authentication assessment while improving cost, time and accuracy of the general procedure.
Study design requirements for RNA sequencing-based breast cancer diagnostics.
Mer, Arvind Singh; Klevebring, Daniel; Grönberg, Henrik; Rantalainen, Mattias
2016-02-01
Sequencing-based molecular characterization of tumors provides information required for individualized cancer treatment. There are well-defined molecular subtypes of breast cancer that provide improved prognostication compared to routine biomarkers. However, molecular subtyping is not yet implemented in routine breast cancer care. Clinical translation is dependent on subtype prediction models providing high sensitivity and specificity. In this study we evaluate sample size and RNA-sequencing read requirements for breast cancer subtyping to facilitate rational design of translational studies. We applied subsampling to ascertain the effect of training sample size and the number of RNA sequencing reads on classification accuracy of molecular subtype and routine biomarker prediction models (unsupervised and supervised). Subtype classification accuracy improved with increasing sample size up to N = 750 (accuracy = 0.93), although with a modest improvement beyond N = 350 (accuracy = 0.92). Prediction of routine biomarkers achieved accuracy of 0.94 (ER) and 0.92 (Her2) at N = 200. Subtype classification improved with RNA-sequencing library size up to 5 million reads. Development of molecular subtyping models for cancer diagnostics requires well-designed studies. Sample size and the number of RNA sequencing reads directly influence accuracy of molecular subtyping. Results in this study provide key information for rational design of translational studies aiming to bring sequencing-based diagnostics to the clinic.
NASA Astrophysics Data System (ADS)
Bassa, Zaakirah; Bob, Urmilla; Szantoi, Zoltan; Ismail, Riyad
2016-01-01
In recent years, the popularity of tree-based ensemble methods for land cover classification has increased significantly. Using WorldView-2 image data, we evaluate the potential of the oblique random forest algorithm (oRF) to classify a highly heterogeneous protected area. In contrast to the random forest (RF) algorithm, the oRF algorithm builds multivariate trees by learning the optimal split using a supervised model. The oRF binary algorithm is adapted to a multiclass land cover and land use application using both the "one-against-one" and "one-against-all" combination approaches. Results show that the oRF algorithms are capable of achieving high classification accuracies (>80%). However, there was no statistical difference in classification accuracies obtained by the oRF algorithms and the more popular RF algorithm. For all the algorithms, user accuracies (UAs) and producer accuracies (PAs) >80% were recorded for most of the classes. Both the RF and oRF algorithms poorly classified the indigenous forest class as indicated by the low UAs and PAs. Finally, the results from this study advocate and support the utility of the oRF algorithm for land cover and land use mapping of protected areas using WorldView-2 image data.
Garcia-Chimeno, Yolanda; Garcia-Zapirain, Begonya; Gomez-Beldarrain, Marian; Fernandez-Ruanova, Begonya; Garcia-Monco, Juan Carlos
2017-04-13
Feature selection methods are commonly used to identify subsets of relevant features to facilitate the construction of models for classification, yet little is known about how feature selection methods perform in diffusion tensor images (DTIs). In this study, feature selection and machine learning classification methods were tested for the purpose of automating diagnosis of migraines using both DTIs and questionnaire answers related to emotion and cognition - factors that influence of pain perceptions. We select 52 adult subjects for the study divided into three groups: control group (15), subjects with sporadic migraine (19) and subjects with chronic migraine and medication overuse (18). These subjects underwent magnetic resonance with diffusion tensor to see white matter pathway integrity of the regions of interest involved in pain and emotion. The tests also gather data about pathology. The DTI images and test results were then introduced into feature selection algorithms (Gradient Tree Boosting, L1-based, Random Forest and Univariate) to reduce features of the first dataset and classification algorithms (SVM (Support Vector Machine), Boosting (Adaboost) and Naive Bayes) to perform a classification of migraine group. Moreover we implement a committee method to improve the classification accuracy based on feature selection algorithms. When classifying the migraine group, the greatest improvements in accuracy were made using the proposed committee-based feature selection method. Using this approach, the accuracy of classification into three types improved from 67 to 93% when using the Naive Bayes classifier, from 90 to 95% with the support vector machine classifier, 93 to 94% in boosting. The features that were determined to be most useful for classification included are related with the pain, analgesics and left uncinate brain (connected with the pain and emotions). The proposed feature selection committee method improved the performance of migraine diagnosis classifiers compared to individual feature selection methods, producing a robust system that achieved over 90% accuracy in all classifiers. The results suggest that the proposed methods can be used to support specialists in the classification of migraines in patients undergoing magnetic resonance imaging.
Yao, Dongren; Calhoun, Vince D; Fu, Zening; Du, Yuhui; Sui, Jing
2018-05-15
Discriminating Alzheimer's disease (AD) from its prodromal form, mild cognitive impairment (MCI), is a significant clinical problem that may facilitate early diagnosis and intervention, in which a more challenging issue is to classify MCI subtypes, i.e., those who eventually convert to AD (cMCI) versus those who do not (MCI). To solve this difficult 4-way classification problem (AD, MCI, cMCI and healthy controls), a competition was hosted by Kaggle to invite the scientific community to apply their machine learning approaches on pre-processed sets of T1-weighted magnetic resonance images (MRI) data and the demographic information from the international Alzheimer's disease neuroimaging initiative (ADNI) database. This paper summarizes our competition results. We first proposed a hierarchical process by turning the 4-way classification into five binary classification problems. A new feature selection technology based on relative importance was also proposed, aiming to identify a more informative and concise subset from 426 sMRI morphometric and 3 demographic features, to ensure each binary classifier to achieve its highest accuracy. As a result, about 2% of the original features were selected to build a new feature space, which can achieve the final four-way classification with a 54.38% accuracy on testing data through hierarchical grouping, higher than several alternative methods in comparison. More importantly, the selected discriminative features such as hippocampal volume, parahippocampal surface area, and medial orbitofrontal thickness, etc. as well as the MMSE score, are reasonable and consistent with those reported in AD/MCI deficits. In summary, the proposed method provides a new framework for multi-way classification using hierarchical grouping and precise feature selection. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Selim, Serdar; Sonmez, Namik Kemal; Onur, Isin; Coslu, Mesut
2017-10-01
Connection of similar landscape patches with ecological corridors supports habitat quality of these patches, increases urban ecological quality, and constitutes an important living and expansion area for wild life. Furthermore, habitat connectivity provided by urban green areas is supporting biodiversity in urban areas. In this study, possible ecological connections between landscape patches, which were achieved by using Expert classification technique and modeled with probabilistic connection index. Firstly, the reflection responses of plants to various bands are used as data in hypotheses. One of the important features of this method is being able to use more than one image at the same time in the formation of the hypothesis. For this reason, before starting the application of the Expert classification, the base images are prepared. In addition to the main image, the hypothesis conditions were also created for each class with the NDVI image which is commonly used in the vegetation researches. Besides, the results of the previously conducted supervised classification were taken into account. We applied this classification method by using the raster imagery with user-defined variables. Hereupon, to provide ecological connections of the tree cover which was achieved from the classification, we used Probabilistic Connection (PC) index. The probabilistic connection model which is used for landscape planning and conservation studies via detecting and prioritization critical areas for ecological connection characterizes the possibility of direct connection between habitats. As a result we obtained over % 90 total accuracy in accuracy assessment analysis. We provided ecological connections with PC index and we created inter-connected green spaces system. Thus, we offered and implicated green infrastructure system model takes place in the agenda of recent years.
NASA Astrophysics Data System (ADS)
Modiri, M.; Salehabadi, A.; Mohebbi, M.; Hashemi, A. M.; Masumi, M.
2015-12-01
The use of UAV in the application of photogrammetry to obtain cover images and achieve the main objectives of the photogrammetric mapping has been a boom in the region. The images taken from REGGIOLO region in the province of, Italy Reggio -Emilia by UAV with non-metric camera Canon Ixus and with an average height of 139.42 meters were used to classify urban feature. Using the software provided SURE and cover images of the study area, to produce dense point cloud, DSM and Artvqvtv spatial resolution of 10 cm was prepared. DTM area using Adaptive TIN filtering algorithm was developed. NDSM area was prepared with using the difference between DSM and DTM and a separate features in the image stack. In order to extract features, using simultaneous occurrence matrix features mean, variance, homogeneity, contrast, dissimilarity, entropy, second moment, and correlation for each of the RGB band image was used Orthophoto area. Classes used to classify urban problems, including buildings, trees and tall vegetation, grass and vegetation short, paved road and is impervious surfaces. Class consists of impervious surfaces such as pavement conditions, the cement, the car, the roof is stored. In order to pixel-based classification and selection of optimal features of classification was GASVM pixel basis. In order to achieve the classification results with higher accuracy and spectral composition informations, texture, and shape conceptual image featureOrthophoto area was fencing. The segmentation of multi-scale segmentation method was used.it belonged class. Search results using the proposed classification of urban feature, suggests the suitability of this method of classification complications UAV is a city using images. The overall accuracy and kappa coefficient method proposed in this study, respectively, 47/93% and 84/91% was.
Continuous robust sound event classification using time-frequency features and deep learning
Song, Yan; Xiao, Wei; Phan, Huy
2017-01-01
The automatic detection and recognition of sound events by computers is a requirement for a number of emerging sensing and human computer interaction technologies. Recent advances in this field have been achieved by machine learning classifiers working in conjunction with time-frequency feature representations. This combination has achieved excellent accuracy for classification of discrete sounds. The ability to recognise sounds under real-world noisy conditions, called robust sound event classification, is an especially challenging task that has attracted recent research attention. Another aspect of real-word conditions is the classification of continuous, occluded or overlapping sounds, rather than classification of short isolated sound recordings. This paper addresses the classification of noise-corrupted, occluded, overlapped, continuous sound recordings. It first proposes a standard evaluation task for such sounds based upon a common existing method for evaluating isolated sound classification. It then benchmarks several high performing isolated sound classifiers to operate with continuous sound data by incorporating an energy-based event detection front end. Results are reported for each tested system using the new task, to provide the first analysis of their performance for continuous sound event detection. In addition it proposes and evaluates a novel Bayesian-inspired front end for the segmentation and detection of continuous sound recordings prior to classification. PMID:28892478
Continuous robust sound event classification using time-frequency features and deep learning.
McLoughlin, Ian; Zhang, Haomin; Xie, Zhipeng; Song, Yan; Xiao, Wei; Phan, Huy
2017-01-01
The automatic detection and recognition of sound events by computers is a requirement for a number of emerging sensing and human computer interaction technologies. Recent advances in this field have been achieved by machine learning classifiers working in conjunction with time-frequency feature representations. This combination has achieved excellent accuracy for classification of discrete sounds. The ability to recognise sounds under real-world noisy conditions, called robust sound event classification, is an especially challenging task that has attracted recent research attention. Another aspect of real-word conditions is the classification of continuous, occluded or overlapping sounds, rather than classification of short isolated sound recordings. This paper addresses the classification of noise-corrupted, occluded, overlapped, continuous sound recordings. It first proposes a standard evaluation task for such sounds based upon a common existing method for evaluating isolated sound classification. It then benchmarks several high performing isolated sound classifiers to operate with continuous sound data by incorporating an energy-based event detection front end. Results are reported for each tested system using the new task, to provide the first analysis of their performance for continuous sound event detection. In addition it proposes and evaluates a novel Bayesian-inspired front end for the segmentation and detection of continuous sound recordings prior to classification.
Hilbert, Kevin; Lueken, Ulrike; Muehlhan, Markus; Beesdo-Baum, Katja
2017-03-01
Generalized anxiety disorder (GAD) is difficult to recognize and hard to separate from major depression (MD) in clinical settings. Biomarkers might support diagnostic decisions. This study used machine learning on multimodal biobehavioral data from a sample of GAD, MD and healthy subjects to differentiate subjects with a disorder from healthy subjects (case-classification) and to differentiate GAD from MD (disorder-classification). Subjects with GAD ( n = 19), MD without GAD ( n = 14), and healthy comparison subjects ( n = 24) were included. The sample was matched regarding age, sex, handedness and education and free of psychopharmacological medication. Binary support vector machines were used within a nested leave-one-out cross-validation framework. Clinical questionnaires, cortisol release, gray matter (GM), and white matter (WM) volumes were used as input data separately and in combination. Questionnaire data were well-suited for case-classification but not disorder-classification (accuracies: 96.40%, p < .001; 56.58%, p > .22). The opposite pattern was found for imaging data (case-classification GM/WM: 58.71%, p = .09/43.18%, p > .66; disorder-classification GM/WM: 68.05%, p = .034/58.27%, p > .15) and for cortisol data (38.02%, p = .84; 74.60%, p = .009). All data combined achieved 90.10% accuracy ( p < .001) for case-classification and 67.46% accuracy ( p = .0268) for disorder-classification. In line with previous evidence, classification of GAD was difficult using clinical questionnaire data alone. Particularly cortisol and GM volume data were able to provide incremental value for the classification of GAD. Findings suggest that neurobiological biomarkers are a useful target for further research to delineate their potential contribution to diagnostic processes.
Discriminative Nonlinear Analysis Operator Learning: When Cosparse Model Meets Image Classification.
Wen, Zaidao; Hou, Biao; Jiao, Licheng
2017-05-03
Linear synthesis model based dictionary learning framework has achieved remarkable performances in image classification in the last decade. Behaved as a generative feature model, it however suffers from some intrinsic deficiencies. In this paper, we propose a novel parametric nonlinear analysis cosparse model (NACM) with which a unique feature vector will be much more efficiently extracted. Additionally, we derive a deep insight to demonstrate that NACM is capable of simultaneously learning the task adapted feature transformation and regularization to encode our preferences, domain prior knowledge and task oriented supervised information into the features. The proposed NACM is devoted to the classification task as a discriminative feature model and yield a novel discriminative nonlinear analysis operator learning framework (DNAOL). The theoretical analysis and experimental performances clearly demonstrate that DNAOL will not only achieve the better or at least competitive classification accuracies than the state-of-the-art algorithms but it can also dramatically reduce the time complexities in both training and testing phases.
Pirooznia, Mehdi; Deng, Youping
2006-12-12
Graphical user interface (GUI) software promotes novelty by allowing users to extend the functionality. SVM Classifier is a cross-platform graphical application that handles very large datasets well. The purpose of this study is to create a GUI application that allows SVM users to perform SVM training, classification and prediction. The GUI provides user-friendly access to state-of-the-art SVM methods embodied in the LIBSVM implementation of Support Vector Machine. We implemented the java interface using standard swing libraries. We used a sample data from a breast cancer study for testing classification accuracy. We achieved 100% accuracy in classification among the BRCA1-BRCA2 samples with RBF kernel of SVM. We have developed a java GUI application that allows SVM users to perform SVM training, classification and prediction. We have demonstrated that support vector machines can accurately classify genes into functional categories based upon expression data from DNA microarray hybridization experiments. Among the different kernel functions that we examined, the SVM that uses a radial basis kernel function provides the best performance. The SVM Classifier is available at http://mfgn.usm.edu/ebl/svm/.
Moncada-Torres, A; Leuenberger, K; Gonzenbach, R; Luft, A; Gassert, R
2014-07-01
Miniature, wearable sensor modules are a promising technology to monitor activities of daily living (ADL) over extended periods of time. To assure both user compliance and meaningful results, the selection and placement site of sensors requires careful consideration. We investigated these aspects for the classification of 16 ADL in 6 healthy subjects under laboratory conditions using ReSense, our custom-made inertial measurement unit enhanced with a barometric pressure sensor used to capture activity-related altitude changes. Subjects wore a module on each wrist and ankle, and one on the trunk. Activities comprised whole body movements as well as gross and dextrous upper-limb activities. Wrist-module data outperformed the other locations for the three activity groups. Specifically, overall classification accuracy rates of almost 93% and more than 95% were achieved for the repeated holdout and user-specific validation methods, respectively, for all 16 activities. Including the altitude profile resulted in a considerable improvement of up to 20% in the classification accuracy for stair ascent and descent. The gyroscopes provided no useful information for activity classification under this scheme. The proposed sensor setting could allow for robust long-term activity monitoring with high compliance in different patient populations.
Classification of deadlift biomechanics with wearable inertial measurement units.
O'Reilly, Martin A; Whelan, Darragh F; Ward, Tomas E; Delahunt, Eamonn; Caulfield, Brian M
2017-06-14
The deadlift is a compound full-body exercise that is fundamental in resistance training, rehabilitation programs and powerlifting competitions. Accurate quantification of deadlift biomechanics is important to reduce the risk of injury and ensure training and rehabilitation goals are achieved. This study sought to develop and evaluate deadlift exercise technique classification systems utilising Inertial Measurement Units (IMUs), recording at 51.2Hz, worn on the lumbar spine, both thighs and both shanks. It also sought to compare classification quality when these IMUs are worn in combination and in isolation. Two datasets of IMU deadlift data were collected. Eighty participants first completed deadlifts with acceptable technique and 5 distinct, deliberately induced deviations from acceptable form. Fifty-five members of this group also completed a fatiguing protocol (3-Repition Maximum test) to enable the collection of natural deadlift deviations. For both datasets, universal and personalised random-forests classifiers were developed and evaluated. Personalised classifiers outperformed universal classifiers in accuracy, sensitivity and specificity in the binary classification of acceptable or aberrant technique and in the multi-label classification of specific deadlift deviations. Whilst recent research has favoured universal classifiers due to the reduced overhead in setting them up for new system users, this work demonstrates that such techniques may not be appropriate for classifying deadlift technique due to the poor accuracy achieved. However, personalised classifiers perform very well in assessing deadlift technique, even when using data derived from a single lumbar-worn IMU to detect specific naturally occurring technique mistakes. Copyright © 2017 Elsevier Ltd. All rights reserved.
Classification of ASKAP Vast Radio Light Curves
NASA Technical Reports Server (NTRS)
Rebbapragada, Umaa; Lo, Kitty; Wagstaff, Kiri L.; Reed, Colorado; Murphy, Tara; Thompson, David R.
2012-01-01
The VAST survey is a wide-field survey that observes with unprecedented instrument sensitivity (0.5 mJy or lower) and repeat cadence (a goal of 5 seconds) that will enable novel scientific discoveries related to known and unknown classes of radio transients and variables. Given the unprecedented observing characteristics of VAST, it is important to estimate source classification performance, and determine best practices prior to the launch of ASKAP's BETA in 2012. The goal of this study is to identify light curve characterization and classification algorithms that are best suited for archival VAST light curve classification. We perform our experiments on light curve simulations of eight source types and achieve best case performance of approximately 90% accuracy. We note that classification performance is most influenced by light curve characterization rather than classifier algorithm.
Land cover mapping after the tsunami event over Nanggroe Aceh Darussalam (NAD) province, Indonesia
NASA Astrophysics Data System (ADS)
Lim, H. S.; MatJafri, M. Z.; Abdullah, K.; Alias, A. N.; Mohd. Saleh, N.; Wong, C. J.; Surbakti, M. S.
2008-03-01
Remote sensing offers an important means of detecting and analyzing temporal changes occurring in our landscape. This research used remote sensing to quantify land use/land cover changes at the Nanggroe Aceh Darussalam (Nad) province, Indonesia on a regional scale. The objective of this paper is to assess the changed produced from the analysis of Landsat TM data. A Landsat TM image was used to develop land cover classification map for the 27 March 2005. Four supervised classifications techniques (Maximum Likelihood, Minimum Distance-to- Mean, Parallelepiped and Parallelepiped with Maximum Likelihood Classifier Tiebreaker classifier) were performed to the satellite image. Training sites and accuracy assessment were needed for supervised classification techniques. The training sites were established using polygons based on the colour image. High detection accuracy (>80%) and overall Kappa (>0.80) were achieved by the Parallelepiped with Maximum Likelihood Classifier Tiebreaker classifier in this study. This preliminary study has produced a promising result. This indicates that land cover mapping can be carried out using remote sensing classification method of the satellite digital imagery.
Better physical activity classification using smartphone acceleration sensor.
Arif, Muhammad; Bilal, Mohsin; Kattan, Ahmed; Ahamed, S Iqbal
2014-09-01
Obesity is becoming one of the serious problems for the health of worldwide population. Social interactions on mobile phones and computers via internet through social e-networks are one of the major causes of lack of physical activities. For the health specialist, it is important to track the record of physical activities of the obese or overweight patients to supervise weight loss control. In this study, acceleration sensor present in the smartphone is used to monitor the physical activity of the user. Physical activities including Walking, Jogging, Sitting, Standing, Walking upstairs and Walking downstairs are classified. Time domain features are extracted from the acceleration data recorded by smartphone during different physical activities. Time and space complexity of the whole framework is done by optimal feature subset selection and pruning of instances. Classification results of six physical activities are reported in this paper. Using simple time domain features, 99 % classification accuracy is achieved. Furthermore, attributes subset selection is used to remove the redundant features and to minimize the time complexity of the algorithm. A subset of 30 features produced more than 98 % classification accuracy for the six physical activities.
Thyroid Nodule Classification in Ultrasound Images by Fine-Tuning Deep Convolutional Neural Network.
Chi, Jianning; Walia, Ekta; Babyn, Paul; Wang, Jimmy; Groot, Gary; Eramian, Mark
2017-08-01
With many thyroid nodules being incidentally detected, it is important to identify as many malignant nodules as possible while excluding those that are highly likely to be benign from fine needle aspiration (FNA) biopsies or surgeries. This paper presents a computer-aided diagnosis (CAD) system for classifying thyroid nodules in ultrasound images. We use deep learning approach to extract features from thyroid ultrasound images. Ultrasound images are pre-processed to calibrate their scale and remove the artifacts. A pre-trained GoogLeNet model is then fine-tuned using the pre-processed image samples which leads to superior feature extraction. The extracted features of the thyroid ultrasound images are sent to a Cost-sensitive Random Forest classifier to classify the images into "malignant" and "benign" cases. The experimental results show the proposed fine-tuned GoogLeNet model achieves excellent classification performance, attaining 98.29% classification accuracy, 99.10% sensitivity and 93.90% specificity for the images in an open access database (Pedraza et al. 16), while 96.34% classification accuracy, 86% sensitivity and 99% specificity for the images in our local health region database.
Deep Multi-Task Learning for Tree Genera Classification
NASA Astrophysics Data System (ADS)
Ko, C.; Kang, J.; Sohn, G.
2018-05-01
The goal for our paper is to classify tree genera using airborne Light Detection and Ranging (LiDAR) data with Convolution Neural Network (CNN) - Multi-task Network (MTN) implementation. Unlike Single-task Network (STN) where only one task is assigned to the learning outcome, MTN is a deep learning architect for learning a main task (classification of tree genera) with other tasks (in our study, classification of coniferous and deciduous) simultaneously, with shared classification features. The main contribution of this paper is to improve classification accuracy from CNN-STN to CNN-MTN. This is achieved by introducing a concurrence loss (Lcd) to the designed MTN. This term regulates the overall network performance by minimizing the inconsistencies between the two tasks. Results show that we can increase the classification accuracy from 88.7 % to 91.0 % (from STN to MTN). The second goal of this paper is to solve the problem of small training sample size by multiple-view data generation. The motivation of this goal is to address one of the most common problems in implementing deep learning architecture, the insufficient number of training data. We address this problem by simulating training dataset with multiple-view approach. The promising results from this paper are providing a basis for classifying a larger number of dataset and number of classes in the future.
Real-time and simultaneous control of artificial limbs based on pattern recognition algorithms.
Ortiz-Catalan, Max; Håkansson, Bo; Brånemark, Rickard
2014-07-01
The prediction of simultaneous limb motions is a highly desirable feature for the control of artificial limbs. In this work, we investigate different classification strategies for individual and simultaneous movements based on pattern recognition of myoelectric signals. Our results suggest that any classifier can be potentially employed in the prediction of simultaneous movements if arranged in a distributed topology. On the other hand, classifiers inherently capable of simultaneous predictions, such as the multi-layer perceptron (MLP), were found to be more cost effective, as they can be successfully employed in their simplest form. In the prediction of individual movements, the one-vs-one (OVO) topology was found to improve classification accuracy across different classifiers and it was therefore used to benchmark the benefits of simultaneous control. As opposed to previous work reporting only offline accuracy, the classification performance and the resulting controllability are evaluated in real time using the motion test and target achievement control (TAC) test, respectively. We propose a simultaneous classification strategy based on MLP that outperformed a top classifier for individual movements (LDA-OVO), thus improving the state-of-the-art classification approach. Furthermore, all the presented classification strategies and data collected in this study are freely available in BioPatRec, an open source platform for the development of advanced prosthetic control strategies.
Advanced Steel Microstructural Classification by Deep Learning Methods.
Azimi, Seyed Majid; Britz, Dominik; Engstler, Michael; Fritz, Mario; Mücklich, Frank
2018-02-01
The inner structure of a material is called microstructure. It stores the genesis of a material and determines all its physical and chemical properties. While microstructural characterization is widely spread and well known, the microstructural classification is mostly done manually by human experts, which gives rise to uncertainties due to subjectivity. Since the microstructure could be a combination of different phases or constituents with complex substructures its automatic classification is very challenging and only a few prior studies exist. Prior works focused on designed and engineered features by experts and classified microstructures separately from the feature extraction step. Recently, Deep Learning methods have shown strong performance in vision applications by learning the features from data together with the classification step. In this work, we propose a Deep Learning method for microstructural classification in the examples of certain microstructural constituents of low carbon steel. This novel method employs pixel-wise segmentation via Fully Convolutional Neural Network (FCNN) accompanied by a max-voting scheme. Our system achieves 93.94% classification accuracy, drastically outperforming the state-of-the-art method of 48.89% accuracy. Beyond the strong performance of our method, this line of research offers a more robust and first of all objective way for the difficult task of steel quality appreciation.
Qureshi, Muhammad Naveed Iqbal; Min, Beomjun; Jo, Hang Joon; Lee, Boreom
2016-01-01
The classification of neuroimaging data for the diagnosis of certain brain diseases is one of the main research goals of the neuroscience and clinical communities. In this study, we performed multiclass classification using a hierarchical extreme learning machine (H-ELM) classifier. We compared the performance of this classifier with that of a support vector machine (SVM) and basic extreme learning machine (ELM) for cortical MRI data from attention deficit/hyperactivity disorder (ADHD) patients. We used 159 structural MRI images of children from the publicly available ADHD-200 MRI dataset. The data consisted of three types, namely, typically developing (TDC), ADHD-inattentive (ADHD-I), and ADHD-combined (ADHD-C). We carried out feature selection by using standard SVM-based recursive feature elimination (RFE-SVM) that enabled us to achieve good classification accuracy (60.78%). In this study, we found the RFE-SVM feature selection approach in combination with H-ELM to effectively enable the acquisition of high multiclass classification accuracy rates for structural neuroimaging data. In addition, we found that the most important features for classification were the surface area of the superior frontal lobe, and the cortical thickness, volume, and mean surface area of the whole cortex. PMID:27500640
NASA Astrophysics Data System (ADS)
Adelabu, Samuel; Mutanga, Onisimo; Adam, Elhadi; Cho, Moses Azong
2013-01-01
Classification of different tree species in semiarid areas can be challenging as a result of the change in leaf structure and orientation due to soil moisture constraints. Tree species mapping is, however, a key parameter for forest management in semiarid environments. In this study, we examined the suitability of 5-band RapidEye satellite data for the classification of five tree species in mopane woodland of Botswana using machine leaning algorithms with limited training samples.We performed classification using random forest (RF) and support vector machines (SVM) based on EnMap box. The overall accuracies for classifying the five tree species was 88.75 and 85% for both SVM and RF, respectively. We also demonstrated that the new red-edge band in the RapidEye sensor has the potential for classifying tree species in semiarid environments when integrated with other standard bands. Similarly, we observed that where there are limited training samples, SVM is preferred over RF. Finally, we demonstrated that the two accuracy measures of quantity and allocation disagreement are simpler and more helpful for the vast majority of remote sensing classification process than the kappa coefficient. Overall, high species classification can be achieved using strategically located RapidEye bands integrated with advanced processing algorithms.
Lee, Ga-Young; Kim, Jeonghun; Kim, Ju Han; Kim, Kiwoong; Seong, Joon-Kyung
2014-01-01
Mobile healthcare applications are becoming a growing trend. Also, the prevalence of dementia in modern society is showing a steady growing trend. Among degenerative brain diseases that cause dementia, Alzheimer disease (AD) is the most common. The purpose of this study was to identify AD patients using magnetic resonance imaging in the mobile environment. We propose an incremental classification for mobile healthcare systems. Our classification method is based on incremental learning for AD diagnosis and AD prediction using the cortical thickness data and hippocampus shape. We constructed a classifier based on principal component analysis and linear discriminant analysis. We performed initial learning and mobile subject classification. Initial learning is the group learning part in our server. Our smartphone agent implements the mobile classification and shows various results. With use of cortical thickness data analysis alone, the discrimination accuracy was 87.33% (sensitivity 96.49% and specificity 64.33%). When cortical thickness data and hippocampal shape were analyzed together, the achieved accuracy was 87.52% (sensitivity 96.79% and specificity 63.24%). In this paper, we presented a classification method based on online learning for AD diagnosis by employing both cortical thickness data and hippocampal shape analysis data. Our method was implemented on smartphone devices and discriminated AD patients for normal group.
Qureshi, Muhammad Naveed Iqbal; Min, Beomjun; Jo, Hang Joon; Lee, Boreom
2016-01-01
The classification of neuroimaging data for the diagnosis of certain brain diseases is one of the main research goals of the neuroscience and clinical communities. In this study, we performed multiclass classification using a hierarchical extreme learning machine (H-ELM) classifier. We compared the performance of this classifier with that of a support vector machine (SVM) and basic extreme learning machine (ELM) for cortical MRI data from attention deficit/hyperactivity disorder (ADHD) patients. We used 159 structural MRI images of children from the publicly available ADHD-200 MRI dataset. The data consisted of three types, namely, typically developing (TDC), ADHD-inattentive (ADHD-I), and ADHD-combined (ADHD-C). We carried out feature selection by using standard SVM-based recursive feature elimination (RFE-SVM) that enabled us to achieve good classification accuracy (60.78%). In this study, we found the RFE-SVM feature selection approach in combination with H-ELM to effectively enable the acquisition of high multiclass classification accuracy rates for structural neuroimaging data. In addition, we found that the most important features for classification were the surface area of the superior frontal lobe, and the cortical thickness, volume, and mean surface area of the whole cortex.
Multi-level discriminative dictionary learning with application to large scale image classification.
Shen, Li; Sun, Gang; Huang, Qingming; Wang, Shuhui; Lin, Zhouchen; Wu, Enhua
2015-10-01
The sparse coding technique has shown flexibility and capability in image representation and analysis. It is a powerful tool in many visual applications. Some recent work has shown that incorporating the properties of task (such as discrimination for classification task) into dictionary learning is effective for improving the accuracy. However, the traditional supervised dictionary learning methods suffer from high computation complexity when dealing with large number of categories, making them less satisfactory in large scale applications. In this paper, we propose a novel multi-level discriminative dictionary learning method and apply it to large scale image classification. Our method takes advantage of hierarchical category correlation to encode multi-level discriminative information. Each internal node of the category hierarchy is associated with a discriminative dictionary and a classification model. The dictionaries at different layers are learnt to capture the information of different scales. Moreover, each node at lower layers also inherits the dictionary of its parent, so that the categories at lower layers can be described with multi-scale information. The learning of dictionaries and associated classification models is jointly conducted by minimizing an overall tree loss. The experimental results on challenging data sets demonstrate that our approach achieves excellent accuracy and competitive computation cost compared with other sparse coding methods for large scale image classification.
Low-back electromyography (EMG) data-driven load classification for dynamic lifting tasks.
Totah, Deema; Ojeda, Lauro; Johnson, Daniel D; Gates, Deanna; Mower Provost, Emily; Barton, Kira
2018-01-01
Numerous devices have been designed to support the back during lifting tasks. To improve the utility of such devices, this research explores the use of preparatory muscle activity to classify muscle loading and initiate appropriate device activation. The goal of this study was to determine the earliest time window that enabled accurate load classification during a dynamic lifting task. Nine subjects performed thirty symmetrical lifts, split evenly across three weight conditions (no-weight, 10-lbs and 24-lbs), while low-back muscle activity data was collected. Seven descriptive statistics features were extracted from 100 ms windows of data. A multinomial logistic regression (MLR) classifier was trained and tested, employing leave-one subject out cross-validation, to classify lifted load values. Dimensionality reduction was achieved through feature cross-correlation analysis and greedy feedforward selection. The time of full load support by the subject was defined as load-onset. Regions of highest average classification accuracy started at 200 ms before until 200 ms after load-onset with average accuracies ranging from 80% (±10%) to 81% (±7%). The average recall for each class ranged from 69-92%. These inter-subject classification results indicate that preparatory muscle activity can be leveraged to identify the intent to lift a weight up to 100 ms prior to load-onset. The high accuracies shown indicate the potential to utilize intent classification for assistive device applications. Active assistive devices, e.g. exoskeletons, could prevent back injury by off-loading low-back muscles. Early intent classification allows more time for actuators to respond and integrate seamlessly with the user.
Yilmaz, E; Kayikcioglu, T; Kayipmaz, S
2017-07-01
In this article, we propose a decision support system for effective classification of dental periapical cyst and keratocystic odontogenic tumor (KCOT) lesions obtained via cone beam computed tomography (CBCT). CBCT has been effectively used in recent years for diagnosing dental pathologies and determining their boundaries and content. Unlike other imaging techniques, CBCT provides detailed and distinctive information about the pathologies by enabling a three-dimensional (3D) image of the region to be displayed. We employed 50 CBCT 3D image dataset files as the full dataset of our study. These datasets were identified by experts as periapical cyst and KCOT lesions according to the clinical, radiographic and histopathologic features. Segmentation operations were performed on the CBCT images using viewer software that we developed. Using the tools of this software, we marked the lesional volume of interest and calculated and applied the order statistics and 3D gray-level co-occurrence matrix for each CBCT dataset. A feature vector of the lesional region, including 636 different feature items, was created from those statistics. Six classifiers were used for the classification experiments. The Support Vector Machine (SVM) classifier achieved the best classification performance with 100% accuracy, and 100% F-score (F1) scores as a result of the experiments in which a ten-fold cross validation method was used with a forward feature selection algorithm. SVM achieved the best classification performance with 96.00% accuracy, and 96.00% F1 scores in the experiments in which a split sample validation method was used with a forward feature selection algorithm. SVM additionally achieved the best performance of 94.00% accuracy, and 93.88% F1 in which a leave-one-out (LOOCV) method was used with a forward feature selection algorithm. Based on the results, we determined that periapical cyst and KCOT lesions can be classified with a high accuracy with the models that we built using the new dataset selected for this study. The studies mentioned in this article, along with the selected 3D dataset, 3D statistics calculated from the dataset, and performance results of the different classifiers, comprise an important contribution to the field of computer-aided diagnosis of dental apical lesions. Copyright © 2017 Elsevier B.V. All rights reserved.
Mandelkow, Hendrik; de Zwart, Jacco A.; Duyn, Jeff H.
2016-01-01
Naturalistic stimuli like movies evoke complex perceptual processes, which are of great interest in the study of human cognition by functional MRI (fMRI). However, conventional fMRI analysis based on statistical parametric mapping (SPM) and the general linear model (GLM) is hampered by a lack of accurate parametric models of the BOLD response to complex stimuli. In this situation, statistical machine-learning methods, a.k.a. multivariate pattern analysis (MVPA), have received growing attention for their ability to generate stimulus response models in a data-driven fashion. However, machine-learning methods typically require large amounts of training data as well as computational resources. In the past, this has largely limited their application to fMRI experiments involving small sets of stimulus categories and small regions of interest in the brain. By contrast, the present study compares several classification algorithms known as Nearest Neighbor (NN), Gaussian Naïve Bayes (GNB), and (regularized) Linear Discriminant Analysis (LDA) in terms of their classification accuracy in discriminating the global fMRI response patterns evoked by a large number of naturalistic visual stimuli presented as a movie. Results show that LDA regularized by principal component analysis (PCA) achieved high classification accuracies, above 90% on average for single fMRI volumes acquired 2 s apart during a 300 s movie (chance level 0.7% = 2 s/300 s). The largest source of classification errors were autocorrelations in the BOLD signal compounded by the similarity of consecutive stimuli. All classifiers performed best when given input features from a large region of interest comprising around 25% of the voxels that responded significantly to the visual stimulus. Consistent with this, the most informative principal components represented widespread distributions of co-activated brain regions that were similar between subjects and may represent functional networks. In light of these results, the combination of naturalistic movie stimuli and classification analysis in fMRI experiments may prove to be a sensitive tool for the assessment of changes in natural cognitive processes under experimental manipulation. PMID:27065832
NASA Astrophysics Data System (ADS)
Amit, Guy; Ben-Ari, Rami; Hadad, Omer; Monovich, Einat; Granot, Noa; Hashoul, Sharbell
2017-03-01
Diagnostic interpretation of breast MRI studies requires meticulous work and a high level of expertise. Computerized algorithms can assist radiologists by automatically characterizing the detected lesions. Deep learning approaches have shown promising results in natural image classification, but their applicability to medical imaging is limited by the shortage of large annotated training sets. In this work, we address automatic classification of breast MRI lesions using two different deep learning approaches. We propose a novel image representation for dynamic contrast enhanced (DCE) breast MRI lesions, which combines the morphological and kinetics information in a single multi-channel image. We compare two classification approaches for discriminating between benign and malignant lesions: training a designated convolutional neural network and using a pre-trained deep network to extract features for a shallow classifier. The domain-specific trained network provided higher classification accuracy, compared to the pre-trained model, with an area under the ROC curve of 0.91 versus 0.81, and an accuracy of 0.83 versus 0.71. Similar accuracy was achieved in classifying benign lesions, malignant lesions, and normal tissue images. The trained network was able to improve accuracy by using the multi-channel image representation, and was more robust to reductions in the size of the training set. A small-size convolutional neural network can learn to accurately classify findings in medical images using only a few hundred images from a few dozen patients. With sufficient data augmentation, such a network can be trained to outperform a pre-trained out-of-domain classifier. Developing domain-specific deep-learning models for medical imaging can facilitate technological advancements in computer-aided diagnosis.
Ground-based cloud classification by learning stable local binary patterns
NASA Astrophysics Data System (ADS)
Wang, Yu; Shi, Cunzhao; Wang, Chunheng; Xiao, Baihua
2018-07-01
Feature selection and extraction is the first step in implementing pattern classification. The same is true for ground-based cloud classification. Histogram features based on local binary patterns (LBPs) are widely used to classify texture images. However, the conventional uniform LBP approach cannot capture all the dominant patterns in cloud texture images, thereby resulting in low classification performance. In this study, a robust feature extraction method by learning stable LBPs is proposed based on the averaged ranks of the occurrence frequencies of all rotation invariant patterns defined in the LBPs of cloud images. The proposed method is validated with a ground-based cloud classification database comprising five cloud types. Experimental results demonstrate that the proposed method achieves significantly higher classification accuracy than the uniform LBP, local texture patterns (LTP), dominant LBP (DLBP), completed LBP (CLTP) and salient LBP (SaLBP) methods in this cloud image database and under different noise conditions. And the performance of the proposed method is comparable with that of the popular deep convolutional neural network (DCNN) method, but with less computation complexity. Furthermore, the proposed method also achieves superior performance on an independent test data set.
Youn, Su Hyun; Sim, Taeyong; Choi, Ahnryul; Song, Jinsung; Shin, Ki Young; Lee, Il Kwon; Heo, Hyun Mu; Lee, Daeweon; Mun, Joung Hwan
2015-06-01
Ultrasonic surgical units (USUs) have the advantage of minimizing tissue damage during surgeries that require tissue dissection by reducing problems such as coagulation and unwanted carbonization, but the disadvantage of requiring manual adjustment of power output according to the target tissue. In order to overcome this limitation, it is necessary to determine the properties of in vivo tissues automatically. We propose a multi-classifier that can accurately classify tissues based on the unique impedance of each tissue. For this purpose, a multi-classifier was built based on single classifiers with high classification rates, and the classification accuracy of the proposed model was compared with that of single classifiers for various electrode types (Type-I: 6 mm invasive; Type-II: 3 mm invasive; Type-III: surface). The sensitivity and positive predictive value (PPV) of the multi-classifier by cross checks were determined. According to the 10-fold cross validation results, the classification accuracy of the proposed model was significantly higher (p<0.05 or <0.01) than that of existing single classifiers for all electrode types. In particular, the classification accuracy of the proposed model was highest when the 3mm invasive electrode (Type-II) was used (sensitivity=97.33-100.00%; PPV=96.71-100.00%). The results of this study are an important contribution to achieving automatic optimal output power adjustment of USUs according to the properties of individual tissues. Copyright © 2015 Elsevier Ltd. All rights reserved.
Bricher, Phillippa K.; Lucieer, Arko; Shaw, Justine; Terauds, Aleks; Bergstrom, Dana M.
2013-01-01
Monitoring changes in the distribution and density of plant species often requires accurate and high-resolution baseline maps of those species. Detecting such change at the landscape scale is often problematic, particularly in remote areas. We examine a new technique to improve accuracy and objectivity in mapping vegetation, combining species distribution modelling and satellite image classification on a remote sub-Antarctic island. In this study, we combine spectral data from very high resolution WorldView-2 satellite imagery and terrain variables from a high resolution digital elevation model to improve mapping accuracy, in both pixel- and object-based classifications. Random forest classification was used to explore the effectiveness of these approaches on mapping the distribution of the critically endangered cushion plant Azorella macquariensis Orchard (Apiaceae) on sub-Antarctic Macquarie Island. Both pixel- and object-based classifications of the distribution of Azorella achieved very high overall validation accuracies (91.6–96.3%, κ = 0.849–0.924). Both two-class and three-class classifications were able to accurately and consistently identify the areas where Azorella was absent, indicating that these maps provide a suitable baseline for monitoring expected change in the distribution of the cushion plants. Detecting such change is critical given the threats this species is currently facing under altering environmental conditions. The method presented here has applications to monitoring a range of species, particularly in remote and isolated environments. PMID:23940805
Epithelial cancer detection by oblique-incidence optical spectroscopy
NASA Astrophysics Data System (ADS)
Garcia-Uribe, Alejandro; Balareddy, Karthik C.; Zou, Jun; Wang, Kenneth K.; Duvic, Madeleine; Wang, Lihong V.
2009-02-01
This paper presents a study on non-invasive detection of two common epithelial cancers (skin and esophagus) based on oblique incidence diffuse reflectance spectroscopy (OIDRS). An OIDRS measurement system, which combines fiber optics and MEMS technologies, was developed. In our pilot studies, a total number of 137 cases have been measured in-vivo for skin cancer detection and a total number of 20 biopsy samples have been measured ex-vivo for esophageal cancer detection. To automatically differentiate the cancerous cases from benign ones, a statistical software classification program was also developed. An overall classification accuracy of 90% and 100% has been achieved for skin and esophageal cancer classification, respectively.
Supervised segmentation of microelectrode recording artifacts using power spectral density.
Bakstein, Eduard; Schneider, Jakub; Sieger, Tomas; Novak, Daniel; Wild, Jiri; Jech, Robert
2015-08-01
Appropriate detection of clean signal segments in extracellular microelectrode recordings (MER) is vital for maintaining high signal-to-noise ratio in MER studies. Existing alternatives to manual signal inspection are based on unsupervised change-point detection. We present a method of supervised MER artifact classification, based on power spectral density (PSD) and evaluate its performance on a database of 95 labelled MER signals. The proposed method yielded test-set accuracy of 90%, which was close to the accuracy of annotation (94%). The unsupervised methods achieved accuracy of about 77% on both training and testing data.
A Brain-Machine Interface Based on ERD/ERS for an Upper-Limb Exoskeleton Control.
Tang, Zhichuan; Sun, Shouqian; Zhang, Sanyuan; Chen, Yumiao; Li, Chao; Chen, Shi
2016-12-02
To recognize the user's motion intention, brain-machine interfaces (BMI) usually decode movements from cortical activity to control exoskeletons and neuroprostheses for daily activities. The aim of this paper is to investigate whether self-induced variations of the electroencephalogram (EEG) can be useful as control signals for an upper-limb exoskeleton developed by us. A BMI based on event-related desynchronization/synchronization (ERD/ERS) is proposed. In the decoder-training phase, we investigate the offline classification performance of left versus right hand and left hand versus both feet by using motor execution (ME) or motor imagery (MI). The results indicate that the accuracies of ME sessions are higher than those of MI sessions, and left hand versus both feet paradigm achieves a better classification performance, which would be used in the online-control phase. In the online-control phase, the trained decoder is tested in two scenarios (wearing or without wearing the exoskeleton). The MI and ME sessions wearing the exoskeleton achieve mean classification accuracy of 84.29% ± 2.11% and 87.37% ± 3.06%, respectively. The present study demonstrates that the proposed BMI is effective to control the upper-limb exoskeleton, and provides a practical method by non-invasive EEG signal associated with human natural behavior for clinical applications.
Breast density characterization using texton distributions.
Petroudi, Styliani; Brady, Michael
2011-01-01
Breast density has been shown to be one of the most significant risks for developing breast cancer, with women with dense breasts at four to six times higher risk. The Breast Imaging Reporting and Data System (BI-RADS) has a four class classification scheme that describes the different breast densities. However, there is great inter and intra observer variability among clinicians in reporting a mammogram's density class. This work presents a novel texture classification method and its application for the development of a completely automated breast density classification system. The new method represents the mammogram using textons, which can be thought of as the building blocks of texture under the operational definition of Leung and Malik as clustered filter responses. The new proposed method characterizes the mammographic appearance of the different density patterns by evaluating the texton spatial dependence matrix (TDSM) in the breast region's corresponding texton map. The TSDM is a texture model that captures both statistical and structural texture characteristics. The normalized TSDM matrices are evaluated for mammograms from the different density classes and corresponding texture models are established. Classification is achieved using a chi-square distance measure. The fully automated TSDM breast density classification method is quantitatively evaluated on mammograms from all density classes from the Oxford Mammogram Database. The incorporation of texton spatial dependencies allows for classification accuracy reaching over 82%. The breast density classification accuracy is better using texton TSDM compared to simple texton histograms.
NASA Astrophysics Data System (ADS)
Yang, Huijuan; Guan, Cuntai; Sui Geok Chua, Karen; San Chok, See; Wang, Chuan Chu; Kok Soon, Phua; Tang, Christina Ka Yin; Keng Ang, Kai
2014-06-01
Objective. Detection of motor imagery of hand/arm has been extensively studied for stroke rehabilitation. This paper firstly investigates the detection of motor imagery of swallow (MI-SW) and motor imagery of tongue protrusion (MI-Ton) in an attempt to find a novel solution for post-stroke dysphagia rehabilitation. Detection of MI-SW from a simple yet relevant modality such as MI-Ton is then investigated, motivated by the similarity in activation patterns between tongue movements and swallowing and there being fewer movement artifacts in performing tongue movements compared to swallowing. Approach. Novel features were extracted based on the coefficients of the dual-tree complex wavelet transform to build multiple training models for detecting MI-SW. The session-to-session classification accuracy was boosted by adaptively selecting the training model to maximize the ratio of between-classes distances versus within-class distances, using features of training and evaluation data. Main results. Our proposed method yielded averaged cross-validation (CV) classification accuracies of 70.89% and 73.79% for MI-SW and MI-Ton for ten healthy subjects, which are significantly better than the results from existing methods. In addition, averaged CV accuracies of 66.40% and 70.24% for MI-SW and MI-Ton were obtained for one stroke patient, demonstrating the detectability of MI-SW and MI-Ton from the idle state. Furthermore, averaged session-to-session classification accuracies of 72.08% and 70% were achieved for ten healthy subjects and one stroke patient using the MI-Ton model. Significance. These results and the subjectwise strong correlations in classification accuracies between MI-SW and MI-Ton demonstrated the feasibility of detecting MI-SW from MI-Ton models.
A Novel Energy-Efficient Approach for Human Activity Recognition
Zheng, Lingxiang; Wu, Dihong; Ruan, Xiaoyang; Weng, Shaolin; Tang, Biyu; Lu, Hai; Shi, Haibin
2017-01-01
In this paper, we propose a novel energy-efficient approach for mobile activity recognition system (ARS) to detect human activities. The proposed energy-efficient ARS, using low sampling rates, can achieve high recognition accuracy and low energy consumption. A novel classifier that integrates hierarchical support vector machine and context-based classification (HSVMCC) is presented to achieve a high accuracy of activity recognition when the sampling rate is less than the activity frequency, i.e., the Nyquist sampling theorem is not satisfied. We tested the proposed energy-efficient approach with the data collected from 20 volunteers (14 males and six females) and the average recognition accuracy of around 96.0% was achieved. Results show that using a low sampling rate of 1Hz can save 17.3% and 59.6% of energy compared with the sampling rates of 5 Hz and 50 Hz. The proposed low sampling rate approach can greatly reduce the power consumption while maintaining high activity recognition accuracy. The composition of power consumption in online ARS is also investigated in this paper. PMID:28885560
NASA Astrophysics Data System (ADS)
Ma, Weiwei; Gong, Cailan; Hu, Yong; Meng, Peng; Xu, Feifei
2013-08-01
Hyperspectral data, consisting of hundreds of spectral bands with a high spectral resolution, enables acquisition of continuous spectral characteristic curves, and therefore have served as a powerful tool for vegetation classification. The difficulty of using hyperspectral data is that they are usually redundant, strongly correlated and subject to Hughes phenomenon where classification accuracy increases gradually in the beginning as the number of spectral bands or dimensions increases, but decreases dramatically when the band number reaches some value. In recent years,some algorithms have been proposed to overcome the Hughes phenomenon in classification, such as selecting several bands from full bands, PCA- and MNF-based feature transformations. Up to date, however, few studies have been conducted to investigate the turning point of Hughes phenomenon (i.e., the point at which the classification accuracy begins to decline). In this paper, we firstly analyze reasons for occurrence of Hughes phenomenon, and then based on the Mahalanobis classifier, classify the ground spectrum of several grasslands which were recorded in September 2012 using FieldSpec3 spectrometer in the regions around Qinghai Lake,a important pasturing area in the north of China. Before classification, we extract features from hyperspectral data by bands selecting and PCA- based feature transformations, and In the process of classification, we analyze how the correlation coefficient between wavebands, the number of waveband channels and the number of principal components affect the classification result. The results show that Hushes phenomenon may occur when the correlation coefficient between wavebands is greater than 94%,the number of wavebands is greater than 6, or the number of principal components is greater than 6. Best classification result can be achieved (overall accuracy of grasslands 90%) if the number of wavebands equals to 3 (the band positions are 370nm, 509nm and 886nm respectively) or the number of principal components ranges from 4 to 6.
Classification Accuracy Increase Using Multisensor Data Fusion
NASA Astrophysics Data System (ADS)
Makarau, A.; Palubinskas, G.; Reinartz, P.
2011-09-01
The practical use of very high resolution visible and near-infrared (VNIR) data is still growing (IKONOS, Quickbird, GeoEye-1, etc.) but for classification purposes the number of bands is limited in comparison to full spectral imaging. These limitations may lead to the confusion of materials such as different roofs, pavements, roads, etc. and therefore may provide wrong interpretation and use of classification products. Employment of hyperspectral data is another solution, but their low spatial resolution (comparing to multispectral data) restrict their usage for many applications. Another improvement can be achieved by fusion approaches of multisensory data since this may increase the quality of scene classification. Integration of Synthetic Aperture Radar (SAR) and optical data is widely performed for automatic classification, interpretation, and change detection. In this paper we present an approach for very high resolution SAR and multispectral data fusion for automatic classification in urban areas. Single polarization TerraSAR-X (SpotLight mode) and multispectral data are integrated using the INFOFUSE framework, consisting of feature extraction (information fission), unsupervised clustering (data representation on a finite domain and dimensionality reduction), and data aggregation (Bayesian or neural network). This framework allows a relevant way of multisource data combination following consensus theory. The classification is not influenced by the limitations of dimensionality, and the calculation complexity primarily depends on the step of dimensionality reduction. Fusion of single polarization TerraSAR-X, WorldView-2 (VNIR or full set), and Digital Surface Model (DSM) data allow for different types of urban objects to be classified into predefined classes of interest with increased accuracy. The comparison to classification results of WorldView-2 multispectral data (8 spectral bands) is provided and the numerical evaluation of the method in comparison to other established methods illustrates the advantage in the classification accuracy for many classes such as buildings, low vegetation, sport objects, forest, roads, rail roads, etc.
User-Independent Motion State Recognition Using Smartphone Sensors
Gu, Fuqiang; Kealy, Allison; Khoshelham, Kourosh; Shang, Jianga
2015-01-01
The recognition of locomotion activities (e.g., walking, running, still) is important for a wide range of applications like indoor positioning, navigation, location-based services, and health monitoring. Recently, there has been a growing interest in activity recognition using accelerometer data. However, when utilizing only acceleration-based features, it is difficult to differentiate varying vertical motion states from horizontal motion states especially when conducting user-independent classification. In this paper, we also make use of the newly emerging barometer built in modern smartphones, and propose a novel feature called pressure derivative from the barometer readings for user motion state recognition, which is proven to be effective for distinguishing vertical motion states and does not depend on specific users’ data. Seven types of motion states are defined and six commonly-used classifiers are compared. In addition, we utilize the motion state history and the characteristics of people’s motion to improve the classification accuracies of those classifiers. Experimental results show that by using the historical information and human’s motion characteristics, we can achieve user-independent motion state classification with an accuracy of up to 90.7%. In addition, we analyze the influence of the window size and smartphone pose on the accuracy. PMID:26690163
User-Independent Motion State Recognition Using Smartphone Sensors.
Gu, Fuqiang; Kealy, Allison; Khoshelham, Kourosh; Shang, Jianga
2015-12-04
The recognition of locomotion activities (e.g., walking, running, still) is important for a wide range of applications like indoor positioning, navigation, location-based services, and health monitoring. Recently, there has been a growing interest in activity recognition using accelerometer data. However, when utilizing only acceleration-based features, it is difficult to differentiate varying vertical motion states from horizontal motion states especially when conducting user-independent classification. In this paper, we also make use of the newly emerging barometer built in modern smartphones, and propose a novel feature called pressure derivative from the barometer readings for user motion state recognition, which is proven to be effective for distinguishing vertical motion states and does not depend on specific users' data. Seven types of motion states are defined and six commonly-used classifiers are compared. In addition, we utilize the motion state history and the characteristics of people's motion to improve the classification accuracies of those classifiers. Experimental results show that by using the historical information and human's motion characteristics, we can achieve user-independent motion state classification with an accuracy of up to 90.7%. In addition, we analyze the influence of the window size and smartphone pose on the accuracy.
Efficient and self-adaptive in-situ learning in multilayer memristor neural networks.
Li, Can; Belkin, Daniel; Li, Yunning; Yan, Peng; Hu, Miao; Ge, Ning; Jiang, Hao; Montgomery, Eric; Lin, Peng; Wang, Zhongrui; Song, Wenhao; Strachan, John Paul; Barnell, Mark; Wu, Qing; Williams, R Stanley; Yang, J Joshua; Xia, Qiangfei
2018-06-19
Memristors with tunable resistance states are emerging building blocks of artificial neural networks. However, in situ learning on a large-scale multiple-layer memristor network has yet to be demonstrated because of challenges in device property engineering and circuit integration. Here we monolithically integrate hafnium oxide-based memristors with a foundry-made transistor array into a multiple-layer neural network. We experimentally demonstrate in situ learning capability and achieve competitive classification accuracy on a standard machine learning dataset, which further confirms that the training algorithm allows the network to adapt to hardware imperfections. Our simulation using the experimental parameters suggests that a larger network would further increase the classification accuracy. The memristor neural network is a promising hardware platform for artificial intelligence with high speed-energy efficiency.
Kruskal-Wallis-based computationally efficient feature selection for face recognition.
Ali Khan, Sajid; Hussain, Ayyaz; Basit, Abdul; Akram, Sheeraz
2014-01-01
Face recognition in today's technological world, and face recognition applications attain much more importance. Most of the existing work used frontal face images to classify face image. However these techniques fail when applied on real world face images. The proposed technique effectively extracts the prominent facial features. Most of the features are redundant and do not contribute to representing face. In order to eliminate those redundant features, computationally efficient algorithm is used to select the more discriminative face features. Extracted features are then passed to classification step. In the classification step, different classifiers are ensemble to enhance the recognition accuracy rate as single classifier is unable to achieve the high accuracy. Experiments are performed on standard face database images and results are compared with existing techniques.
Younghak Shin; Balasingham, Ilangko
2017-07-01
Colonoscopy is a standard method for screening polyps by highly trained physicians. Miss-detected polyps in colonoscopy are potential risk factor for colorectal cancer. In this study, we investigate an automatic polyp classification framework. We aim to compare two different approaches named hand-craft feature method and convolutional neural network (CNN) based deep learning method. Combined shape and color features are used for hand craft feature extraction and support vector machine (SVM) method is adopted for classification. For CNN approach, three convolution and pooling based deep learning framework is used for classification purpose. The proposed framework is evaluated using three public polyp databases. From the experimental results, we have shown that the CNN based deep learning framework shows better classification performance than the hand-craft feature based methods. It achieves over 90% of classification accuracy, sensitivity, specificity and precision.
Combining multiple features for color texture classification
NASA Astrophysics Data System (ADS)
Cusano, Claudio; Napoletano, Paolo; Schettini, Raimondo
2016-11-01
The analysis of color and texture has a long history in image analysis and computer vision. These two properties are often considered as independent, even though they are strongly related in images of natural objects and materials. Correlation between color and texture information is especially relevant in the case of variable illumination, a condition that has a crucial impact on the effectiveness of most visual descriptors. We propose an ensemble of hand-crafted image descriptors designed to capture different aspects of color textures. We show that the use of these descriptors in a multiple classifiers framework makes it possible to achieve a very high classification accuracy in classifying texture images acquired under different lighting conditions. A powerful alternative to hand-crafted descriptors is represented by features obtained with deep learning methods. We also show how the proposed combining strategy hand-crafted and convolutional neural networks features can be used together to further improve the classification accuracy. Experimental results on a food database (raw food texture) demonstrate the effectiveness of the proposed strategy.
VO2 estimation using 6-axis motion sensor with sports activity classification.
Nagata, Takashi; Nakamura, Naoteru; Miyatake, Masato; Yuuki, Akira; Yomo, Hiroyuki; Kawabata, Takashi; Hara, Shinsuke
2016-08-01
In this paper, we focus on oxygen consumption (VO2) estimation using 6-axis motion sensor (3-axis accelerometer and 3-axis gyroscope) for people playing sports with diverse intensities. The VO2 estimated with a small motion sensor can be used to calculate the energy expenditure, however, its accuracy depends on the intensities of various types of activities. In order to achieve high accuracy over a wide range of intensities, we employ an estimation framework that first classifies activities with a simple machine-learning based classification algorithm. We prepare different coefficients of linear regression model for different types of activities, which are determined with training data obtained by experiments. The best-suited model is used for each type of activity when VO2 is estimated. The accuracy of the employed framework depends on the trade-off between the degradation due to classification errors and improvement brought by applying separate, optimum model to VO2 estimation. Taking this trade-off into account, we evaluate the accuracy of the employed estimation framework by using a set of experimental data consisting of VO2 and motion data of people with a wide range of intensities of exercises, which were measured by a VO2 meter and motion sensor, respectively. Our numerical results show that the employed framework can improve the estimation accuracy in comparison to a reference method that uses a common regression model for all types of activities.
Refining Time-Activity Classification of Human Subjects Using the Global Positioning System.
Hu, Maogui; Li, Wei; Li, Lianfa; Houston, Douglas; Wu, Jun
2016-01-01
Detailed spatial location information is important in accurately estimating personal exposure to air pollution. Global Position System (GPS) has been widely used in tracking personal paths and activities. Previous researchers have developed time-activity classification models based on GPS data, most of them were developed for specific regions. An adaptive model for time-location classification can be widely applied to air pollution studies that use GPS to track individual level time-activity patterns. Time-activity data were collected for seven days using GPS loggers and accelerometers from thirteen adult participants from Southern California under free living conditions. We developed an automated model based on random forests to classify major time-activity patterns (i.e. indoor, outdoor-static, outdoor-walking, and in-vehicle travel). Sensitivity analysis was conducted to examine the contribution of the accelerometer data and the supplemental spatial data (i.e. roadway and tax parcel data) to the accuracy of time-activity classification. Our model was evaluated using both leave-one-fold-out and leave-one-subject-out methods. Maximum speeds in averaging time intervals of 7 and 5 minutes, and distance to primary highways with limited access were found to be the three most important variables in the classification model. Leave-one-fold-out cross-validation showed an overall accuracy of 99.71%. Sensitivities varied from 84.62% (outdoor walking) to 99.90% (indoor). Specificities varied from 96.33% (indoor) to 99.98% (outdoor static). The exclusion of accelerometer and ambient light sensor variables caused a slight loss in sensitivity for outdoor walking, but little loss in overall accuracy. However, leave-one-subject-out cross-validation showed considerable loss in sensitivity for outdoor static and outdoor walking conditions. The random forests classification model can achieve high accuracy for the four major time-activity categories. The model also performed well with just GPS, road and tax parcel data. However, caution is warranted when generalizing the model developed from a small number of subjects to other populations.
Lundervold, Astri J; Bøe, Tormod; Lundervold, Arvid
2017-01-01
Inattention in childhood is associated with academic problems later in life. The contribution of specific aspects of inattentive behaviour is, however, less known. We investigated feature importance of primary school teachers' reports on nine aspects of inattentive behaviour, gender and age in predicting future academic achievement. Primary school teachers of n = 2491 children (7-9 years) rated nine items reflecting different aspects of inattentive behaviour in 2002. A mean academic achievement score from the previous semester in high school (2012) was available for each youth from an official school register. All scores were at a categorical level. Feature importances were assessed by using multinominal logistic regression, classification and regression trees analysis, and a random forest algorithm. Finally, a comprehensive pattern classification procedure using k-fold cross-validation was implemented. Overall, inattention was rated as more severe in boys, who also obtained lower academic achievement scores in high school than girls. Problems related to sustained attention and distractibility were together with age and gender defined as the most important features to predict future achievement scores. Using these four features as input to a collection of classifiers employing k-fold cross-validation for prediction of academic achievement level, we obtained classification accuracy, precision and recall that were clearly better than chance levels. Primary school teachers' reports of problems related to sustained attention and distractibility were identified as the two most important features of inattentive behaviour predicting academic achievement in high school. Identification and follow-up procedures of primary school children showing these characteristics should be prioritised to prevent future academic failure.
A multiple-point spatially weighted k-NN method for object-based classification
NASA Astrophysics Data System (ADS)
Tang, Yunwei; Jing, Linhai; Li, Hui; Atkinson, Peter M.
2016-10-01
Object-based classification, commonly referred to as object-based image analysis (OBIA), is now commonly regarded as able to produce more appealing classification maps, often of greater accuracy, than pixel-based classification and its application is now widespread. Therefore, improvement of OBIA using spatial techniques is of great interest. In this paper, multiple-point statistics (MPS) is proposed for object-based classification enhancement in the form of a new multiple-point k-nearest neighbour (k-NN) classification method (MPk-NN). The proposed method first utilises a training image derived from a pre-classified map to characterise the spatial correlation between multiple points of land cover classes. The MPS borrows spatial structures from other parts of the training image, and then incorporates this spatial information, in the form of multiple-point probabilities, into the k-NN classifier. Two satellite sensor images with a fine spatial resolution were selected to evaluate the new method. One is an IKONOS image of the Beijing urban area and the other is a WorldView-2 image of the Wolong mountainous area, in China. The images were object-based classified using the MPk-NN method and several alternatives, including the k-NN, the geostatistically weighted k-NN, the Bayesian method, the decision tree classifier (DTC), and the support vector machine classifier (SVM). It was demonstrated that the new spatial weighting based on MPS can achieve greater classification accuracy relative to the alternatives and it is, thus, recommended as appropriate for object-based classification.
NASA Astrophysics Data System (ADS)
Liu, Tao; Abd-Elrahman, Amr
2018-05-01
Deep convolutional neural network (DCNN) requires massive training datasets to trigger its image classification power, while collecting training samples for remote sensing application is usually an expensive process. When DCNN is simply implemented with traditional object-based image analysis (OBIA) for classification of Unmanned Aerial systems (UAS) orthoimage, its power may be undermined if the number training samples is relatively small. This research aims to develop a novel OBIA classification approach that can take advantage of DCNN by enriching the training dataset automatically using multi-view data. Specifically, this study introduces a Multi-View Object-based classification using Deep convolutional neural network (MODe) method to process UAS images for land cover classification. MODe conducts the classification on multi-view UAS images instead of directly on the orthoimage, and gets the final results via a voting procedure. 10-fold cross validation results show the mean overall classification accuracy increasing substantially from 65.32%, when DCNN was applied on the orthoimage to 82.08% achieved when MODe was implemented. This study also compared the performances of the support vector machine (SVM) and random forest (RF) classifiers with DCNN under traditional OBIA and the proposed multi-view OBIA frameworks. The results indicate that the advantage of DCNN over traditional classifiers in terms of accuracy is more obvious when these classifiers were applied with the proposed multi-view OBIA framework than when these classifiers were applied within the traditional OBIA framework.
NASA Astrophysics Data System (ADS)
Phinyomark, A.; Hu, H.; Phukpattaranont, P.; Limsakul, C.
2012-01-01
The classification of upper-limb movements based on surface electromyography (EMG) signals is an important issue in the control of assistive devices and rehabilitation systems. Increasing the number of EMG channels and features in order to increase the number of control commands can yield a high dimensional feature vector. To cope with the accuracy and computation problems associated with high dimensionality, it is commonplace to apply a processing step that transforms the data to a space of significantly lower dimensions with only a limited loss of useful information. Linear discriminant analysis (LDA) has been successfully applied as an EMG feature projection method. Recently, a number of extended LDA-based algorithms have been proposed, which are more competitive in terms of both classification accuracy and computational costs/times with classical LDA. This paper presents the findings of a comparative study of classical LDA and five extended LDA methods. From a quantitative comparison based on seven multi-feature sets, three extended LDA-based algorithms, consisting of uncorrelated LDA, orthogonal LDA and orthogonal fuzzy neighborhood discriminant analysis, produce better class separability when compared with a baseline system (without feature projection), principle component analysis (PCA), and classical LDA. Based on a 7-dimension time domain and time-scale feature vectors, these methods achieved respectively 95.2% and 93.2% classification accuracy by using a linear discriminant classifier.
Convolutional neural networks for vibrational spectroscopic data analysis.
Acquarelli, Jacopo; van Laarhoven, Twan; Gerretzen, Jan; Tran, Thanh N; Buydens, Lutgarde M C; Marchiori, Elena
2017-02-15
In this work we show that convolutional neural networks (CNNs) can be efficiently used to classify vibrational spectroscopic data and identify important spectral regions. CNNs are the current state-of-the-art in image classification and speech recognition and can learn interpretable representations of the data. These characteristics make CNNs a good candidate for reducing the need for preprocessing and for highlighting important spectral regions, both of which are crucial steps in the analysis of vibrational spectroscopic data. Chemometric analysis of vibrational spectroscopic data often relies on preprocessing methods involving baseline correction, scatter correction and noise removal, which are applied to the spectra prior to model building. Preprocessing is a critical step because even in simple problems using 'reasonable' preprocessing methods may decrease the performance of the final model. We develop a new CNN based method and provide an accompanying publicly available software. It is based on a simple CNN architecture with a single convolutional layer (a so-called shallow CNN). Our method outperforms standard classification algorithms used in chemometrics (e.g. PLS) in terms of accuracy when applied to non-preprocessed test data (86% average accuracy compared to the 62% achieved by PLS), and it achieves better performance even on preprocessed test data (96% average accuracy compared to the 89% achieved by PLS). For interpretability purposes, our method includes a procedure for finding important spectral regions, thereby facilitating qualitative interpretation of results. Copyright © 2016 Elsevier B.V. All rights reserved.
Radio Galaxy Zoo: compact and extended radio source classification with deep learning
NASA Astrophysics Data System (ADS)
Lukic, V.; Brüggen, M.; Banfield, J. K.; Wong, O. I.; Rudnick, L.; Norris, R. P.; Simmons, B.
2018-05-01
Machine learning techniques have been increasingly useful in astronomical applications over the last few years, for example in the morphological classification of galaxies. Convolutional neural networks have proven to be highly effective in classifying objects in image data. In the context of radio-interferometric imaging in astronomy, we looked for ways to identify multiple components of individual sources. To this effect, we design a convolutional neural network to differentiate between different morphology classes using sources from the Radio Galaxy Zoo (RGZ) citizen science project. In this first step, we focus on exploring the factors that affect the performance of such neural networks, such as the amount of training data, number and nature of layers, and the hyperparameters. We begin with a simple experiment in which we only differentiate between two extreme morphologies, using compact and multiple-component extended sources. We found that a three-convolutional layer architecture yielded very good results, achieving a classification accuracy of 97.4 per cent on a test data set. The same architecture was then tested on a four-class problem where we let the network classify sources into compact and three classes of extended sources, achieving a test accuracy of 93.5 per cent. The best-performing convolutional neural network set-up has been verified against RGZ Data Release 1 where a final test accuracy of 94.8 per cent was obtained, using both original and augmented images. The use of sigma clipping does not offer a significant benefit overall, except in cases with a small number of training images.
Brain amyloidosis ascertainment from cognitive, imaging, and peripheral blood protein measures
Hwang, Kristy S.; Avila, David; Elashoff, David; Kohannim, Omid; Teng, Edmond; Sokolow, Sophie; Jack, Clifford R.; Jagust, William J.; Shaw, Leslie; Trojanowski, John Q.; Weiner, Michael W.; Thompson, Paul M.
2015-01-01
Background: The goal of this study was to identify a clinical biomarker signature of brain amyloidosis in the Alzheimer's Disease Neuroimaging Initiative 1 (ADNI1) mild cognitive impairment (MCI) cohort. Methods: We developed a multimodal biomarker classifier for predicting brain amyloidosis using cognitive, imaging, and peripheral blood protein ADNI1 MCI data. We used CSF β-amyloid 1–42 (Aβ42) ≤192 pg/mL as proxy measure for Pittsburgh compound B (PiB)-PET standard uptake value ratio ≥1.5. We trained our classifier in the subcohort with CSF Aβ42 but no PiB-PET data and tested its performance in the subcohort with PiB-PET but no CSF Aβ42 data. We also examined the utility of our biomarker signature for predicting disease progression from MCI to Alzheimer dementia. Results: The CSF training classifier selected Mini-Mental State Examination, Trails B, Auditory Verbal Learning Test delayed recall, education, APOE genotype, interleukin 6 receptor, clusterin, and ApoE protein, and achieved leave-one-out accuracy of 85% (area under the curve [AUC] = 0.8). The PiB testing classifier achieved an AUC of 0.72, and when classifier self-tuning was allowed, AUC = 0.74. The 36-month disease-progression classifier achieved AUC = 0.75 and accuracy = 71%. Conclusions: Automated classifiers based on cognitive and peripheral blood protein variables can identify the presence of brain amyloidosis with a modest level of accuracy. Such methods could have implications for clinical trial design and enrollment in the near future. Classification of evidence: This study provides Class II evidence that a classification algorithm based on cognitive, imaging, and peripheral blood protein measures identifies patients with brain amyloid on PiB-PET with moderate accuracy (sensitivity 68%, specificity 78%). PMID:25609767
Besga, Ariadna; Gonzalez, Itxaso; Echeburua, Enrique; Savio, Alexandre; Ayerdi, Borja; Chyzhyk, Darya; Madrigal, Jose L M; Leza, Juan C; Graña, Manuel; Gonzalez-Pinto, Ana Maria
2015-01-01
Late onset bipolar disorder (LOBD) is often difficult to distinguish from degenerative dementias, such as Alzheimer disease (AD), due to comorbidities and common cognitive symptoms. Moreover, LOBD prevalence in the elder population is not negligible and it is increasing. Both pathologies share pathophysiological neuroinflammation features. Improvements in differential diagnosis of LOBD and AD will help to select the best personalized treatment. The aim of this study is to assess the relative significance of clinical observations, neuropsychological tests, and specific blood plasma biomarkers (inflammatory and neurotrophic), separately and combined, in the differential diagnosis of LOBD versus AD. It was carried out evaluating the accuracy achieved by classification-based computer-aided diagnosis (CAD) systems based on these variables. A sample of healthy controls (HC) (n = 26), AD patients (n = 37), and LOBD patients (n = 32) was recruited at the Alava University Hospital. Clinical observations, neuropsychological tests, and plasma biomarkers were measured at recruitment time. We applied multivariate machine learning classification methods to discriminate subjects from HC, AD, and LOBD populations in the study. We analyzed, for each classification contrast, feature sets combining clinical observations, neuropsychological measures, and biological markers, including inflammation biomarkers. Furthermore, we analyzed reduced feature sets containing variables with significative differences determined by a Welch's t-test. Furthermore, a battery of classifier architectures were applied, encompassing linear and non-linear Support Vector Machines (SVM), Random Forests (RF), Classification and regression trees (CART), and their performance was evaluated in a leave-one-out (LOO) cross-validation scheme. Post hoc analysis of Gini index in CART classifiers provided a measure of each variable importance. Welch's t-test found one biomarker (Malondialdehyde) with significative differences (p < 0.001) in LOBD vs. AD contrast. Classification results with the best features are as follows: discrimination of HC vs. AD patients reaches accuracy 97.21% and AUC 98.17%. Discrimination of LOBD vs. AD patients reaches accuracy 90.26% and AUC 89.57%. Discrimination of HC vs LOBD patients achieves accuracy 95.76% and AUC 88.46%. It is feasible to build CAD systems for differential diagnosis of LOBD and AD on the basis of a reduced set of clinical variables. Clinical observations provide the greatest discrimination. Neuropsychological tests are improved by the addition of biomarkers, and both contribute significantly to improve the overall predictive performance.
NASA Astrophysics Data System (ADS)
Dronova, I.; Gong, P.; Wang, L.; Clinton, N.; Fu, W.; Qi, S.
2011-12-01
Remote sensing-based vegetation classifications representing plant function such as photosynthesis and productivity are challenging in wetlands with complex cover and difficult field access. Recent advances in object-based image analysis (OBIA) and machine-learning algorithms offer new classification tools; however, few comparisons of different algorithms and spatial scales have been discussed to date. We applied OBIA to delineate wetland plant functional types (PFTs) for Poyang Lake, the largest freshwater lake in China and Ramsar wetland conservation site, from 30-m Landsat TM scene at the peak of spring growing season. We targeted major PFTs (C3 grasses, C3 forbs and different types of C4 grasses and aquatic vegetation) that are both key players in system's biogeochemical cycles and critical providers of waterbird habitat. Classification results were compared among: a) several object segmentation scales (with average object sizes 900-9000 m2); b) several families of statistical classifiers (including Bayesian, Logistic, Neural Network, Decision Trees and Support Vector Machines) and c) two hierarchical levels of vegetation classification, a generalized 3-class set and more detailed 6-class set. We found that classification benefited from object-based approach which allowed including object shape, texture and context descriptors in classification. While a number of classifiers achieved high accuracy at the finest pixel-equivalent segmentation scale, the highest accuracies and best agreement among algorithms occurred at coarser object scales. No single classifier was consistently superior across all scales, although selected algorithms of Neural Network, Logistic and K-Nearest Neighbors families frequently provided the best discrimination of classes at different scales. The choice of vegetation categories also affected classification accuracy. The 6-class set allowed for higher individual class accuracies but lower overall accuracies than the 3-class set because individual classes differed in scales at which they were best discriminated from others. Main classification challenges included a) presence of C3 grasses in C4-grass areas, particularly following harvesting of C4 reeds and b) mixtures of emergent, floating and submerged aquatic plants at sub-object and sub-pixel scales. We conclude that OBIA with advanced statistical classifiers offers useful instruments for landscape vegetation analyses, and that spatial scale considerations are critical in mapping PFTs, while multi-scale comparisons can be used to guide class selection. Future work will further apply fuzzy classification and field-collected spectral data for PFT analysis and compare results with MODIS PFT products.
Assessment of the Thematic Accuracy of Land Cover Maps
NASA Astrophysics Data System (ADS)
Höhle, J.
2015-08-01
Several land cover maps are generated from aerial imagery and assessed by different approaches. The test site is an urban area in Europe for which six classes (`building', `hedge and bush', `grass', `road and parking lot', `tree', `wall and car port') had to be derived. Two classification methods were applied (`Decision Tree' and `Support Vector Machine') using only two attributes (height above ground and normalized difference vegetation index) which both are derived from the images. The assessment of the thematic accuracy applied a stratified design and was based on accuracy measures such as user's and producer's accuracy, and kappa coefficient. In addition, confidence intervals were computed for several accuracy measures. The achieved accuracies and confidence intervals are thoroughly analysed and recommendations are derived from the gained experiences. Reliable reference values are obtained using stereovision, false-colour image pairs, and positioning to the checkpoints with 3D coordinates. The influence of the training areas on the results is studied. Cross validation has been tested with a few reference points in order to derive approximate accuracy measures. The two classification methods perform equally for five classes. Trees are classified with a much better accuracy and a smaller confidence interval by means of the decision tree method. Buildings are classified by both methods with an accuracy of 99% (95% CI: 95%-100%) using independent 3D checkpoints. The average width of the confidence interval of six classes was 14% of the user's accuracy.
Foliar and woody materials discriminated using terrestrial LiDAR in a mixed natural forest
NASA Astrophysics Data System (ADS)
Zhu, Xi; Skidmore, Andrew K.; Darvishzadeh, Roshanak; Niemann, K. Olaf; Liu, Jing; Shi, Yifang; Wang, Tiejun
2018-02-01
Separation of foliar and woody materials using remotely sensed data is crucial for the accurate estimation of leaf area index (LAI) and woody biomass across forest stands. In this paper, we present a new method to accurately separate foliar and woody materials using terrestrial LiDAR point clouds obtained from ten test sites in a mixed forest in Bavarian Forest National Park, Germany. Firstly, we applied and compared an adaptive radius near-neighbor search algorithm with a fixed radius near-neighbor search method in order to obtain both radiometric and geometric features derived from terrestrial LiDAR point clouds. Secondly, we used a random forest machine learning algorithm to classify foliar and woody materials and examined the impact of understory and slope on the classification accuracy. An average overall accuracy of 84.4% (Kappa = 0.75) was achieved across all experimental plots. The adaptive radius near-neighbor search method outperformed the fixed radius near-neighbor search method. The classification accuracy was significantly higher when the combination of both radiometric and geometric features was utilized. The analysis showed that increasing slope and understory coverage had a significant negative effect on the overall classification accuracy. Our results suggest that the utilization of the adaptive radius near-neighbor search method coupling both radiometric and geometric features has the potential to accurately discriminate foliar and woody materials from terrestrial LiDAR data in a mixed natural forest.
Automatic Fault Characterization via Abnormality-Enhanced Classification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bronevetsky, G; Laguna, I; de Supinski, B R
Enterprise and high-performance computing systems are growing extremely large and complex, employing hundreds to hundreds of thousands of processors and software/hardware stacks built by many people across many organizations. As the growing scale of these machines increases the frequency of faults, system complexity makes these faults difficult to detect and to diagnose. Current system management techniques, which focus primarily on efficient data access and query mechanisms, require system administrators to examine the behavior of various system services manually. Growing system complexity is making this manual process unmanageable: administrators require more effective management tools that can detect faults and help tomore » identify their root causes. System administrators need timely notification when a fault is manifested that includes the type of fault, the time period in which it occurred and the processor on which it originated. Statistical modeling approaches can accurately characterize system behavior. However, the complex effects of system faults make these tools difficult to apply effectively. This paper investigates the application of classification and clustering algorithms to fault detection and characterization. We show experimentally that naively applying these methods achieves poor accuracy. Further, we design novel techniques that combine classification algorithms with information on the abnormality of application behavior to improve detection and characterization accuracy. Our experiments demonstrate that these techniques can detect and characterize faults with 65% accuracy, compared to just 5% accuracy for naive approaches.« less
Log Defect Recognition Using CT-images and Neural Net Classifiers
Daniel L. Schmoldt; Pei Li; A. Lynn Abbott
1995-01-01
Although several approaches have been introduced to automatically identify internal log defects using computed tomography (CT) imagery, most of these have been feasibility efforts and consequently have had several limitations: (1) reports of classification accuracy are largely subjective, not statistical, (2) there has been no attempt to achieve real-time operation,...
Zhou, Zhen; Wang, Jian-Bao; Zang, Yu-Feng; Pan, Gang
2018-01-01
Classification approaches have been increasingly applied to differentiate patients and normal controls using resting-state functional magnetic resonance imaging data (RS-fMRI). Although most previous classification studies have reported promising accuracy within individual datasets, achieving high levels of accuracy with multiple datasets remains challenging for two main reasons: high dimensionality, and high variability across subjects. We used two independent RS-fMRI datasets (n = 31, 46, respectively) both with eyes closed (EC) and eyes open (EO) conditions. For each dataset, we first reduced the number of features to a small number of brain regions with paired t-tests, using the amplitude of low frequency fluctuation (ALFF) as a metric. Second, we employed a new method for feature extraction, named the PAIR method, examining EC and EO as paired conditions rather than independent conditions. Specifically, for each dataset, we obtained EC minus EO (EC—EO) maps of ALFF from half of subjects (n = 15 for dataset-1, n = 23 for dataset-2) and obtained EO—EC maps from the other half (n = 16 for dataset-1, n = 23 for dataset-2). A support vector machine (SVM) method was used for classification of EC RS-fMRI mapping and EO mapping. The mean classification accuracy of the PAIR method was 91.40% for dataset-1, and 92.75% for dataset-2 in the conventional frequency band of 0.01–0.08 Hz. For cross-dataset validation, we applied the classifier from dataset-1 directly to dataset-2, and vice versa. The mean accuracy of cross-dataset validation was 94.93% for dataset-1 to dataset-2 and 90.32% for dataset-2 to dataset-1 in the 0.01–0.08 Hz range. For the UNPAIR method, classification accuracy was substantially lower (mean 69.89% for dataset-1 and 82.97% for dataset-2), and was much lower for cross-dataset validation (64.69% for dataset-1 to dataset-2 and 64.98% for dataset-2 to dataset-1) in the 0.01–0.08 Hz range. In conclusion, for within-group design studies (e.g., paired conditions or follow-up studies), we recommend the PAIR method for feature extraction. In addition, dimensionality reduction with strong prior knowledge of specific brain regions should also be considered for feature selection in neuroimaging studies. PMID:29375288
Lajnef, Tarek; Chaibi, Sahbi; Ruby, Perrine; Aguera, Pierre-Emmanuel; Eichenlaub, Jean-Baptiste; Samet, Mounir; Kachouri, Abdennaceur; Jerbi, Karim
2015-07-30
Sleep staging is a critical step in a range of electrophysiological signal processing pipelines used in clinical routine as well as in sleep research. Although the results currently achievable with automatic sleep staging methods are promising, there is need for improvement, especially given the time-consuming and tedious nature of visual sleep scoring. Here we propose a sleep staging framework that consists of a multi-class support vector machine (SVM) classification based on a decision tree approach. The performance of the method was evaluated using polysomnographic data from 15 subjects (electroencephalogram (EEG), electrooculogram (EOG) and electromyogram (EMG) recordings). The decision tree, or dendrogram, was obtained using a hierarchical clustering technique and a wide range of time and frequency-domain features were extracted. Feature selection was carried out using forward sequential selection and classification was evaluated using k-fold cross-validation. The dendrogram-based SVM (DSVM) achieved mean specificity, sensitivity and overall accuracy of 0.92, 0.74 and 0.88 respectively, compared to expert visual scoring. Restricting DSVM classification to data where both experts' scoring was consistent (76.73% of the data) led to a mean specificity, sensitivity and overall accuracy of 0.94, 0.82 and 0.92 respectively. The DSVM framework outperforms classification with more standard multi-class "one-against-all" SVM and linear-discriminant analysis. The promising results of the proposed methodology suggest that it may be a valuable alternative to existing automatic methods and that it could accelerate visual scoring by providing a robust starting hypnogram that can be further fine-tuned by expert inspection. Copyright © 2015 Elsevier B.V. All rights reserved.
A motion-classification strategy based on sEMG-EEG signal combination for upper-limb amputees.
Li, Xiangxin; Samuel, Oluwarotimi Williams; Zhang, Xu; Wang, Hui; Fang, Peng; Li, Guanglin
2017-01-07
Most of the modern motorized prostheses are controlled with the surface electromyography (sEMG) recorded on the residual muscles of amputated limbs. However, the residual muscles are usually limited, especially after above-elbow amputations, which would not provide enough sEMG for the control of prostheses with multiple degrees of freedom. Signal fusion is a possible approach to solve the problem of insufficient control commands, where some non-EMG signals are combined with sEMG signals to provide sufficient information for motion intension decoding. In this study, a motion-classification method that combines sEMG and electroencephalography (EEG) signals were proposed and investigated, in order to improve the control performance of upper-limb prostheses. Four transhumeral amputees without any form of neurological disease were recruited in the experiments. Five motion classes including hand-open, hand-close, wrist-pronation, wrist-supination, and no-movement were specified. During the motion performances, sEMG and EEG signals were simultaneously acquired from the skin surface and scalp of the amputees, respectively. The two types of signals were independently preprocessed and then combined as a parallel control input. Four time-domain features were extracted and fed into a classifier trained by the Linear Discriminant Analysis (LDA) algorithm for motion recognition. In addition, channel selections were performed by using the Sequential Forward Selection (SFS) algorithm to optimize the performance of the proposed method. The classification performance achieved by the fusion of sEMG and EEG signals was significantly better than that obtained by single signal source of either sEMG or EEG. An increment of more than 14% in classification accuracy was achieved when using a combination of 32-channel sEMG and 64-channel EEG. Furthermore, based on the SFS algorithm, two optimized electrode arrangements (10-channel sEMG + 10-channel EEG, 10-channel sEMG + 20-channel EEG) were obtained with classification accuracies of 84.2 and 87.0%, respectively, which were about 7.2 and 10% higher than the accuracy by using only 32-channel sEMG input. This study demonstrated the feasibility of fusing sEMG and EEG signals towards improving motion classification accuracy for above-elbow amputees, which might enhance the control performances of multifunctional myoelectric prostheses in clinical application. The study was approved by the ethics committee of Institutional Review Board of Shenzhen Institutes of Advanced Technology, and the reference number is SIAT-IRB-150515-H0077.
Gemignani, Jessica; Middell, Eike; Barbour, Randall L; Graber, Harry L; Blankertz, Benjamin
2018-04-04
The statistical analysis of functional near infrared spectroscopy (fNIRS) data based on the general linear model (GLM) is often made difficult by serial correlations, high inter-subject variability of the hemodynamic response, and the presence of motion artifacts. In this work we propose to extract information on the pattern of hemodynamic activations without using any a priori model for the data, by classifying the channels as 'active' or 'not active' with a multivariate classifier based on linear discriminant analysis (LDA). This work is developed in two steps. First we compared the performance of the two analyses, using a synthetic approach in which simulated hemodynamic activations were combined with either simulated or real resting-state fNIRS data. This procedure allowed for exact quantification of the classification accuracies of GLM and LDA. In the case of real resting-state data, the correlations between classification accuracy and demographic characteristics were investigated by means of a Linear Mixed Model. In the second step, to further characterize the reliability of the newly proposed analysis method, we conducted an experiment in which participants had to perform a simple motor task and data were analyzed with the LDA-based classifier as well as with the standard GLM analysis. The results of the simulation study show that the LDA-based method achieves higher classification accuracies than the GLM analysis, and that the LDA results are more uniform across different subjects and, in contrast to the accuracies achieved by the GLM analysis, have no significant correlations with any of the demographic characteristics. Findings from the real-data experiment are consistent with the results of the real-plus-simulation study, in that the GLM-analysis results show greater inter-subject variability than do the corresponding LDA results. The results obtained suggest that the outcome of GLM analysis is highly vulnerable to violations of theoretical assumptions, and that therefore a data-driven approach such as that provided by the proposed LDA-based method is to be favored.
Low-back electromyography (EMG) data-driven load classification for dynamic lifting tasks
Ojeda, Lauro; Johnson, Daniel D.; Gates, Deanna; Mower Provost, Emily; Barton, Kira
2018-01-01
Objective Numerous devices have been designed to support the back during lifting tasks. To improve the utility of such devices, this research explores the use of preparatory muscle activity to classify muscle loading and initiate appropriate device activation. The goal of this study was to determine the earliest time window that enabled accurate load classification during a dynamic lifting task. Methods Nine subjects performed thirty symmetrical lifts, split evenly across three weight conditions (no-weight, 10-lbs and 24-lbs), while low-back muscle activity data was collected. Seven descriptive statistics features were extracted from 100 ms windows of data. A multinomial logistic regression (MLR) classifier was trained and tested, employing leave-one subject out cross-validation, to classify lifted load values. Dimensionality reduction was achieved through feature cross-correlation analysis and greedy feedforward selection. The time of full load support by the subject was defined as load-onset. Results Regions of highest average classification accuracy started at 200 ms before until 200 ms after load-onset with average accuracies ranging from 80% (±10%) to 81% (±7%). The average recall for each class ranged from 69–92%. Conclusion These inter-subject classification results indicate that preparatory muscle activity can be leveraged to identify the intent to lift a weight up to 100 ms prior to load-onset. The high accuracies shown indicate the potential to utilize intent classification for assistive device applications. Significance Active assistive devices, e.g. exoskeletons, could prevent back injury by off-loading low-back muscles. Early intent classification allows more time for actuators to respond and integrate seamlessly with the user. PMID:29447252
Wishart Deep Stacking Network for Fast POLSAR Image Classification.
Jiao, Licheng; Liu, Fang
2016-05-11
Inspired by the popular deep learning architecture - Deep Stacking Network (DSN), a specific deep model for polarimetric synthetic aperture radar (POLSAR) image classification is proposed in this paper, which is named as Wishart Deep Stacking Network (W-DSN). First of all, a fast implementation of Wishart distance is achieved by a special linear transformation, which speeds up the classification of POLSAR image and makes it possible to use this polarimetric information in the following Neural Network (NN). Then a single-hidden-layer neural network based on the fast Wishart distance is defined for POLSAR image classification, which is named as Wishart Network (WN) and improves the classification accuracy. Finally, a multi-layer neural network is formed by stacking WNs, which is in fact the proposed deep learning architecture W-DSN for POLSAR image classification and improves the classification accuracy further. In addition, the structure of WN can be expanded in a straightforward way by adding hidden units if necessary, as well as the structure of the W-DSN. As a preliminary exploration on formulating specific deep learning architecture for POLSAR image classification, the proposed methods may establish a simple but clever connection between POLSAR image interpretation and deep learning. The experiment results tested on real POLSAR image show that the fast implementation of Wishart distance is very efficient (a POLSAR image with 768000 pixels can be classified in 0.53s), and both the single-hidden-layer architecture WN and the deep learning architecture W-DSN for POLSAR image classification perform well and work efficiently.
Li, Zhaohua; Wang, Yuduo; Quan, Wenxiang; Wu, Tongning; Lv, Bin
2015-02-15
Based on near-infrared spectroscopy (NIRS), recent converging evidence has been observed that patients with schizophrenia exhibit abnormal functional activities in the prefrontal cortex during a verbal fluency task (VFT). Therefore, some studies have attempted to employ NIRS measurements to differentiate schizophrenia patients from healthy controls with different classification methods. However, no systematic evaluation was conducted to compare their respective classification performances on the same study population. In this study, we evaluated the classification performance of four classification methods (including linear discriminant analysis, k-nearest neighbors, Gaussian process classifier, and support vector machines) on an NIRS-aided schizophrenia diagnosis. We recruited a large sample of 120 schizophrenia patients and 120 healthy controls and measured the hemoglobin response in the prefrontal cortex during the VFT using a multichannel NIRS system. Features for classification were extracted from three types of NIRS data in each channel. We subsequently performed a principal component analysis (PCA) for feature selection prior to comparison of the different classification methods. We achieved a maximum accuracy of 85.83% and an overall mean accuracy of 83.37% using a PCA-based feature selection on oxygenated hemoglobin signals and support vector machine classifier. This is the first comprehensive evaluation of different classification methods for the diagnosis of schizophrenia based on different types of NIRS signals. Our results suggested that, using the appropriate classification method, NIRS has the potential capacity to be an effective objective biomarker for the diagnosis of schizophrenia. Copyright © 2014 Elsevier B.V. All rights reserved.
Automated Tissue Classification Framework for Reproducible Chronic Wound Assessment
Mukherjee, Rashmi; Manohar, Dhiraj Dhane; Das, Dev Kumar; Achar, Arun; Mitra, Analava; Chakraborty, Chandan
2014-01-01
The aim of this paper was to develop a computer assisted tissue classification (granulation, necrotic, and slough) scheme for chronic wound (CW) evaluation using medical image processing and statistical machine learning techniques. The red-green-blue (RGB) wound images grabbed by normal digital camera were first transformed into HSI (hue, saturation, and intensity) color space and subsequently the “S” component of HSI color channels was selected as it provided higher contrast. Wound areas from 6 different types of CW were segmented from whole images using fuzzy divergence based thresholding by minimizing edge ambiguity. A set of color and textural features describing granulation, necrotic, and slough tissues in the segmented wound area were extracted using various mathematical techniques. Finally, statistical learning algorithms, namely, Bayesian classification and support vector machine (SVM), were trained and tested for wound tissue classification in different CW images. The performance of the wound area segmentation protocol was further validated by ground truth images labeled by clinical experts. It was observed that SVM with 3rd order polynomial kernel provided the highest accuracies, that is, 86.94%, 90.47%, and 75.53%, for classifying granulation, slough, and necrotic tissues, respectively. The proposed automated tissue classification technique achieved the highest overall accuracy, that is, 87.61%, with highest kappa statistic value (0.793). PMID:25114925
Peng, Xiang; King, Irwin
2008-01-01
The Biased Minimax Probability Machine (BMPM) constructs a classifier which deals with the imbalanced learning tasks. It provides a worst-case bound on the probability of misclassification of future data points based on reliable estimates of means and covariance matrices of the classes from the training data samples, and achieves promising performance. In this paper, we develop a novel yet critical extension training algorithm for BMPM that is based on Second-Order Cone Programming (SOCP). Moreover, we apply the biased classification model to medical diagnosis problems to demonstrate its usefulness. By removing some crucial assumptions in the original solution to this model, we make the new method more accurate and robust. We outline the theoretical derivatives of the biased classification model, and reformulate it into an SOCP problem which could be efficiently solved with global optima guarantee. We evaluate our proposed SOCP-based BMPM (BMPMSOCP) scheme in comparison with traditional solutions on medical diagnosis tasks where the objectives are to focus on improving the sensitivity (the accuracy of the more important class, say "ill" samples) instead of the overall accuracy of the classification. Empirical results have shown that our method is more effective and robust to handle imbalanced classification problems than traditional classification approaches, and the original Fractional Programming-based BMPM (BMPMFP).
LDA boost classification: boosting by topics
NASA Astrophysics Data System (ADS)
Lei, La; Qiao, Guo; Qimin, Cao; Qitao, Li
2012-12-01
AdaBoost is an efficacious classification algorithm especially in text categorization (TC) tasks. The methodology of setting up a classifier committee and voting on the documents for classification can achieve high categorization precision. However, traditional Vector Space Model can easily lead to the curse of dimensionality and feature sparsity problems; so it affects classification performance seriously. This article proposed a novel classification algorithm called LDABoost based on boosting ideology which uses Latent Dirichlet Allocation (LDA) to modeling the feature space. Instead of using words or phrase, LDABoost use latent topics as the features. In this way, the feature dimension is significantly reduced. Improved Naïve Bayes (NB) is designed as the weaker classifier which keeps the efficiency advantage of classic NB algorithm and has higher precision. Moreover, a two-stage iterative weighted method called Cute Integration in this article is proposed for improving the accuracy by integrating weak classifiers into strong classifier in a more rational way. Mutual Information is used as metrics of weights allocation. The voting information and the categorization decision made by basis classifiers are fully utilized for generating the strong classifier. Experimental results reveals LDABoost making categorization in a low-dimensional space, it has higher accuracy than traditional AdaBoost algorithms and many other classic classification algorithms. Moreover, its runtime consumption is lower than different versions of AdaBoost, TC algorithms based on support vector machine and Neural Networks.
Lewicke, Aaron; Sazonov, Edward; Corwin, Michael J; Neuman, Michael; Schuckers, Stephanie
2008-01-01
Reliability of classification performance is important for many biomedical applications. A classification model which considers reliability in the development of the model such that unreliable segments are rejected would be useful, particularly, in large biomedical data sets. This approach is demonstrated in the development of a technique to reliably determine sleep and wake using only the electrocardiogram (ECG) of infants. Typically, sleep state scoring is a time consuming task in which sleep states are manually derived from many physiological signals. The method was tested with simultaneous 8-h ECG and polysomnogram (PSG) determined sleep scores from 190 infants enrolled in the collaborative home infant monitoring evaluation (CHIME) study. Learning vector quantization (LVQ) neural network, multilayer perceptron (MLP) neural network, and support vector machines (SVMs) are tested as the classifiers. After systematic rejection of difficult to classify segments, the models can achieve 85%-87% correct classification while rejecting only 30% of the data. This corresponds to a Kappa statistic of 0.65-0.68. With rejection, accuracy improves by about 8% over a model without rejection. Additionally, the impact of the PSG scored indeterminate state epochs is analyzed. The advantages of a reliable sleep/wake classifier based only on ECG include high accuracy, simplicity of use, and low intrusiveness. Reliability of the classification can be built directly in the model, such that unreliable segments are rejected.
The influence of multispectral scanner spatial resolution on forest feature classification
NASA Technical Reports Server (NTRS)
Sadowski, F. G.; Malila, W. A.; Sarno, J. E.; Nalepka, R. F.
1977-01-01
Inappropriate spatial resolution and corresponding data processing techniques may be major causes for non-optimal forest classification results frequently achieved from multispectral scanner (MSS) data. Procedures and results of empirical investigations are studied to determine the influence of MSS spatial resolution on the classification of forest features into levels of detail or hierarchies of information that might be appropriate for nationwide forest surveys and detailed in-place inventories. Two somewhat different, but related studies are presented. The first consisted of establishing classification accuracies for several hierarchies of features as spatial resolution was progressively coarsened from (2 meters) squared to (64 meters) squared. The second investigated the capabilities for specialized processing techniques to improve upon the results of conventional processing procedures for both coarse and fine resolution data.
Xu, Xiayu; Ding, Wenxiang; Abràmoff, Michael D; Cao, Ruofan
2017-04-01
Retinal artery and vein classification is an important task for the automatic computer-aided diagnosis of various eye diseases and systemic diseases. This paper presents an improved supervised artery and vein classification method in retinal image. Intra-image regularization and inter-subject normalization is applied to reduce the differences in feature space. Novel features, including first-order and second-order texture features, are utilized to capture the discriminating characteristics of arteries and veins. The proposed method was tested on the DRIVE dataset and achieved an overall accuracy of 0.923. This retinal artery and vein classification algorithm serves as a potentially important tool for the early diagnosis of various diseases, including diabetic retinopathy and cardiovascular diseases. Copyright © 2017 Elsevier B.V. All rights reserved.
Variance approximations for assessments of classification accuracy
R. L. Czaplewski
1994-01-01
Variance approximations are derived for the weighted and unweighted kappa statistics, the conditional kappa statistic, and conditional probabilities. These statistics are useful to assess classification accuracy, such as accuracy of remotely sensed classifications in thematic maps when compared to a sample of reference classifications made in the field. Published...
Classification of Focal and Non Focal Epileptic Seizures Using Multi-Features and SVM Classifier.
Sriraam, N; Raghu, S
2017-09-02
Identifying epileptogenic zones prior to surgery is an essential and crucial step in treating patients having pharmacoresistant focal epilepsy. Electroencephalogram (EEG) is a significant measurement benchmark to assess patients suffering from epilepsy. This paper investigates the application of multi-features derived from different domains to recognize the focal and non focal epileptic seizures obtained from pharmacoresistant focal epilepsy patients from Bern Barcelona database. From the dataset, five different classification tasks were formed. Total 26 features were extracted from focal and non focal EEG. Significant features were selected using Wilcoxon rank sum test by setting p-value (p < 0.05) and z-score (-1.96 > z > 1.96) at 95% significance interval. Hypothesis was made that the effect of removing outliers improves the classification accuracy. Turkey's range test was adopted for pruning outliers from feature set. Finally, 21 features were classified using optimized support vector machine (SVM) classifier with 10-fold cross validation. Bayesian optimization technique was adopted to minimize the cross-validation loss. From the simulation results, it was inferred that the highest sensitivity, specificity, and classification accuracy of 94.56%, 89.74%, and 92.15% achieved respectively and found to be better than the state-of-the-art approaches. Further, it was observed that the classification accuracy improved from 80.2% with outliers to 92.15% without outliers. The classifier performance metrics ensures the suitability of the proposed multi-features with optimized SVM classifier. It can be concluded that the proposed approach can be applied for recognition of focal EEG signals to localize epileptogenic zones.
Knauer, Uwe; Matros, Andrea; Petrovic, Tijana; Zanker, Timothy; Scott, Eileen S; Seiffert, Udo
2017-01-01
Hyperspectral imaging is an emerging means of assessing plant vitality, stress parameters, nutrition status, and diseases. Extraction of target values from the high-dimensional datasets either relies on pixel-wise processing of the full spectral information, appropriate selection of individual bands, or calculation of spectral indices. Limitations of such approaches are reduced classification accuracy, reduced robustness due to spatial variation of the spectral information across the surface of the objects measured as well as a loss of information intrinsic to band selection and use of spectral indices. In this paper we present an improved spatial-spectral segmentation approach for the analysis of hyperspectral imaging data and its application for the prediction of powdery mildew infection levels (disease severity) of intact Chardonnay grape bunches shortly before veraison. Instead of calculating texture features (spatial features) for the huge number of spectral bands independently, dimensionality reduction by means of Linear Discriminant Analysis (LDA) was applied first to derive a few descriptive image bands. Subsequent classification was based on modified Random Forest classifiers and selective extraction of texture parameters from the integral image representation of the image bands generated. Dimensionality reduction, integral images, and the selective feature extraction led to improved classification accuracies of up to [Formula: see text] for detached berries used as a reference sample (training dataset). Our approach was validated by predicting infection levels for a sample of 30 intact bunches. Classification accuracy improved with the number of decision trees of the Random Forest classifier. These results corresponded with qPCR results. An accuracy of 0.87 was achieved in classification of healthy, infected, and severely diseased bunches. However, discrimination between visually healthy and infected bunches proved to be challenging for a few samples, perhaps due to colonized berries or sparse mycelia hidden within the bunch or airborne conidia on the berries that were detected by qPCR. An advanced approach to hyperspectral image classification based on combined spatial and spectral image features, potentially applicable to many available hyperspectral sensor technologies, has been developed and validated to improve the detection of powdery mildew infection levels of Chardonnay grape bunches. The spatial-spectral approach improved especially the detection of light infection levels compared with pixel-wise spectral data analysis. This approach is expected to improve the speed and accuracy of disease detection once the thresholds for fungal biomass detected by hyperspectral imaging are established; it can also facilitate monitoring in plant phenotyping of grapevine and additional crops.
Bøe, Tormod; Lundervold, Arvid
2017-01-01
Inattention in childhood is associated with academic problems later in life. The contribution of specific aspects of inattentive behaviour is, however, less known. We investigated feature importance of primary school teachers’ reports on nine aspects of inattentive behaviour, gender and age in predicting future academic achievement. Primary school teachers of n = 2491 children (7–9 years) rated nine items reflecting different aspects of inattentive behaviour in 2002. A mean academic achievement score from the previous semester in high school (2012) was available for each youth from an official school register. All scores were at a categorical level. Feature importances were assessed by using multinominal logistic regression, classification and regression trees analysis, and a random forest algorithm. Finally, a comprehensive pattern classification procedure using k-fold cross-validation was implemented. Overall, inattention was rated as more severe in boys, who also obtained lower academic achievement scores in high school than girls. Problems related to sustained attention and distractibility were together with age and gender defined as the most important features to predict future achievement scores. Using these four features as input to a collection of classifiers employing k-fold cross-validation for prediction of academic achievement level, we obtained classification accuracy, precision and recall that were clearly better than chance levels. Primary school teachers’ reports of problems related to sustained attention and distractibility were identified as the two most important features of inattentive behaviour predicting academic achievement in high school. Identification and follow-up procedures of primary school children showing these characteristics should be prioritised to prevent future academic failure. PMID:29182663
Automatic sub-pixel coastline extraction based on spectral mixture analysis using EO-1 Hyperion data
NASA Astrophysics Data System (ADS)
Hong, Zhonghua; Li, Xuesu; Han, Yanling; Zhang, Yun; Wang, Jing; Zhou, Ruyan; Hu, Kening
2018-06-01
Many megacities (such as Shanghai) are located in coastal areas, therefore, coastline monitoring is critical for urban security and urban development sustainability. A shoreline is defined as the intersection between coastal land and a water surface and features seawater edge movements as tides rise and fall. Remote sensing techniques have increasingly been used for coastline extraction; however, traditional hard classification methods are performed only at the pixel-level and extracting subpixel accuracy using soft classification methods is both challenging and time consuming due to the complex features in coastal regions. This paper presents an automatic sub-pixel coastline extraction method (ASPCE) from high-spectral satellite imaging that performs coastline extraction based on spectral mixture analysis and, thus, achieves higher accuracy. The ASPCE method consists of three main components: 1) A Water- Vegetation-Impervious-Soil (W-V-I-S) model is first presented to detect mixed W-V-I-S pixels and determine the endmember spectra in coastal regions; 2) The linear spectral mixture unmixing technique based on Fully Constrained Least Squares (FCLS) is applied to the mixed W-V-I-S pixels to estimate seawater abundance; and 3) The spatial attraction model is used to extract the coastline. We tested this new method using EO-1 images from three coastal regions in China: the South China Sea, the East China Sea, and the Bohai Sea. The results showed that the method is accurate and robust. Root mean square error (RMSE) was utilized to evaluate the accuracy by calculating the distance differences between the extracted coastline and the digitized coastline. The classifier's performance was compared with that of the Multiple Endmember Spectral Mixture Analysis (MESMA), Mixture Tuned Matched Filtering (MTMF), Sequential Maximum Angle Convex Cone (SMACC), Constrained Energy Minimization (CEM), and one classical Normalized Difference Water Index (NDWI). The results from the three test sites indicated that the proposed ASPCE method extracted coastlines more efficiently than did the compared methods, and its coastline extraction accuracy corresponded closely to the digitized coastline, with 0.39 pixels, 0.40 pixels, and 0.35 pixels in the three test regions, showing that the ASPCE method achieves an accuracy below 12.0 m (0.40 pixels). Moreover, in the quantitative accuracy assessment for the three test sites, the ASPCE method shows the best performance in coastline extraction, achieving a 0.35 pixel-level at the Bohai Sea, China test site. Therefore, the proposed ASPCE method can extract coastline more accurately than can the hard classification methods or other spectral unmixing methods.
Dai, Shengfa; Wei, Qingguo
2017-01-01
Common spatial pattern algorithm is widely used to estimate spatial filters in motor imagery based brain-computer interfaces. However, use of a large number of channels will make common spatial pattern tend to over-fitting and the classification of electroencephalographic signals time-consuming. To overcome these problems, it is necessary to choose an optimal subset of the whole channels to save computational time and improve the classification accuracy. In this paper, a novel method named backtracking search optimization algorithm is proposed to automatically select the optimal channel set for common spatial pattern. Each individual in the population is a N-dimensional vector, with each component representing one channel. A population of binary codes generate randomly in the beginning, and then channels are selected according to the evolution of these codes. The number and positions of 1's in the code denote the number and positions of chosen channels. The objective function of backtracking search optimization algorithm is defined as the combination of classification error rate and relative number of channels. Experimental results suggest that higher classification accuracy can be achieved with much fewer channels compared to standard common spatial pattern with whole channels.
Low-power wearable respiratory sound sensing.
Oletic, Dinko; Arsenali, Bruno; Bilas, Vedran
2014-04-09
Building upon the findings from the field of automated recognition of respiratory sound patterns, we propose a wearable wireless sensor implementing on-board respiratory sound acquisition and classification, to enable continuous monitoring of symptoms, such as asthmatic wheezing. Low-power consumption of such a sensor is required in order to achieve long autonomy. Considering that the power consumption of its radio is kept minimal if transmitting only upon (rare) occurrences of wheezing, we focus on optimizing the power consumption of the digital signal processor (DSP). Based on a comprehensive review of asthmatic wheeze detection algorithms, we analyze the computational complexity of common features drawn from short-time Fourier transform (STFT) and decision tree classification. Four algorithms were implemented on a low-power TMS320C5505 DSP. Their classification accuracies were evaluated on a dataset of prerecorded respiratory sounds in two operating scenarios of different detection fidelities. The execution times of all algorithms were measured. The best classification accuracy of over 92%, while occupying only 2.6% of the DSP's processing time, is obtained for the algorithm featuring the time-frequency tracking of shapes of crests originating from wheezing, with spectral features modeled using energy.
Ramsey, Elijah W.; Nelson, Gene A.; Sapkota, Sijan
1998-01-01
A progressive classification of a marsh and forest system using Landsat Thematic Mapper (TM), color infrared (CIR) photograph, and ERS-1 synthetic aperture radar (SAR) data improved classification accuracy when compared to classification using solely TM reflective band data. The classification resulted in a detailed identification of differences within a nearly monotypic black needlerush marsh. Accuracy percentages of these classes were surprisingly high given the complexities of classification. The detailed classification resulted in a more accurate portrayal of the marsh transgressive sequence than was obtainable with TM data alone. Individual sensor contribution to the improved classification was compared to that using only the six reflective TM bands. Individually, the green reflective CIR and SAR data identified broad categories of water, marsh, and forest. In combination with TM, SAR and the green CIR band each improved overall accuracy by about 3% and 15% respectively. The SAR data improved the TM classification accuracy mostly in the marsh classes. The green CIR data also improved the marsh classification accuracy and accuracies in some water classes. The final combination of all sensor data improved almost all class accuracies from 2% to 70% with an overall improvement of about 20% over TM data alone. Not only was the identification of vegetation types improved, but the spatial detail of the classification approached 10 m in some areas.
Identification of different bacterial species in biofilms using confocal Raman microscopy
NASA Astrophysics Data System (ADS)
Beier, Brooke D.; Quivey, Robert G.; Berger, Andrew J.
2010-11-01
Confocal Raman microspectroscopy is used to discriminate between different species of bacteria grown in biofilms. Tests are performed using two bacterial species, Streptococcus sanguinis and Streptococcus mutans, which are major components of oral plaque and of particular interest due to their association with healthy and cariogenic plaque, respectively. Dehydrated biofilms of these species are studied as a simplified model of dental plaque. A prediction model based on principal component analysis and logistic regression is calibrated using pure biofilms of each species and validated on pure biofilms grown months later, achieving 96% accuracy in prospective classification. When biofilms of the two species are partially mixed together, Raman-based identifications are achieved within ~2 μm of the boundaries between species with 97% accuracy. This combination of spatial resolution and predication accuracy should be suitable for forming images of species distributions within intact two-species biofilms.
Kwon, Yea-Hoon; Shin, Sae-Byuk; Kim, Shin-Dug
2018-04-30
The purpose of this study is to improve human emotional classification accuracy using a convolution neural networks (CNN) model and to suggest an overall method to classify emotion based on multimodal data. We improved classification performance by combining electroencephalogram (EEG) and galvanic skin response (GSR) signals. GSR signals are preprocessed using by the zero-crossing rate. Sufficient EEG feature extraction can be obtained through CNN. Therefore, we propose a suitable CNN model for feature extraction by tuning hyper parameters in convolution filters. The EEG signal is preprocessed prior to convolution by a wavelet transform while considering time and frequency simultaneously. We use a database for emotion analysis using the physiological signals open dataset to verify the proposed process, achieving 73.4% accuracy, showing significant performance improvement over the current best practice models.
A Neuro-Fuzzy Approach in the Classification of Students' Academic Performance
2013-01-01
Classifying the student academic performance with high accuracy facilitates admission decisions and enhances educational services at educational institutions. The purpose of this paper is to present a neuro-fuzzy approach for classifying students into different groups. The neuro-fuzzy classifier used previous exam results and other related factors as input variables and labeled students based on their expected academic performance. The results showed that the proposed approach achieved a high accuracy. The results were also compared with those obtained from other well-known classification approaches, including support vector machine, Naive Bayes, neural network, and decision tree approaches. The comparative analysis indicated that the neuro-fuzzy approach performed better than the others. It is expected that this work may be used to support student admission procedures and to strengthen the services of educational institutions. PMID:24302928
A neuro-fuzzy approach in the classification of students' academic performance.
Do, Quang Hung; Chen, Jeng-Fung
2013-01-01
Classifying the student academic performance with high accuracy facilitates admission decisions and enhances educational services at educational institutions. The purpose of this paper is to present a neuro-fuzzy approach for classifying students into different groups. The neuro-fuzzy classifier used previous exam results and other related factors as input variables and labeled students based on their expected academic performance. The results showed that the proposed approach achieved a high accuracy. The results were also compared with those obtained from other well-known classification approaches, including support vector machine, Naive Bayes, neural network, and decision tree approaches. The comparative analysis indicated that the neuro-fuzzy approach performed better than the others. It is expected that this work may be used to support student admission procedures and to strengthen the services of educational institutions.
Seizure classification in EEG signals utilizing Hilbert-Huang transform
2011-01-01
Background Classification method capable of recognizing abnormal activities of the brain functionality are either brain imaging or brain signal analysis. The abnormal activity of interest in this study is characterized by a disturbance caused by changes in neuronal electrochemical activity that results in abnormal synchronous discharges. The method aims at helping physicians discriminate between healthy and seizure electroencephalographic (EEG) signals. Method Discrimination in this work is achieved by analyzing EEG signals obtained from freely accessible databases. MATLAB has been used to implement and test the proposed classification algorithm. The analysis in question presents a classification of normal and ictal activities using a feature relied on Hilbert-Huang Transform. Through this method, information related to the intrinsic functions contained in the EEG signal has been extracted to track the local amplitude and the frequency of the signal. Based on this local information, weighted frequencies are calculated and a comparison between ictal and seizure-free determinant intrinsic functions is then performed. Methods of comparison used are the t-test and the Euclidean clustering. Results The t-test results in a P-value < 0.02 and the clustering leads to accurate (94%) and specific (96%) results. The proposed method is also contrasted against the Multivariate Empirical Mode Decomposition that reaches 80% accuracy. Comparison results strengthen the contribution of this paper not only from the accuracy point of view but also with respect to its fast response and ease to use. Conclusion An original tool for EEG signal processing giving physicians the possibility to diagnose brain functionality abnormalities is presented in this paper. The proposed system bears the potential of providing several credible benefits such as fast diagnosis, high accuracy, good sensitivity and specificity, time saving and user friendly. Furthermore, the classification of mode mixing can be achieved using the extracted instantaneous information of every IMF, but it would be most likely a hard task if only the average value is used. Extra benefits of this proposed system include low cost, and ease of interface. All of that indicate the usefulness of the tool and its use as an efficient diagnostic tool. PMID:21609459
Seizure classification in EEG signals utilizing Hilbert-Huang transform.
Oweis, Rami J; Abdulhay, Enas W
2011-05-24
Classification method capable of recognizing abnormal activities of the brain functionality are either brain imaging or brain signal analysis. The abnormal activity of interest in this study is characterized by a disturbance caused by changes in neuronal electrochemical activity that results in abnormal synchronous discharges. The method aims at helping physicians discriminate between healthy and seizure electroencephalographic (EEG) signals. Discrimination in this work is achieved by analyzing EEG signals obtained from freely accessible databases. MATLAB has been used to implement and test the proposed classification algorithm. The analysis in question presents a classification of normal and ictal activities using a feature relied on Hilbert-Huang Transform. Through this method, information related to the intrinsic functions contained in the EEG signal has been extracted to track the local amplitude and the frequency of the signal. Based on this local information, weighted frequencies are calculated and a comparison between ictal and seizure-free determinant intrinsic functions is then performed. Methods of comparison used are the t-test and the Euclidean clustering. The t-test results in a P-value < 0.02 and the clustering leads to accurate (94%) and specific (96%) results. The proposed method is also contrasted against the Multivariate Empirical Mode Decomposition that reaches 80% accuracy. Comparison results strengthen the contribution of this paper not only from the accuracy point of view but also with respect to its fast response and ease to use. An original tool for EEG signal processing giving physicians the possibility to diagnose brain functionality abnormalities is presented in this paper. The proposed system bears the potential of providing several credible benefits such as fast diagnosis, high accuracy, good sensitivity and specificity, time saving and user friendly. Furthermore, the classification of mode mixing can be achieved using the extracted instantaneous information of every IMF, but it would be most likely a hard task if only the average value is used. Extra benefits of this proposed system include low cost, and ease of interface. All of that indicate the usefulness of the tool and its use as an efficient diagnostic tool.
Comparison of Feature Selection Techniques in Machine Learning for Anatomical Brain MRI in Dementia.
Tohka, Jussi; Moradi, Elaheh; Huttunen, Heikki
2016-07-01
We present a comparative split-half resampling analysis of various data driven feature selection and classification methods for the whole brain voxel-based classification analysis of anatomical magnetic resonance images. We compared support vector machines (SVMs), with or without filter based feature selection, several embedded feature selection methods and stability selection. While comparisons of the accuracy of various classification methods have been reported previously, the variability of the out-of-training sample classification accuracy and the set of selected features due to independent training and test sets have not been previously addressed in a brain imaging context. We studied two classification problems: 1) Alzheimer's disease (AD) vs. normal control (NC) and 2) mild cognitive impairment (MCI) vs. NC classification. In AD vs. NC classification, the variability in the test accuracy due to the subject sample did not vary between different methods and exceeded the variability due to different classifiers. In MCI vs. NC classification, particularly with a large training set, embedded feature selection methods outperformed SVM-based ones with the difference in the test accuracy exceeding the test accuracy variability due to the subject sample. The filter and embedded methods produced divergent feature patterns for MCI vs. NC classification that suggests the utility of the embedded feature selection for this problem when linked with the good generalization performance. The stability of the feature sets was strongly correlated with the number of features selected, weakly correlated with the stability of classification accuracy, and uncorrelated with the average classification accuracy.
ERIC Educational Resources Information Center
Wang, Wenyi; Song, Lihong; Chen, Ping; Meng, Yaru; Ding, Shuliang
2015-01-01
Classification consistency and accuracy are viewed as important indicators for evaluating the reliability and validity of classification results in cognitive diagnostic assessment (CDA). Pattern-level classification consistency and accuracy indices were introduced by Cui, Gierl, and Chang. However, the indices at the attribute level have not yet…
A fuzzy neural network for intelligent data processing
NASA Astrophysics Data System (ADS)
Xie, Wei; Chu, Feng; Wang, Lipo; Lim, Eng Thiam
2005-03-01
In this paper, we describe an incrementally generated fuzzy neural network (FNN) for intelligent data processing. This FNN combines the features of initial fuzzy model self-generation, fast input selection, partition validation, parameter optimization and rule-base simplification. A small FNN is created from scratch -- there is no need to specify the initial network architecture, initial membership functions, or initial weights. Fuzzy IF-THEN rules are constantly combined and pruned to minimize the size of the network while maintaining accuracy; irrelevant inputs are detected and deleted, and membership functions and network weights are trained with a gradient descent algorithm, i.e., error backpropagation. Experimental studies on synthesized data sets demonstrate that the proposed Fuzzy Neural Network is able to achieve accuracy comparable to or higher than both a feedforward crisp neural network, i.e., NeuroRule, and a decision tree, i.e., C4.5, with more compact rule bases for most of the data sets used in our experiments. The FNN has achieved outstanding results for cancer classification based on microarray data. The excellent classification result for Small Round Blue Cell Tumors (SRBCTs) data set is shown. Compared with other published methods, we have used a much fewer number of genes for perfect classification, which will help researchers directly focus their attention on some specific genes and may lead to discovery of deep reasons of the development of cancers and discovery of drugs.
Dimitriadis, Stavros I; Salis, Christos; Linden, David
2018-04-01
Limitations of the manual scoring of polysomnograms, which include data from electroencephalogram (EEG), electro-oculogram (EOG), electrocardiogram (ECG) and electromyogram (EMG) channels have long been recognized. Manual staging is resource intensive and time consuming, and thus considerable effort must be spent to ensure inter-rater reliability. As a result, there is a great interest in techniques based on signal processing and machine learning for a completely Automatic Sleep Stage Classification (ASSC). In this paper, we present a single-EEG-sensor ASSC technique based on the dynamic reconfiguration of different aspects of cross-frequency coupling (CFC) estimated between predefined frequency pairs over 5 s epoch lengths. The proposed analytic scheme is demonstrated using the PhysioNet Sleep European Data Format (EDF) Database with repeat recordings from 20 healthy young adults. We validate our methodology in a second sleep dataset. We achieved very high classification sensitivity, specificity and accuracy of 96.2 ± 2.2%, 94.2 ± 2.3%, and 94.4 ± 2.2% across 20 folds, respectively, and also a high mean F1 score (92%, range 90-94%) when a multi-class Naive Bayes classifier was applied. High classification performance has been achieved also in the second sleep dataset. Our method outperformed the accuracy of previous studies not only on different datasets but also on the same database. Single-sensor ASSC makes the entire methodology appropriate for longitudinal monitoring using wearable EEG in real-world and laboratory-oriented environments. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.
Mapping forest tree species over large areas with partially cloudy Landsat imagery
NASA Astrophysics Data System (ADS)
Turlej, K.; Radeloff, V.
2017-12-01
Forests provide numerous services to natural systems and humankind, but which services forest provide depends greatly on their tree species composition. That makes it important to track not only changes in forest extent, something that remote sensing excels in, but also to map tree species. The main goal of our work was to map tree species with Landsat imagery, and to identify how to maximize mapping accuracy by including partially cloudy imagery. Our study area covered one Landsat footprint (26/28) in Northern Wisconsin, USA, with temperate and boreal forests. We selected this area because it contains numerous tree species and variable forest composition providing an ideal study area to test the limits of Landsat data. We quantified how species-level classification accuracy was affected by a) the number of acquisitions, b) the seasonal distribution of observations, and c) the amount of cloud contamination. We classified a single year stack of Landsat-7, and -8 images data with a decision tree algorithm to generate a map of dominant tree species at the pixel- and stand-level. We obtained three important results. First, we achieved producer's accuracies in the range 70-80% and user's accuracies in range 80-90% for the most abundant tree species in our study area. Second, classification accuracy improved with more acquisitions, when observations were available from all seasons, and is the best when images with up to 40% cloud cover are included. Finally, classifications for pure stands were 10 to 30 percentage points better than those for mixed stands. We conclude that including partially cloudy Landsat imagery allows to map forest tree species with accuracies that were previously only possible for rare years with many cloud-free observations. Our approach thus provides important information for both forest management and science.
Accuracy of automated classification of major depressive disorder as a function of symptom severity.
Ramasubbu, Rajamannar; Brown, Matthew R G; Cortese, Filmeno; Gaxiola, Ismael; Goodyear, Bradley; Greenshaw, Andrew J; Dursun, Serdar M; Greiner, Russell
2016-01-01
Growing evidence documents the potential of machine learning for developing brain based diagnostic methods for major depressive disorder (MDD). As symptom severity may influence brain activity, we investigated whether the severity of MDD affected the accuracies of machine learned MDD-vs-Control diagnostic classifiers. Forty-five medication-free patients with DSM-IV defined MDD and 19 healthy controls participated in the study. Based on depression severity as determined by the Hamilton Rating Scale for Depression (HRSD), MDD patients were sorted into three groups: mild to moderate depression (HRSD 14-19), severe depression (HRSD 20-23), and very severe depression (HRSD ≥ 24). We collected functional magnetic resonance imaging (fMRI) data during both resting-state and an emotional-face matching task. Patients in each of the three severity groups were compared against controls in separate analyses, using either the resting-state or task-based fMRI data. We use each of these six datasets with linear support vector machine (SVM) binary classifiers for identifying individuals as patients or controls. The resting-state fMRI data showed statistically significant classification accuracy only for the very severe depression group (accuracy 66%, p = 0.012 corrected), while mild to moderate (accuracy 58%, p = 1.0 corrected) and severe depression (accuracy 52%, p = 1.0 corrected) were only at chance. With task-based fMRI data, the automated classifier performed at chance in all three severity groups. Binary linear SVM classifiers achieved significant classification of very severe depression with resting-state fMRI, but the contribution of brain measurements may have limited potential in differentiating patients with less severe depression from healthy controls.
Mehrang, Saeed; Pietilä, Julia; Korhonen, Ilkka
2018-02-22
Wrist-worn sensors have better compliance for activity monitoring compared to hip, waist, ankle or chest positions. However, wrist-worn activity monitoring is challenging due to the wide degree of freedom for the hand movements, as well as similarity of hand movements in different activities such as varying intensities of cycling. To strengthen the ability of wrist-worn sensors in detecting human activities more accurately, motion signals can be complemented by physiological signals such as optical heart rate (HR) based on photoplethysmography. In this paper, an activity monitoring framework using an optical HR sensor and a triaxial wrist-worn accelerometer is presented. We investigated a range of daily life activities including sitting, standing, household activities and stationary cycling with two intensities. A random forest (RF) classifier was exploited to detect these activities based on the wrist motions and optical HR. The highest overall accuracy of 89.6 ± 3.9% was achieved with a forest of a size of 64 trees and 13-s signal segments with 90% overlap. Removing the HR-derived features decreased the classification accuracy of high-intensity cycling by almost 7%, but did not affect the classification accuracies of other activities. A feature reduction utilizing the feature importance scores of RF was also carried out and resulted in a shrunken feature set of only 21 features. The overall accuracy of the classification utilizing the shrunken feature set was 89.4 ± 4.2%, which is almost equivalent to the above-mentioned peak overall accuracy.
Tang, Yunwei; Jing, Linhai; Li, Hui; Liu, Qingjie; Yan, Qi; Li, Xiuxia
2016-11-22
This study explores the ability of WorldView-2 (WV-2) imagery for bamboo mapping in a mountainous region in Sichuan Province, China. A large area of this place is covered by shadows in the image, and only a few sampled points derived were useful. In order to identify bamboos based on sparse training data, the sample size was expanded according to the reflectance of multispectral bands selected using the principal component analysis (PCA). Then, class separability based on the training data was calculated using a feature space optimization method to select the features for classification. Four regular object-based classification methods were applied based on both sets of training data. The results show that the k -nearest neighbor ( k -NN) method produced the greatest accuracy. A geostatistically-weighted k -NN classifier, accounting for the spatial correlation between classes, was then applied to further increase the accuracy. It achieved 82.65% and 93.10% of the producer's and user's accuracies respectively for the bamboo class. The canopy densities were estimated to explain the result. This study demonstrates that the WV-2 image can be used to identify small patches of understory bamboos given limited known samples, and the resulting bamboo distribution facilitates the assessments of the habitats of giant pandas.
AutoBD: Automated Bi-Level Description for Scalable Fine-Grained Visual Categorization.
Yao, Hantao; Zhang, Shiliang; Yan, Chenggang; Zhang, Yongdong; Li, Jintao; Tian, Qi
Compared with traditional image classification, fine-grained visual categorization is a more challenging task, because it targets to classify objects belonging to the same species, e.g. , classify hundreds of birds or cars. In the past several years, researchers have made many achievements on this topic. However, most of them are heavily dependent on the artificial annotations, e.g., bounding boxes, part annotations, and so on . The requirement of artificial annotations largely hinders the scalability and application. Motivated to release such dependence, this paper proposes a robust and discriminative visual description named Automated Bi-level Description (AutoBD). "Bi-level" denotes two complementary part-level and object-level visual descriptions, respectively. AutoBD is "automated," because it only requires the image-level labels of training images and does not need any annotations for testing images. Compared with the part annotations labeled by the human, the image-level labels can be easily acquired, which thus makes AutoBD suitable for large-scale visual categorization. Specifically, the part-level description is extracted by identifying the local region saliently representing the visual distinctiveness. The object-level description is extracted from object bounding boxes generated with a co-localization algorithm. Although only using the image-level labels, AutoBD outperforms the recent studies on two public benchmark, i.e. , classification accuracy achieves 81.6% on CUB-200-2011 and 88.9% on Car-196, respectively. On the large-scale Birdsnap data set, AutoBD achieves the accuracy of 68%, which is currently the best performance to the best of our knowledge.Compared with traditional image classification, fine-grained visual categorization is a more challenging task, because it targets to classify objects belonging to the same species, e.g. , classify hundreds of birds or cars. In the past several years, researchers have made many achievements on this topic. However, most of them are heavily dependent on the artificial annotations, e.g., bounding boxes, part annotations, and so on . The requirement of artificial annotations largely hinders the scalability and application. Motivated to release such dependence, this paper proposes a robust and discriminative visual description named Automated Bi-level Description (AutoBD). "Bi-level" denotes two complementary part-level and object-level visual descriptions, respectively. AutoBD is "automated," because it only requires the image-level labels of training images and does not need any annotations for testing images. Compared with the part annotations labeled by the human, the image-level labels can be easily acquired, which thus makes AutoBD suitable for large-scale visual categorization. Specifically, the part-level description is extracted by identifying the local region saliently representing the visual distinctiveness. The object-level description is extracted from object bounding boxes generated with a co-localization algorithm. Although only using the image-level labels, AutoBD outperforms the recent studies on two public benchmark, i.e. , classification accuracy achieves 81.6% on CUB-200-2011 and 88.9% on Car-196, respectively. On the large-scale Birdsnap data set, AutoBD achieves the accuracy of 68%, which is currently the best performance to the best of our knowledge.
Fang, Chen; Li, Chunfei; Cabrerizo, Mercedes; Barreto, Armando; Andrian, Jean; Rishe, Naphtali; Loewenstein, David; Duara, Ranjan; Adjouadi, Malek
2018-04-12
Over the past few years, several approaches have been proposed to assist in the early diagnosis of Alzheimer's disease (AD) and its prodromal stage of mild cognitive impairment (MCI). Using multimodal biomarkers for this high-dimensional classification problem, the widely used algorithms include Support Vector Machines (SVM), Sparse Representation-based classification (SRC), Deep Belief Networks (DBN) and Random Forest (RF). These widely used algorithms continue to yield unsatisfactory performance for delineating the MCI participants from the cognitively normal control (CN) group. A novel Gaussian discriminant analysis-based algorithm is thus introduced to achieve a more effective and accurate classification performance than the aforementioned state-of-the-art algorithms. This study makes use of magnetic resonance imaging (MRI) data uniquely as input to two separate high-dimensional decision spaces that reflect the structural measures of the two brain hemispheres. The data used include 190 CN, 305 MCI and 133 AD subjects as part of the AD Big Data DREAM Challenge #1. Using 80% data for a 10-fold cross-validation, the proposed algorithm achieved an average F1 score of 95.89% and an accuracy of 96.54% for discriminating AD from CN; and more importantly, an average F1 score of 92.08% and an accuracy of 90.26% for discriminating MCI from CN. Then, a true test was implemented on the remaining 20% held-out test data. For discriminating MCI from CN, an accuracy of 80.61%, a sensitivity of 81.97% and a specificity of 78.38% were obtained. These results show significant improvement over existing algorithms for discriminating the subtle differences between MCI participants and the CN group.
NASA Astrophysics Data System (ADS)
Li, Xiaohui; Yang, Sibo; Fan, Rongwei; Yu, Xin; Chen, Deying
2018-06-01
In this paper, discrimination of soft tissues using laser-induced breakdown spectroscopy (LIBS) in combination with multivariate statistical methods is presented. Fresh pork fat, skin, ham, loin and tenderloin muscle tissues are manually cut into slices and ablated using a 1064 nm pulsed Nd:YAG laser. Discrimination analyses between fat, skin and muscle tissues, and further between highly similar ham, loin and tenderloin muscle tissues, are performed based on the LIBS spectra in combination with multivariate statistical methods, including principal component analysis (PCA), k nearest neighbors (kNN) classification, and support vector machine (SVM) classification. Performances of the discrimination models, including accuracy, sensitivity and specificity, are evaluated using 10-fold cross validation. The classification models are optimized to achieve best discrimination performances. The fat, skin and muscle tissues can be definitely discriminated using both kNN and SVM classifiers, with accuracy of over 99.83%, sensitivity of over 0.995 and specificity of over 0.998. The highly similar ham, loin and tenderloin muscle tissues can also be discriminated with acceptable performances. The best performances are achieved with SVM classifier using Gaussian kernel function, with accuracy of 76.84%, sensitivity of over 0.742 and specificity of over 0.869. The results show that the LIBS technique assisted with multivariate statistical methods could be a powerful tool for online discrimination of soft tissues, even for tissues of high similarity, such as muscles from different parts of the animal body. This technique could be used for discrimination of tissues suffering minor clinical changes, thus may advance the diagnosis of early lesions and abnormalities.
[Accuracy improvement of spectral classification of crop using microwave backscatter data].
Jia, Kun; Li, Qiang-Zi; Tian, Yi-Chen; Wu, Bing-Fang; Zhang, Fei-Fei; Meng, Ji-Hua
2011-02-01
In the present study, VV polarization microwave backscatter data used for improving accuracies of spectral classification of crop is investigated. Classification accuracy using different classifiers based on the fusion data of HJ satellite multi-spectral and Envisat ASAR VV backscatter data are compared. The results indicate that fusion data can take full advantage of spectral information of HJ multi-spectral data and the structure sensitivity feature of ASAR VV polarization data. The fusion data enlarges the spectral difference among different classifications and improves crop classification accuracy. The classification accuracy using fusion data can be increased by 5 percent compared to the single HJ data. Furthermore, ASAR VV polarization data is sensitive to non-agrarian area of planted field, and VV polarization data joined classification can effectively distinguish the field border. VV polarization data associating with multi-spectral data used in crop classification enlarges the application of satellite data and has the potential of spread in the domain of agriculture.
Gromski, Piotr S; Correa, Elon; Vaughan, Andrew A; Wedge, David C; Turner, Michael L; Goodacre, Royston
2014-11-01
Accurate detection of certain chemical vapours is important, as these may be diagnostic for the presence of weapons, drugs of misuse or disease. In order to achieve this, chemical sensors could be deployed remotely. However, the readout from such sensors is a multivariate pattern, and this needs to be interpreted robustly using powerful supervised learning methods. Therefore, in this study, we compared the classification accuracy of four pattern recognition algorithms which include linear discriminant analysis (LDA), partial least squares-discriminant analysis (PLS-DA), random forests (RF) and support vector machines (SVM) which employed four different kernels. For this purpose, we have used electronic nose (e-nose) sensor data (Wedge et al., Sensors Actuators B Chem 143:365-372, 2009). In order to allow direct comparison between our four different algorithms, we employed two model validation procedures based on either 10-fold cross-validation or bootstrapping. The results show that LDA (91.56% accuracy) and SVM with a polynomial kernel (91.66% accuracy) were very effective at analysing these e-nose data. These two models gave superior prediction accuracy, sensitivity and specificity in comparison to the other techniques employed. With respect to the e-nose sensor data studied here, our findings recommend that SVM with a polynomial kernel should be favoured as a classification method over the other statistical models that we assessed. SVM with non-linear kernels have the advantage that they can be used for classifying non-linear as well as linear mapping from analytical data space to multi-group classifications and would thus be a suitable algorithm for the analysis of most e-nose sensor data.
NASA Astrophysics Data System (ADS)
Chan, Heang-Ping; Helvie, Mark A.; Petrick, Nicholas; Sahiner, Berkman; Adler, Dorit D.; Blane, Caroline E.; Joynt, Lynn K.; Paramagul, Chintana; Roubidoux, Marilyn A.; Wilson, Todd E.; Hadjiiski, Lubomir M.; Goodsitt, Mitchell M.
1999-05-01
A receiver operating characteristic (ROC) experiment was conducted to evaluate the effects of pixel size on the characterization of mammographic microcalcifications. Digital mammograms were obtained by digitizing screen-film mammograms with a laser film scanner. One hundred twelve two-view mammograms with biopsy-proven microcalcifications were digitized at a pixel size of 35 micrometer X 35 micrometer. A region of interest (ROI) containing the microcalcifications was extracted from each image. ROI images with pixel sizes of 70 micrometers, 105 micrometers, and 140 micrometers were derived from the ROI of 35 micrometer pixel size by averaging 2 X 2, 3 X 3, and 4 X 4 neighboring pixels, respectively. The ROI images were printed on film with a laser imager. Seven MQSA-approved radiologists participated as observers. The likelihood of malignancy of the microcalcifications was rated on a 10-point confidence rating scale and analyzed with ROC methodology. The classification accuracy was quantified by the area, Az, under the ROC curve. The statistical significance of the differences in the Az values for different pixel sizes was estimated with the Dorfman-Berbaum-Metz (DBM) method for multi-reader, multi-case ROC data. It was found that five of the seven radiologists demonstrated a higher classification accuracy with the 70 micrometer or 105 micrometer images. The average Az also showed a higher classification accuracy in the range of 70 to 105 micrometer pixel size. However, the differences in A(subscript z/ between different pixel sizes did not achieve statistical significance. The low specificity of image features of microcalcifications an the large interobserver and intraobserver variabilities may have contributed to the relatively weak dependence of classification accuracy on pixel size.
Graph pyramids for protein function prediction
2015-01-01
Background Uncovering the hidden organizational characteristics and regularities among biological sequences is the key issue for detailed understanding of an underlying biological phenomenon. Thus pattern recognition from nucleic acid sequences is an important affair for protein function prediction. As proteins from the same family exhibit similar characteristics, homology based approaches predict protein functions via protein classification. But conventional classification approaches mostly rely on the global features by considering only strong protein similarity matches. This leads to significant loss of prediction accuracy. Methods Here we construct the Protein-Protein Similarity (PPS) network, which captures the subtle properties of protein families. The proposed method considers the local as well as the global features, by examining the interactions among 'weakly interacting proteins' in the PPS network and by using hierarchical graph analysis via the graph pyramid. Different underlying properties of the protein families are uncovered by operating the proposed graph based features at various pyramid levels. Results Experimental results on benchmark data sets show that the proposed hierarchical voting algorithm using graph pyramid helps to improve computational efficiency as well the protein classification accuracy. Quantitatively, among 14,086 test sequences, on an average the proposed method misclassified only 21.1 sequences whereas baseline BLAST score based global feature matching method misclassified 362.9 sequences. With each correctly classified test sequence, the fast incremental learning ability of the proposed method further enhances the training model. Thus it has achieved more than 96% protein classification accuracy using only 20% per class training data. PMID:26044522
Graph pyramids for protein function prediction.
Sandhan, Tushar; Yoo, Youngjun; Choi, Jin; Kim, Sun
2015-01-01
Uncovering the hidden organizational characteristics and regularities among biological sequences is the key issue for detailed understanding of an underlying biological phenomenon. Thus pattern recognition from nucleic acid sequences is an important affair for protein function prediction. As proteins from the same family exhibit similar characteristics, homology based approaches predict protein functions via protein classification. But conventional classification approaches mostly rely on the global features by considering only strong protein similarity matches. This leads to significant loss of prediction accuracy. Here we construct the Protein-Protein Similarity (PPS) network, which captures the subtle properties of protein families. The proposed method considers the local as well as the global features, by examining the interactions among 'weakly interacting proteins' in the PPS network and by using hierarchical graph analysis via the graph pyramid. Different underlying properties of the protein families are uncovered by operating the proposed graph based features at various pyramid levels. Experimental results on benchmark data sets show that the proposed hierarchical voting algorithm using graph pyramid helps to improve computational efficiency as well the protein classification accuracy. Quantitatively, among 14,086 test sequences, on an average the proposed method misclassified only 21.1 sequences whereas baseline BLAST score based global feature matching method misclassified 362.9 sequences. With each correctly classified test sequence, the fast incremental learning ability of the proposed method further enhances the training model. Thus it has achieved more than 96% protein classification accuracy using only 20% per class training data.
Thanh Noi, Phan; Kappas, Martin
2017-01-01
In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a few studies have compared the performances of these classifiers with different training sample sizes for the same remote sensing images, particularly the Sentinel-2 Multispectral Imager (MSI). In this study, we examined and compared the performances of the RF, kNN, and SVM classifiers for land use/cover classification using Sentinel-2 image data. An area of 30 × 30 km2 within the Red River Delta of Vietnam with six land use/cover types was classified using 14 different training sample sizes, including balanced and imbalanced, from 50 to over 1250 pixels/class. All classification results showed a high overall accuracy (OA) ranging from 90% to 95%. Among the three classifiers and 14 sub-datasets, SVM produced the highest OA with the least sensitivity to the training sample sizes, followed consecutively by RF and kNN. In relation to the sample size, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough, i.e., greater than 750 pixels/class or representing an area of approximately 0.25% of the total study area. The high accuracy was achieved with both imbalanced and balanced datasets. PMID:29271909
Thanh Noi, Phan; Kappas, Martin
2017-12-22
In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a few studies have compared the performances of these classifiers with different training sample sizes for the same remote sensing images, particularly the Sentinel-2 Multispectral Imager (MSI). In this study, we examined and compared the performances of the RF, kNN, and SVM classifiers for land use/cover classification using Sentinel-2 image data. An area of 30 × 30 km² within the Red River Delta of Vietnam with six land use/cover types was classified using 14 different training sample sizes, including balanced and imbalanced, from 50 to over 1250 pixels/class. All classification results showed a high overall accuracy (OA) ranging from 90% to 95%. Among the three classifiers and 14 sub-datasets, SVM produced the highest OA with the least sensitivity to the training sample sizes, followed consecutively by RF and kNN. In relation to the sample size, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough, i.e., greater than 750 pixels/class or representing an area of approximately 0.25% of the total study area. The high accuracy was achieved with both imbalanced and balanced datasets.
NASA Astrophysics Data System (ADS)
Legara, Erika Fille; Monterola, Christopher; Abundo, Cheryl
2011-01-01
We demonstrate an accurate procedure based on linear discriminant analysis that allows automatic authorship classification of opinion column articles. First, we extract the following stylometric features of 157 column articles from four authors: statistics on high frequency words, number of words per sentence, and number of sentences per paragraph. Then, by systematically ranking these features based on an effect size criterion, we show that we can achieve an average classification accuracy of 93% for the test set. In comparison, frequency size based ranking has an average accuracy of 80%. The highest possible average classification accuracy of our data merely relying on chance is ∼31%. By carrying out sensitivity analysis, we show that the effect size criterion is superior than frequency ranking because there exist low frequency words that significantly contribute to successful author discrimination. Consistent results are seen when the procedure is applied in classifying the undisputed Federalist papers of Alexander Hamilton and James Madison. To the best of our knowledge, the work is the first attempt in classifying opinion column articles, that by virtue of being shorter in length (as compared to novels or short stories), are more prone to over-fitting issues. The near perfect classification for the longer papers supports this claim. Our results provide an important insight on authorship attribution that has been overlooked in previous studies: that ranking discriminant variables based on word frequency counts is not necessarily an optimal procedure.
Du, Hua Qiang; Sun, Xiao Yan; Han, Ning; Mao, Fang Jie
2017-10-01
By synergistically using the object-based image analysis (OBIA) and the classification and regression tree (CART) methods, the distribution information, the indexes (including diameter at breast, tree height, and crown closure), and the aboveground carbon storage (AGC) of moso bamboo forest in Shanchuan Town, Anji County, Zhejiang Province were investigated. The results showed that the moso bamboo forest could be accurately delineated by integrating the multi-scale ima ge segmentation in OBIA technique and CART, which connected the image objects at various scales, with a pretty good producer's accuracy of 89.1%. The investigation of indexes estimated by regression tree model that was constructed based on the features extracted from the image objects reached normal or better accuracy, in which the crown closure model archived the best estimating accuracy of 67.9%. The estimating accuracy of diameter at breast and tree height was relatively low, which was consistent with conclusion that estimating diameter at breast and tree height using optical remote sensing could not achieve satisfactory results. Estimation of AGC reached relatively high accuracy, and accuracy of the region of high value achieved above 80%.
Object oriented classification of high resolution data for inventory of horticultural crops
NASA Astrophysics Data System (ADS)
Hebbar, R.; Ravishankar, H. M.; Trivedi, S.; Subramoniam, S. R.; Uday, R.; Dadhwal, V. K.
2014-11-01
High resolution satellite images are associated with large variance and thus, per pixel classifiers often result in poor accuracy especially in delineation of horticultural crops. In this context, object oriented techniques are powerful and promising methods for classification. In the present study, a semi-automatic object oriented feature extraction model has been used for delineation of horticultural fruit and plantation crops using Erdas Objective Imagine. Multi-resolution data from Resourcesat LISS-IV and Cartosat-1 have been used as source data in the feature extraction model. Spectral and textural information along with NDVI were used as inputs for generation of Spectral Feature Probability (SFP) layers using sample training pixels. The SFP layers were then converted into raster objects using threshold and clump function resulting in pixel probability layer. A set of raster and vector operators was employed in the subsequent steps for generating thematic layer in the vector format. This semi-automatic feature extraction model was employed for classification of major fruit and plantations crops viz., mango, banana, citrus, coffee and coconut grown under different agro-climatic conditions. In general, the classification accuracy of about 75-80 per cent was achieved for these crops using object based classification alone and the same was further improved using minimal visual editing of misclassified areas. A comparison of on-screen visual interpretation with object oriented approach showed good agreement. It was observed that old and mature plantations were classified more accurately while young and recently planted ones (3 years or less) showed poor classification accuracy due to mixed spectral signature, wider spacing and poor stands of plantations. The results indicated the potential use of object oriented approach for classification of high resolution data for delineation of horticultural fruit and plantation crops. The present methodology is applicable at local levels and future development is focused on up-scaling the methodology for generation of fruit and plantation crop maps at regional and national level which is important for creation of database for overall horticultural crop development.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brunton, Steven
Optical systems provide valuable information for evaluating interactions and associations between organisms and MHK energy converters and for capturing potentially rare encounters between marine organisms and MHK device. The deluge of optical data from cabled monitoring packages makes expert review time-consuming and expensive. We propose algorithms and a processing framework to automatically extract events of interest from underwater video. The open-source software framework consists of background subtraction, filtering, feature extraction and hierarchical classification algorithms. This principle classification pipeline was validated on real-world data collected with an experimental underwater monitoring package. An event detection rate of 100% was achieved using robustmore » principal components analysis (RPCA), Fourier feature extraction and a support vector machine (SVM) binary classifier. The detected events were then further classified into more complex classes – algae | invertebrate | vertebrate, one species | multiple species of fish, and interest rank. Greater than 80% accuracy was achieved using a combination of machine learning techniques.« less
Karuppiah Ramachandran, Vignesh Raja; Alblas, Huibert J; Le, Duc V; Meratnia, Nirvana
2018-05-24
In the last decade, seizure prediction systems have gained a lot of attention because of their enormous potential to largely improve the quality-of-life of the epileptic patients. The accuracy of the prediction algorithms to detect seizure in real-world applications is largely limited because the brain signals are inherently uncertain and affected by various factors, such as environment, age, drug intake, etc., in addition to the internal artefacts that occur during the process of recording the brain signals. To deal with such ambiguity, researchers transitionally use active learning, which selects the ambiguous data to be annotated by an expert and updates the classification model dynamically. However, selecting the particular data from a pool of large ambiguous datasets to be labelled by an expert is still a challenging problem. In this paper, we propose an active learning-based prediction framework that aims to improve the accuracy of the prediction with a minimum number of labelled data. The core technique of our framework is employing the Bernoulli-Gaussian Mixture model (BGMM) to determine the feature samples that have the most ambiguity to be annotated by an expert. By doing so, our approach facilitates expert intervention as well as increasing medical reliability. We evaluate seven different classifiers in terms of the classification time and memory required. An active learning framework built on top of the best performing classifier is evaluated in terms of required annotation effort to achieve a high level of prediction accuracy. The results show that our approach can achieve the same accuracy as a Support Vector Machine (SVM) classifier using only 20 % of the labelled data and also improve the prediction accuracy even under the noisy condition.
NASA Technical Reports Server (NTRS)
Justice, C.; Townshend, J. (Principal Investigator)
1981-01-01
Two unsupervised classification procedures were applied to ratioed and unratioed LANDSAT multispectral scanner data of an area of spatially complex vegetation and terrain. An objective accuracy assessment was undertaken on each classification and comparison was made of the classification accuracies. The two unsupervised procedures use the same clustering algorithm. By on procedure the entire area is clustered and by the other a representative sample of the area is clustered and the resulting statistics are extrapolated to the remaining area using a maximum likelihood classifier. Explanation is given of the major steps in the classification procedures including image preprocessing; classification; interpretation of cluster classes; and accuracy assessment. Of the four classifications undertaken, the monocluster block approach on the unratioed data gave the highest accuracy of 80% for five coarse cover classes. This accuracy was increased to 84% by applying a 3 x 3 contextual filter to the classified image. A detailed description and partial explanation is provided for the major misclassification. The classification of the unratioed data produced higher percentage accuracies than for the ratioed data and the monocluster block approach gave higher accuracies than clustering the entire area. The moncluster block approach was additionally the most economical in terms of computing time.
Detection of epileptic seizure in EEG signals using linear least squares preprocessing.
Roshan Zamir, Z
2016-09-01
An epileptic seizure is a transient event of abnormal excessive neuronal discharge in the brain. This unwanted event can be obstructed by detection of electrical changes in the brain that happen before the seizure takes place. The automatic detection of seizures is necessary since the visual screening of EEG recordings is a time consuming task and requires experts to improve the diagnosis. Much of the prior research in detection of seizures has been developed based on artificial neural network, genetic programming, and wavelet transforms. Although the highest achieved accuracy for classification is 100%, there are drawbacks, such as the existence of unbalanced datasets and the lack of investigations in performances consistency. To address these, four linear least squares-based preprocessing models are proposed to extract key features of an EEG signal in order to detect seizures. The first two models are newly developed. The original signal (EEG) is approximated by a sinusoidal curve. Its amplitude is formed by a polynomial function and compared with the predeveloped spline function. Different statistical measures, namely classification accuracy, true positive and negative rates, false positive and negative rates and precision, are utilised to assess the performance of the proposed models. These metrics are derived from confusion matrices obtained from classifiers. Different classifiers are used over the original dataset and the set of extracted features. The proposed models significantly reduce the dimension of the classification problem and the computational time while the classification accuracy is improved in most cases. The first and third models are promising feature extraction methods with the classification accuracy of 100%. Logistic, LazyIB1, LazyIB5, and J48 are the best classifiers. Their true positive and negative rates are 1 while false positive and negative rates are 0 and the corresponding precision values are 1. Numerical results suggest that these models are robust and efficient for detecting epileptic seizure. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Yang, Hao; Zhang, Junran; Jiang, Xiaomei; Liu, Fei
2018-04-01
In recent years, with the rapid development of machine learning techniques,the deep learning algorithm has been widely used in one-dimensional physiological signal processing. In this paper we used electroencephalography (EEG) signals based on deep belief network (DBN) model in open source frameworks of deep learning to identify emotional state (positive, negative and neutrals), then the results of DBN were compared with support vector machine (SVM). The EEG signals were collected from the subjects who were under different emotional stimuli, and DBN and SVM were adopted to identify the EEG signals with changes of different characteristics and different frequency bands. We found that the average accuracy of differential entropy (DE) feature by DBN is 89.12%±6.54%, which has a better performance than previous research based on the same data set. At the same time, the classification effects of DBN are better than the results from traditional SVM (the average classification accuracy of 84.2%±9.24%) and its accuracy and stability have a better trend. In three experiments with different time points, single subject can achieve the consistent results of classification by using DBN (the mean standard deviation is1.44%), and the experimental results show that the system has steady performance and good repeatability. According to our research, the characteristic of DE has a better classification result than other characteristics. Furthermore, the Beta band and the Gamma band in the emotional recognition model have higher classification accuracy. To sum up, the performances of classifiers have a promotion by using the deep learning algorithm, which has a reference for establishing a more accurate system of emotional recognition. Meanwhile, we can trace through the results of recognition to find out the brain regions and frequency band that are related to the emotions, which can help us to understand the emotional mechanism better. This study has a high academic value and practical significance, so further investigation still needs to be done.
Lee, David; Park, Sang-Hoon; Lee, Sang-Goog
2017-10-07
In this paper, we propose a set of wavelet-based combined feature vectors and a Gaussian mixture model (GMM)-supervector to enhance training speed and classification accuracy in motor imagery brain-computer interfaces. The proposed method is configured as follows: first, wavelet transforms are applied to extract the feature vectors for identification of motor imagery electroencephalography (EEG) and principal component analyses are used to reduce the dimensionality of the feature vectors and linearly combine them. Subsequently, the GMM universal background model is trained by the expectation-maximization (EM) algorithm to purify the training data and reduce its size. Finally, a purified and reduced GMM-supervector is used to train the support vector machine classifier. The performance of the proposed method was evaluated for three different motor imagery datasets in terms of accuracy, kappa, mutual information, and computation time, and compared with the state-of-the-art algorithms. The results from the study indicate that the proposed method achieves high accuracy with a small amount of training data compared with the state-of-the-art algorithms in motor imagery EEG classification.
Zheng, Haiyong; Wang, Ruchen; Yu, Zhibin; Wang, Nan; Gu, Zhaorui; Zheng, Bing
2017-12-28
Plankton, including phytoplankton and zooplankton, are the main source of food for organisms in the ocean and form the base of marine food chain. As the fundamental components of marine ecosystems, plankton is very sensitive to environment changes, and the study of plankton abundance and distribution is crucial, in order to understand environment changes and protect marine ecosystems. This study was carried out to develop an extensive applicable plankton classification system with high accuracy for the increasing number of various imaging devices. Literature shows that most plankton image classification systems were limited to only one specific imaging device and a relatively narrow taxonomic scope. The real practical system for automatic plankton classification is even non-existent and this study is partly to fill this gap. Inspired by the analysis of literature and development of technology, we focused on the requirements of practical application and proposed an automatic system for plankton image classification combining multiple view features via multiple kernel learning (MKL). For one thing, in order to describe the biomorphic characteristics of plankton more completely and comprehensively, we combined general features with robust features, especially by adding features like Inner-Distance Shape Context for morphological representation. For another, we divided all the features into different types from multiple views and feed them to multiple classifiers instead of only one by combining different kernel matrices computed from different types of features optimally via multiple kernel learning. Moreover, we also applied feature selection method to choose the optimal feature subsets from redundant features for satisfying different datasets from different imaging devices. We implemented our proposed classification system on three different datasets across more than 20 categories from phytoplankton to zooplankton. The experimental results validated that our system outperforms state-of-the-art plankton image classification systems in terms of accuracy and robustness. This study demonstrated automatic plankton image classification system combining multiple view features using multiple kernel learning. The results indicated that multiple view features combined by NLMKL using three kernel functions (linear, polynomial and Gaussian kernel functions) can describe and use information of features better so that achieve a higher classification accuracy.
A Directed Acyclic Graph-Large Margin Distribution Machine Model for Music Symbol Classification
Wen, Cuihong; Zhang, Jing; Rebelo, Ana; Cheng, Fanyong
2016-01-01
Optical Music Recognition (OMR) has received increasing attention in recent years. In this paper, we propose a classifier based on a new method named Directed Acyclic Graph-Large margin Distribution Machine (DAG-LDM). The DAG-LDM is an improvement of the Large margin Distribution Machine (LDM), which is a binary classifier that optimizes the margin distribution by maximizing the margin mean and minimizing the margin variance simultaneously. We modify the LDM to the DAG-LDM to solve the multi-class music symbol classification problem. Tests are conducted on more than 10000 music symbol images, obtained from handwritten and printed images of music scores. The proposed method provides superior classification capability and achieves much higher classification accuracy than the state-of-the-art algorithms such as Support Vector Machines (SVMs) and Neural Networks (NNs). PMID:26985826
A Directed Acyclic Graph-Large Margin Distribution Machine Model for Music Symbol Classification.
Wen, Cuihong; Zhang, Jing; Rebelo, Ana; Cheng, Fanyong
2016-01-01
Optical Music Recognition (OMR) has received increasing attention in recent years. In this paper, we propose a classifier based on a new method named Directed Acyclic Graph-Large margin Distribution Machine (DAG-LDM). The DAG-LDM is an improvement of the Large margin Distribution Machine (LDM), which is a binary classifier that optimizes the margin distribution by maximizing the margin mean and minimizing the margin variance simultaneously. We modify the LDM to the DAG-LDM to solve the multi-class music symbol classification problem. Tests are conducted on more than 10000 music symbol images, obtained from handwritten and printed images of music scores. The proposed method provides superior classification capability and achieves much higher classification accuracy than the state-of-the-art algorithms such as Support Vector Machines (SVMs) and Neural Networks (NNs).
NASA Astrophysics Data System (ADS)
Pilarska, M.
2018-05-01
Airborne laser scanning (ALS) is a well-known and willingly used technology. One of the advantages of this technology is primarily its fast and accurate data registration. In recent years ALS is continuously developed. One of the latest achievements is multispectral ALS, which consists in obtaining simultaneously the data in more than one laser wavelength. In this article the results of the dual-wavelength ALS data classification are presented. The data were acquired with RIEGL VQ-1560i sensor, which is equipped with two laser scanners operating in different wavelengths: 532 nm and 1064 nm. Two classification approaches are presented in the article: classification, which is based on geometric relationships between points and classification, which mostly relies on the radiometric properties of registered objects. The overall accuracy of the geometric classification was 86 %, whereas for the radiometric classification it was 81 %. As a result, it can be assumed that the radiometric features which are provided by the multispectral ALS have potential to be successfully used in ALS point cloud classification.
a Novel Framework for Remote Sensing Image Scene Classification
NASA Astrophysics Data System (ADS)
Jiang, S.; Zhao, H.; Wu, W.; Tan, Q.
2018-04-01
High resolution remote sensing (HRRS) images scene classification aims to label an image with a specific semantic category. HRRS images contain more details of the ground objects and their spatial distribution patterns than low spatial resolution images. Scene classification can bridge the gap between low-level features and high-level semantics. It can be applied in urban planning, target detection and other fields. This paper proposes a novel framework for HRRS images scene classification. This framework combines the convolutional neural network (CNN) and XGBoost, which utilizes CNN as feature extractor and XGBoost as a classifier. Then, this framework is evaluated on two different HRRS images datasets: UC-Merced dataset and NWPU-RESISC45 dataset. Our framework achieved satisfying accuracies on two datasets, which is 95.57 % and 83.35 % respectively. From the experiments result, our framework has been proven to be effective for remote sensing images classification. Furthermore, we believe this framework will be more practical for further HRRS scene classification, since it costs less time on training stage.
NASA Astrophysics Data System (ADS)
Hancock, Matthew C.; Magnan, Jerry F.
2017-03-01
To determine the potential usefulness of quantified diagnostic image features as inputs to a CAD system, we investigate the predictive capabilities of statistical learning methods for classifying nodule malignancy, utilizing the Lung Image Database Consortium (LIDC) dataset, and only employ the radiologist-assigned diagnostic feature values for the lung nodules therein, as well as our derived estimates of the diameter and volume of the nodules from the radiologists' annotations. We calculate theoretical upper bounds on the classification accuracy that is achievable by an ideal classifier that only uses the radiologist-assigned feature values, and we obtain an accuracy of 85.74 (+/-1.14)% which is, on average, 4.43% below the theoretical maximum of 90.17%. The corresponding area-under-the-curve (AUC) score is 0.932 (+/-0.012), which increases to 0.949 (+/-0.007) when diameter and volume features are included, along with the accuracy to 88.08 (+/-1.11)%. Our results are comparable to those in the literature that use algorithmically-derived image-based features, which supports our hypothesis that lung nodules can be classified as malignant or benign using only quantified, diagnostic image features, and indicates the competitiveness of this approach. We also analyze how the classification accuracy depends on specific features, and feature subsets, and we rank the features according to their predictive power, statistically demonstrating the top four to be spiculation, lobulation, subtlety, and calcification.
Land cover classification of VHR airborne images for citrus grove identification
NASA Astrophysics Data System (ADS)
Amorós López, J.; Izquierdo Verdiguier, E.; Gómez Chova, L.; Muñoz Marí, J.; Rodríguez Barreiro, J. Z.; Camps Valls, G.; Calpe Maravilla, J.
Managing land resources using remote sensing techniques is becoming a common practice. However, data analysis procedures should satisfy the high accuracy levels demanded by users (public or private companies and governments) in order to be extensively used. This paper presents a multi-stage classification scheme to update the citrus Geographical Information System (GIS) of the Comunidad Valenciana region (Spain). Spain is the first citrus fruit producer in Europe and the fourth in the world. In particular, citrus fruits represent 67% of the agricultural production in this region, with a total production of 4.24 million tons (campaign 2006-2007). The citrus GIS inventory, created in 2001, needs to be regularly updated in order to monitor changes quickly enough, and allow appropriate policy making and citrus production forecasting. Automatic methods are proposed in this work to facilitate this update, whose processing scheme is summarized as follows. First, an object-oriented feature extraction process is carried out for each cadastral parcel from very high spatial resolution aerial images (0.5 m). Next, several automatic classifiers (decision trees, artificial neural networks, and support vector machines) are trained and combined to improve the final classification accuracy. Finally, the citrus GIS is automatically updated if a high enough level of confidence, based on the agreement between classifiers, is achieved. This is the case for 85% of the parcels and accuracy results exceed 94%. The remaining parcels are classified by expert photo-interpreters in order to guarantee the high accuracy demanded by policy makers.
Can segmentation evaluation metric be used as an indicator of land cover classification accuracy?
NASA Astrophysics Data System (ADS)
Švab Lenarčič, Andreja; Đurić, Nataša; Čotar, Klemen; Ritlop, Klemen; Oštir, Krištof
2016-10-01
It is a broadly established belief that the segmentation result significantly affects subsequent image classification accuracy. However, the actual correlation between the two has never been evaluated. Such an evaluation would be of considerable importance for any attempts to automate the object-based classification process, as it would reduce the amount of user intervention required to fine-tune the segmentation parameters. We conducted an assessment of segmentation and classification by analyzing 100 different segmentation parameter combinations, 3 classifiers, 5 land cover classes, 20 segmentation evaluation metrics, and 7 classification accuracy measures. The reliability definition of segmentation evaluation metrics as indicators of land cover classification accuracy was based on the linear correlation between the two. All unsupervised metrics that are not based on number of segments have a very strong correlation with all classification measures and are therefore reliable as indicators of land cover classification accuracy. On the other hand, correlation at supervised metrics is dependent on so many factors that it cannot be trusted as a reliable classification quality indicator. Algorithms for land cover classification studied in this paper are widely used; therefore, presented results are applicable to a wider area.
NASA Astrophysics Data System (ADS)
Jing, Ya-Bing; Liu, Chang-Wen; Bi, Feng-Rong; Bi, Xiao-Yang; Wang, Xia; Shao, Kang
2017-07-01
Numerous vibration-based techniques are rarely used in diesel engines fault diagnosis in a direct way, due to the surface vibration signals of diesel engines with the complex non-stationary and nonlinear time-varying features. To investigate the fault diagnosis of diesel engines, fractal correlation dimension, wavelet energy and entropy as features reflecting the diesel engine fault fractal and energy characteristics are extracted from the decomposed signals through analyzing vibration acceleration signals derived from the cylinder head in seven different states of valve train. An intelligent fault detector FastICA-SVM is applied for diesel engine fault diagnosis and classification. The results demonstrate that FastICA-SVM achieves higher classification accuracy and makes better generalization performance in small samples recognition. Besides, the fractal correlation dimension and wavelet energy and entropy as the special features of diesel engine vibration signal are considered as input vectors of classifier FastICA-SVM and could produce the excellent classification results. The proposed methodology improves the accuracy of feature extraction and the fault diagnosis of diesel engines.
Supervised Learning Applied to Air Traffic Trajectory Classification
NASA Technical Reports Server (NTRS)
Bosson, Christabelle S.; Nikoleris, Tasos
2018-01-01
Given the recent increase of interest in introducing new vehicle types and missions into the National Airspace System, a transition towards a more autonomous air traffic control system is required in order to enable and handle increased density and complexity. This paper presents an exploratory effort of the needed autonomous capabilities by exploring supervised learning techniques in the context of aircraft trajectories. In particular, it focuses on the application of machine learning algorithms and neural network models to a runway recognition trajectory-classification study. It investigates the applicability and effectiveness of various classifiers using datasets containing trajectory records for a month of air traffic. A feature importance and sensitivity analysis are conducted to challenge the chosen time-based datasets and the ten selected features. The study demonstrates that classification accuracy levels of 90% and above can be reached in less than 40 seconds of training for most machine learning classifiers when one track data point, described by the ten selected features at a particular time step, per trajectory is used as input. It also shows that neural network models can achieve similar accuracy levels but at higher training time costs.
Classification of hyperspectral imagery with neural networks: comparison to conventional tools
NASA Astrophysics Data System (ADS)
Merényi, Erzsébet; Farrand, William H.; Taranik, James V.; Minor, Timothy B.
2014-12-01
Efficient exploitation of hyperspectral imagery is of great importance in remote sensing. Artificial intelligence approaches have been receiving favorable reviews for classification of hyperspectral data because the complexity of such data challenges the limitations of many conventional methods. Artificial neural networks (ANNs) were shown to outperform traditional classifiers in many situations. However, studies that use the full spectral dimensionality of hyperspectral images to classify a large number of surface covers are scarce if non-existent. We advocate the need for methods that can handle the full dimensionality and a large number of classes to retain the discovery potential and the ability to discriminate classes with subtle spectral differences. We demonstrate that such a method exists in the family of ANNs. We compare the maximum likelihood, Mahalonobis distance, minimum distance, spectral angle mapper, and a hybrid ANN classifier for real hyperspectral AVIRIS data, using the full spectral resolution to map 23 cover types and using a small training set. Rigorous evaluation of the classification accuracies shows that the ANN outperforms the other methods and achieves ≈90% accuracy on test data.
Sleep state classification using pressure sensor mats.
Baran Pouyan, M; Nourani, M; Pompeo, M
2015-08-01
Sleep state detection is valuable in assessing patient's sleep quality and in-bed general behavior. In this paper, a novel classification approach of sleep states (sleep, pre-wake, wake) is proposed that uses only surface pressure sensors. In our method, a mobility metric is defined based on successive pressure body maps. Then, suitable statistical features are computed based on the mobility metric. Finally, a customized random forest classifier is employed to identify various classes including a new class for pre-wake state. Our algorithm achieves 96.1% and 88% accuracies for two (sleep, wake) and three (sleep, pre-wake, wake) class identification, respectively.
Study of USGS/NASA land use classification system. [computer analysis from LANDSAT data
NASA Technical Reports Server (NTRS)
Spann, G. W.
1975-01-01
The results of a computer mapping project using LANDSAT data and the USGS/NASA land use classification system are summarized. During the computer mapping portion of the project, accuracies of 67 percent to 79 percent were achieved using Level II of the classification system and a 4,000 acre test site centered on Douglasville, Georgia. Analysis of response to a questionaire circulated to actual and potential LANDSAT data users reveals several important findings: (1) there is a substantial desire for additional information related to LANDSAT capabilities; (2) a majority of the respondents feel computer mapping from LANDSAT data could aid present or future projects; and (3) the costs of computer mapping are substantially less than those of other methods.
Fine-grained leukocyte classification with deep residual learning for microscopic images.
Qin, Feiwei; Gao, Nannan; Peng, Yong; Wu, Zizhao; Shen, Shuying; Grudtsin, Artur
2018-08-01
Leukocyte classification and cytometry have wide applications in medical domain, previous researches usually exploit machine learning techniques to classify leukocytes automatically. However, constrained by the past development of machine learning techniques, for example, extracting distinctive features from raw microscopic images are difficult, the widely used SVM classifier only has relative few parameters to tune, these methods cannot efficiently handle fine-grained classification cases when the white blood cells have up to 40 categories. Based on deep learning theory, a systematic study is conducted on finer leukocyte classification in this paper. A deep residual neural network based leukocyte classifier is constructed at first, which can imitate the domain expert's cell recognition process, and extract salient features robustly and automatically. Then the deep neural network classifier's topology is adjusted according to the prior knowledge of white blood cell test. After that the microscopic image dataset with almost one hundred thousand labeled leukocytes belonging to 40 categories is built, and combined training strategies are adopted to make the designed classifier has good generalization ability. The proposed deep residual neural network based classifier was tested on microscopic image dataset with 40 leukocyte categories. It achieves top-1 accuracy of 77.80%, top-5 accuracy of 98.75% during the training procedure. The average accuracy on the test set is nearly 76.84%. This paper presents a fine-grained leukocyte classification method for microscopic images, based on deep residual learning theory and medical domain knowledge. Experimental results validate the feasibility and effectiveness of our approach. Extended experiments support that the fine-grained leukocyte classifier could be used in real medical applications, assist doctors in diagnosing diseases, reduce human power significantly. Copyright © 2018 Elsevier B.V. All rights reserved.
Ertosun, Mehmet Günhan; Rubin, Daniel L
2015-01-01
Brain glioma is the most common primary malignant brain tumors in adults with different pathologic subtypes: Lower Grade Glioma (LGG) Grade II, Lower Grade Glioma (LGG) Grade III, and Glioblastoma Multiforme (GBM) Grade IV. The survival and treatment options are highly dependent of this glioma grade. We propose a deep learning-based, modular classification pipeline for automated grading of gliomas using digital pathology images. Whole tissue digitized images of pathology slides obtained from The Cancer Genome Atlas (TCGA) were used to train our deep learning modules. Our modular pipeline provides diagnostic quality statistics, such as precision, sensitivity and specificity, of the individual deep learning modules, and (1) facilitates training given the limited data in this domain, (2) enables exploration of different deep learning structures for each module, (3) leads to developing less complex modules that are simpler to analyze, and (4) provides flexibility, permitting use of single modules within the framework or use of other modeling or machine learning applications, such as probabilistic graphical models or support vector machines. Our modular approach helps us meet the requirements of minimum accuracy levels that are demanded by the context of different decision points within a multi-class classification scheme. Convolutional Neural Networks are trained for each module for each sub-task with more than 90% classification accuracies on validation data set, and achieved classification accuracy of 96% for the task of GBM vs LGG classification, 71% for further identifying the grade of LGG into Grade II or Grade III on independent data set coming from new patients from the multi-institutional repository.
Intelligent feature selection techniques for pattern classification of Lamb wave signals
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hinders, Mark K.; Miller, Corey A.
2014-02-18
Lamb wave interaction with flaws is a complex, three-dimensional phenomenon, which often frustrates signal interpretation schemes based on mode arrival time shifts predicted by dispersion curves. As the flaw severity increases, scattering and mode conversion effects will often dominate the time-domain signals, obscuring available information about flaws because multiple modes may arrive on top of each other. Even for idealized flaw geometries the scattering and mode conversion behavior of Lamb waves is very complex. Here, multi-mode Lamb waves in a metal plate are propagated across a rectangular flat-bottom hole in a sequence of pitch-catch measurements corresponding to the double crossholemore » tomography geometry. The flaw is sequentially deepened, with the Lamb wave measurements repeated at each flaw depth. Lamb wave tomography reconstructions are used to identify which waveforms have interacted with the flaw and thereby carry information about its depth. Multiple features are extracted from each of the Lamb wave signals using wavelets, which are then fed to statistical pattern classification algorithms that identify flaw severity. In order to achieve the highest classification accuracy, an optimal feature space is required but it’s never known a priori which features are going to be best. For structural health monitoring we make use of the fact that physical flaws, such as corrosion, will only increase over time. This allows us to identify feature vectors which are topologically well-behaved by requiring that sequential classes “line up” in feature vector space. An intelligent feature selection routine is illustrated that identifies favorable class distributions in multi-dimensional feature spaces using computational homology theory. Betti numbers and formal classification accuracies are calculated for each feature space subset to establish a correlation between the topology of the class distribution and the corresponding classification accuracy.« less
Xia, Wenjun; Mita, Yoshio; Shibata, Tadashi
2016-05-01
Aiming at efficient data condensation and improving accuracy, this paper presents a hardware-friendly template reduction (TR) method for the nearest neighbor (NN) classifiers by introducing the concept of critical boundary vectors. A hardware system is also implemented to demonstrate the feasibility of using an field-programmable gate array (FPGA) to accelerate the proposed method. Initially, k -means centers are used as substitutes for the entire template set. Then, to enhance the classification performance, critical boundary vectors are selected by a novel learning algorithm, which is completed within a single iteration. Moreover, to remove noisy boundary vectors that can mislead the classification in a generalized manner, a global categorization scheme has been explored and applied to the algorithm. The global characterization automatically categorizes each classification problem and rapidly selects the boundary vectors according to the nature of the problem. Finally, only critical boundary vectors and k -means centers are used as the new template set for classification. Experimental results for 24 data sets show that the proposed algorithm can effectively reduce the number of template vectors for classification with a high learning speed. At the same time, it improves the accuracy by an average of 2.17% compared with the traditional NN classifiers and also shows greater accuracy than seven other TR methods. We have shown the feasibility of using a proof-of-concept FPGA system of 256 64-D vectors to accelerate the proposed method on hardware. At a 50-MHz clock frequency, the proposed system achieves a 3.86 times higher learning speed than on a 3.4-GHz PC, while consuming only 1% of the power of that used by the PC.
Assessment of pedophilia using hemodynamic brain response to sexual stimuli.
Ponseti, Jorge; Granert, Oliver; Jansen, Olav; Wolff, Stephan; Beier, Klaus; Neutze, Janina; Deuschl, Günther; Mehdorn, Hubertus; Siebner, Hartwig; Bosinski, Hartmut
2012-02-01
Accurately assessing sexual preference is important in the treatment of child sex offenders. Phallometry is the standard method to identify sexual preference; however, this measure has been criticized for its intrusiveness and limited reliability. To evaluate whether spatial response pattern to sexual stimuli as revealed by a change in the blood oxygen level-dependent signal facilitates the identification of pedophiles. During functional magnetic resonance imaging, pedophilic and nonpedophilic participants were briefly exposed to same- and opposite-sex images of nude children and adults. We calculated differences in blood oxygen level-dependent signals to child and adult sexual stimuli for each participant. The corresponding contrast images were entered into a group analysis to calculate whole-brain difference maps between groups. We calculated an expression value that corresponded to the group result for each participant. These expression values were submitted to 2 different classification algorithms: Fisher linear discriminant analysis and κ -nearest neighbor analysis. This classification procedure was cross-validated using the leave-one-out method. Section of Sexual Medicine, Medical School, Christian Albrechts University of Kiel, Kiel, Germany. We recruited 24 participants with pedophilia who were sexually attracted to either prepubescent girls (n = 11) or prepubescent boys (n = 13) and 32 healthy male controls who were sexually attracted to either adult women (n = 18) or adult men (n = 14). Sensitivity and specificity scores of the 2 classification algorithms. The highest classification accuracy was achieved by Fisher linear discriminant analysis, which showed a mean accuracy of 95% (100% specificity, 88% sensitivity). Functional brain response patterns to sexual stimuli contain sufficient information to identify pedophiles with high accuracy. The automatic classification of these patterns is a promising objective tool to clinically diagnose pedophilia.
Ertosun, Mehmet Günhan; Rubin, Daniel L.
2015-01-01
Brain glioma is the most common primary malignant brain tumors in adults with different pathologic subtypes: Lower Grade Glioma (LGG) Grade II, Lower Grade Glioma (LGG) Grade III, and Glioblastoma Multiforme (GBM) Grade IV. The survival and treatment options are highly dependent of this glioma grade. We propose a deep learning-based, modular classification pipeline for automated grading of gliomas using digital pathology images. Whole tissue digitized images of pathology slides obtained from The Cancer Genome Atlas (TCGA) were used to train our deep learning modules. Our modular pipeline provides diagnostic quality statistics, such as precision, sensitivity and specificity, of the individual deep learning modules, and (1) facilitates training given the limited data in this domain, (2) enables exploration of different deep learning structures for each module, (3) leads to developing less complex modules that are simpler to analyze, and (4) provides flexibility, permitting use of single modules within the framework or use of other modeling or machine learning applications, such as probabilistic graphical models or support vector machines. Our modular approach helps us meet the requirements of minimum accuracy levels that are demanded by the context of different decision points within a multi-class classification scheme. Convolutional Neural Networks are trained for each module for each sub-task with more than 90% classification accuracies on validation data set, and achieved classification accuracy of 96% for the task of GBM vs LGG classification, 71% for further identifying the grade of LGG into Grade II or Grade III on independent data set coming from new patients from the multi-institutional repository. PMID:26958289
Prahm, Cosima; Eckstein, Korbinian; Ortiz-Catalan, Max; Dorffner, Georg; Kaniusas, Eugenijus; Aszmann, Oskar C
2016-08-31
Controlling a myoelectric prosthesis for upper limbs is increasingly challenging for the user as more electrodes and joints become available. Motion classification based on pattern recognition with a multi-electrode array allows multiple joints to be controlled simultaneously. Previous pattern recognition studies are difficult to compare, because individual research groups use their own data sets. To resolve this shortcoming and to facilitate comparisons, open access data sets were analysed using components of BioPatRec and Netlab pattern recognition models. Performances of the artificial neural networks, linear models, and training program components were compared. Evaluation took place within the BioPatRec environment, a Matlab-based open source platform that provides feature extraction, processing and motion classification algorithms for prosthetic control. The algorithms were applied to myoelectric signals for individual and simultaneous classification of movements, with the aim of finding the best performing algorithm and network model. Evaluation criteria included classification accuracy and training time. Results in both the linear and the artificial neural network models demonstrated that Netlab's implementation using scaled conjugate training algorithm reached significantly higher accuracies than BioPatRec. It is concluded that the best movement classification performance would be achieved through integrating Netlab training algorithms in the BioPatRec environment so that future prosthesis training can be shortened and control made more reliable. Netlab was therefore included into the newest release of BioPatRec (v4.0).
Dissimilarity representations in lung parenchyma classification
NASA Astrophysics Data System (ADS)
Sørensen, Lauge; de Bruijne, Marleen
2009-02-01
A good problem representation is important for a pattern recognition system to be successful. The traditional approach to statistical pattern recognition is feature representation. More specifically, objects are represented by a number of features in a feature vector space, and classifiers are built in this representation. This is also the general trend in lung parenchyma classification in computed tomography (CT) images, where the features often are measures on feature histograms. Instead, we propose to build normal density based classifiers in dissimilarity representations for lung parenchyma classification. This allows for the classifiers to work on dissimilarities between objects, which might be a more natural way of representing lung parenchyma. In this context, dissimilarity is defined between CT regions of interest (ROI)s. ROIs are represented by their CT attenuation histogram and ROI dissimilarity is defined as a histogram dissimilarity measure between the attenuation histograms. In this setting, the full histograms are utilized according to the chosen histogram dissimilarity measure. We apply this idea to classification of different emphysema patterns as well as normal, healthy tissue. Two dissimilarity representation approaches as well as different histogram dissimilarity measures are considered. The approaches are evaluated on a set of 168 CT ROIs using normal density based classifiers all showing good performance. Compared to using histogram dissimilarity directly as distance in a emph{k} nearest neighbor classifier, which achieves a classification accuracy of 92.9%, the best dissimilarity representation based classifier is significantly better with a classification accuracy of 97.0% (text{emph{p" border="0" class="imgtopleft"> = 0.046).
Castro, Eduardo; Martínez-Ramón, Manel; Pearlson, Godfrey; Sui, Jing; Calhoun, Vince D.
2011-01-01
Pattern classification of brain imaging data can enable the automatic detection of differences in cognitive processes of specific groups of interest. Furthermore, it can also give neuroanatomical information related to the regions of the brain that are most relevant to detect these differences by means of feature selection procedures, which are also well-suited to deal with the high dimensionality of brain imaging data. This work proposes the application of recursive feature elimination using a machine learning algorithm based on composite kernels to the classification of healthy controls and patients with schizophrenia. This framework, which evaluates nonlinear relationships between voxels, analyzes whole-brain fMRI data from an auditory task experiment that is segmented into anatomical regions and recursively eliminates the uninformative ones based on their relevance estimates, thus yielding the set of most discriminative brain areas for group classification. The collected data was processed using two analysis methods: the general linear model (GLM) and independent component analysis (ICA). GLM spatial maps as well as ICA temporal lobe and default mode component maps were then input to the classifier. A mean classification accuracy of up to 95% estimated with a leave-two-out cross-validation procedure was achieved by doing multi-source data classification. In addition, it is shown that the classification accuracy rate obtained by using multi-source data surpasses that reached by using single-source data, hence showing that this algorithm takes advantage of the complimentary nature of GLM and ICA. PMID:21723948
Variations in the Intragene Methylation Profiles Hallmark Induced Pluripotency
Druzhkov, Pavel; Zolotykh, Nikolay; Meyerov, Iosif; Alsaedi, Ahmed; Shutova, Maria; Ivanchenko, Mikhail; Zaikin, Alexey
2015-01-01
We demonstrate the potential of differentiating embryonic and induced pluripotent stem cells by the regularized linear and decision tree machine learning classification algorithms, based on a number of intragene methylation measures. The resulting average accuracy of classification has been proven to be above 95%, which overcomes the earlier achievements. We propose a constructive and transparent method of feature selection based on classifier accuracy. Enrichment analysis reveals statistically meaningful presence of stemness group and cancer discriminating genes among the selected best classifying features. These findings stimulate the further research on the functional consequences of these differences in methylation patterns. The presented approach can be broadly used to discriminate the cells of different phenotype or in different state by their methylation profiles, identify groups of genes constituting multifeature classifiers, and assess enrichment of these groups by the sets of genes with a functionality of interest. PMID:26618180
seXY: a tool for sex inference from genotype arrays.
Qian, David C; Busam, Jonathan A; Xiao, Xiangjun; O'Mara, Tracy A; Eeles, Rosalind A; Schumacher, Frederick R; Phelan, Catherine M; Amos, Christopher I
2017-02-15
Checking concordance between reported sex and genotype-inferred sex is a crucial quality control measure in genome-wide association studies (GWAS). However, limited insights exist regarding the true accuracy of software that infer sex from genotype array data. We present seXY, a logistic regression model trained on both X chromosome heterozygosity and Y chromosome missingness, that consistently demonstrated >99.5% sex inference accuracy in cross-validation for 889 males and 5,361 females enrolled in prostate cancer and ovarian cancer GWAS. Compared to PLINK, one of the most popular tools for sex inference in GWAS that assesses only X chromosome heterozygosity, seXY achieved marginally better male classification and 3% more accurate female classification. https://github.com/Christopher-Amos-Lab/seXY. Christopher.I.Amos@dartmouth.edu. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Multiple Hypotheses Image Segmentation and Classification With Application to Dietary Assessment
Zhu, Fengqing; Bosch, Marc; Khanna, Nitin; Boushey, Carol J.; Delp, Edward J.
2016-01-01
We propose a method for dietary assessment to automatically identify and locate food in a variety of images captured during controlled and natural eating events. Two concepts are combined to achieve this: a set of segmented objects can be partitioned into perceptually similar object classes based on global and local features; and perceptually similar object classes can be used to assess the accuracy of image segmentation. These ideas are implemented by generating multiple segmentations of an image to select stable segmentations based on the classifier’s confidence score assigned to each segmented image region. Automatic segmented regions are classified using a multichannel feature classification system. For each segmented region, multiple feature spaces are formed. Feature vectors in each of the feature spaces are individually classified. The final decision is obtained by combining class decisions from individual feature spaces using decision rules. We show improved accuracy of segmenting food images with classifier feedback. PMID:25561457
A SVM-based method for sentiment analysis in Persian language
NASA Astrophysics Data System (ADS)
Hajmohammadi, Mohammad Sadegh; Ibrahim, Roliana
2013-03-01
Persian language is the official language of Iran, Tajikistan and Afghanistan. Local online users often represent their opinions and experiences on the web with written Persian. Although the information in those reviews is valuable to potential consumers and sellers, the huge amount of web reviews make it difficult to give an unbiased evaluation to a product. In this paper, standard machine learning techniques SVM and naive Bayes are incorporated into the domain of online Persian Movie reviews to automatically classify user reviews as positive or negative and performance of these two classifiers is compared with each other in this language. The effects of feature presentations on classification performance are discussed. We find that accuracy is influenced by interaction between the classification models and the feature options. The SVM classifier achieves as well as or better accuracy than naive Bayes in Persian movie. Unigrams are proved better features than bigrams and trigrams in capturing Persian sentiment orientation.
Efficient HIK SVM learning for image classification.
Wu, Jianxin
2012-10-01
Histograms are used in almost every aspect of image processing and computer vision, from visual descriptors to image representations. Histogram intersection kernel (HIK) and support vector machine (SVM) classifiers are shown to be very effective in dealing with histograms. This paper presents contributions concerning HIK SVM for image classification. First, we propose intersection coordinate descent (ICD), a deterministic and scalable HIK SVM solver. ICD is much faster than, and has similar accuracies to, general purpose SVM solvers and other fast HIK SVM training methods. We also extend ICD to the efficient training of a broader family of kernels. Second, we show an important empirical observation that ICD is not sensitive to the C parameter in SVM, and we provide some theoretical analyses to explain this observation. ICD achieves high accuracies in many problems, using its default parameters. This is an attractive property for practitioners, because many image processing tasks are too large to choose SVM parameters using cross-validation.
Comparison of citrus orchard inventory using LISS-III and LISS-IV data
NASA Astrophysics Data System (ADS)
Singh, Niti; Chaudhari, K. N.; Manjunath, K. R.
2016-04-01
In India, in terms of area under cultivation, citrus is the third most cultivated fruit crop after Banana and Mango. Among citrus group, lime is one of the most important horticultural crops in India as the demand for its consumption is very high. Hence, preparing citrus crop inventories using remote sensing techniques would help in maintaining a record of its area and production statistics. This study shows how accurately citrus orchard can be classified using both IRS Resourcesat-2 LISS-III and LISS-IV data and depicts the optimum bio-widow for procuring satellite data to achieve high classification accuracy required for maintaining inventory of crop. Findings of the study show classification accuracy increased from 55% (using LISS-III) to 77% (using LISS-IV). Also, according to classified outputs and NDVI values obtained, April and May months were identified as optimum bio-window for citrus crop identification.
Three-dimensional object recognition using similar triangles and decision trees
NASA Technical Reports Server (NTRS)
Spirkovska, Lilly
1993-01-01
A system, TRIDEC, that is capable of distinguishing between a set of objects despite changes in the objects' positions in the input field, their size, or their rotational orientation in 3D space is described. TRIDEC combines very simple yet effective features with the classification capabilities of inductive decision tree methods. The feature vector is a list of all similar triangles defined by connecting all combinations of three pixels in a coarse coded 127 x 127 pixel input field. The classification is accomplished by building a decision tree using the information provided from a limited number of translated, scaled, and rotated samples. Simulation results are presented which show that TRIDEC achieves 94 percent recognition accuracy in the 2D invariant object recognition domain and 98 percent recognition accuracy in the 3D invariant object recognition domain after training on only a small sample of transformed views of the objects.
Multiple hypotheses image segmentation and classification with application to dietary assessment.
Zhu, Fengqing; Bosch, Marc; Khanna, Nitin; Boushey, Carol J; Delp, Edward J
2015-01-01
We propose a method for dietary assessment to automatically identify and locate food in a variety of images captured during controlled and natural eating events. Two concepts are combined to achieve this: a set of segmented objects can be partitioned into perceptually similar object classes based on global and local features; and perceptually similar object classes can be used to assess the accuracy of image segmentation. These ideas are implemented by generating multiple segmentations of an image to select stable segmentations based on the classifier's confidence score assigned to each segmented image region. Automatic segmented regions are classified using a multichannel feature classification system. For each segmented region, multiple feature spaces are formed. Feature vectors in each of the feature spaces are individually classified. The final decision is obtained by combining class decisions from individual feature spaces using decision rules. We show improved accuracy of segmenting food images with classifier feedback.
Seeberg, Trine M.; Tjønnås, Johannes; Haugnes, Pål; Sandbakk, Øyvind
2017-01-01
The automatic classification of sub-techniques in classical cross-country skiing provides unique possibilities for analyzing the biomechanical aspects of outdoor skiing. This is currently possible due to the miniaturization and flexibility of wearable inertial measurement units (IMUs) that allow researchers to bring the laboratory to the field. In this study, we aimed to optimize the accuracy of the automatic classification of classical cross-country skiing sub-techniques by using two IMUs attached to the skier’s arm and chest together with a machine learning algorithm. The novelty of our approach is the reliable detection of individual cycles using a gyroscope on the skier’s arm, while a neural network machine learning algorithm robustly classifies each cycle to a sub-technique using sensor data from an accelerometer on the chest. In this study, 24 datasets from 10 different participants were separated into the categories training-, validation- and test-data. Overall, we achieved a classification accuracy of 93.9% on the test-data. Furthermore, we illustrate how an accurate classification of sub-techniques can be combined with data from standard sports equipment including position, altitude, speed and heart rate measuring systems. Combining this information has the potential to provide novel insight into physiological and biomechanical aspects valuable to coaches, athletes and researchers. PMID:29283421
Refining Time-Activity Classification of Human Subjects Using the Global Positioning System
Hu, Maogui; Li, Wei; Li, Lianfa; Houston, Douglas; Wu, Jun
2016-01-01
Background Detailed spatial location information is important in accurately estimating personal exposure to air pollution. Global Position System (GPS) has been widely used in tracking personal paths and activities. Previous researchers have developed time-activity classification models based on GPS data, most of them were developed for specific regions. An adaptive model for time-location classification can be widely applied to air pollution studies that use GPS to track individual level time-activity patterns. Methods Time-activity data were collected for seven days using GPS loggers and accelerometers from thirteen adult participants from Southern California under free living conditions. We developed an automated model based on random forests to classify major time-activity patterns (i.e. indoor, outdoor-static, outdoor-walking, and in-vehicle travel). Sensitivity analysis was conducted to examine the contribution of the accelerometer data and the supplemental spatial data (i.e. roadway and tax parcel data) to the accuracy of time-activity classification. Our model was evaluated using both leave-one-fold-out and leave-one-subject-out methods. Results Maximum speeds in averaging time intervals of 7 and 5 minutes, and distance to primary highways with limited access were found to be the three most important variables in the classification model. Leave-one-fold-out cross-validation showed an overall accuracy of 99.71%. Sensitivities varied from 84.62% (outdoor walking) to 99.90% (indoor). Specificities varied from 96.33% (indoor) to 99.98% (outdoor static). The exclusion of accelerometer and ambient light sensor variables caused a slight loss in sensitivity for outdoor walking, but little loss in overall accuracy. However, leave-one-subject-out cross-validation showed considerable loss in sensitivity for outdoor static and outdoor walking conditions. Conclusions The random forests classification model can achieve high accuracy for the four major time-activity categories. The model also performed well with just GPS, road and tax parcel data. However, caution is warranted when generalizing the model developed from a small number of subjects to other populations. PMID:26919723
Behavioral state classification in epileptic brain using intracranial electrophysiology
NASA Astrophysics Data System (ADS)
Kremen, Vaclav; Duque, Juliano J.; Brinkmann, Benjamin H.; Berry, Brent M.; Kucewicz, Michal T.; Khadjevand, Fatemeh; Van Gompel, Jamie; Stead, Matt; St. Louis, Erik K.; Worrell, Gregory A.
2017-04-01
Objective. Automated behavioral state classification can benefit next generation implantable epilepsy devices. In this study we explored the feasibility of automated awake (AW) and slow wave sleep (SWS) classification using wide bandwidth intracranial EEG (iEEG) in patients undergoing evaluation for epilepsy surgery. Approach. Data from seven patients (age 34+/- 12 , 4 women) who underwent intracranial depth electrode implantation for iEEG monitoring were included. Spectral power features (0.1-600 Hz) spanning several frequency bands from a single electrode were used to train and test a support vector machine classifier. Main results. Classification accuracy of 97.8 ± 0.3% (normal tissue) and 89.4 ± 0.8% (epileptic tissue) across seven subjects using multiple spectral power features from a single electrode was achieved. Spectral power features from electrodes placed in normal temporal neocortex were found to be more useful (accuracy 90.8 ± 0.8%) for sleep-wake state classification than electrodes located in normal hippocampus (87.1 ± 1.6%). Spectral power in high frequency band features (Ripple (80-250 Hz), Fast Ripple (250-600 Hz)) showed comparable performance for AW and SWS classification as the best performing Berger bands (Alpha, Beta, low Gamma) with accuracy ⩾90% using a single electrode contact and single spectral feature. Significance. Automated classification of wake and SWS should prove useful for future implantable epilepsy devices with limited computational power, memory, and number of electrodes. Applications include quantifying patient sleep patterns and behavioral state dependent detection, prediction, and electrical stimulation therapies.
Noise tolerant dendritic lattice associative memories
NASA Astrophysics Data System (ADS)
Ritter, Gerhard X.; Schmalz, Mark S.; Hayden, Eric; Tucker, Marc
2011-09-01
Linear classifiers based on computation over the real numbers R (e.g., with operations of addition and multiplication) denoted by (R, +, x), have been represented extensively in the literature of pattern recognition. However, a different approach to pattern classification involves the use of addition, maximum, and minimum operations over the reals in the algebra (R, +, maximum, minimum) These pattern classifiers, based on lattice algebra, have been shown to exhibit superior information storage capacity, fast training and short convergence times, high pattern classification accuracy, and low computational cost. Such attributes are not always found, for example, in classical neural nets based on the linear inner product. In a special type of lattice associative memory (LAM), called a dendritic LAM or DLAM, it is possible to achieve noise-tolerant pattern classification by varying the design of noise or error acceptance bounds. This paper presents theory and algorithmic approaches for the computation of noise-tolerant lattice associative memories (LAMs) under a variety of input constraints. Of particular interest are the classification of nonergodic data in noise regimes with time-varying statistics. DLAMs, which are a specialization of LAMs derived from concepts of biological neural networks, have successfully been applied to pattern classification from hyperspectral remote sensing data, as well as spatial object recognition from digital imagery. The authors' recent research in the development of DLAMs is overviewed, with experimental results that show utility for a wide variety of pattern classification applications. Performance results are presented in terms of measured computational cost, noise tolerance, classification accuracy, and throughput for a variety of input data and noise levels.
Guo, Lilin; Wang, Zhenzhong; Cabrerizo, Mercedes; Adjouadi, Malek
2017-05-01
This study introduces a novel learning algorithm for spiking neurons, called CCDS, which is able to learn and reproduce arbitrary spike patterns in a supervised fashion allowing the processing of spatiotemporal information encoded in the precise timing of spikes. Unlike the Remote Supervised Method (ReSuMe), synapse delays and axonal delays in CCDS are variants which are modulated together with weights during learning. The CCDS rule is both biologically plausible and computationally efficient. The properties of this learning rule are investigated extensively through experimental evaluations in terms of reliability, adaptive learning performance, generality to different neuron models, learning in the presence of noise, effects of its learning parameters and classification performance. Results presented show that the CCDS learning method achieves learning accuracy and learning speed comparable with ReSuMe, but improves classification accuracy when compared to both the Spike Pattern Association Neuron (SPAN) learning rule and the Tempotron learning rule. The merit of CCDS rule is further validated on a practical example involving the automated detection of interictal spikes in EEG records of patients with epilepsy. Results again show that with proper encoding, the CCDS rule achieves good recognition performance.
NASA Astrophysics Data System (ADS)
Hishe, Hadgu; Giday, Kidane; Neka, Mulugeta; Soromessa, Teshome; Van Orshoven, Jos; Muys, Bart
2015-01-01
Comprehensive and less costly forest inventory approaches are required to monitor the spatiotemporal dynamics of key species in forest ecosystems. Subpixel analysis using the earth resources data analysis system imagine subpixel classification procedure was tested to extract Olea europaea subsp. cuspidata and Juniperus procera canopies from Landsat 7 enhanced thematic mapper plus imagery. Control points with various canopy area fractions of the target species were collected to develop signatures for each of the species. With these signatures, the imagine subpixel classification procedure was run for each species independently. The subpixel process enabled the detection of O. europaea subsp. cuspidata and J. procera trees in pure and mixed pixels. Total of 100 pixels each were field verified for both species. An overall accuracy of 85% was achieved for O. europaea subsp. cuspidata and 89% for J. procera. A high overall accuracy level of detecting species at a natural forest was achieved, which encourages using the algorithm for future species monitoring activities. We recommend that the algorithm has to be validated in similar environment to enrich the knowledge on its capability to ensure its wider usage.
Land-cover classification in a moist tropical region of Brazil with Landsat TM imagery.
Li, Guiying; Lu, Dengsheng; Moran, Emilio; Hetrick, Scott
2011-01-01
This research aims to improve land-cover classification accuracy in a moist tropical region in Brazil by examining the use of different remote sensing-derived variables and classification algorithms. Different scenarios based on Landsat Thematic Mapper (TM) spectral data and derived vegetation indices and textural images, and different classification algorithms - maximum likelihood classification (MLC), artificial neural network (ANN), classification tree analysis (CTA), and object-based classification (OBC), were explored. The results indicated that a combination of vegetation indices as extra bands into Landsat TM multispectral bands did not improve the overall classification performance, but the combination of textural images was valuable for improving vegetation classification accuracy. In particular, the combination of both vegetation indices and textural images into TM multispectral bands improved overall classification accuracy by 5.6% and kappa coefficient by 6.25%. Comparison of the different classification algorithms indicated that CTA and ANN have poor classification performance in this research, but OBC improved primary forest and pasture classification accuracies. This research indicates that use of textural images or use of OBC are especially valuable for improving the vegetation classes such as upland and liana forest classes having complex stand structures and having relatively large patch sizes.
Land-cover classification in a moist tropical region of Brazil with Landsat TM imagery
LI, GUIYING; LU, DENGSHENG; MORAN, EMILIO; HETRICK, SCOTT
2011-01-01
This research aims to improve land-cover classification accuracy in a moist tropical region in Brazil by examining the use of different remote sensing-derived variables and classification algorithms. Different scenarios based on Landsat Thematic Mapper (TM) spectral data and derived vegetation indices and textural images, and different classification algorithms – maximum likelihood classification (MLC), artificial neural network (ANN), classification tree analysis (CTA), and object-based classification (OBC), were explored. The results indicated that a combination of vegetation indices as extra bands into Landsat TM multispectral bands did not improve the overall classification performance, but the combination of textural images was valuable for improving vegetation classification accuracy. In particular, the combination of both vegetation indices and textural images into TM multispectral bands improved overall classification accuracy by 5.6% and kappa coefficient by 6.25%. Comparison of the different classification algorithms indicated that CTA and ANN have poor classification performance in this research, but OBC improved primary forest and pasture classification accuracies. This research indicates that use of textural images or use of OBC are especially valuable for improving the vegetation classes such as upland and liana forest classes having complex stand structures and having relatively large patch sizes. PMID:22368311
Affective Computing and the Impact of Gender and Age
Rukavina, Stefanie; Gruss, Sascha; Hoffmann, Holger; Tan, Jun-Wen; Walter, Steffen; Traue, Harald C.
2016-01-01
Affective computing aims at the detection of users’ mental states, in particular, emotions and dispositions during human-computer interactions. Detection can be achieved by measuring multimodal signals, namely, speech, facial expressions and/or psychobiology. Over the past years, one major approach was to identify the best features for each signal using different classification methods. Although this is of high priority, other subject-specific variables should not be neglected. In our study, we analyzed the effect of gender, age, personality and gender roles on the extracted psychobiological features (derived from skin conductance level, facial electromyography and heart rate variability) as well as the influence on the classification results. In an experimental human-computer interaction, five different affective states with picture material from the International Affective Picture System and ULM pictures were induced. A total of 127 subjects participated in the study. Among all potentially influencing variables (gender has been reported to be influential), age was the only variable that correlated significantly with psychobiological responses. In summary, the conducted classification processes resulted in 20% classification accuracy differences according to age and gender, especially when comparing the neutral condition with four other affective states. We suggest taking age and gender specifically into account for future studies in affective computing, as these may lead to an improvement of emotion recognition accuracy. PMID:26939129
Application of texture analysis method for mammogram density classification
NASA Astrophysics Data System (ADS)
Nithya, R.; Santhi, B.
2017-07-01
Mammographic density is considered a major risk factor for developing breast cancer. This paper proposes an automated approach to classify breast tissue types in digital mammogram. The main objective of the proposed Computer-Aided Diagnosis (CAD) system is to investigate various feature extraction methods and classifiers to improve the diagnostic accuracy in mammogram density classification. Texture analysis methods are used to extract the features from the mammogram. Texture features are extracted by using histogram, Gray Level Co-Occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), Gray Level Difference Matrix (GLDM), Local Binary Pattern (LBP), Entropy, Discrete Wavelet Transform (DWT), Wavelet Packet Transform (WPT), Gabor transform and trace transform. These extracted features are selected using Analysis of Variance (ANOVA). The features selected by ANOVA are fed into the classifiers to characterize the mammogram into two-class (fatty/dense) and three-class (fatty/glandular/dense) breast density classification. This work has been carried out by using the mini-Mammographic Image Analysis Society (MIAS) database. Five classifiers are employed namely, Artificial Neural Network (ANN), Linear Discriminant Analysis (LDA), Naive Bayes (NB), K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). Experimental results show that ANN provides better performance than LDA, NB, KNN and SVM classifiers. The proposed methodology has achieved 97.5% accuracy for three-class and 99.37% for two-class density classification.
Goshvarpour, Ateke; Goshvarpour, Atefeh
2018-04-30
Heart rate variability (HRV) analysis has become a widely used tool for monitoring pathological and psychological states in medical applications. In a typical classification problem, information fusion is a process whereby the effective combination of the data can achieve a more accurate system. The purpose of this article was to provide an accurate algorithm for classifying HRV signals in various psychological states. Therefore, a novel feature level fusion approach was proposed. First, using the theory of information, two similarity indicators of the signal were extracted, including correntropy and Cauchy-Schwarz divergence. Applying probabilistic neural network (PNN) and k-nearest neighbor (kNN), the performance of each index in the classification of meditators and non-meditators HRV signals was appraised. Then, three fusion rules, including division, product, and weighted sum rules were used to combine the information of both similarity measures. For the first time, we propose an algorithm to define the weights of each feature based on the statistical p-values. The performance of HRV classification using combined features was compared with the non-combined features. Totally, the accuracy of 100% was obtained for discriminating all states. The results showed the strong ability and proficiency of division and weighted sum rules in the improvement of the classifier accuracies.
Classification of right-hand grasp movement based on EMOTIV Epoc+
NASA Astrophysics Data System (ADS)
Tobing, T. A. M. L.; Prawito, Wijaya, S. K.
2017-07-01
Combinations of BCT elements for right-hand grasp movement have been obtained, providing the average value of their classification accuracy. The aim of this study is to find a suitable combination for best classification accuracy of right-hand grasp movement based on EEG headset, EMOTIV Epoc+. There are three movement classifications: grasping hand, relax, and opening hand. These classifications take advantage of Event-Related Desynchronization (ERD) phenomenon that makes it possible to differ relaxation, imagery, and movement state from each other. The combinations of elements are the usage of Independent Component Analysis (ICA), spectrum analysis by Fast Fourier Transform (FFT), maximum mu and beta power with their frequency as features, and also classifier Probabilistic Neural Network (PNN) and Radial Basis Function (RBF). The average values of classification accuracy are ± 83% for training and ± 57% for testing. To have a better understanding of the signal quality recorded by EMOTIV Epoc+, the result of classification accuracy of left or right-hand grasping movement EEG signal (provided by Physionet) also be given, i.e.± 85% for training and ± 70% for testing. The comparison of accuracy value from each combination, experiment condition, and external EEG data are provided for the purpose of value analysis of classification accuracy.
NASA Astrophysics Data System (ADS)
Xiao, Zhitao; Leng, Yanyi; Geng, Lei; Xi, Jiangtao
2018-04-01
In this paper, a new convolution neural network method is proposed for the inspection and classification of galvanized stamping parts. Firstly, all workpieces are divided into normal and defective by image processing, and then the defective workpieces extracted from the region of interest (ROI) area are input to the trained fully convolutional networks (FCN). The network utilizes an end-to-end and pixel-to-pixel training convolution network that is currently the most advanced technology in semantic segmentation, predicts result of each pixel. Secondly, we mark the different pixel values of the workpiece, defect and background for the training image, and use the pixel value and the number of pixels to realize the recognition of the defects of the output picture. Finally, the defect area's threshold depended on the needs of the project is set to achieve the specific classification of the workpiece. The experiment results show that the proposed method can successfully achieve defect detection and classification of galvanized stamping parts under ordinary camera and illumination conditions, and its accuracy can reach 99.6%. Moreover, it overcomes the problem of complex image preprocessing and difficult feature extraction and performs better adaptability.
NASA Astrophysics Data System (ADS)
Hashimoto, Noriaki; Suzuki, Kenji; Liu, Junchi; Hirano, Yasushi; MacMahon, Heber; Kido, Shoji
2018-02-01
Consolidation and ground-glass opacity (GGO) are two major types of opacities associated with diffuse lung diseases. Accurate detection and classification of such opacities are crucially important in the diagnosis of lung diseases, but the process is subjective, and suffers from interobserver variability. Our study purpose was to develop a deep neural network convolution (NNC) system for distinguishing among consolidation, GGO, and normal lung tissue in high-resolution CT (HRCT). We developed ensemble of two deep NNC models, each of which was composed of neural network regression (NNR) with an input layer, a convolution layer, a fully-connected hidden layer, and a fully-connected output layer followed by a thresholding layer. The output layer of each NNC provided a map for the likelihood of being each corresponding lung opacity of interest. The two NNC models in the ensemble were connected in a class-selection layer. We trained our NNC ensemble with pairs of input 2D axial slices and "teaching" probability maps for the corresponding lung opacity, which were obtained by combining three radiologists' annotations. We randomly selected 10 and 40 slices from HRCT scans of 172 patients for each class as a training and test set, respectively. Our NNC ensemble achieved an area under the receiver-operating-characteristic (ROC) curve (AUC) of 0.981 and 0.958 in distinction of consolidation and GGO, respectively, from normal opacity, yielding a classification accuracy of 93.3% among 3 classes. Thus, our deep-NNC-based system for classifying diffuse lung diseases achieved high accuracies for classification of consolidation, GGO, and normal opacity.
Beevi, K Sabeena; Nair, Madhu S; Bindu, G R
2016-08-01
The exact measure of mitotic nuclei is a crucial parameter in breast cancer grading and prognosis. This can be achieved by improving the mitotic detection accuracy by careful design of segmentation and classification techniques. In this paper, segmentation of nuclei from breast histopathology images are carried out by Localized Active Contour Model (LACM) utilizing bio-inspired optimization techniques in the detection stage, in order to handle diffused intensities present along object boundaries. Further, the application of a new optimal machine learning algorithm capable of classifying strong non-linear data such as Random Kitchen Sink (RKS), shows improved classification performance. The proposed method has been tested on Mitosis detection in breast cancer histological images (MITOS) dataset provided for MITOS-ATYPIA CONTEST 2014. The proposed framework achieved 95% recall, 98% precision and 96% F-score.
Classification of Clouds in Satellite Imagery Using Adaptive Fuzzy Sparse Representation.
Jin, Wei; Gong, Fei; Zeng, Xingbin; Fu, Randi
2016-12-16
Automatic cloud detection and classification using satellite cloud imagery have various meteorological applications such as weather forecasting and climate monitoring. Cloud pattern analysis is one of the research hotspots recently. Since satellites sense the clouds remotely from space, and different cloud types often overlap and convert into each other, there must be some fuzziness and uncertainty in satellite cloud imagery. Satellite observation is susceptible to noises, while traditional cloud classification methods are sensitive to noises and outliers; it is hard for traditional cloud classification methods to achieve reliable results. To deal with these problems, a satellite cloud classification method using adaptive fuzzy sparse representation-based classification (AFSRC) is proposed. Firstly, by defining adaptive parameters related to attenuation rate and critical membership, an improved fuzzy membership is introduced to accommodate the fuzziness and uncertainty of satellite cloud imagery; secondly, by effective combination of the improved fuzzy membership function and sparse representation-based classification (SRC), atoms in training dictionary are optimized; finally, an adaptive fuzzy sparse representation classifier for cloud classification is proposed. Experiment results on FY-2G satellite cloud image show that, the proposed method not only improves the accuracy of cloud classification, but also has strong stability and adaptability with high computational efficiency.
Segmentation schema for enhancing land cover identification: A case study using Sentinel 2 data
NASA Astrophysics Data System (ADS)
Mongus, Domen; Žalik, Borut
2018-04-01
Land monitoring is performed increasingly using high and medium resolution optical satellites, such as the Sentinel-2. However, optical data is inevitably subjected to the variable operational conditions under which it was acquired. Overlapping of features caused by shadows, soft transitions between shadowed and non-shadowed regions, and temporal variability of the observed land-cover types require radiometric corrections. This study examines a new approach to enhancing the accuracy of land cover identification that resolves this problem. The proposed method constructs an ensemble-type classification model with weak classifiers tuned to the particular operational conditions under which the data was acquired. Iterative segmentation over the learning set is applied for this purpose, where feature space is partitioned according to the likelihood of misclassifications introduced by the classification model. As these are a consequence of overlapping features, such partitioning avoids the need for radiometric corrections of the data, and divides land cover types implicitly into subclasses. As a result, improved performance of all tested classification approaches were measured during the validation that was conducted on Sentinel-2 data. The highest accuracies in terms of F1-scores were achieved using the Naive Bayes Classifier as the weak classifier, while supplementing original spectral signatures with normalised difference vegetation index and texture analysis features, namely, average intensity, contrast, homogeneity, and dissimilarity. In total, an F1-score of nearly 95% was achieved in this way, with F1-scores of each particular land cover type reaching above 90%.
Ludwig, Simone A; Kong, Jun
2017-12-01
Innovative methods and new technologies have significantly improved the quality of our daily life. However, disabled people, for example those that cannot use their arms and legs anymore, often cannot benefit from these developments, since they cannot use their hands to interact with traditional interaction methods (such as mouse or keyboard) to communicate with a computer system. A brain-computer interface (BCI) system allows such a disabled person to control an external device via brain waves. Past research mostly dealt with static interfaces, which limit users to a stationary location. However, since we are living in a world that is highly mobile, this paper evaluates a speller interface on a mobile phone used in a moving condition. The spelling experiments were conducted with 14 able-bodied subjects using visual flashes as the stimulus to spell 47 alphanumeric characters (38 letters and 9 numbers). This data was then used for the classification experiments. In par- ticular, two research directions are pursued. The first investigates the impact of different classification algorithms, and the second direction looks at the channel configuration, i.e., which channels are most beneficial in terms of achieving the highest classification accuracy. The evaluation results indicate that the Bayesian Linear Discriminant Analysis algorithm achieves the best accuracy. Also, the findings of the investigation on the channel configuration, which can potentially reduce the amount of data processing on a mobile device with limited computing capacity, is especially useful in mobile BCIs.
NASA Astrophysics Data System (ADS)
Musa Abbagoni, Baba; Yeung, Hoi
2016-08-01
The identification of flow pattern is a key issue in multiphase flow which is encountered in the petrochemical industry. It is difficult to identify the gas-liquid flow regimes objectively with the gas-liquid two-phase flow. This paper presents the feasibility of a clamp-on instrument for an objective flow regime classification of two-phase flow using an ultrasonic Doppler sensor and an artificial neural network, which records and processes the ultrasonic signals reflected from the two-phase flow. Experimental data is obtained on a horizontal test rig with a total pipe length of 21 m and 5.08 cm internal diameter carrying air-water two-phase flow under slug, elongated bubble, stratified-wavy and, stratified flow regimes. Multilayer perceptron neural networks (MLPNNs) are used to develop the classification model. The classifier requires features as an input which is representative of the signals. Ultrasound signal features are extracted by applying both power spectral density (PSD) and discrete wavelet transform (DWT) methods to the flow signals. A classification scheme of ‘1-of-C coding method for classification’ was adopted to classify features extracted into one of four flow regime categories. To improve the performance of the flow regime classifier network, a second level neural network was incorporated by using the output of a first level networks feature as an input feature. The addition of the two network models provided a combined neural network model which has achieved a higher accuracy than single neural network models. Classification accuracies are evaluated in the form of both the PSD and DWT features. The success rates of the two models are: (1) using PSD features, the classifier missed 3 datasets out of 24 test datasets of the classification and scored 87.5% accuracy; (2) with the DWT features, the network misclassified only one data point and it was able to classify the flow patterns up to 95.8% accuracy. This approach has demonstrated the success of a clamp-on ultrasound sensor for flow regime classification that would be possible in industry practice. It is considerably more promising than other techniques as it uses a non-invasive and non-radioactive sensor.
Classification of EEG signals using a genetic-based machine learning classifier.
Skinner, B T; Nguyen, H T; Liu, D K
2007-01-01
This paper investigates the efficacy of the genetic-based learning classifier system XCS, for the classification of noisy, artefact-inclusive human electroencephalogram (EEG) signals represented using large condition strings (108bits). EEG signals from three participants were recorded while they performed four mental tasks designed to elicit hemispheric responses. Autoregressive (AR) models and Fast Fourier Transform (FFT) methods were used to form feature vectors with which mental tasks can be discriminated. XCS achieved a maximum classification accuracy of 99.3% and a best average of 88.9%. The relative classification performance of XCS was then compared against four non-evolutionary classifier systems originating from different learning techniques. The experimental results will be used as part of our larger research effort investigating the feasibility of using EEG signals as an interface to allow paralysed persons to control a powered wheelchair or other devices.
NASA Astrophysics Data System (ADS)
Hizukuri, Akiyoshi; Nagata, Takeshi
2017-03-01
The purpose of this study is to develop a classification method for a crack on a pavement surface image using machine learning to reduce a maintenance fee. Our database consists of 3500 pavement surface images. This includes 800 crack and 2700 normal pavement surface images. The pavement surface images first are decomposed into several sub-images using a discrete wavelet transform (DWT) decomposition. We then calculate the wavelet sub-band histogram from each several sub-images at each level. The support vector machine (SVM) with computed wavelet sub-band histogram is employed for distinguishing between a crack and normal pavement surface images. The accuracies of the proposed classification method are 85.3% for crack and 84.4% for normal pavement images. The proposed classification method achieved high performance. Therefore, the proposed method would be useful in maintenance inspection.
Hybrid EEG-EOG brain-computer interface system for practical machine control.
Punsawad, Yunyong; Wongsawat, Yodchanan; Parnichkun, Manukid
2010-01-01
Practical issues such as accuracy with various subjects, number of sensors, and time for training are important problems of existing brain-computer interface (BCI) systems. In this paper, we propose a hybrid framework for the BCI system that can make machine control more practical. The electrooculogram (EOG) is employed to control the machine in the left and right directions while the electroencephalogram (EEG) is employed to control the forword, no action, and complete stop motions of the machine. By using only 2-channel biosignals, the average classification accuracy of more than 95% can be achieved.
NASA Astrophysics Data System (ADS)
Sreejith, Sreevarsha; Pereverzyev, Sergiy, Jr.; Kelvin, Lee S.; Marleau, Francine R.; Haltmeier, Markus; Ebner, Judith; Bland-Hawthorn, Joss; Driver, Simon P.; Graham, Alister W.; Holwerda, Benne W.; Hopkins, Andrew M.; Liske, Jochen; Loveday, Jon; Moffett, Amanda J.; Pimbblet, Kevin A.; Taylor, Edward N.; Wang, Lingyu; Wright, Angus H.
2018-03-01
We apply four statistical learning methods to a sample of 7941 galaxies (z < 0.06) from the Galaxy And Mass Assembly survey to test the feasibility of using automated algorithms to classify galaxies. Using 10 features measured for each galaxy (sizes, colours, shape parameters, and stellar mass), we apply the techniques of Support Vector Machines, Classification Trees, Classification Trees with Random Forest (CTRF) and Neural Networks, and returning True Prediction Ratios (TPRs) of 75.8 per cent, 69.0 per cent, 76.2 per cent, and 76.0 per cent, respectively. Those occasions whereby all four algorithms agree with each other yet disagree with the visual classification (`unanimous disagreement') serves as a potential indicator of human error in classification, occurring in ˜ 9 per cent of ellipticals, ˜ 9 per cent of little blue spheroids, ˜ 14 per cent of early-type spirals, ˜ 21 per cent of intermediate-type spirals, and ˜ 4 per cent of late-type spirals and irregulars. We observe that the choice of parameters rather than that of algorithms is more crucial in determining classification accuracy. Due to its simplicity in formulation and implementation, we recommend the CTRF algorithm for classifying future galaxy data sets. Adopting the CTRF algorithm, the TPRs of the five galaxy types are : E, 70.1 per cent; LBS, 75.6 per cent; S0-Sa, 63.6 per cent; Sab-Scd, 56.4 per cent, and Sd-Irr, 88.9 per cent. Further, we train a binary classifier using this CTRF algorithm that divides galaxies into spheroid-dominated (E, LBS, and S0-Sa) and disc-dominated (Sab-Scd and Sd-Irr), achieving an overall accuracy of 89.8 per cent. This translates into an accuracy of 84.9 per cent for spheroid-dominated systems and 92.5 per cent for disc-dominated systems.
Douglas, P K; Harris, Sam; Yuille, Alan; Cohen, Mark S
2011-05-15
Machine learning (ML) has become a popular tool for mining functional neuroimaging data, and there are now hopes of performing such analyses efficiently in real-time. Towards this goal, we compared accuracy of six different ML algorithms applied to neuroimaging data of persons engaged in a bivariate task, asserting their belief or disbelief of a variety of propositional statements. We performed unsupervised dimension reduction and automated feature extraction using independent component (IC) analysis and extracted IC time courses. Optimization of classification hyperparameters across each classifier occurred prior to assessment. Maximum accuracy was achieved at 92% for Random Forest, followed by 91% for AdaBoost, 89% for Naïve Bayes, 87% for a J48 decision tree, 86% for K*, and 84% for support vector machine. For real-time decoding applications, finding a parsimonious subset of diagnostic ICs might be useful. We used a forward search technique to sequentially add ranked ICs to the feature subspace. For the current data set, we determined that approximately six ICs represented a meaningful basis set for classification. We then projected these six IC spatial maps forward onto a later scanning session within subject. We then applied the optimized ML algorithms to these new data instances, and found that classification accuracy results were reproducible. Additionally, we compared our classification method to our previously published general linear model results on this same data set. The highest ranked IC spatial maps show similarity to brain regions associated with contrasts for belief > disbelief, and disbelief < belief. Copyright © 2010 Elsevier Inc. All rights reserved.
Tang, Yunwei; Jing, Linhai; Li, Hui; Liu, Qingjie; Yan, Qi; Li, Xiuxia
2016-01-01
This study explores the ability of WorldView-2 (WV-2) imagery for bamboo mapping in a mountainous region in Sichuan Province, China. A large area of this place is covered by shadows in the image, and only a few sampled points derived were useful. In order to identify bamboos based on sparse training data, the sample size was expanded according to the reflectance of multispectral bands selected using the principal component analysis (PCA). Then, class separability based on the training data was calculated using a feature space optimization method to select the features for classification. Four regular object-based classification methods were applied based on both sets of training data. The results show that the k-nearest neighbor (k-NN) method produced the greatest accuracy. A geostatistically-weighted k-NN classifier, accounting for the spatial correlation between classes, was then applied to further increase the accuracy. It achieved 82.65% and 93.10% of the producer’s and user’s accuracies respectively for the bamboo class. The canopy densities were estimated to explain the result. This study demonstrates that the WV-2 image can be used to identify small patches of understory bamboos given limited known samples, and the resulting bamboo distribution facilitates the assessments of the habitats of giant pandas. PMID:27879661
Kumar, Surendra; Ghosh, Subhojit; Tetarway, Suhash; Sinha, Rakesh Kumar
2015-07-01
In this study, the magnitude and spatial distribution of frequency spectrum in the resting electroencephalogram (EEG) were examined to address the problem of detecting alcoholism in the cerebral motor cortex. The EEG signals were recorded from chronic alcoholic conditions (n = 20) and the control group (n = 20). Data were taken from motor cortex region and divided into five sub-bands (delta, theta, alpha, beta-1 and beta-2). Three methodologies were adopted for feature extraction: (1) absolute power, (2) relative power and (3) peak power frequency. The dimension of the extracted features is reduced by linear discrimination analysis and classified by support vector machine (SVM) and fuzzy C-mean clustering. The maximum classification accuracy (88 %) with SVM clustering was achieved with the EEG spectral features with absolute power frequency on F4 channel. Among the bands, relatively higher classification accuracy was found over theta band and beta-2 band in most of the channels when computed with the EEG features of relative power. Electrodes wise CZ, C3 and P4 were having more alteration. Considering the good classification accuracy obtained by SVM with relative band power features in most of the EEG channels of motor cortex, it can be suggested that the noninvasive automated online diagnostic system for the chronic alcoholic condition can be developed with the help of EEG signals.
Emami, Zahra; Chau, Tom
2018-06-01
Brain-computer interfaces (BCIs) allow users to operate a device or application by means of cognitive activity. This technology will ultimately be used in real-world environments which include the presence of distractors. The purpose of the study was to determine the effect of visual distractors on BCI performance. Sixteen able-bodied participants underwent neurofeedback training to achieve motor imagery-guided BCI control in an online paradigm using electroencephalography (EEG) to measure neural signals. Participants then completed two sessions of the motor imagery EEG-BCI protocol in the presence of infrequent, small visual distractors. BCI performance was determined based on classification accuracy. The presence of distractors was found to affect motor imagery-specific patterns in mu and beta power. However, the distractors did not significantly affect the BCI classification accuracy; across participants, the mean classification accuracy was 81.5 ± 14% for non-distractor trials, and 78.3 ± 17% for distractor trials. This minimal consequence suggests that the BCI was robust to distractor effects, despite motor imagery-related brain activity being attenuated amid distractors. A BCI system that mitigates distraction-related effects may improve the ease of its use and ultimately facilitate the effective translation of the technology from the lab to the home. Copyright © 2018 International Federation of Clinical Neurophysiology. Published by Elsevier B.V. All rights reserved.
Zhu, Lianzhang; Chen, Leiming; Zhao, Dehai
2017-01-01
Accurate emotion recognition from speech is important for applications like smart health care, smart entertainment, and other smart services. High accuracy emotion recognition from Chinese speech is challenging due to the complexities of the Chinese language. In this paper, we explore how to improve the accuracy of speech emotion recognition, including speech signal feature extraction and emotion classification methods. Five types of features are extracted from a speech sample: mel frequency cepstrum coefficient (MFCC), pitch, formant, short-term zero-crossing rate and short-term energy. By comparing statistical features with deep features extracted by a Deep Belief Network (DBN), we attempt to find the best features to identify the emotion status for speech. We propose a novel classification method that combines DBN and SVM (support vector machine) instead of using only one of them. In addition, a conjugate gradient method is applied to train DBN in order to speed up the training process. Gender-dependent experiments are conducted using an emotional speech database created by the Chinese Academy of Sciences. The results show that DBN features can reflect emotion status better than artificial features, and our new classification approach achieves an accuracy of 95.8%, which is higher than using either DBN or SVM separately. Results also show that DBN can work very well for small training databases if it is properly designed. PMID:28737705
Ma, Chao; Ouyang, Jihong; Chen, Hui-Ling; Zhao, Xue-Hua
2014-01-01
A novel hybrid method named SCFW-KELM, which integrates effective subtractive clustering features weighting and a fast classifier kernel-based extreme learning machine (KELM), has been introduced for the diagnosis of PD. In the proposed method, SCFW is used as a data preprocessing tool, which aims at decreasing the variance in features of the PD dataset, in order to further improve the diagnostic accuracy of the KELM classifier. The impact of the type of kernel functions on the performance of KELM has been investigated in detail. The efficiency and effectiveness of the proposed method have been rigorously evaluated against the PD dataset in terms of classification accuracy, sensitivity, specificity, area under the receiver operating characteristic (ROC) curve (AUC), f-measure, and kappa statistics value. Experimental results have demonstrated that the proposed SCFW-KELM significantly outperforms SVM-based, KNN-based, and ELM-based approaches and other methods in the literature and achieved highest classification results reported so far via 10-fold cross validation scheme, with the classification accuracy of 99.49%, the sensitivity of 100%, the specificity of 99.39%, AUC of 99.69%, the f-measure value of 0.9964, and kappa value of 0.9867. Promisingly, the proposed method might serve as a new candidate of powerful methods for the diagnosis of PD with excellent performance.
Ma, Chao; Ouyang, Jihong; Chen, Hui-Ling; Zhao, Xue-Hua
2014-01-01
A novel hybrid method named SCFW-KELM, which integrates effective subtractive clustering features weighting and a fast classifier kernel-based extreme learning machine (KELM), has been introduced for the diagnosis of PD. In the proposed method, SCFW is used as a data preprocessing tool, which aims at decreasing the variance in features of the PD dataset, in order to further improve the diagnostic accuracy of the KELM classifier. The impact of the type of kernel functions on the performance of KELM has been investigated in detail. The efficiency and effectiveness of the proposed method have been rigorously evaluated against the PD dataset in terms of classification accuracy, sensitivity, specificity, area under the receiver operating characteristic (ROC) curve (AUC), f-measure, and kappa statistics value. Experimental results have demonstrated that the proposed SCFW-KELM significantly outperforms SVM-based, KNN-based, and ELM-based approaches and other methods in the literature and achieved highest classification results reported so far via 10-fold cross validation scheme, with the classification accuracy of 99.49%, the sensitivity of 100%, the specificity of 99.39%, AUC of 99.69%, the f-measure value of 0.9964, and kappa value of 0.9867. Promisingly, the proposed method might serve as a new candidate of powerful methods for the diagnosis of PD with excellent performance. PMID:25484912
Bandeira Diniz, João Otávio; Bandeira Diniz, Pedro Henrique; Azevedo Valente, Thales Levi; Corrêa Silva, Aristófanes; de Paiva, Anselmo Cardoso; Gattass, Marcelo
2018-03-01
The processing of medical image is an important tool to assist in minimizing the degree of uncertainty of the specialist, while providing specialists with an additional source of detect and diagnosis information. Breast cancer is the most common type of cancer that affects the female population around the world. It is also the most deadly type of cancer among women. It is the second most common type of cancer among all others. The most common examination to diagnose breast cancer early is mammography. In the last decades, computational techniques have been developed with the purpose of automatically detecting structures that maybe associated with tumors in mammography examination. This work presents a computational methodology to automatically detection of mass regions in mammography by using a convolutional neural network. The materials used in this work is the DDSM database. The method proposed consists of two phases: training phase and test phase. The training phase has 2 main steps: (1) create a model to classify breast tissue into dense and non-dense (2) create a model to classify regions of breast into mass and non-mass. The test phase has 7 step: (1) preprocessing; (2) registration; (3) segmentation; (4) first reduction of false positives; (5) preprocessing of regions segmented; (6) density tissue classification (7) second reduction of false positives where regions will be classified into mass and non-mass. The proposed method achieved 95.6% of accuracy in classify non-dense breasts tissue and 97,72% accuracy in classify dense breasts. To detect regions of mass in non-dense breast, the method achieved a sensitivity value of 91.5%, and specificity value of 90.7%, with 91% accuracy. To detect regions in dense breasts, our method achieved 90.4% of sensitivity and 96.4% of specificity, with accuracy of 94.8%. According to the results achieved by CNN, we demonstrate the feasibility of using convolutional neural networks on medical image processing techniques for classification of breast tissue and mass detection. Copyright © 2018 Elsevier B.V. All rights reserved.
Estimating Classification Consistency and Accuracy for Cognitive Diagnostic Assessment
ERIC Educational Resources Information Center
Cui, Ying; Gierl, Mark J.; Chang, Hua-Hua
2012-01-01
This article introduces procedures for the computation and asymptotic statistical inference for classification consistency and accuracy indices specifically designed for cognitive diagnostic assessments. The new classification indices can be used as important indicators of the reliability and validity of classification results produced by…
Keihaninejad, Shiva; Heckemann, Rolf A.; Gousias, Ioannis S.; Hajnal, Joseph V.; Duncan, John S.; Aljabar, Paul; Rueckert, Daniel; Hammers, Alexander
2012-01-01
Brain images contain information suitable for automatically sorting subjects into categories such as healthy controls and patients. We sought to identify morphometric criteria for distinguishing controls (n = 28) from patients with unilateral temporal lobe epilepsy (TLE), 60 with and 20 without hippocampal atrophy (TLE-HA and TLE-N, respectively), and for determining the presumed side of seizure onset. The framework employs multi-atlas segmentation to estimate the volumes of 83 brain structures. A kernel-based separability criterion was then used to identify structures whose volumes discriminate between the groups. Next, we applied support vector machines (SVM) to the selected set for classification on the basis of volumes. We also computed pairwise similarities between all subjects and used spectral analysis to convert these into per-subject features. SVM was again applied to these feature data. After training on a subgroup, all TLE-HA patients were correctly distinguished from controls, achieving an accuracy of 96 ± 2% in both classification schemes. For TLE-N patients, the accuracy was 86 ± 2% based on structural volumes and 91 ± 3% using spectral analysis. Structures discriminating between patients and controls were mainly localized ipsilaterally to the presumed seizure focus. For the TLE-HA group, they were mainly in the temporal lobe; for the TLE-N group they included orbitofrontal regions, as well as the ipsilateral substantia nigra. Correct lateralization of the presumed seizure onset zone was achieved using hippocampi and parahippocampal gyri in all TLE-HA patients using either classification scheme; in the TLE-N patients, lateralization was accurate based on structural volumes in 86 ± 4%, and in 94 ± 4% with the spectral analysis approach. Unilateral TLE has imaging features that can be identified automatically, even when they are invisible to human experts. Such morphometric image features may serve as classification and lateralization criteria. The technique also detects unsuspected distinguishing features like the substantia nigra, warranting further study. PMID:22523539
The effect of finite field size on classification and atmospheric correction
NASA Technical Reports Server (NTRS)
Kaufman, Y. J.; Fraser, R. S.
1981-01-01
The atmospheric effect on the upward radiance of sunlight scattered from the Earth-atmosphere system is strongly influenced by the contrasts between fields and their sizes. For a given atmospheric turbidity, the atmospheric effect on classification of surface features is much stronger for nonuniform surfaces than for uniform surfaces. Therefore, the classification accuracy of agricultural fields and urban areas is dependent not only on the optical characteristics of the atmosphere, but also on the size of the surface do not account for the nonuniformity of the surface have only a slight effect on the classification accuracy; in other cases the classification accuracy descreases. The radiances above finite fields were computed to simulate radiances measured by a satellite. A simulation case including 11 agricultural fields and four natural fields (water, soil, savanah, and forest) was used to test the effect of the size of the background reflectance and the optical thickness of the atmosphere on classification accuracy. It is concluded that new atmospheric correction methods, which take into account the finite size of the fields, have to be developed to improve significantly the classification accuracy.
A contour-based shape descriptor for biomedical image classification and retrieval
NASA Astrophysics Data System (ADS)
You, Daekeun; Antani, Sameer; Demner-Fushman, Dina; Thoma, George R.
2013-12-01
Contours, object blobs, and specific feature points are utilized to represent object shapes and extract shape descriptors that can then be used for object detection or image classification. In this research we develop a shape descriptor for biomedical image type (or, modality) classification. We adapt a feature extraction method used in optical character recognition (OCR) for character shape representation, and apply various image preprocessing methods to successfully adapt the method to our application. The proposed shape descriptor is applied to radiology images (e.g., MRI, CT, ultrasound, X-ray, etc.) to assess its usefulness for modality classification. In our experiment we compare our method with other visual descriptors such as CEDD, CLD, Tamura, and PHOG that extract color, texture, or shape information from images. The proposed method achieved the highest classification accuracy of 74.1% among all other individual descriptors in the test, and when combined with CSD (color structure descriptor) showed better performance (78.9%) than using the shape descriptor alone.
[An object-based information extraction technology for dominant tree species group types].
Tian, Tian; Fan, Wen-yi; Lu, Wei; Xiao, Xiang
2015-06-01
Information extraction for dominant tree group types is difficult in remote sensing image classification, howevers, the object-oriented classification method using high spatial resolution remote sensing data is a new method to realize the accurate type information extraction. In this paper, taking the Jiangle Forest Farm in Fujian Province as the research area, based on the Quickbird image data in 2013, the object-oriented method was adopted to identify the farmland, shrub-herbaceous plant, young afforested land, Pinus massoniana, Cunninghamia lanceolata and broad-leave tree types. Three types of classification factors including spectral, texture, and different vegetation indices were used to establish a class hierarchy. According to the different levels, membership functions and the decision tree classification rules were adopted. The results showed that the method based on the object-oriented method by using texture, spectrum and the vegetation indices achieved the classification accuracy of 91.3%, which was increased by 5.7% compared with that by only using the texture and spectrum.
Zhang, Y N
2017-01-01
Parkinson's disease (PD) is primarily diagnosed by clinical examinations, such as walking test, handwriting test, and MRI diagnostic. In this paper, we propose a machine learning based PD telediagnosis method for smartphone. Classification of PD using speech records is a challenging task owing to the fact that the classification accuracy is still lower than doctor-level. Here we demonstrate automatic classification of PD using time frequency features, stacked autoencoders (SAE), and K nearest neighbor (KNN) classifier. KNN classifier can produce promising classification results from useful representations which were learned by SAE. Empirical results show that the proposed method achieves better performance with all tested cases across classification tasks, demonstrating machine learning capable of classifying PD with a level of competence comparable to doctor. It concludes that a smartphone can therefore potentially provide low-cost PD diagnostic care. This paper also gives an implementation on browser/server system and reports the running time cost. Both advantages and disadvantages of the proposed telediagnosis system are discussed.
2017-01-01
Parkinson's disease (PD) is primarily diagnosed by clinical examinations, such as walking test, handwriting test, and MRI diagnostic. In this paper, we propose a machine learning based PD telediagnosis method for smartphone. Classification of PD using speech records is a challenging task owing to the fact that the classification accuracy is still lower than doctor-level. Here we demonstrate automatic classification of PD using time frequency features, stacked autoencoders (SAE), and K nearest neighbor (KNN) classifier. KNN classifier can produce promising classification results from useful representations which were learned by SAE. Empirical results show that the proposed method achieves better performance with all tested cases across classification tasks, demonstrating machine learning capable of classifying PD with a level of competence comparable to doctor. It concludes that a smartphone can therefore potentially provide low-cost PD diagnostic care. This paper also gives an implementation on browser/server system and reports the running time cost. Both advantages and disadvantages of the proposed telediagnosis system are discussed. PMID:29075547
A Hybrid Classification System for Heart Disease Diagnosis Based on the RFRS Method.
Liu, Xiao; Wang, Xiaoli; Su, Qiang; Zhang, Mo; Zhu, Yanhong; Wang, Qiugen; Wang, Qian
2017-01-01
Heart disease is one of the most common diseases in the world. The objective of this study is to aid the diagnosis of heart disease using a hybrid classification system based on the ReliefF and Rough Set (RFRS) method. The proposed system contains two subsystems: the RFRS feature selection system and a classification system with an ensemble classifier. The first system includes three stages: (i) data discretization, (ii) feature extraction using the ReliefF algorithm, and (iii) feature reduction using the heuristic Rough Set reduction algorithm that we developed. In the second system, an ensemble classifier is proposed based on the C4.5 classifier. The Statlog (Heart) dataset, obtained from the UCI database, was used for experiments. A maximum classification accuracy of 92.59% was achieved according to a jackknife cross-validation scheme. The results demonstrate that the performance of the proposed system is superior to the performances of previously reported classification techniques.
Yang, Xiaoyan; Chen, Longgao; Li, Yingkui; Xi, Wenjia; Chen, Longqian
2015-07-01
Land use/land cover (LULC) inventory provides an important dataset in regional planning and environmental assessment. To efficiently obtain the LULC inventory, we compared the LULC classifications based on single satellite imagery with a rule-based classification based on multi-seasonal imagery in Lianyungang City, a coastal city in China, using CBERS-02 (the 2nd China-Brazil Environmental Resource Satellites) images. The overall accuracies of the classification based on single imagery are 78.9, 82.8, and 82.0% in winter, early summer, and autumn, respectively. The rule-based classification improves the accuracy to 87.9% (kappa 0.85), suggesting that combining multi-seasonal images can considerably improve the classification accuracy over any single image-based classification. This method could also be used to analyze seasonal changes of LULC types, especially for those associated with tidal changes in coastal areas. The distribution and inventory of LULC types with an overall accuracy of 87.9% and a spatial resolution of 19.5 m can assist regional planning and environmental assessment efficiently in Lianyungang City. This rule-based classification provides a guidance to improve accuracy for coastal areas with distinct LULC temporal spectral features.
Markerless gating for lung cancer radiotherapy based on machine learning techniques
NASA Astrophysics Data System (ADS)
Lin, Tong; Li, Ruijiang; Tang, Xiaoli; Dy, Jennifer G.; Jiang, Steve B.
2009-03-01
In lung cancer radiotherapy, radiation to a mobile target can be delivered by respiratory gating, for which we need to know whether the target is inside or outside a predefined gating window at any time point during the treatment. This can be achieved by tracking one or more fiducial markers implanted inside or near the target, either fluoroscopically or electromagnetically. However, the clinical implementation of marker tracking is limited for lung cancer radiotherapy mainly due to the risk of pneumothorax. Therefore, gating without implanted fiducial markers is a promising clinical direction. We have developed several template-matching methods for fluoroscopic marker-less gating. Recently, we have modeled the gating problem as a binary pattern classification problem, in which principal component analysis (PCA) and support vector machine (SVM) are combined to perform the classification task. Following the same framework, we investigated different combinations of dimensionality reduction techniques (PCA and four nonlinear manifold learning methods) and two machine learning classification methods (artificial neural networks—ANN and SVM). Performance was evaluated on ten fluoroscopic image sequences of nine lung cancer patients. We found that among all combinations of dimensionality reduction techniques and classification methods, PCA combined with either ANN or SVM achieved a better performance than the other nonlinear manifold learning methods. ANN when combined with PCA achieves a better performance than SVM in terms of classification accuracy and recall rate, although the target coverage is similar for the two classification methods. Furthermore, the running time for both ANN and SVM with PCA is within tolerance for real-time applications. Overall, ANN combined with PCA is a better candidate than other combinations we investigated in this work for real-time gated radiotherapy.
Motion data classification on the basis of dynamic time warping with a cloud point distance measure
NASA Astrophysics Data System (ADS)
Switonski, Adam; Josinski, Henryk; Zghidi, Hafedh; Wojciechowski, Konrad
2016-06-01
The paper deals with the problem of classification of model free motion data. The nearest neighbors classifier which is based on comparison performed by Dynamic Time Warping transform with cloud point distance measure is proposed. The classification utilizes both specific gait features reflected by a movements of subsequent skeleton joints and anthropometric data. To validate proposed approach human gait identification challenge problem is taken into consideration. The motion capture database containing data of 30 different humans collected in Human Motion Laboratory of Polish-Japanese Academy of Information Technology is used. The achieved results are satisfactory, the obtained accuracy of human recognition exceeds 90%. What is more, the applied cloud point distance measure does not depend on calibration process of motion capture system which results in reliable validation.
Human Supervision of Time Critical Control Systems. Addendum
2010-02-26
signals such as electroencephalogram (EEG) and electrooculography ( EOG ). Current research has demonstrated these signals ’ ability to respond to changing...relationships often present in EEG/ EOG data; they routinely achieve classification accuracy greater than 80%. However, the discrete output of these...present data there were seven EEG and EOG signals recorded, thus, ICA assumes each were a mixture of seven independent components (Stone, 2002). Some
Austin, Peter C; Lee, Douglas S
2011-01-01
Purpose: Classification trees are increasingly being used to classifying patients according to the presence or absence of a disease or health outcome. A limitation of classification trees is their limited predictive accuracy. In the data-mining and machine learning literature, boosting has been developed to improve classification. Boosting with classification trees iteratively grows classification trees in a sequence of reweighted datasets. In a given iteration, subjects that were misclassified in the previous iteration are weighted more highly than subjects that were correctly classified. Classifications from each of the classification trees in the sequence are combined through a weighted majority vote to produce a final classification. The authors' objective was to examine whether boosting improved the accuracy of classification trees for predicting outcomes in cardiovascular patients. Methods: We examined the utility of boosting classification trees for classifying 30-day mortality outcomes in patients hospitalized with either acute myocardial infarction or congestive heart failure. Results: Improvements in the misclassification rate using boosted classification trees were at best minor compared to when conventional classification trees were used. Minor to modest improvements to sensitivity were observed, with only a negligible reduction in specificity. For predicting cardiovascular mortality, boosted classification trees had high specificity, but low sensitivity. Conclusions: Gains in predictive accuracy for predicting cardiovascular outcomes were less impressive than gains in performance observed in the data mining literature. PMID:22254181
Muñoz-Ruiz, Miguel Ángel; Hartikainen, Päivi; Koikkalainen, Juha; Wolz, Robin; Julkunen, Valtteri; Niskanen, Eini; Herukka, Sanna-Kaisa; Kivipelto, Miia; Vanninen, Ritva; Rueckert, Daniel; Liu, Yawu; Lötjönen, Jyrki; Soininen, Hilkka
2012-01-01
Background MRI is an important clinical tool for diagnosing dementia-like diseases such as Frontemporal Dementia (FTD). However there is a need to develop more accurate and standardized MRI analysis methods. Objective To compare FTD with Alzheimer’s Disease (AD) and Mild Cognitive Impairment (MCI) with three automatic MRI analysis methods - Hippocampal Volumetry (HV), Tensor-based Morphometry (TBM) and Voxel-based Morphometry (VBM), in specific regions of interest in order to determine the highest classification accuracy. Methods Thirty-seven patients with FTD, 46 patients with AD, 26 control subjects, 16 patients with progressive MCI (PMCI) and 48 patients with stable MCI (SMCI) were examined with HV, TBM for shape change, and VBM for gray matter density. We calculated the Correct Classification Rate (CCR), sensitivity (SS) and specificity (SP) between the study groups. Results We found unequivocal results differentiating controls from FTD with HV (hippocampus left side) (CCR = 0.83; SS = 0.84; SP = 0.80), with TBM (hippocampus and amygdala (CCR = 0.80/SS = 0.71/SP = 0.94), and with VBM (all the regions studied, especially in lateral ventricle frontal horn, central part and occipital horn) (CCR = 0.87/SS = 0.81/SP = 0.96). VBM achieved the highest accuracy in differentiating AD and FTD (CCR = 0.72/SS = 0.67/SP = 0.76), particularly in lateral ventricle (frontal horn, central part and occipital horn) (CCR = 0.73), whereas TBM in superior frontal gyrus also achieved a high accuracy (CCR = 0.71/SS = 0.68/SP = 0.73). TBM resulted in low accuracy (CCR = 0.62) in the differentiation of AD from FTD using all regions of interest, with similar results for HV (CCR = 0.55). Conclusion Hippocampal atrophy is present not only in AD but also in FTD. Of the methods used, VBM achieved the highest accuracy in its ability to differentiate between FTD and AD. PMID:23285078
Texture operator for snow particle classification into snowflake and graupel
NASA Astrophysics Data System (ADS)
Nurzyńska, Karolina; Kubo, Mamoru; Muramoto, Ken-ichiro
2012-11-01
In order to improve the estimation of precipitation, the coefficients of Z-R relation should be determined for each snow type. Therefore, it is necessary to identify the type of falling snow. Consequently, this research addresses a problem of snow particle classification into snowflake and graupel in an automatic manner (as these types are the most common in the study region). Having correctly classified precipitation events, it is believed that it will be possible to estimate the related parameters accurately. The automatic classification system presented here describes the images with texture operators. Some of them are well-known from the literature: first order features, co-occurrence matrix, grey-tone difference matrix, run length matrix, and local binary pattern, but also a novel approach to design simple local statistic operators is introduced. In this work the following texture operators are defined: mean histogram, min-max histogram, and mean-variance histogram. Moreover, building a feature vector, which is based on the structure created in many from mentioned algorithms is also suggested. For classification, the k-nearest neighbourhood classifier was applied. The results showed that it is possible to achieve correct classification accuracy above 80% by most of the techniques. The best result of 86.06%, was achieved for operator built from a structure achieved in the middle stage of the co-occurrence matrix calculation. Next, it was noticed that describing an image with two texture operators does not improve the classification results considerably. In the best case the correct classification efficiency was 87.89% for a pair of texture operators created from local binary pattern and structure build in a middle stage of grey-tone difference matrix calculation. This also suggests that the information gathered by each texture operator is redundant. Therefore, the principal component analysis was applied in order to remove the unnecessary information and additionally reduce the length of the feature vectors. The improvement of the correct classification efficiency for up to 100% is possible for methods: min-max histogram, texture operator built from structure achieved in a middle stage of co-occurrence matrix calculation, texture operator built from a structure achieved in a middle stage of grey-tone difference matrix creation, and texture operator based on a histogram, when the feature vector stores 99% of initial information.
NASA Technical Reports Server (NTRS)
Cibula, William G.; Nyquist, Maurice O.
1987-01-01
An unsupervised computer classification of vegetation/landcover of Olympic National Park and surrounding environs was initially carried out using four bands of Landsat MSS data. The primary objective of the project was to derive a level of landcover classifications useful for park management applications while maintaining an acceptably high level of classification accuracy. Initially, nine generalized vegetation/landcover classes were derived. Overall classification accuracy was 91.7 percent. In an attempt to refine the level of classification, a geographic information system (GIS) approach was employed. Topographic data and watershed boundaries (inferred precipitation/temperature) data were registered with the Landsat MSS data. The resultant boolean operations yielded 21 vegetation/landcover classes while maintaining the same level of classification accuracy. The final classification provided much better identification and location of the major forest types within the park at the same high level of accuracy, and these met the project objective. This classification could now become inputs into a GIS system to help provide answers to park management coupled with other ancillary data programs such as fire management.
NASA Astrophysics Data System (ADS)
Gao, Yan; Marpu, Prashanth; Morales Manila, Luis M.
2014-11-01
This paper assesses the suitability of 8-band Worldview-2 (WV2) satellite data and object-based random forest algorithm for the classification of avocado growth stages in Mexico. We tested both pixel-based with minimum distance (MD) and maximum likelihood (MLC) and object-based with Random Forest (RF) algorithm for this task. Training samples and verification data were selected by visual interpreting the WV2 images for seven thematic classes: fully grown, middle stage, and early stage of avocado crops, bare land, two types of natural forests, and water body. To examine the contribution of the four new spectral bands of WV2 sensor, all the tested classifications were carried out with and without the four new spectral bands. Classification accuracy assessment results show that object-based classification with RF algorithm obtained higher overall higher accuracy (93.06%) than pixel-based MD (69.37%) and MLC (64.03%) method. For both pixel-based and object-based methods, the classifications with the four new spectral bands (overall accuracy obtained higher accuracy than those without: overall accuracy of object-based RF classification with vs without: 93.06% vs 83.59%, pixel-based MD: 69.37% vs 67.2%, pixel-based MLC: 64.03% vs 36.05%, suggesting that the four new spectral bands in WV2 sensor contributed to the increase of the classification accuracy.
Snow mapping and land use studies in Switzerland
NASA Technical Reports Server (NTRS)
Haefner, H. (Principal Investigator)
1977-01-01
The author has identified the following significant results. A system was developed for operational snow and land use mapping, based on a supervised classification method using various classification algorithms and representation of the results in maplike form on color film with a photomation system. Land use mapping, under European conditions, was achieved with a stepwise linear discriminant analysis by using additional ratio variables. On fall images, signatures of built-up areas were often not separable from wetlands. Two different methods were tested to correlate the size of settlements and the population with an accuracy for the densely populated Swiss Plateau between +2 or -12%.
Landsat Thematic Mapper studies of land cover spatial variability related to hydrology
NASA Technical Reports Server (NTRS)
Wharton, S.; Ormsby, J.; Salomonson, V.; Mulligan, P.
1984-01-01
Past accomplishments involving remote sensing based land-cover analysis for hydrologic applications are reviewed. Ongoing research in exploiting the increased spatial, radiometric, and spectral capabilities afforded by the TM on Landsats 4 and 5 is considered. Specific studies to compare MSS and TM for urbanizing watersheds, wetlands, and floodplain mapping situations show that only a modest improvement in classification accuracy is achieved via statistical per pixel multispectral classifiers. The limitations of current approaches to multispectral classification are illustrated. The objectives, background, and progress in the development of an alternative analysis approach for defining inputs to urban hydrologic models using TM are discussed.
Kraken: ultrafast metagenomic sequence classification using exact alignments
2014-01-01
Kraken is an ultrafast and highly accurate program for assigning taxonomic labels to metagenomic DNA sequences. Previous programs designed for this task have been relatively slow and computationally expensive, forcing researchers to use faster abundance estimation programs, which only classify small subsets of metagenomic data. Using exact alignment of k-mers, Kraken achieves classification accuracy comparable to the fastest BLAST program. In its fastest mode, Kraken classifies 100 base pair reads at a rate of over 4.1 million reads per minute, 909 times faster than Megablast and 11 times faster than the abundance estimation program MetaPhlAn. Kraken is available at http://ccb.jhu.edu/software/kraken/. PMID:24580807
myBlackBox: Blackbox Mobile Cloud Systems for Personalized Unusual Event Detection.
Ahn, Junho; Han, Richard
2016-05-23
We demonstrate the feasibility of constructing a novel and practical real-world mobile cloud system, called myBlackBox, that efficiently fuses multimodal smartphone sensor data to identify and log unusual personal events in mobile users' daily lives. The system incorporates a hybrid architectural design that combines unsupervised classification of audio, accelerometer and location data with supervised joint fusion classification to achieve high accuracy, customization, convenience and scalability. We show the feasibility of myBlackBox by implementing and evaluating this end-to-end system that combines Android smartphones with cloud servers, deployed for 15 users over a one-month period.
myBlackBox: Blackbox Mobile Cloud Systems for Personalized Unusual Event Detection
Ahn, Junho; Han, Richard
2016-01-01
We demonstrate the feasibility of constructing a novel and practical real-world mobile cloud system, called myBlackBox, that efficiently fuses multimodal smartphone sensor data to identify and log unusual personal events in mobile users’ daily lives. The system incorporates a hybrid architectural design that combines unsupervised classification of audio, accelerometer and location data with supervised joint fusion classification to achieve high accuracy, customization, convenience and scalability. We show the feasibility of myBlackBox by implementing and evaluating this end-to-end system that combines Android smartphones with cloud servers, deployed for 15 users over a one-month period. PMID:27223292
NASA Astrophysics Data System (ADS)
Starkey, Andrew; Usman Ahmad, Aliyu; Hamdoun, Hassan
2017-10-01
This paper investigates the application of a novel method for classification called Feature Weighted Self Organizing Map (FWSOM) that analyses the topology information of a converged standard Self Organizing Map (SOM) to automatically guide the selection of important inputs during training for improved classification of data with redundant inputs, examined against two traditional approaches namely neural networks and Support Vector Machines (SVM) for the classification of EEG data as presented in previous work. In particular, the novel method looks to identify the features that are important for classification automatically, and in this way the important features can be used to improve the diagnostic ability of any of the above methods. The paper presents the results and shows how the automated identification of the important features successfully identified the important features in the dataset and how this results in an improvement of the classification results for all methods apart from linear discriminatory methods which cannot separate the underlying nonlinear relationship in the data. The FWSOM in addition to achieving higher classification accuracy has given insights into what features are important in the classification of each class (left and right-hand movements), and these are corroborated by already published work in this area.
Geng, Xiangfei; Xu, Junhai; Liu, Baolin; Shi, Yonggang
2018-01-01
Major depressive disorder (MDD) is a mental disorder characterized by at least 2 weeks of low mood, which is present across most situations. Diagnosis of MDD using rest-state functional magnetic resonance imaging (fMRI) data faces many challenges due to the high dimensionality, small samples, noisy and individual variability. To our best knowledge, no studies aim at classification with effective connectivity and functional connectivity measures between MDD patients and healthy controls. In this study, we performed a data-driving classification analysis using the whole brain connectivity measures which included the functional connectivity from two brain templates and effective connectivity measures created by the default mode network (DMN), dorsal attention network (DAN), frontal-parietal network (FPN), and silence network (SN). Effective connectivity measures were extracted using spectral Dynamic Causal Modeling (spDCM) and transformed into a vectorial feature space. Linear Support Vector Machine (linear SVM), non-linear SVM, k-Nearest Neighbor (KNN), and Logistic Regression (LR) were used as the classifiers to identify the differences between MDD patients and healthy controls. Our results showed that the highest accuracy achieved 91.67% (p < 0.0001) when using 19 effective connections and 89.36% when using 6,650 functional connections. The functional connections with high discriminative power were mainly located within or across the whole brain resting-state networks while the discriminative effective connections located in several specific regions, such as posterior cingulate cortex (PCC), ventromedial prefrontal cortex (vmPFC), dorsal cingulate cortex (dACC), and inferior parietal lobes (IPL). To further compare the discriminative power of functional connections and effective connections, a classification analysis only using the functional connections from those four networks was conducted and the highest accuracy achieved 78.33% (p < 0.0001). Our study demonstrated that the effective connectivity measures might play a more important role than functional connectivity in exploring the alterations between patients and health controls and afford a better mechanistic interpretability. Moreover, our results showed a diagnostic potential of the effective connectivity for the diagnosis of MDD patients with high accuracies allowing for earlier prevention or intervention. PMID:29515348
Efficient use of unlabeled data for protein sequence classification: a comparative study.
Kuksa, Pavel; Huang, Pai-Hsi; Pavlovic, Vladimir
2009-04-29
Recent studies in computational primary protein sequence analysis have leveraged the power of unlabeled data. For example, predictive models based on string kernels trained on sequences known to belong to particular folds or superfamilies, the so-called labeled data set, can attain significantly improved accuracy if this data is supplemented with protein sequences that lack any class tags-the unlabeled data. In this study, we present a principled and biologically motivated computational framework that more effectively exploits the unlabeled data by only using the sequence regions that are more likely to be biologically relevant for better prediction accuracy. As overly-represented sequences in large uncurated databases may bias the estimation of computational models that rely on unlabeled data, we also propose a method to remove this bias and improve performance of the resulting classifiers. Combined with state-of-the-art string kernels, our proposed computational framework achieves very accurate semi-supervised protein remote fold and homology detection on three large unlabeled databases. It outperforms current state-of-the-art methods and exhibits significant reduction in running time. The unlabeled sequences used under the semi-supervised setting resemble the unpolished gemstones; when used as-is, they may carry unnecessary features and hence compromise the classification accuracy but once cut and polished, they improve the accuracy of the classifiers considerably.
Feature Selection Methods for Zero-Shot Learning of Neural Activity.
Caceres, Carlos A; Roos, Matthew J; Rupp, Kyle M; Milsap, Griffin; Crone, Nathan E; Wolmetz, Michael E; Ratto, Christopher R
2017-01-01
Dimensionality poses a serious challenge when making predictions from human neuroimaging data. Across imaging modalities, large pools of potential neural features (e.g., responses from particular voxels, electrodes, and temporal windows) have to be related to typically limited sets of stimuli and samples. In recent years, zero-shot prediction models have been introduced for mapping between neural signals and semantic attributes, which allows for classification of stimulus classes not explicitly included in the training set. While choices about feature selection can have a substantial impact when closed-set accuracy, open-set robustness, and runtime are competing design objectives, no systematic study of feature selection for these models has been reported. Instead, a relatively straightforward feature stability approach has been adopted and successfully applied across models and imaging modalities. To characterize the tradeoffs in feature selection for zero-shot learning, we compared correlation-based stability to several other feature selection techniques on comparable data sets from two distinct imaging modalities: functional Magnetic Resonance Imaging and Electrocorticography. While most of the feature selection methods resulted in similar zero-shot prediction accuracies and spatial/spectral patterns of selected features, there was one exception; A novel feature/attribute correlation approach was able to achieve those accuracies with far fewer features, suggesting the potential for simpler prediction models that yield high zero-shot classification accuracy.
NASA Astrophysics Data System (ADS)
Guo, Yiqing; Jia, Xiuping; Paull, David
2018-06-01
The explosive availability of remote sensing images has challenged supervised classification algorithms such as Support Vector Machines (SVM), as training samples tend to be highly limited due to the expensive and laborious task of ground truthing. The temporal correlation and spectral similarity between multitemporal images have opened up an opportunity to alleviate this problem. In this study, a SVM-based Sequential Classifier Training (SCT-SVM) approach is proposed for multitemporal remote sensing image classification. The approach leverages the classifiers of previous images to reduce the required number of training samples for the classifier training of an incoming image. For each incoming image, a rough classifier is firstly predicted based on the temporal trend of a set of previous classifiers. The predicted classifier is then fine-tuned into a more accurate position with current training samples. This approach can be applied progressively to sequential image data, with only a small number of training samples being required from each image. Experiments were conducted with Sentinel-2A multitemporal data over an agricultural area in Australia. Results showed that the proposed SCT-SVM achieved better classification accuracies compared with two state-of-the-art model transfer algorithms. When training data are insufficient, the overall classification accuracy of the incoming image was improved from 76.18% to 94.02% with the proposed SCT-SVM, compared with those obtained without the assistance from previous images. These results demonstrate that the leverage of a priori information from previous images can provide advantageous assistance for later images in multitemporal image classification.
Corn and soybean Landsat MSS classification performance as a function of scene characteristics
NASA Technical Reports Server (NTRS)
Batista, G. T.; Hixson, M. M.; Bauer, M. E.
1982-01-01
In order to fully utilize remote sensing to inventory crop production, it is important to identify the factors that affect the accuracy of Landsat classifications. The objective of this study was to investigate the effect of scene characteristics involving crop, soil, and weather variables on the accuracy of Landsat classifications of corn and soybeans. Segments sampling the U.S. Corn Belt were classified using a Gaussian maximum likelihood classifier on multitemporally registered data from two key acquisition periods. Field size had a strong effect on classification accuracy with small fields tending to have low accuracies even when the effect of mixed pixels was eliminated. Other scene characteristics accounting for variability in classification accuracy included proportions of corn and soybeans, crop diversity index, proportion of all field crops, soil drainage, slope, soil order, long-term average soybean yield, maximum yield, relative position of the segment in the Corn Belt, weather, and crop development stage.
Multimodal Classification of Alzheimer’s Disease and Mild Cognitive Impairment
Zhang, Daoqiang; Wang, Yaping; Zhou, Luping; Yuan, Hong; Shen, Dinggang
2011-01-01
Effective and accurate diagnosis of Alzheimer’s disease (AD), as well as its prodromal stage (i.e., mild cognitive impairment (MCI)), has attracted more and more attentions recently. So far, multiple biomarkers have been shown sensitive to the diagnosis of AD and MCI, i.e., structural MR imaging (MRI) for brain atrophy measurement, functional imaging (e.g., FDG-PET) for hypometabolism quantification, and cerebrospinal fluid (CSF) for quantification of specific proteins. However, most existing research focuses on only a single modality of biomarkers for diagnosis of AD and MCI, although recent studies have shown that different biomarkers may provide complementary information for diagnosis of AD and MCI. In this paper, we propose to combine three modalities of biomarkers, i.e., MRI, FDG-PET, and CSF biomarkers, to discriminate between AD (or MCI) and healthy controls, using a kernel combination method. Specifically, ADNI baseline MRI, FDG-PET, and CSF data from 51 AD patients, 99 MCI patients (including 43 MCI converters who had converted to AD within 18 months and 56 MCI non-converters who had not converted to AD within 18 months), and 52 healthy controls are used for development and validation of our proposed multimodal classification method. In particular, for each MR or FDG-PET image, 93 volumetric features are extracted from the 93 regions of interest (ROIs), automatically labeled by an atlas warping algorithm. For CSF biomarkers, their original values are directly used as features. Then, a linear support vector machine (SVM) is adopted to evaluate the classification accuracy, using a 10-fold cross-validation. As a result, for classifying AD from healthy controls, we achieve a classification accuracy of 93.2% (with a sensitivity of 93% and a specificity of 93.3%) when combining all three modalities of biomarkers, and only 86.5% when using even the best individual modality of biomarkers. Similarly, for classifying MCI from healthy controls, we achieve a classification accuracy of 76.4% (with a sensitivity of 81.8% and a specificity of 66%) for our combined method, and only 72% even using the best individual modality of biomarkers. Further analysis on MCI sensitivity of our combined method indicates that 91.5% of MCI converters and 73.4% of MCI non-converters are correctly classified. Moreover, we also evaluate the classification performance when employing a feature selection method to select the most discriminative MR and FDG-PET features. Again, our combined method shows considerably better performance, compared to the case of using an individual modality of biomarkers. PMID:21236349
A Classification of Remote Sensing Image Based on Improved Compound Kernels of Svm
NASA Astrophysics Data System (ADS)
Zhao, Jianing; Gao, Wanlin; Liu, Zili; Mou, Guifen; Lu, Lin; Yu, Lina
The accuracy of RS classification based on SVM which is developed from statistical learning theory is high under small number of train samples, which results in satisfaction of classification on RS using SVM methods. The traditional RS classification method combines visual interpretation with computer classification. The accuracy of the RS classification, however, is improved a lot based on SVM method, because it saves much labor and time which is used to interpret images and collect training samples. Kernel functions play an important part in the SVM algorithm. It uses improved compound kernel function and therefore has a higher accuracy of classification on RS images. Moreover, compound kernel improves the generalization and learning ability of the kernel.
Nationwide forestry applications program. Analysis of forest classification accuracy
NASA Technical Reports Server (NTRS)
Congalton, R. G.; Mead, R. A.; Oderwald, R. G.; Heinen, J. (Principal Investigator)
1981-01-01
The development of LANDSAT classification accuracy assessment techniques, and of a computerized system for assessing wildlife habitat from land cover maps are considered. A literature review on accuracy assessment techniques and an explanation for the techniques development under both projects are included along with listings of the computer programs. The presentations and discussions at the National Working Conference on LANDSAT Classification Accuracy are summarized. Two symposium papers which were published on the results of this project are appended.
Vessel Classification in Cosmo-Skymed SAR Data Using Hierarchical Feature Selection
NASA Astrophysics Data System (ADS)
Makedonas, A.; Theoharatos, C.; Tsagaris, V.; Anastasopoulos, V.; Costicoglou, S.
2015-04-01
SAR based ship detection and classification are important elements of maritime monitoring applications. Recently, high-resolution SAR data have opened new possibilities to researchers for achieving improved classification results. In this work, a hierarchical vessel classification procedure is presented based on a robust feature extraction and selection scheme that utilizes scale, shape and texture features in a hierarchical way. Initially, different types of feature extraction algorithms are implemented in order to form the utilized feature pool, able to represent the structure, material, orientation and other vessel type characteristics. A two-stage hierarchical feature selection algorithm is utilized next in order to be able to discriminate effectively civilian vessels into three distinct types, in COSMO-SkyMed SAR images: cargos, small ships and tankers. In our analysis, scale and shape features are utilized in order to discriminate smaller types of vessels present in the available SAR data, or shape specific vessels. Then, the most informative texture and intensity features are incorporated in order to be able to better distinguish the civilian types with high accuracy. A feature selection procedure that utilizes heuristic measures based on features' statistical characteristics, followed by an exhaustive research with feature sets formed by the most qualified features is carried out, in order to discriminate the most appropriate combination of features for the final classification. In our analysis, five COSMO-SkyMed SAR data with 2.2m x 2.2m resolution were used to analyse the detailed characteristics of these types of ships. A total of 111 ships with available AIS data were used in the classification process. The experimental results show that this method has good performance in ship classification, with an overall accuracy reaching 83%. Further investigation of additional features and proper feature selection is currently in progress.
Comparison of Machine Learning Methods for the Arterial Hypertension Diagnostics
Belo, David; Gamboa, Hugo
2017-01-01
The paper presents results of machine learning approach accuracy applied analysis of cardiac activity. The study evaluates the diagnostics possibilities of the arterial hypertension by means of the short-term heart rate variability signals. Two groups were studied: 30 relatively healthy volunteers and 40 patients suffering from the arterial hypertension of II-III degree. The following machine learning approaches were studied: linear and quadratic discriminant analysis, k-nearest neighbors, support vector machine with radial basis, decision trees, and naive Bayes classifier. Moreover, in the study, different methods of feature extraction are analyzed: statistical, spectral, wavelet, and multifractal. All in all, 53 features were investigated. Investigation results show that discriminant analysis achieves the highest classification accuracy. The suggested approach of noncorrelated feature set search achieved higher results than data set based on the principal components. PMID:28831239
NASA Astrophysics Data System (ADS)
Schudlo, Larissa C.; Chau, Tom
2015-12-01
Objective. The majority of near-infrared spectroscopy (NIRS) brain-computer interface (BCI) studies have investigated binary classification problems. Limited work has considered differentiation of more than two mental states, or multi-class differentiation of higher-level cognitive tasks using measurements outside of the anterior prefrontal cortex. Improvements in accuracies are needed to deliver effective communication with a multi-class NIRS system. We investigated the feasibility of a ternary NIRS-BCI that supports mental states corresponding to verbal fluency task (VFT) performance, Stroop task performance, and unconstrained rest using prefrontal and parietal measurements. Approach. Prefrontal and parietal NIRS signals were acquired from 11 able-bodied adults during rest and performance of the VFT or Stroop task. Classification was performed offline using bagging with a linear discriminant base classifier trained on a 10 dimensional feature set. Main results. VFT, Stroop task and rest were classified at an average accuracy of 71.7% ± 7.9%. The ternary classification system provided a statistically significant improvement in information transfer rate relative to a binary system controlled by either mental task (0.87 ± 0.35 bits/min versus 0.73 ± 0.24 bits/min). Significance. These results suggest that effective communication can be achieved with a ternary NIRS-BCI that supports VFT, Stroop task and rest via measurements from the frontal and parietal cortices. Further development of such a system is warranted. Accurate ternary classification can enhance communication rates offered by NIRS-BCIs, improving the practicality of this technology.
A Machine Learning-based Method for Question Type Classification in Biomedical Question Answering.
Sarrouti, Mourad; Ouatik El Alaoui, Said
2017-05-18
Biomedical question type classification is one of the important components of an automatic biomedical question answering system. The performance of the latter depends directly on the performance of its biomedical question type classification system, which consists of assigning a category to each question in order to determine the appropriate answer extraction algorithm. This study aims to automatically classify biomedical questions into one of the four categories: (1) yes/no, (2) factoid, (3) list, and (4) summary. In this paper, we propose a biomedical question type classification method based on machine learning approaches to automatically assign a category to a biomedical question. First, we extract features from biomedical questions using the proposed handcrafted lexico-syntactic patterns. Then, we feed these features for machine-learning algorithms. Finally, the class label is predicted using the trained classifiers. Experimental evaluations performed on large standard annotated datasets of biomedical questions, provided by the BioASQ challenge, demonstrated that our method exhibits significant improved performance when compared to four baseline systems. The proposed method achieves a roughly 10-point increase over the best baseline in terms of accuracy. Moreover, the obtained results show that using handcrafted lexico-syntactic patterns as features' provider of support vector machine (SVM) lead to the highest accuracy of 89.40 %. The proposed method can automatically classify BioASQ questions into one of the four categories: yes/no, factoid, list, and summary. Furthermore, the results demonstrated that our method produced the best classification performance compared to four baseline systems.
Breast cancer Ki67 expression preoperative discrimination by DCE-MRI radiomics features
NASA Astrophysics Data System (ADS)
Ma, Wenjuan; Ji, Yu; Qin, Zhuanping; Guo, Xinpeng; Jian, Xiqi; Liu, Peifang
2018-02-01
To investigate whether quantitative radiomics features extracted from dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) are associated with Ki67 expression of breast cancer. In this institutional review board approved retrospective study, we collected 377 cases Chinese women who were diagnosed with invasive breast cancer in 2015. This cohort included 53 low-Ki67 expression (Ki67 proliferation index less than 14%) and 324 cases with high-Ki67 expression (Ki67 proliferation index more than 14%). A binary-classification of low- vs. high- Ki67 expression was performed. A set of 52 quantitative radiomics features, including morphological, gray scale statistic, and texture features, were extracted from the segmented lesion area. Three most common machine learning classification methods, including Naive Bayes, k-Nearest Neighbor and support vector machine with Gaussian kernel, were employed for the classification and the least absolute shrink age and selection operator (LASSO) method was used to select most predictive features set for the classifiers. Classification performance was evaluated by the area under receiver operating characteristic curve (AUC), accuracy, sensitivity and specificity. The model that used Naive Bayes classification method achieved the best performance than the other two methods, yielding 0.773 AUC value, 0.757 accuracy, 0.777 sensitivity and 0.769 specificity. Our study showed that quantitative radiomics imaging features of breast tumor extracted from DCE-MRI are associated with breast cancer Ki67 expression. Future larger studies are needed in order to further evaluate the findings.
NASA Technical Reports Server (NTRS)
Hoffbeck, Joseph P.; Landgrebe, David A.
1994-01-01
Many analysis algorithms for high-dimensional remote sensing data require that the remotely sensed radiance spectra be transformed to approximate reflectance to allow comparison with a library of laboratory reflectance spectra. In maximum likelihood classification, however, the remotely sensed spectra are compared to training samples, thus a transformation to reflectance may or may not be helpful. The effect of several radiance-to-reflectance transformations on maximum likelihood classification accuracy is investigated in this paper. We show that the empirical line approach, LOWTRAN7, flat-field correction, single spectrum method, and internal average reflectance are all non-singular affine transformations, and that non-singular affine transformations have no effect on discriminant analysis feature extraction and maximum likelihood classification accuracy. (An affine transformation is a linear transformation with an optional offset.) Since the Atmosphere Removal Program (ATREM) and the log residue method are not affine transformations, experiments with Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data were conducted to determine the effect of these transformations on maximum likelihood classification accuracy. The average classification accuracy of the data transformed by ATREM and the log residue method was slightly less than the accuracy of the original radiance data. Since the radiance-to-reflectance transformations allow direct comparison of remotely sensed spectra with laboratory reflectance spectra, they can be quite useful in labeling the training samples required by maximum likelihood classification, but these transformations have only a slight effect or no effect at all on discriminant analysis and maximum likelihood classification accuracy.
Hyperspectral recognition of processing tomato early blight based on GA and SVM
NASA Astrophysics Data System (ADS)
Yin, Xiaojun; Zhao, SiFeng
2013-03-01
Processing tomato early blight seriously affect the yield and quality of its.Determine the leaves spectrum of different disease severity level of processing tomato early blight.We take the sensitive bands of processing tomato early blight as support vector machine input vector.Through the genetic algorithm(GA) to optimize the parameters of SVM, We could recognize different disease severity level of processing tomato early blight.The result show:the sensitive bands of different disease severity levels of processing tomato early blight is 628-643nm and 689-692nm.The sensitive bands are as the GA and SVM input vector.We get the best penalty parameters is 0.129 and kernel function parameters is 3.479.We make classification training and testing by polynomial nuclear,radial basis function nuclear,Sigmoid nuclear.The best classification model is the radial basis function nuclear of SVM. Training accuracy is 84.615%,Testing accuracy is 80.681%.It is combined GA and SVM to achieve multi-classification of processing tomato early blight.It is provided the technical support of prediction processing tomato early blight occurrence, development and diffusion rule in large areas.
Chen, Zhiru; Hong, Wenxue
2016-02-01
Considering the low accuracy of prediction in the positive samples and poor overall classification effects caused by unbalanced sample data of MicroRNA (miRNA) target, we proposes a support vector machine (SVM)-integration of under-sampling and weight (IUSM) algorithm in this paper, an under-sampling based on the ensemble learning algorithm. The algorithm adopts SVM as learning algorithm and AdaBoost as integration framework, and embeds clustering-based under-sampling into the iterative process, aiming at reducing the degree of unbalanced distribution of positive and negative samples. Meanwhile, in the process of adaptive weight adjustment of the samples, the SVM-IUSM algorithm eliminates the abnormal ones in negative samples with robust sample weights smoothing mechanism so as to avoid over-learning. Finally, the prediction of miRNA target integrated classifier is achieved with the combination of multiple weak classifiers through the voting mechanism. The experiment revealed that the SVM-IUSW, compared with other algorithms on unbalanced dataset collection, could not only improve the accuracy of positive targets and the overall effect of classification, but also enhance the generalization ability of miRNA target classifier.
Skin Lesion Analysis towards Melanoma Detection Using Deep Learning Network.
Li, Yuexiang; Shen, Linlin
2018-02-11
Skin lesions are a severe disease globally. Early detection of melanoma in dermoscopy images significantly increases the survival rate. However, the accurate recognition of melanoma is extremely challenging due to the following reasons: low contrast between lesions and skin, visual similarity between melanoma and non-melanoma lesions, etc. Hence, reliable automatic detection of skin tumors is very useful to increase the accuracy and efficiency of pathologists. In this paper, we proposed two deep learning methods to address three main tasks emerging in the area of skin lesion image processing, i.e., lesion segmentation (task 1), lesion dermoscopic feature extraction (task 2) and lesion classification (task 3). A deep learning framework consisting of two fully convolutional residual networks (FCRN) is proposed to simultaneously produce the segmentation result and the coarse classification result. A lesion index calculation unit (LICU) is developed to refine the coarse classification results by calculating the distance heat-map. A straight-forward CNN is proposed for the dermoscopic feature extraction task. The proposed deep learning frameworks were evaluated on the ISIC 2017 dataset. Experimental results show the promising accuracies of our frameworks, i.e., 0.753 for task 1, 0.848 for task 2 and 0.912 for task 3 were achieved.
A novel application of deep learning for single-lead ECG classification.
Mathews, Sherin M; Kambhamettu, Chandra; Barner, Kenneth E
2018-06-04
Detecting and classifying cardiac arrhythmias is critical to the diagnosis of patients with cardiac abnormalities. In this paper, a novel approach based on deep learning methodology is proposed for the classification of single-lead electrocardiogram (ECG) signals. We demonstrate the application of the Restricted Boltzmann Machine (RBM) and deep belief networks (DBN) for ECG classification following detection of ventricular and supraventricular heartbeats using single-lead ECG. The effectiveness of this proposed algorithm is illustrated using real ECG signals from the widely-used MIT-BIH database. Simulation results demonstrate that with a suitable choice of parameters, RBM and DBN can achieve high average recognition accuracies of ventricular ectopic beats (93.63%) and of supraventricular ectopic beats (95.57%) at a low sampling rate of 114 Hz. Experimental results indicate that classifiers built into this deep learning-based framework achieved state-of-the art performance models at lower sampling rates and simple features when compared to traditional methods. Further, employing features extracted at a sampling rate of 114 Hz when combined with deep learning provided enough discriminatory power for the classification task. This performance is comparable to that of traditional methods and uses a much lower sampling rate and simpler features. Thus, our proposed deep neural network algorithm demonstrates that deep learning-based methods offer accurate ECG classification and could potentially be extended to other physiological signal classifications, such as those in arterial blood pressure (ABP), nerve conduction (EMG), and heart rate variability (HRV) studies. Copyright © 2018. Published by Elsevier Ltd.
Multi-region analysis of longitudinal FDG-PET for the classification of Alzheimer’s disease
Gray, Katherine R.; Wolz, Robin; Heckemann, Rolf A.; Aljabar, Paul; Hammers, Alexander; Rueckert, Daniel
2012-01-01
Imaging biomarkers for Alzheimer’s disease are desirable for improved diagnosis and monitoring, as well as drug discovery. Automated image-based classification of individual patients could provide valuable diagnostic support for clinicians, when considered alongside cognitive assessment scores. We investigate the value of combining cross-sectional and longitudinal multi-region FDG-PET information for classification, using clinical and imaging data from the Alzheimer’s Disease Neuroimaging Initiative. Whole-brain segmentations into 83 anatomically defined regions were automatically generated for baseline and 12-month FDG-PET images. Regional signal intensities were extracted at each timepoint, as well as changes in signal intensity over the follow-up period. Features were provided to a support vector machine classifier. By combining 12-month signal intensities and changes over 12 months, we achieve significantly increased classification performance compared with using any of the three feature sets independently. Based on this combined feature set, we report classification accuracies of 88% between patients with Alzheimer’s disease and elderly healthy controls, and 65% between patients with stable mild cognitive impairment and those who subsequently progressed to Alzheimer’s disease. We demonstrate that information extracted from serial FDG-PET through regional analysis can be used to achieve state-of-the-art classification of diagnostic groups in a realistic multi-centre setting. This finding may be usefully applied in the diagnosis of Alzheimer’s disease, predicting disease course in individuals with mild cognitive impairment, and in the selection of participants for clinical trials. PMID:22236449
Derivation of an artificial gene to improve classification accuracy upon gene selection.
Seo, Minseok; Oh, Sejong
2012-02-01
Classification analysis has been developed continuously since 1936. This research field has advanced as a result of development of classifiers such as KNN, ANN, and SVM, as well as through data preprocessing areas. Feature (gene) selection is required for very high dimensional data such as microarray before classification work. The goal of feature selection is to choose a subset of informative features that reduces processing time and provides higher classification accuracy. In this study, we devised a method of artificial gene making (AGM) for microarray data to improve classification accuracy. Our artificial gene was derived from a whole microarray dataset, and combined with a result of gene selection for classification analysis. We experimentally confirmed a clear improvement of classification accuracy after inserting artificial gene. Our artificial gene worked well for popular feature (gene) selection algorithms and classifiers. The proposed approach can be applied to any type of high dimensional dataset. Copyright © 2011 Elsevier Ltd. All rights reserved.
SVM-RFE based feature selection and Taguchi parameters optimization for multiclass SVM classifier.
Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W M; Li, R K; Jiang, Bo-Ru
2014-01-01
Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.
SVM-RFE Based Feature Selection and Taguchi Parameters Optimization for Multiclass SVM Classifier
Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W. M.; Li, R. K.; Jiang, Bo-Ru
2014-01-01
Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases. PMID:25295306
Deep-learning-based classification of FDG-PET data for Alzheimer's disease categories
NASA Astrophysics Data System (ADS)
Singh, Shibani; Srivastava, Anant; Mi, Liang; Caselli, Richard J.; Chen, Kewei; Goradia, Dhruman; Reiman, Eric M.; Wang, Yalin
2017-11-01
Fluorodeoxyglucose (FDG) positron emission tomography (PET) measures the decline in the regional cerebral metabolic rate for glucose, offering a reliable metabolic biomarker even on presymptomatic Alzheimer's disease (AD) patients. PET scans provide functional information that is unique and unavailable using other types of imaging. However, the computational efficacy of FDG-PET data alone, for the classification of various Alzheimers Diagnostic categories, has not been well studied. This motivates us to correctly discriminate various AD Diagnostic categories using FDG-PET data. Deep learning has improved state-of-the-art classification accuracies in the areas of speech, signal, image, video, text mining and recognition. We propose novel methods that involve probabilistic principal component analysis on max-pooled data and mean-pooled data for dimensionality reduction, and multilayer feed forward neural network which performs binary classification. Our experimental dataset consists of baseline data of subjects including 186 cognitively unimpaired (CU) subjects, 336 mild cognitive impairment (MCI) subjects with 158 Late MCI and 178 Early MCI, and 146 AD patients from Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. We measured F1-measure, precision, recall, negative and positive predictive values with a 10-fold cross validation scheme. Our results indicate that our designed classifiers achieve competitive results while max pooling achieves better classification performance compared to mean-pooled features. Our deep model based research may advance FDG-PET analysis by demonstrating their potential as an effective imaging biomarker of AD.
NASA Astrophysics Data System (ADS)
Lee, Haeil; Lee, Hansang; Park, Minseok; Kim, Junmo
2017-03-01
Lung cancer is the most common cause of cancer-related death. To diagnose lung cancers in early stages, numerous studies and approaches have been developed for cancer screening with computed tomography (CT) imaging. In recent years, convolutional neural networks (CNN) have become one of the most common and reliable techniques in computer aided detection (CADe) and diagnosis (CADx) by achieving state-of-the-art-level performances for various tasks. In this study, we propose a CNN classification system for false positive reduction of initially detected lung nodule candidates. First, image patches of lung nodule candidates are extracted from CT scans to train a CNN classifier. To reflect the volumetric contextual information of lung nodules to 2D image patch, we propose a weighted average image patch (WAIP) generation by averaging multiple slice images of lung nodule candidates. Moreover, to emphasize central slices of lung nodules, slice images are locally weighted according to Gaussian distribution and averaged to generate the 2D WAIP. With these extracted patches, 2D CNN is trained to achieve the classification of WAIPs of lung nodule candidates into positive and negative labels. We used LUNA 2016 public challenge database to validate the performance of our approach for false positive reduction in lung CT nodule classification. Experiments show our approach improves the classification accuracy of lung nodules compared to the baseline 2D CNN with patches from single slice image.
Unsupervised classification of variable stars
NASA Astrophysics Data System (ADS)
Valenzuela, Lucas; Pichara, Karim
2018-03-01
During the past 10 years, a considerable amount of effort has been made to develop algorithms for automatic classification of variable stars. That has been primarily achieved by applying machine learning methods to photometric data sets where objects are represented as light curves. Classifiers require training sets to learn the underlying patterns that allow the separation among classes. Unfortunately, building training sets is an expensive process that demands a lot of human efforts. Every time data come from new surveys; the only available training instances are the ones that have a cross-match with previously labelled objects, consequently generating insufficient training sets compared with the large amounts of unlabelled sources. In this work, we present an algorithm that performs unsupervised classification of variable stars, relying only on the similarity among light curves. We tackle the unsupervised classification problem by proposing an untraditional approach. Instead of trying to match classes of stars with clusters found by a clustering algorithm, we propose a query-based method where astronomers can find groups of variable stars ranked by similarity. We also develop a fast similarity function specific for light curves, based on a novel data structure that allows scaling the search over the entire data set of unlabelled objects. Experiments show that our unsupervised model achieves high accuracy in the classification of different types of variable stars and that the proposed algorithm scales up to massive amounts of light curves.
Classification of large-scale fundus image data sets: a cloud-computing framework.
Roychowdhury, Sohini
2016-08-01
Large medical image data sets with high dimensionality require substantial amount of computation time for data creation and data processing. This paper presents a novel generalized method that finds optimal image-based feature sets that reduce computational time complexity while maximizing overall classification accuracy for detection of diabetic retinopathy (DR). First, region-based and pixel-based features are extracted from fundus images for classification of DR lesions and vessel-like structures. Next, feature ranking strategies are used to distinguish the optimal classification feature sets. DR lesion and vessel classification accuracies are computed using the boosted decision tree and decision forest classifiers in the Microsoft Azure Machine Learning Studio platform, respectively. For images from the DIARETDB1 data set, 40 of its highest-ranked features are used to classify four DR lesion types with an average classification accuracy of 90.1% in 792 seconds. Also, for classification of red lesion regions and hemorrhages from microaneurysms, accuracies of 85% and 72% are observed, respectively. For images from STARE data set, 40 high-ranked features can classify minor blood vessels with an accuracy of 83.5% in 326 seconds. Such cloud-based fundus image analysis systems can significantly enhance the borderline classification performances in automated screening systems.
A Nonparametric Approach to Estimate Classification Accuracy and Consistency
ERIC Educational Resources Information Center
Lathrop, Quinn N.; Cheng, Ying
2014-01-01
When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…
NASA Astrophysics Data System (ADS)
Löw, Fabian; Schorcht, Gunther; Michel, Ulrich; Dech, Stefan; Conrad, Christopher
2012-10-01
Accurate crop identification and crop area estimation are important for studies on irrigated agricultural systems, yield and water demand modeling, and agrarian policy development. In this study a novel combination of Random Forest (RF) and Support Vector Machine (SVM) classifiers is presented that (i) enhances crop classification accuracy and (ii) provides spatial information on map uncertainty. The methodology was implemented over four distinct irrigated sites in Middle Asia using RapidEye time series data. The RF feature importance statistics was used as feature-selection strategy for the SVM to assess possible negative effects on classification accuracy caused by an oversized feature space. The results of the individual RF and SVM classifications were combined with rules based on posterior classification probability and estimates of classification probability entropy. SVM classification performance was increased by feature selection through RF. Further experimental results indicate that the hybrid classifier improves overall classification accuracy in comparison to the single classifiers as well as useŕs and produceŕs accuracy.
Gould, Ian C.; Shepherd, Alana M.; Laurens, Kristin R.; Cairns, Murray J.; Carr, Vaughan J.; Green, Melissa J.
2014-01-01
Heterogeneity in the structural brain abnormalities associated with schizophrenia has made identification of reliable neuroanatomical markers of the disease difficult. The use of more homogenous clinical phenotypes may improve the accuracy of predicting psychotic disorder/s on the basis of observable brain disturbances. Here we investigate the utility of cognitive subtypes of schizophrenia – ‘cognitive deficit’ and ‘cognitively spared’ – in determining whether multivariate patterns of volumetric brain differences can accurately discriminate these clinical subtypes from healthy controls, and from each other. We applied support vector machine classification to grey- and white-matter volume data from 126 schizophrenia patients previously allocated to the cognitive spared subtype, 74 cognitive deficit schizophrenia patients, and 134 healthy controls. Using this method, cognitive subtypes were distinguished from healthy controls with up to 72% accuracy. Cross-validation analyses between subtypes achieved an accuracy of 71%, suggesting that some common neuroanatomical patterns distinguish both subtypes from healthy controls. Notably, cognitive subtypes were best distinguished from one another when the sample was stratified by sex prior to classification analysis: cognitive subtype classification accuracy was relatively low (<60%) without stratification, and increased to 83% for females with sex stratification. Distinct neuroanatomical patterns predicted cognitive subtype status in each sex: sex-specific multivariate patterns did not predict cognitive subtype status in the other sex above chance, and weight map analyses demonstrated negative correlations between the spatial patterns of weights underlying classification for each sex. These results suggest that in typical mixed-sex samples of schizophrenia patients, the volumetric brain differences between cognitive subtypes are relatively minor in contrast to the large common disease-associated changes. Volumetric differences that distinguish between cognitive subtypes on a case-by-case basis appear to occur in a sex-specific manner that is consistent with previous evidence of disrupted relationships between brain structure and cognition in male, but not female, schizophrenia patients. Consideration of sex-specific differences in brain organization is thus likely to assist future attempts to distinguish subgroups of schizophrenia patients on the basis of neuroanatomical features. PMID:25379435
Empirical Analysis and Automated Classification of Security Bug Reports
NASA Technical Reports Server (NTRS)
Tyo, Jacob P.
2016-01-01
With the ever expanding amount of sensitive data being placed into computer systems, the need for effective cybersecurity is of utmost importance. However, there is a shortage of detailed empirical studies of security vulnerabilities from which cybersecurity metrics and best practices could be determined. This thesis has two main research goals: (1) to explore the distribution and characteristics of security vulnerabilities based on the information provided in bug tracking systems and (2) to develop data analytics approaches for automatic classification of bug reports as security or non-security related. This work is based on using three NASA datasets as case studies. The empirical analysis showed that the majority of software vulnerabilities belong only to a small number of types. Addressing these types of vulnerabilities will consequently lead to cost efficient improvement of software security. Since this analysis requires labeling of each bug report in the bug tracking system, we explored using machine learning to automate the classification of each bug report as a security or non-security related (two-class classification), as well as each security related bug report as specific security type (multiclass classification). In addition to using supervised machine learning algorithms, a novel unsupervised machine learning approach is proposed. An ac- curacy of 92%, recall of 96%, precision of 92%, probability of false alarm of 4%, F-Score of 81% and G-Score of 90% were the best results achieved during two-class classification. Furthermore, an accuracy of 80%, recall of 80%, precision of 94%, and F-score of 85% were the best results achieved during multiclass classification.
Information extraction with object based support vector machines and vegetation indices
NASA Astrophysics Data System (ADS)
Ustuner, Mustafa; Abdikan, Saygin; Balik Sanli, Fusun
2016-07-01
Information extraction through remote sensing data is important for policy and decision makers as extracted information provide base layers for many application of real world. Classification of remotely sensed data is the one of the most common methods of extracting information however it is still a challenging issue because several factors are affecting the accuracy of the classification. Resolution of the imagery, number and homogeneity of land cover classes, purity of training data and characteristic of adopted classifiers are just some of these challenging factors. Object based image classification has some superiority than pixel based classification for high resolution images since it uses geometry and structure information besides spectral information. Vegetation indices are also commonly used for the classification process since it provides additional spectral information for vegetation, forestry and agricultural areas. In this study, the impacts of the Normalized Difference Vegetation Index (NDVI) and Normalized Difference Red Edge Index (NDRE) on the classification accuracy of RapidEye imagery were investigated. Object based Support Vector Machines were implemented for the classification of crop types for the study area located in Aegean region of Turkey. Results demonstrated that the incorporation of NDRE increase the classification accuracy from 79,96% to 86,80% as overall accuracy, however NDVI decrease the classification accuracy from 79,96% to 78,90%. Moreover it is proven than object based classification with RapidEye data give promising results for crop type mapping and analysis.
Bolin, Jocelyn Holden; Finch, W Holmes
2014-01-01
Statistical classification of phenomena into observed groups is very common in the social and behavioral sciences. Statistical classification methods, however, are affected by the characteristics of the data under study. Statistical classification can be further complicated by initial misclassification of the observed groups. The purpose of this study is to investigate the impact of initial training data misclassification on several statistical classification and data mining techniques. Misclassification conditions in the three group case will be simulated and results will be presented in terms of overall as well as subgroup classification accuracy. Results show decreased classification accuracy as sample size, group separation and group size ratio decrease and as misclassification percentage increases with random forests demonstrating the highest accuracy across conditions.
Three-Class Mammogram Classification Based on Descriptive CNN Features
Zhang, Qianni; Jadoon, Adeel
2017-01-01
In this paper, a novel classification technique for large data set of mammograms using a deep learning method is proposed. The proposed model targets a three-class classification study (normal, malignant, and benign cases). In our model we have presented two methods, namely, convolutional neural network-discrete wavelet (CNN-DW) and convolutional neural network-curvelet transform (CNN-CT). An augmented data set is generated by using mammogram patches. To enhance the contrast of mammogram images, the data set is filtered by contrast limited adaptive histogram equalization (CLAHE). In the CNN-DW method, enhanced mammogram images are decomposed as its four subbands by means of two-dimensional discrete wavelet transform (2D-DWT), while in the second method discrete curvelet transform (DCT) is used. In both methods, dense scale invariant feature (DSIFT) for all subbands is extracted. Input data matrix containing these subband features of all the mammogram patches is created that is processed as input to convolutional neural network (CNN). Softmax layer and support vector machine (SVM) layer are used to train CNN for classification. Proposed methods have been compared with existing methods in terms of accuracy rate, error rate, and various validation assessment measures. CNN-DW and CNN-CT have achieved accuracy rate of 81.83% and 83.74%, respectively. Simulation results clearly validate the significance and impact of our proposed model as compared to other well-known existing techniques. PMID:28191461
Three-Class Mammogram Classification Based on Descriptive CNN Features.
Jadoon, M Mohsin; Zhang, Qianni; Haq, Ihsan Ul; Butt, Sharjeel; Jadoon, Adeel
2017-01-01
In this paper, a novel classification technique for large data set of mammograms using a deep learning method is proposed. The proposed model targets a three-class classification study (normal, malignant, and benign cases). In our model we have presented two methods, namely, convolutional neural network-discrete wavelet (CNN-DW) and convolutional neural network-curvelet transform (CNN-CT). An augmented data set is generated by using mammogram patches. To enhance the contrast of mammogram images, the data set is filtered by contrast limited adaptive histogram equalization (CLAHE). In the CNN-DW method, enhanced mammogram images are decomposed as its four subbands by means of two-dimensional discrete wavelet transform (2D-DWT), while in the second method discrete curvelet transform (DCT) is used. In both methods, dense scale invariant feature (DSIFT) for all subbands is extracted. Input data matrix containing these subband features of all the mammogram patches is created that is processed as input to convolutional neural network (CNN). Softmax layer and support vector machine (SVM) layer are used to train CNN for classification. Proposed methods have been compared with existing methods in terms of accuracy rate, error rate, and various validation assessment measures. CNN-DW and CNN-CT have achieved accuracy rate of 81.83% and 83.74%, respectively. Simulation results clearly validate the significance and impact of our proposed model as compared to other well-known existing techniques.
Tong, Tong; Ledig, Christian; Guerrero, Ricardo; Schuh, Andreas; Koikkalainen, Juha; Tolonen, Antti; Rhodius, Hanneke; Barkhof, Frederik; Tijms, Betty; Lemstra, Afina W; Soininen, Hilkka; Remes, Anne M; Waldemar, Gunhild; Hasselbalch, Steen; Mecocci, Patrizia; Baroni, Marta; Lötjönen, Jyrki; Flier, Wiesje van der; Rueckert, Daniel
2017-01-01
Differentiating between different types of neurodegenerative diseases is not only crucial in clinical practice when treatment decisions have to be made, but also has a significant potential for the enrichment of clinical trials. The purpose of this study is to develop a classification framework for distinguishing the four most common neurodegenerative diseases, including Alzheimer's disease, frontotemporal lobe degeneration, Dementia with Lewy bodies and vascular dementia, as well as patients with subjective memory complaints. Different biomarkers including features from images (volume features, region-wise grading features) and non-imaging features (CSF measures) were extracted for each subject. In clinical practice, the prevalence of different dementia types is imbalanced, posing challenges for learning an effective classification model. Therefore, we propose the use of the RUSBoost algorithm in order to train classifiers and to handle the class imbalance training problem. Furthermore, a multi-class feature selection method based on sparsity is integrated into the proposed framework to improve the classification performance. It also provides a way for investigating the importance of different features and regions. Using a dataset of 500 subjects, the proposed framework achieved a high accuracy of 75.2% with a balanced accuracy of 69.3% for the five-class classification using ten-fold cross validation, which is significantly better than the results using support vector machine or random forest, demonstrating the feasibility of the proposed framework to support clinical decision making.
A patch-based convolutional neural network for remote sensing image classification.
Sharma, Atharva; Liu, Xiuwen; Yang, Xiaojun; Shi, Di
2017-11-01
Availability of accurate land cover information over large areas is essential to the global environment sustainability; digital classification using medium-resolution remote sensing data would provide an effective method to generate the required land cover information. However, low accuracy of existing per-pixel based classification methods for medium-resolution data is a fundamental limiting factor. While convolutional neural networks (CNNs) with deep layers have achieved unprecedented improvements in object recognition applications that rely on fine image structures, they cannot be applied directly to medium-resolution data due to lack of such fine structures. In this paper, considering the spatial relation of a pixel to its neighborhood, we propose a new deep patch-based CNN system tailored for medium-resolution remote sensing data. The system is designed by incorporating distinctive characteristics of medium-resolution data; in particular, the system computes patch-based samples from multidimensional top of atmosphere reflectance data. With a test site from the Florida Everglades area (with a size of 771 square kilometers), the proposed new system has outperformed pixel-based neural network, pixel-based CNN and patch-based neural network by 24.36%, 24.23% and 11.52%, respectively, in overall classification accuracy. By combining the proposed deep CNN and the huge collection of medium-resolution remote sensing data, we believe that much more accurate land cover datasets can be produced over large areas. Copyright © 2017 Elsevier Ltd. All rights reserved.
Classification of Clouds in Satellite Imagery Using Adaptive Fuzzy Sparse Representation
Jin, Wei; Gong, Fei; Zeng, Xingbin; Fu, Randi
2016-01-01
Automatic cloud detection and classification using satellite cloud imagery have various meteorological applications such as weather forecasting and climate monitoring. Cloud pattern analysis is one of the research hotspots recently. Since satellites sense the clouds remotely from space, and different cloud types often overlap and convert into each other, there must be some fuzziness and uncertainty in satellite cloud imagery. Satellite observation is susceptible to noises, while traditional cloud classification methods are sensitive to noises and outliers; it is hard for traditional cloud classification methods to achieve reliable results. To deal with these problems, a satellite cloud classification method using adaptive fuzzy sparse representation-based classification (AFSRC) is proposed. Firstly, by defining adaptive parameters related to attenuation rate and critical membership, an improved fuzzy membership is introduced to accommodate the fuzziness and uncertainty of satellite cloud imagery; secondly, by effective combination of the improved fuzzy membership function and sparse representation-based classification (SRC), atoms in training dictionary are optimized; finally, an adaptive fuzzy sparse representation classifier for cloud classification is proposed. Experiment results on FY-2G satellite cloud image show that, the proposed method not only improves the accuracy of cloud classification, but also has strong stability and adaptability with high computational efficiency. PMID:27999261
Li, Yanfei; Tian, Yun
2018-01-01
The development of network technology and the popularization of image capturing devices have led to a rapid increase in the number of digital images available, and it is becoming increasingly difficult to identify a desired image from among the massive number of possible images. Images usually contain rich semantic information, and people usually understand images at a high semantic level. Therefore, achieving the ability to use advanced technology to identify the emotional semantics contained in images to enable emotional semantic image classification remains an urgent issue in various industries. To this end, this study proposes an improved OCC emotion model that integrates personality and mood factors for emotional modelling to describe the emotional semantic information contained in an image. The proposed classification system integrates the k-Nearest Neighbour (KNN) algorithm with the Support Vector Machine (SVM) algorithm. The MapReduce parallel programming model was used to adapt the KNN-SVM algorithm for parallel implementation in the Hadoop cluster environment, thereby achieving emotional semantic understanding for the classification of a massive collection of images. For training and testing, 70,000 scene images were randomly selected from the SUN Database. The experimental results indicate that users with different personalities show overall consistency in their emotional understanding of the same image. For a training sample size of 50,000, the classification accuracies for different emotional categories targeted at users with different personalities were approximately 95%, and the training time was only 1/5 of that required for the corresponding algorithm with a single-node architecture. Furthermore, the speedup of the system also showed a linearly increasing tendency. Thus, the experiments achieved a good classification effect and can lay a foundation for classification in terms of additional types of emotional image semantics, thereby demonstrating the practical significance of the proposed model. PMID:29320579
Cao, Jianfang; Li, Yanfei; Tian, Yun
2018-01-01
The development of network technology and the popularization of image capturing devices have led to a rapid increase in the number of digital images available, and it is becoming increasingly difficult to identify a desired image from among the massive number of possible images. Images usually contain rich semantic information, and people usually understand images at a high semantic level. Therefore, achieving the ability to use advanced technology to identify the emotional semantics contained in images to enable emotional semantic image classification remains an urgent issue in various industries. To this end, this study proposes an improved OCC emotion model that integrates personality and mood factors for emotional modelling to describe the emotional semantic information contained in an image. The proposed classification system integrates the k-Nearest Neighbour (KNN) algorithm with the Support Vector Machine (SVM) algorithm. The MapReduce parallel programming model was used to adapt the KNN-SVM algorithm for parallel implementation in the Hadoop cluster environment, thereby achieving emotional semantic understanding for the classification of a massive collection of images. For training and testing, 70,000 scene images were randomly selected from the SUN Database. The experimental results indicate that users with different personalities show overall consistency in their emotional understanding of the same image. For a training sample size of 50,000, the classification accuracies for different emotional categories targeted at users with different personalities were approximately 95%, and the training time was only 1/5 of that required for the corresponding algorithm with a single-node architecture. Furthermore, the speedup of the system also showed a linearly increasing tendency. Thus, the experiments achieved a good classification effect and can lay a foundation for classification in terms of additional types of emotional image semantics, thereby demonstrating the practical significance of the proposed model.
Minimum distance classification in remote sensing
NASA Technical Reports Server (NTRS)
Wacker, A. G.; Landgrebe, D. A.
1972-01-01
The utilization of minimum distance classification methods in remote sensing problems, such as crop species identification, is considered. Literature concerning both minimum distance classification problems and distance measures is reviewed. Experimental results are presented for several examples. The objective of these examples is to: (a) compare the sample classification accuracy of a minimum distance classifier, with the vector classification accuracy of a maximum likelihood classifier, and (b) compare the accuracy of a parametric minimum distance classifier with that of a nonparametric one. Results show the minimum distance classifier performance is 5% to 10% better than that of the maximum likelihood classifier. The nonparametric classifier is only slightly better than the parametric version.
Özdemir, Merve Erkınay; Telatar, Ziya; Eroğul, Osman; Tunca, Yusuf
2018-05-01
Dysmorphic syndromes have different facial malformations. These malformations are significant to an early diagnosis of dysmorphic syndromes and contain distinctive information for face recognition. In this study we define the certain features of each syndrome by considering facial malformations and classify Fragile X, Hurler, Prader Willi, Down, Wolf Hirschhorn syndromes and healthy groups automatically. The reference points are marked on the face images and ratios between the points' distances are taken into consideration as features. We suggest a neural network based hierarchical decision tree structure in order to classify the syndrome types. We also implement k-nearest neighbor (k-NN) and artificial neural network (ANN) classifiers to compare classification accuracy with our hierarchical decision tree. The classification accuracy is 50, 73 and 86.7% with k-NN, ANN and hierarchical decision tree methods, respectively. Then, the same images are shown to a clinical expert who achieve a recognition rate of 46.7%. We develop an efficient system to recognize different syndrome types automatically in a simple, non-invasive imaging data, which is independent from the patient's age, sex and race at high accuracy. The promising results indicate that our method can be used for pre-diagnosis of the dysmorphic syndromes by clinical experts.
Optimal two-phase sampling design for comparing accuracies of two binary classification rules.
Xu, Huiping; Hui, Siu L; Grannis, Shaun
2014-02-10
In this paper, we consider the design for comparing the performance of two binary classification rules, for example, two record linkage algorithms or two screening tests. Statistical methods are well developed for comparing these accuracy measures when the gold standard is available for every unit in the sample, or in a two-phase study when the gold standard is ascertained only in the second phase in a subsample using a fixed sampling scheme. However, these methods do not attempt to optimize the sampling scheme to minimize the variance of the estimators of interest. In comparing the performance of two classification rules, the parameters of primary interest are the difference in sensitivities, specificities, and positive predictive values. We derived the analytic variance formulas for these parameter estimates and used them to obtain the optimal sampling design. The efficiency of the optimal sampling design is evaluated through an empirical investigation that compares the optimal sampling with simple random sampling and with proportional allocation. Results of the empirical study show that the optimal sampling design is similar for estimating the difference in sensitivities and in specificities, and both achieve a substantial amount of variance reduction with an over-sample of subjects with discordant results and under-sample of subjects with concordant results. A heuristic rule is recommended when there is no prior knowledge of individual sensitivities and specificities, or the prevalence of the true positive findings in the study population. The optimal sampling is applied to a real-world example in record linkage to evaluate the difference in classification accuracy of two matching algorithms. Copyright © 2013 John Wiley & Sons, Ltd.
Modified DCTNet for audio signals classification
NASA Astrophysics Data System (ADS)
Xian, Yin; Pu, Yunchen; Gan, Zhe; Lu, Liang; Thompson, Andrew
2016-10-01
In this paper, we investigate DCTNet for audio signal classification. Its output feature is related to Cohen's class of time-frequency distributions. We introduce the use of adaptive DCTNet (A-DCTNet) for audio signals feature extraction. The A-DCTNet applies the idea of constant-Q transform, with its center frequencies of filterbanks geometrically spaced. The A-DCTNet is adaptive to different acoustic scales, and it can better capture low frequency acoustic information that is sensitive to human audio perception than features such as Mel-frequency spectral coefficients (MFSC). We use features extracted by the A-DCTNet as input for classifiers. Experimental results show that the A-DCTNet and Recurrent Neural Networks (RNN) achieve state-of-the-art performance in bird song classification rate, and improve artist identification accuracy in music data. They demonstrate A-DCTNet's applicability to signal processing problems.
Quantum Algorithm for K-Nearest Neighbors Classification Based on the Metric of Hamming Distance
NASA Astrophysics Data System (ADS)
Ruan, Yue; Xue, Xiling; Liu, Heng; Tan, Jianing; Li, Xi
2017-11-01
K-nearest neighbors (KNN) algorithm is a common algorithm used for classification, and also a sub-routine in various complicated machine learning tasks. In this paper, we presented a quantum algorithm (QKNN) for implementing this algorithm based on the metric of Hamming distance. We put forward a quantum circuit for computing Hamming distance between testing sample and each feature vector in the training set. Taking advantage of this method, we realized a good analog for classical KNN algorithm by setting a distance threshold value t to select k - n e a r e s t neighbors. As a result, QKNN achieves O( n 3) performance which is only relevant to the dimension of feature vectors and high classification accuracy, outperforms Llyod's algorithm (Lloyd et al. 2013) and Wiebe's algorithm (Wiebe et al. 2014).
Collagen morphology and texture analysis: from statistics to classification
Mostaço-Guidolin, Leila B.; Ko, Alex C.-T.; Wang, Fei; Xiang, Bo; Hewko, Mark; Tian, Ganghong; Major, Arkady; Shiomi, Masashi; Sowa, Michael G.
2013-01-01
In this study we present an image analysis methodology capable of quantifying morphological changes in tissue collagen fibril organization caused by pathological conditions. Texture analysis based on first-order statistics (FOS) and second-order statistics such as gray level co-occurrence matrix (GLCM) was explored to extract second-harmonic generation (SHG) image features that are associated with the structural and biochemical changes of tissue collagen networks. Based on these extracted quantitative parameters, multi-group classification of SHG images was performed. With combined FOS and GLCM texture values, we achieved reliable classification of SHG collagen images acquired from atherosclerosis arteries with >90% accuracy, sensitivity and specificity. The proposed methodology can be applied to a wide range of conditions involving collagen re-modeling, such as in skin disorders, different types of fibrosis and muscular-skeletal diseases affecting ligaments and cartilage. PMID:23846580
Adetiba, Emmanuel; Olugbara, Oludayo O
2015-01-01
Lung cancer is one of the diseases responsible for a large number of cancer related death cases worldwide. The recommended standard for screening and early detection of lung cancer is the low dose computed tomography. However, many patients diagnosed die within one year, which makes it essential to find alternative approaches for screening and early detection of lung cancer. We present computational methods that can be implemented in a functional multi-genomic system for classification, screening and early detection of lung cancer victims. Samples of top ten biomarker genes previously reported to have the highest frequency of lung cancer mutations and sequences of normal biomarker genes were respectively collected from the COSMIC and NCBI databases to validate the computational methods. Experiments were performed based on the combinations of Z-curve and tetrahedron affine transforms, Histogram of Oriented Gradient (HOG), Multilayer perceptron and Gaussian Radial Basis Function (RBF) neural networks to obtain an appropriate combination of computational methods to achieve improved classification of lung cancer biomarker genes. Results show that a combination of affine transforms of Voss representation, HOG genomic features and Gaussian RBF neural network perceptibly improves classification accuracy, specificity and sensitivity of lung cancer biomarker genes as well as achieving low mean square error.
Multimodal 2D Brain Computer Interface.
Almajidy, Rand K; Boudria, Yacine; Hofmann, Ulrich G; Besio, Walter; Mankodiya, Kunal
2015-08-01
In this work we used multimodal, non-invasive brain signal recording systems, namely Near Infrared Spectroscopy (NIRS), disc electrode electroencephalography (EEG) and tripolar concentric ring electrodes (TCRE) electroencephalography (tEEG). 7 healthy subjects participated in our experiments to control a 2-D Brain Computer Interface (BCI). Four motor imagery task were performed, imagery motion of the left hand, the right hand, both hands and both feet. The signal slope (SS) of the change in oxygenated hemoglobin concentration measured by NIRS was used for feature extraction while the power spectrum density (PSD) of both EEG and tEEG in the frequency band 8-30Hz was used for feature extraction. Linear Discriminant Analysis (LDA) was used to classify different combinations of the aforementioned features. The highest classification accuracy (85.2%) was achieved by using features from all the three brain signals recording modules. The improvement in classification accuracy was highly significant (p = 0.0033) when using the multimodal signals features as compared to pure EEG features.
Fully Convolutional Network Based Shadow Extraction from GF-2 Imagery
NASA Astrophysics Data System (ADS)
Li, Z.; Cai, G.; Ren, H.
2018-04-01
There are many shadows on the high spatial resolution satellite images, especially in the urban areas. Although shadows on imagery severely affect the information extraction of land cover or land use, they provide auxiliary information for building extraction which is hard to achieve a satisfactory accuracy through image classification itself. This paper focused on the method of building shadow extraction by designing a fully convolutional network and training samples collected from GF-2 satellite imagery in the urban region of Changchun city. By means of spatial filtering and calculation of adjacent relationship along the sunlight direction, the small patches from vegetation or bridges have been eliminated from the preliminary extracted shadows. Finally, the building shadows were separated. The extracted building shadow information from the proposed method in this paper was compared with the results from the traditional object-oriented supervised classification algorihtms. It showed that the deep learning network approach can improve the accuracy to a large extent.
Accelerometer-based on-body sensor localization for health and medical monitoring applications
Vahdatpour, Alireza; Amini, Navid; Xu, Wenyao; Sarrafzadeh, Majid
2011-01-01
In this paper, we present a technique to recognize the position of sensors on the human body. Automatic on-body device localization ensures correctness and accuracy of measurements in health and medical monitoring systems. In addition, it provides opportunities to improve the performance and usability of ubiquitous devices. Our technique uses accelerometers to capture motion data to estimate the location of the device on the user’s body, using mixed supervised and unsupervised time series analysis methods. We have evaluated our technique with extensive experiments on 25 subjects. On average, our technique achieves 89% accuracy in estimating the location of devices on the body. In order to study the feasibility of classification of left limbs from right limbs (e.g., left arm vs. right arm), we performed analysis, based of which no meaningful classification was observed. Personalized ultraviolet monitoring and wireless transmission power control comprise two immediate applications of our on-body device localization approach. Such applications, along with their corresponding feasibility studies, are discussed. PMID:22347840
Shin, Jaeyoung; Kwon, Jinuk; Im, Chang-Hwan
2018-01-01
The performance of a brain-computer interface (BCI) can be enhanced by simultaneously using two or more modalities to record brain activity, which is generally referred to as a hybrid BCI. To date, many BCI researchers have tried to implement a hybrid BCI system by combining electroencephalography (EEG) and functional near-infrared spectroscopy (NIRS) to improve the overall accuracy of binary classification. However, since hybrid EEG-NIRS BCI, which will be denoted by hBCI in this paper, has not been applied to ternary classification problems, paradigms and classification strategies appropriate for ternary classification using hBCI are not well investigated. Here we propose the use of an hBCI for the classification of three brain activation patterns elicited by mental arithmetic, motor imagery, and idle state, with the aim to elevate the information transfer rate (ITR) of hBCI by increasing the number of classes while minimizing the loss of accuracy. EEG electrodes were placed over the prefrontal cortex and the central cortex, and NIRS optodes were placed only on the forehead. The ternary classification problem was decomposed into three binary classification problems using the "one-versus-one" (OVO) classification strategy to apply the filter-bank common spatial patterns filter to EEG data. A 10 × 10-fold cross validation was performed using shrinkage linear discriminant analysis (sLDA) to evaluate the average classification accuracies for EEG-BCI, NIRS-BCI, and hBCI when the meta-classification method was adopted to enhance classification accuracy. The ternary classification accuracies for EEG-BCI, NIRS-BCI, and hBCI were 76.1 ± 12.8, 64.1 ± 9.7, and 82.2 ± 10.2%, respectively. The classification accuracy of the proposed hBCI was thus significantly higher than those of the other BCIs ( p < 0.005). The average ITR for the proposed hBCI was calculated to be 4.70 ± 1.92 bits/minute, which was 34.3% higher than that reported for a previous binary hBCI study.
Selective classification for improved robustness of myoelectric control under nonideal conditions.
Scheme, Erik J; Englehart, Kevin B; Hudgins, Bernard S
2011-06-01
Recent literature in pattern recognition-based myoelectric control has highlighted a disparity between classification accuracy and the usability of upper limb prostheses. This paper suggests that the conventionally defined classification accuracy may be idealistic and may not reflect true clinical performance. Herein, a novel myoelectric control system based on a selective multiclass one-versus-one classification scheme, capable of rejecting unknown data patterns, is introduced. This scheme is shown to outperform nine other popular classifiers when compared using conventional classification accuracy as well as a form of leave-one-out analysis that may be more representative of real prosthetic use. Additionally, the classification scheme allows for real-time, independent adjustment of individual class-pair boundaries making it flexible and intuitive for clinical use.
Good Practices for Learning to Recognize Actions Using FV and VLAD.
Wu, Jianxin; Zhang, Yu; Lin, Weiyao
2016-12-01
High dimensional representations such as Fisher vectors (FV) and vectors of locally aggregated descriptors (VLAD) have shown state-of-the-art accuracy for action recognition in videos. The high dimensionality, on the other hand, also causes computational difficulties when scaling up to large-scale video data. This paper makes three lines of contributions to learning to recognize actions using high dimensional representations. First, we reviewed several existing techniques that improve upon FV or VLAD in image classification, and performed extensive empirical evaluations to assess their applicability for action recognition. Our analyses of these empirical results show that normality and bimodality are essential to achieve high accuracy. Second, we proposed a new pooling strategy for VLAD and three simple, efficient, and effective transformations for both FV and VLAD. Both proposed methods have shown higher accuracy than the original FV/VLAD method in extensive evaluations. Third, we proposed and evaluated new feature selection and compression methods for the FV and VLAD representations. This strategy uses only 4% of the storage of the original representation, but achieves comparable or even higher accuracy. Based on these contributions, we recommend a set of good practices for action recognition in videos for practitioners in this field.
Detection of Dendritic Spines Using Wavelet Packet Entropy and Fuzzy Support Vector Machine.
Wang, Shuihua; Li, Yang; Shao, Ying; Cattani, Carlo; Zhang, Yudong; Du, Sidan
2017-01-01
The morphology of dendritic spines is highly correlated with the neuron function. Therefore, it is of positive influence for the research of the dendritic spines. However, it is tried to manually label the spine types for statistical analysis. In this work, we proposed an approach based on the combination of wavelet contour analysis for the backbone detection, wavelet packet entropy, and fuzzy support vector machine for the spine classification. The experiments show that this approach is promising. The average detection accuracy of "MushRoom" achieves 97.3%, "Stubby" achieves 94.6%, and "Thin" achieves 97.2%. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Land use/cover classification in the Brazilian Amazon using satellite images.
Lu, Dengsheng; Batistella, Mateus; Li, Guiying; Moran, Emilio; Hetrick, Scott; Freitas, Corina da Costa; Dutra, Luciano Vieira; Sant'anna, Sidnei João Siqueira
2012-09-01
Land use/cover classification is one of the most important applications in remote sensing. However, mapping accurate land use/cover spatial distribution is a challenge, particularly in moist tropical regions, due to the complex biophysical environment and limitations of remote sensing data per se. This paper reviews experiments related to land use/cover classification in the Brazilian Amazon for a decade. Through comprehensive analysis of the classification results, it is concluded that spatial information inherent in remote sensing data plays an essential role in improving land use/cover classification. Incorporation of suitable textural images into multispectral bands and use of segmentation-based method are valuable ways to improve land use/cover classification, especially for high spatial resolution images. Data fusion of multi-resolution images within optical sensor data is vital for visual interpretation, but may not improve classification performance. In contrast, integration of optical and radar data did improve classification performance when the proper data fusion method was used. Of the classification algorithms available, the maximum likelihood classifier is still an important method for providing reasonably good accuracy, but nonparametric algorithms, such as classification tree analysis, has the potential to provide better results. However, they often require more time to achieve parametric optimization. Proper use of hierarchical-based methods is fundamental for developing accurate land use/cover classification, mainly from historical remotely sensed data.
Land use/cover classification in the Brazilian Amazon using satellite images
Lu, Dengsheng; Batistella, Mateus; Li, Guiying; Moran, Emilio; Hetrick, Scott; Freitas, Corina da Costa; Dutra, Luciano Vieira; Sant’Anna, Sidnei João Siqueira
2013-01-01
Land use/cover classification is one of the most important applications in remote sensing. However, mapping accurate land use/cover spatial distribution is a challenge, particularly in moist tropical regions, due to the complex biophysical environment and limitations of remote sensing data per se. This paper reviews experiments related to land use/cover classification in the Brazilian Amazon for a decade. Through comprehensive analysis of the classification results, it is concluded that spatial information inherent in remote sensing data plays an essential role in improving land use/cover classification. Incorporation of suitable textural images into multispectral bands and use of segmentation-based method are valuable ways to improve land use/cover classification, especially for high spatial resolution images. Data fusion of multi-resolution images within optical sensor data is vital for visual interpretation, but may not improve classification performance. In contrast, integration of optical and radar data did improve classification performance when the proper data fusion method was used. Of the classification algorithms available, the maximum likelihood classifier is still an important method for providing reasonably good accuracy, but nonparametric algorithms, such as classification tree analysis, has the potential to provide better results. However, they often require more time to achieve parametric optimization. Proper use of hierarchical-based methods is fundamental for developing accurate land use/cover classification, mainly from historical remotely sensed data. PMID:24353353
Multi-phenology WorldView-2 imagery improves remote sensing of savannah tree species
NASA Astrophysics Data System (ADS)
Madonsela, Sabelo; Cho, Moses Azong; Mathieu, Renaud; Mutanga, Onisimo; Ramoelo, Abel; Kaszta, Żaneta; Kerchove, Ruben Van De; Wolff, Eléonore
2017-06-01
Biodiversity mapping in African savannah is important for monitoring changes and ensuring sustainable use of ecosystem resources. Biodiversity mapping can benefit from multi-spectral instruments such as WorldView-2 with very high spatial resolution and a spectral configuration encompassing important spectral regions not previously available for vegetation mapping. This study investigated i) the benefits of the eight-band WorldView-2 (WV-2) spectral configuration for discriminating tree species in Southern African savannah and ii) if multiple-images acquired at key points of the typical phenological development of savannahs (peak productivity, transition to senescence) improve on tree species classifications. We first assessed the discriminatory power of WV-2 bands using interspecies-Spectral Angle Mapper (SAM) via Band Add-On procedure and tested the spectral capability of WorldView-2 against simulated IKONOS for tree species classification. The results from interspecies-SAM procedure identified the yellow and red bands as the most statistically significant bands (p = 0.000251 and p = 0.000039 respectively) in the discriminatory power of WV-2 during the transition from wet to dry season (April). Using Random Forest classifier, the classification scenarios investigated showed that i) the 8-bands of the WV-2 sensor achieved higher classification accuracy for the April date (transition from wet to dry season, senescence) compared to the March date (peak productivity season) ii) the WV-2 spectral configuration systematically outperformed the IKONOS sensor spectral configuration and iii) the multi-temporal approach (March and April combined) improved the discrimination of tress species and produced the highest overall accuracy results at 80.4%. Consistent with the interspecies-SAM procedure, the yellow (605 nm) band also showed a statistically significant contribution in the improved classification accuracy from WV-2. These results highlight the mapping opportunities presented by WV-2 data for monitoring the distribution status of e.g. species often harvested by local communities (e.g. Sclerocharya birrea), encroaching species, or species-specific tree losses induced by elephants.
Classification of visual and linguistic tasks using eye-movement features.
Coco, Moreno I; Keller, Frank
2014-03-07
The role of the task has received special attention in visual-cognition research because it can provide causal explanations of goal-directed eye-movement responses. The dependency between visual attention and task suggests that eye movements can be used to classify the task being performed. A recent study by Greene, Liu, and Wolfe (2012), however, fails to achieve accurate classification of visual tasks based on eye-movement features. In the present study, we hypothesize that tasks can be successfully classified when they differ with respect to the involvement of other cognitive domains, such as language processing. We extract the eye-movement features used by Greene et al. as well as additional features from the data of three different tasks: visual search, object naming, and scene description. First, we demonstrated that eye-movement responses make it possible to characterize the goals of these tasks. Then, we trained three different types of classifiers and predicted the task participants performed with an accuracy well above chance (a maximum of 88% for visual search). An analysis of the relative importance of features for classification accuracy reveals that just one feature, i.e., initiation time, is sufficient for above-chance performance (a maximum of 79% accuracy in object naming). Crucially, this feature is independent of task duration, which differs systematically across the three tasks we investigated. Overall, the best task classification performance was obtained with a set of seven features that included both spatial information (e.g., entropy of attention allocation) and temporal components (e.g., total fixation on objects) of the eye-movement record. This result confirms the task-dependent allocation of visual attention and extends previous work by showing that task classification is possible when tasks differ in the cognitive processes involved (purely visual tasks such as search vs. communicative tasks such as scene description).
Comparison of wheat classification accuracy using different classifiers of the image-100 system
NASA Technical Reports Server (NTRS)
Dejesusparada, N. (Principal Investigator); Chen, S. C.; Moreira, M. A.; Delima, A. M.
1981-01-01
Classification results using single-cell and multi-cell signature acquisition options, a point-by-point Gaussian maximum-likelihood classifier, and K-means clustering of the Image-100 system are presented. Conclusions reached are that: a better indication of correct classification can be provided by using a test area which contains various cover types of the study area; classification accuracy should be evaluated considering both the percentages of correct classification and error of commission; supervised classification approaches are better than K-means clustering; Gaussian distribution maximum likelihood classifier is better than Single-cell and Multi-cell Signature Acquisition Options of the Image-100 system; and in order to obtain a high classification accuracy in a large and heterogeneous crop area, using Gaussian maximum-likelihood classifier, homogeneous spectral subclasses of the study crop should be created to derive training statistics.
Monitoring of "urban villages" in Shenzhen, China from high-resolution GF-1 and TerraSAR-X data
NASA Astrophysics Data System (ADS)
Wei, Chunzhu; Blaschke, Thomas; Taubenböck, Hannes
2015-10-01
Urban villages comprise mainly low-rise and congested, often informal settlements surrounded by new constructions and high-rise buildings whereby structures can be very different between neighboring areas. Monitoring urban villages and analyzing their characteristics are crucial for urban development and sustainability research. In this study, we carried out a combined analysis of multispectral GaoFen-1 (GF-1) and high resolution TerraSAR-X radar (TSX) imagery to extract the urban village information. GF-1 and TSX data are combined with the Gramshmidt spectral sharpening method so as to provide new input data for urban village classification. The Grey-Level Co-occurrence Matrix (GLCM) approach was also applied to four directions to provide another four types (all, 0°, 90°, 45° directions) of TSX-based inputs for urban village detection. We analyzed the urban village mapping performance using the Random Forest approach. The results demonstrate that the best overall accuracy and the best producer accuracy of urban villages reached with the GLCM 90° dataset (82.33%, 68.54% respectively). Adding single polarization TSX data as input information to the optical image GF-1 provided an average product accuracy improvement of around 7% in formal built-up area classification. The SAR and optical fusion imagery also provided an effective means to eliminate some layover, shadow effects, and dominant scattering at building locations and green spaces, improving the producer accuracy by 7% in urban area classification. To sum up, the added value of SAR information is demonstrated by the enhanced results achievable over built-up areas, including formal and informal settlements.
Na, Tong; Xie, Jianyang; Zhao, Yitian; Zhao, Yifan; Liu, Yue; Wang, Yongtian; Liu, Jiang
2018-05-09
Automatic methods of analyzing of retinal vascular networks, such as retinal blood vessel detection, vascular network topology estimation, and arteries/veins classification are of great assistance to the ophthalmologist in terms of diagnosis and treatment of a wide spectrum of diseases. We propose a new framework for precisely segmenting retinal vasculatures, constructing retinal vascular network topology, and separating the arteries and veins. A nonlocal total variation inspired Retinex model is employed to remove the image intensity inhomogeneities and relatively poor contrast. For better generalizability and segmentation performance, a superpixel-based line operator is proposed as to distinguish between lines and the edges, thus allowing more tolerance in the position of the respective contours. The concept of dominant sets clustering is adopted to estimate retinal vessel topology and classify the vessel network into arteries and veins. The proposed segmentation method yields competitive results on three public data sets (STARE, DRIVE, and IOSTAR), and it has superior performance when compared with unsupervised segmentation methods, with accuracy of 0.954, 0.957, and 0.964, respectively. The topology estimation approach has been applied to five public databases (DRIVE,STARE, INSPIRE, IOSTAR, and VICAVR) and achieved high accuracy of 0.830, 0.910, 0.915, 0.928, and 0.889, respectively. The accuracies of arteries/veins classification based on the estimated vascular topology on three public databases (INSPIRE, DRIVE and VICAVR) are 0.90.9, 0.910, and 0.907, respectively. The experimental results show that the proposed framework has effectively addressed crossover problem, a bottleneck issue in segmentation and vascular topology reconstruction. The vascular topology information significantly improves the accuracy on arteries/veins classification. © 2018 American Association of Physicists in Medicine.
A new classification scheme of plastic wastes based upon recycling labels.
Özkan, Kemal; Ergin, Semih; Işık, Şahin; Işıklı, Idil
2015-01-01
Since recycling of materials is widely assumed to be environmentally and economically beneficial, reliable sorting and processing of waste packaging materials such as plastics is very important for recycling with high efficiency. An automated system that can quickly categorize these materials is certainly needed for obtaining maximum classification while maintaining high throughput. In this paper, first of all, the photographs of the plastic bottles have been taken and several preprocessing steps were carried out. The first preprocessing step is to extract the plastic area of a bottle from the background. Then, the morphological image operations are implemented. These operations are edge detection, noise removal, hole removing, image enhancement, and image segmentation. These morphological operations can be generally defined in terms of the combinations of erosion and dilation. The effect of bottle color as well as label are eliminated using these operations. Secondly, the pixel-wise intensity values of the plastic bottle images have been used together with the most popular subspace and statistical feature extraction methods to construct the feature vectors in this study. Only three types of plastics are considered due to higher existence ratio of them than the other plastic types in the world. The decision mechanism consists of five different feature extraction methods including as Principal Component Analysis (PCA), Kernel PCA (KPCA), Fisher's Linear Discriminant Analysis (FLDA), Singular Value Decomposition (SVD) and Laplacian Eigenmaps (LEMAP) and uses a simple experimental setup with a camera and homogenous backlighting. Due to the giving global solution for a classification problem, Support Vector Machine (SVM) is selected to achieve the classification task and majority voting technique is used as the decision mechanism. This technique equally weights each classification result and assigns the given plastic object to the class that the most classification results agree on. The proposed classification scheme provides high accuracy rate, and also it is able to run in real-time applications. It can automatically classify the plastic bottle types with approximately 90% recognition accuracy. Besides this, the proposed methodology yields approximately 96% classification rate for the separation of PET or non-PET plastic types. It also gives 92% accuracy for the categorization of non-PET plastic types into HPDE or PP. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Bangs, Corey F.; Kruse, Fred A.; Olsen, Chris R.
2013-05-01
Hyperspectral data were assessed to determine the effect of integrating spectral data and extracted texture feature data on classification accuracy. Four separate spectral ranges (hundreds of spectral bands total) were used from the Visible and Near Infrared (VNIR) and Shortwave Infrared (SWIR) portions of the electromagnetic spectrum. Haralick texture features (contrast, entropy, and correlation) were extracted from the average gray-level image for each of the four spectral ranges studied. A maximum likelihood classifier was trained using a set of ground truth regions of interest (ROIs) and applied separately to the spectral data, texture data, and a fused dataset containing both. Classification accuracy was measured by comparison of results to a separate verification set of test ROIs. Analysis indicates that the spectral range (source of the gray-level image) used to extract the texture feature data has a significant effect on the classification accuracy. This result applies to texture-only classifications as well as the classification of integrated spectral data and texture feature data sets. Overall classification improvement for the integrated data sets was near 1%. Individual improvement for integrated spectral and texture classification of the "Urban" class showed approximately 9% accuracy increase over spectral-only classification. Texture-only classification accuracy was highest for the "Dirt Path" class at approximately 92% for the spectral range from 947 to 1343nm. This research demonstrates the effectiveness of texture feature data for more accurate analysis of hyperspectral data and the importance of selecting the correct spectral range to be used for the gray-level image source to extract these features.
Accuracy of Remotely Sensed Classifications For Stratification of Forest and Nonforest Lands
Raymond L. Czaplewski; Paul L. Patterson
2001-01-01
We specify accuracy standards for remotely sensed classifications used by FIA to stratify landscapes into two categories: forest and nonforest. Accuracy must be highest when forest area approaches 100 percent of the landscape. If forest area is rare in a landscape, then accuracy in the nonforest stratum must be very high, even at the expense of accuracy in the forest...
NASA Astrophysics Data System (ADS)
Shahriari Nia, Morteza; Wang, Daisy Zhe; Bohlman, Stephanie Ann; Gader, Paul; Graves, Sarah J.; Petrovic, Milenko
2015-01-01
Hyperspectral images can be used to identify savannah tree species at the landscape scale, which is a key step in measuring biomass and carbon, and tracking changes in species distributions, including invasive species, in these ecosystems. Before automated species mapping can be performed, image processing and atmospheric correction is often performed, which can potentially affect the performance of classification algorithms. We determine how three processing and correction techniques (atmospheric correction, Gaussian filters, and shade/green vegetation filters) affect the prediction accuracy of classification of tree species at pixel level from airborne visible/infrared imaging spectrometer imagery of longleaf pine savanna in Central Florida, United States. Species classification using fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) atmospheric correction outperformed ATCOR in the majority of cases. Green vegetation (normalized difference vegetation index) and shade (near-infrared) filters did not increase classification accuracy when applied to large and continuous patches of specific species. Finally, applying a Gaussian filter reduces interband noise and increases species classification accuracy. Using the optimal preprocessing steps, our classification accuracy of six species classes is about 75%.
NASA Astrophysics Data System (ADS)
Wei, Hongqiang; Zhou, Guiyun; Zhou, Junjie
2018-04-01
The classification of leaf and wood points is an essential preprocessing step for extracting inventory measurements and canopy characterization of trees from the terrestrial laser scanning (TLS) data. The geometry-based approach is one of the widely used classification method. In the geometry-based method, it is common practice to extract salient features at one single scale before the features are used for classification. It remains unclear how different scale(s) used affect the classification accuracy and efficiency. To assess the scale effect on the classification accuracy and efficiency, we extracted the single-scale and multi-scale salient features from the point clouds of two oak trees of different sizes and conducted the classification on leaf and wood. Our experimental results show that the balanced accuracy of the multi-scale method is higher than the average balanced accuracy of the single-scale method by about 10 % for both trees. The average speed-up ratio of single scale classifiers over multi-scale classifier for each tree is higher than 30.
Comparing Features for Classification of MEG Responses to Motor Imagery.
Halme, Hanna-Leena; Parkkonen, Lauri
2016-01-01
Motor imagery (MI) with real-time neurofeedback could be a viable approach, e.g., in rehabilitation of cerebral stroke. Magnetoencephalography (MEG) noninvasively measures electric brain activity at high temporal resolution and is well-suited for recording oscillatory brain signals. MI is known to modulate 10- and 20-Hz oscillations in the somatomotor system. In order to provide accurate feedback to the subject, the most relevant MI-related features should be extracted from MEG data. In this study, we evaluated several MEG signal features for discriminating between left- and right-hand MI and between MI and rest. MEG was measured from nine healthy participants imagining either left- or right-hand finger tapping according to visual cues. Data preprocessing, feature extraction and classification were performed offline. The evaluated MI-related features were power spectral density (PSD), Morlet wavelets, short-time Fourier transform (STFT), common spatial patterns (CSP), filter-bank common spatial patterns (FBCSP), spatio-spectral decomposition (SSD), and combined SSD+CSP, CSP+PSD, CSP+Morlet, and CSP+STFT. We also compared four classifiers applied to single trials using 5-fold cross-validation for evaluating the classification accuracy and its possible dependence on the classification algorithm. In addition, we estimated the inter-session left-vs-right accuracy for each subject. The SSD+CSP combination yielded the best accuracy in both left-vs-right (mean 73.7%) and MI-vs-rest (mean 81.3%) classification. CSP+Morlet yielded the best mean accuracy in inter-session left-vs-right classification (mean 69.1%). There were large inter-subject differences in classification accuracy, and the level of the 20-Hz suppression correlated significantly with the subjective MI-vs-rest accuracy. Selection of the classification algorithm had only a minor effect on the results. We obtained good accuracy in sensor-level decoding of MI from single-trial MEG data. Feature extraction methods utilizing both the spatial and spectral profile of MI-related signals provided the best classification results, suggesting good performance of these methods in an online MEG neurofeedback system.
PCA based feature reduction to improve the accuracy of decision tree c4.5 classification
NASA Astrophysics Data System (ADS)
Nasution, M. Z. F.; Sitompul, O. S.; Ramli, M.
2018-03-01
Splitting attribute is a major process in Decision Tree C4.5 classification. However, this process does not give a significant impact on the establishment of the decision tree in terms of removing irrelevant features. It is a major problem in decision tree classification process called over-fitting resulting from noisy data and irrelevant features. In turns, over-fitting creates misclassification and data imbalance. Many algorithms have been proposed to overcome misclassification and overfitting on classifications Decision Tree C4.5. Feature reduction is one of important issues in classification model which is intended to remove irrelevant data in order to improve accuracy. The feature reduction framework is used to simplify high dimensional data to low dimensional data with non-correlated attributes. In this research, we proposed a framework for selecting relevant and non-correlated feature subsets. We consider principal component analysis (PCA) for feature reduction to perform non-correlated feature selection and Decision Tree C4.5 algorithm for the classification. From the experiments conducted using available data sets from UCI Cervical cancer data set repository with 858 instances and 36 attributes, we evaluated the performance of our framework based on accuracy, specificity and precision. Experimental results show that our proposed framework is robust to enhance classification accuracy with 90.70% accuracy rates.
Video-based depression detection using local Curvelet binary patterns in pairwise orthogonal planes.
Pampouchidou, Anastasia; Marias, Kostas; Tsiknakis, Manolis; Simos, Panagiotis; Fan Yang; Lemaitre, Guillaume; Meriaudeau, Fabrice
2016-08-01
Depression is an increasingly prevalent mood disorder. This is the reason why the field of computer-based depression assessment has been gaining the attention of the research community during the past couple of years. The present work proposes two algorithms for depression detection, one Frame-based and the second Video-based, both employing Curvelet transform and Local Binary Patterns. The main advantage of these methods is that they have significantly lower computational requirements, as the extracted features are of very low dimensionality. This is achieved by modifying the previously proposed algorithm which considers Three-Orthogonal-Planes, to only Pairwise-Orthogonal-Planes. Performance of the algorithms was tested on the benchmark dataset provided by the Audio/Visual Emotion Challenge 2014, with the person-specific system achieving 97.6% classification accuracy, and the person-independed one yielding promising preliminary results of 74.5% accuracy. The paper concludes with open issues, proposed solutions, and future plans.
Comprehensive decision tree models in bioinformatics.
Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter
2012-01-01
Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly redundant attributes that are very common in bioinformatics.
Comprehensive Decision Tree Models in Bioinformatics
Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter
2012-01-01
Purpose Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. Methods This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. Results The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. Conclusions The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly redundant attributes that are very common in bioinformatics. PMID:22479449
The Ex Vivo Eye Irritation Test as an alternative test method for serious eye damage/eye irritation.
Spöler, Felix; Kray, Oya; Kray, Stefan; Panfil, Claudia; Schrage, Norbert F
2015-07-01
Ocular irritation testing is a common requirement for the classification, labelling and packaging of chemicals (substances and mixtures). The in vivo Draize rabbit eye test (OECD Test Guideline 405) is considered to be the regulatory reference method for the classification of chemicals according to their potential to induce eye injury. In the Draize test, chemicals are applied to rabbit eyes in vivo, and changes are monitored over time. If no damage is observed, the chemical is not categorised. Otherwise, the classification depends on the severity and reversibility of the damage. Alternative test methods have to be designed to match the classifications from the in vivo reference method. However, observation of damage reversibility is usually not possible in vitro. Within the present study, a new organotypic method based on rabbit corneas obtained from food production is demonstrated to close this gap. The Ex Vivo Eye Irritation Test (EVEIT) retains the full biochemical activity of the corneal epithelium, epithelial stem cells and endothelium. This permits the in-depth analysis of ocular chemical trauma beyond that achievable by using established in vitro methods. In particular, the EVEIT is the first test to permit the direct monitoring of recovery of all corneal layers after damage. To develop a prediction model for the EVEIT that is comparable to the GHS system, 37 reference chemicals were analysed. The experimental data were used to derive a three-level potency ranking of eye irritation and corrosion that best fits the GHS categorisation. In vivo data available in the literature were used for comparison. When compared with GHS classification predictions, the overall accuracy of the three-level potency ranking was 78%. The classification of chemicals as irritating versus non-irritating resulted in 96% sensitivity, 91% specificity and 95% accuracy. 2015 FRAME.
IMPACTS OF PATCH SIZE AND LANDSCAPE HETEROGENEITY ON THEMATIC IMAGE CLASSIFICATION ACCURACY
Impacts of Patch Size and Landscape Heterogeneity on Thematic Image Classification Accuracy.
Currently, most thematic accuracy assessments of classified remotely sensed images oily account for errors between the various classes employed, at particular pixels of interest, thu...
A hybrid approach to select features and classify diseases based on medical data
NASA Astrophysics Data System (ADS)
AbdelLatif, Hisham; Luo, Jiawei
2018-03-01
Feature selection is popular problem in the classification of diseases in clinical medicine. Here, we developing a hybrid methodology to classify diseases, based on three medical datasets, Arrhythmia, Breast cancer, and Hepatitis datasets. This methodology called k-means ANOVA Support Vector Machine (K-ANOVA-SVM) uses K-means cluster with ANOVA statistical to preprocessing data and selection the significant features, and Support Vector Machines in the classification process. To compare and evaluate the performance, we choice three classification algorithms, decision tree Naïve Bayes, Support Vector Machines and applied the medical datasets direct to these algorithms. Our methodology was a much better classification accuracy is given of 98% in Arrhythmia datasets, 92% in Breast cancer datasets and 88% in Hepatitis datasets, Compare to use the medical data directly with decision tree Naïve Bayes, and Support Vector Machines. Also, the ROC curve and precision with (K-ANOVA-SVM) Achieved best results than other algorithms
Wavelet Packet Entropy for Heart Murmurs Classification
Safara, Fatemeh; Doraisamy, Shyamala; Azman, Azreen; Jantan, Azrul; Ranga, Sri
2012-01-01
Heart murmurs are the first signs of cardiac valve disorders. Several studies have been conducted in recent years to automatically differentiate normal heart sounds, from heart sounds with murmurs using various types of audio features. Entropy was successfully used as a feature to distinguish different heart sounds. In this paper, new entropy was introduced to analyze heart sounds and the feasibility of using this entropy in classification of five types of heart sounds and murmurs was shown. The entropy was previously introduced to analyze mammograms. Four common murmurs were considered including aortic regurgitation, mitral regurgitation, aortic stenosis, and mitral stenosis. Wavelet packet transform was employed for heart sound analysis, and the entropy was calculated for deriving feature vectors. Five types of classification were performed to evaluate the discriminatory power of the generated features. The best results were achieved by BayesNet with 96.94% accuracy. The promising results substantiate the effectiveness of the proposed wavelet packet entropy for heart sounds classification. PMID:23227043
An Extended Spectral-Spatial Classification Approach for Hyperspectral Data
NASA Astrophysics Data System (ADS)
Akbari, D.
2017-11-01
In this paper an extended classification approach for hyperspectral imagery based on both spectral and spatial information is proposed. The spatial information is obtained by an enhanced marker-based minimum spanning forest (MSF) algorithm. Three different methods of dimension reduction are first used to obtain the subspace of hyperspectral data: (1) unsupervised feature extraction methods including principal component analysis (PCA), independent component analysis (ICA), and minimum noise fraction (MNF); (2) supervised feature extraction including decision boundary feature extraction (DBFE), discriminate analysis feature extraction (DAFE), and nonparametric weighted feature extraction (NWFE); (3) genetic algorithm (GA). The spectral features obtained are then fed into the enhanced marker-based MSF classification algorithm. In the enhanced MSF algorithm, the markers are extracted from the classification maps obtained by both SVM and watershed segmentation algorithm. To evaluate the proposed approach, the Pavia University hyperspectral data is tested. Experimental results show that the proposed approach using GA achieves an approximately 8 % overall accuracy higher than the original MSF-based algorithm.
Distance Metric Learning via Iterated Support Vector Machines.
Zuo, Wangmeng; Wang, Faqiang; Zhang, David; Lin, Liang; Huang, Yuchi; Meng, Deyu; Zhang, Lei
2017-07-11
Distance metric learning aims to learn from the given training data a valid distance metric, with which the similarity between data samples can be more effectively evaluated for classification. Metric learning is often formulated as a convex or nonconvex optimization problem, while most existing methods are based on customized optimizers and become inefficient for large scale problems. In this paper, we formulate metric learning as a kernel classification problem with the positive semi-definite constraint, and solve it by iterated training of support vector machines (SVMs). The new formulation is easy to implement and efficient in training with the off-the-shelf SVM solvers. Two novel metric learning models, namely Positive-semidefinite Constrained Metric Learning (PCML) and Nonnegative-coefficient Constrained Metric Learning (NCML), are developed. Both PCML and NCML can guarantee the global optimality of their solutions. Experiments are conducted on general classification, face verification and person re-identification to evaluate our methods. Compared with the state-of-the-art approaches, our methods can achieve comparable classification accuracy and are efficient in training.
Automatic Identification of Critical Follow-Up Recommendation Sentences in Radiology Reports
Yetisgen-Yildiz, Meliha; Gunn, Martin L.; Xia, Fei; Payne, Thomas H.
2011-01-01
Communication of follow-up recommendations when abnormalities are identified on imaging studies is prone to error. When recommendations are not systematically identified and promptly communicated to referrers, poor patient outcomes can result. Using information technology can improve communication and improve patient safety. In this paper, we describe a text processing approach that uses natural language processing (NLP) and supervised text classification methods to automatically identify critical recommendation sentences in radiology reports. To increase the classification performance we enhanced the simple unigram token representation approach with lexical, semantic, knowledge-base, and structural features. We tested different combinations of those features with the Maximum Entropy (MaxEnt) classification algorithm. Classifiers were trained and tested with a gold standard corpus annotated by a domain expert. We applied 5-fold cross validation and our best performing classifier achieved 95.60% precision, 79.82% recall, 87.0% F-score, and 99.59% classification accuracy in identifying the critical recommendation sentences in radiology reports. PMID:22195225
Electronic Sleep Stage Classifiers: A Survey and VLSI Design Methodology.
Kassiri, Hossein; Chemparathy, Aditi; Salam, M Tariqus; Boyce, Richard; Adamantidis, Antoine; Genov, Roman
2017-02-01
First, existing sleep stage classifier sensors and algorithms are reviewed and compared in terms of classification accuracy, level of automation, implementation complexity, invasiveness, and targeted application. Next, the implementation of a miniature microsystem for low-latency automatic sleep stage classification in rodents is presented. The classification algorithm uses one EMG (electromyogram) and two EEG (electroencephalogram) signals as inputs in order to detect REM (rapid eye movement) sleep, and is optimized for low complexity and low power consumption. It is implemented in an on-board low-power FPGA connected to a multi-channel neural recording IC, to achieve low-latency (order of 1 ms or less) classification. Off-line experimental results using pre-recorded signals from nine mice show REM detection sensitivity and specificity of 81.69% and 93.86%, respectively, with the maximum latency of 39 [Formula: see text]. The device is designed to be used in a non-disruptive closed-loop REM sleep suppression microsystem, for future studies of the effects of REM sleep deprivation on memory consolidation.
Automatic identification of critical follow-up recommendation sentences in radiology reports.
Yetisgen-Yildiz, Meliha; Gunn, Martin L; Xia, Fei; Payne, Thomas H
2011-01-01
Communication of follow-up recommendations when abnormalities are identified on imaging studies is prone to error. When recommendations are not systematically identified and promptly communicated to referrers, poor patient outcomes can result. Using information technology can improve communication and improve patient safety. In this paper, we describe a text processing approach that uses natural language processing (NLP) and supervised text classification methods to automatically identify critical recommendation sentences in radiology reports. To increase the classification performance we enhanced the simple unigram token representation approach with lexical, semantic, knowledge-base, and structural features. We tested different combinations of those features with the Maximum Entropy (MaxEnt) classification algorithm. Classifiers were trained and tested with a gold standard corpus annotated by a domain expert. We applied 5-fold cross validation and our best performing classifier achieved 95.60% precision, 79.82% recall, 87.0% F-score, and 99.59% classification accuracy in identifying the critical recommendation sentences in radiology reports.
Automated analysis and classification of melanocytic tumor on skin whole slide images.
Xu, Hongming; Lu, Cheng; Berendt, Richard; Jha, Naresh; Mandal, Mrinal
2018-06-01
This paper presents a computer-aided technique for automated analysis and classification of melanocytic tumor on skin whole slide biopsy images. The proposed technique consists of four main modules. First, skin epidermis and dermis regions are segmented by a multi-resolution framework. Next, epidermis analysis is performed, where a set of epidermis features reflecting nuclear morphologies and spatial distributions is computed. In parallel with epidermis analysis, dermis analysis is also performed, where dermal cell nuclei are segmented and a set of textural and cytological features are computed. Finally, the skin melanocytic image is classified into different categories such as melanoma, nevus or normal tissue by using a multi-class support vector machine (mSVM) with extracted epidermis and dermis features. Experimental results on 66 skin whole slide images indicate that the proposed technique achieves more than 95% classification accuracy, which suggests that the technique has the potential to be used for assisting pathologists on skin biopsy image analysis and classification. Copyright © 2018 Elsevier Ltd. All rights reserved.
Nonlinear features for classification and pose estimation of machined parts from single views
NASA Astrophysics Data System (ADS)
Talukder, Ashit; Casasent, David P.
1998-10-01
A new nonlinear feature extraction method is presented for classification and pose estimation of objects from single views. The feature extraction method is called the maximum representation and discrimination feature (MRDF) method. The nonlinear MRDF transformations to use are obtained in closed form, and offer significant advantages compared to nonlinear neural network implementations. The features extracted are useful for both object discrimination (classification) and object representation (pose estimation). We consider MRDFs on image data, provide a new 2-stage nonlinear MRDF solution, and show it specializes to well-known linear and nonlinear image processing transforms under certain conditions. We show the use of MRDF in estimating the class and pose of images of rendered solid CAD models of machine parts from single views using a feature-space trajectory neural network classifier. We show new results with better classification and pose estimation accuracy than are achieved by standard principal component analysis and Fukunaga-Koontz feature extraction methods.
Instrument classification in polyphonic music based on timbre analysis
NASA Astrophysics Data System (ADS)
Zhang, Tong
2001-07-01
While most previous work on musical instrument recognition is focused on the classification of single notes in monophonic music, a scheme is proposed in this paper for the distinction of instruments in continuous music pieces which may contain one or more kinds of instruments. Highlights of the system include music segmentation into notes, harmonic partial estimation in polyphonic sound, note feature calculation and normalization, note classification using a set of neural networks, and music piece categorization with fuzzy logic principles. Example outputs of the system are `the music piece is 100% guitar (with 90% likelihood)' and `the music piece is 60% violin and 40% piano, thus a violin/piano duet'. The system has been tested with twelve kinds of musical instruments, and very promising experimental results have been obtained. An accuracy of about 80% is achieved, and the number can be raised to 90% if misindexings within the same instrument family are tolerated (e.g. cello, viola and violin). A demonstration system for musical instrument classification and music timbre retrieval is also presented.
Shankar, Vijay; Reo, Nicholas V; Paliy, Oleg
2015-12-09
We previously showed that stool samples of pre-adolescent and adolescent US children diagnosed with diarrhea-predominant IBS (IBS-D) had different compositions of microbiota and metabolites compared to healthy age-matched controls. Here we explored whether observed fecal microbiota and metabolite differences between these two adolescent populations can be used to discriminate between IBS and health. We constructed individual microbiota- and metabolite-based sample classification models based on the partial least squares multivariate analysis and then applied a Bayesian approach to integrate individual models into a single classifier. The resulting combined classification achieved 84 % accuracy of correct sample group assignment and 86 % prediction for IBS-D in cross-validation tests. The performance of the cumulative classification model was further validated by the de novo analysis of stool samples from a small independent IBS-D cohort. High-throughput microbial and metabolite profiling of subject stool samples can be used to facilitate IBS diagnosis.
The Cross-Entropy Based Multi-Filter Ensemble Method for Gene Selection.
Sun, Yingqiang; Lu, Chengbo; Li, Xiaobo
2018-05-17
The gene expression profile has the characteristics of a high dimension, low sample, and continuous type, and it is a great challenge to use gene expression profile data for the classification of tumor samples. This paper proposes a cross-entropy based multi-filter ensemble (CEMFE) method for microarray data classification. Firstly, multiple filters are used to select the microarray data in order to obtain a plurality of the pre-selected feature subsets with a different classification ability. The top N genes with the highest rank of each subset are integrated so as to form a new data set. Secondly, the cross-entropy algorithm is used to remove the redundant data in the data set. Finally, the wrapper method, which is based on forward feature selection, is used to select the best feature subset. The experimental results show that the proposed method is more efficient than other gene selection methods and that it can achieve a higher classification accuracy under fewer characteristic genes.
NASA Astrophysics Data System (ADS)
Yan, Dan; Bai, Lianfa; Zhang, Yi; Han, Jing
2018-02-01
For the problems of missing details and performance of the colorization based on sparse representation, we propose a conceptual model framework for colorizing gray-scale images, and then a multi-sparse dictionary colorization algorithm based on the feature classification and detail enhancement (CEMDC) is proposed based on this framework. The algorithm can achieve a natural colorized effect for a gray-scale image, and it is consistent with the human vision. First, the algorithm establishes a multi-sparse dictionary classification colorization model. Then, to improve the accuracy rate of the classification, the corresponding local constraint algorithm is proposed. Finally, we propose a detail enhancement based on Laplacian Pyramid, which is effective in solving the problem of missing details and improving the speed of image colorization. In addition, the algorithm not only realizes the colorization of the visual gray-scale image, but also can be applied to the other areas, such as color transfer between color images, colorizing gray fusion images, and infrared images.
Classification Algorithms for Big Data Analysis, a Map Reduce Approach
NASA Astrophysics Data System (ADS)
Ayma, V. A.; Ferreira, R. S.; Happ, P.; Oliveira, D.; Feitosa, R.; Costa, G.; Plaza, A.; Gamba, P.
2015-03-01
Since many years ago, the scientific community is concerned about how to increase the accuracy of different classification methods, and major achievements have been made so far. Besides this issue, the increasing amount of data that is being generated every day by remote sensors raises more challenges to be overcome. In this work, a tool within the scope of InterIMAGE Cloud Platform (ICP), which is an open-source, distributed framework for automatic image interpretation, is presented. The tool, named ICP: Data Mining Package, is able to perform supervised classification procedures on huge amounts of data, usually referred as big data, on a distributed infrastructure using Hadoop MapReduce. The tool has four classification algorithms implemented, taken from WEKA's machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines (SVM). The results of an experimental analysis using a SVM classifier on data sets of different sizes for different cluster configurations demonstrates the potential of the tool, as well as aspects that affect its performance.
Sevel, Landrew S; Boissoneault, Jeff; Letzen, Janelle E; Robinson, Michael E; Staud, Roland
2018-05-30
Chronic fatigue syndrome (CFS) is a disorder associated with fatigue, pain, and structural/functional abnormalities seen during magnetic resonance brain imaging (MRI). Therefore, we evaluated the performance of structural MRI (sMRI) abnormalities in the classification of CFS patients versus healthy controls and compared it to machine learning (ML) classification based upon self-report (SR). Participants included 18 CFS patients and 15 healthy controls (HC). All subjects underwent T1-weighted sMRI and provided visual analogue-scale ratings of fatigue, pain intensity, anxiety, depression, anger, and sleep quality. sMRI data were segmented using FreeSurfer and 61 regions based on functional and structural abnormalities previously reported in patients with CFS. Classification was performed in RapidMiner using a linear support vector machine and bootstrap optimism correction. We compared ML classifiers based on (1) 61 a priori sMRI regional estimates and (2) SR ratings. The sMRI model achieved 79.58% classification accuracy. The SR (accuracy = 95.95%) outperformed both sMRI models. Estimates from multiple brain areas related to cognition, emotion, and memory contributed strongly to group classification. This is the first ML-based group classification of CFS. Our findings suggest that sMRI abnormalities are useful for discriminating CFS patients from HC, but SR ratings remain most effective in classification tasks.
Analysis of framelets for breast cancer diagnosis.
Thivya, K S; Sakthivel, P; Venkata Sai, P M
2016-01-01
Breast cancer is the second threatening tumor among the women. The effective way of reducing breast cancer is its early detection which helps to improve the diagnosing process. Digital mammography plays a significant role in mammogram screening at earlier stage of breast carcinoma. Even though, it is very difficult to find accurate abnormality in prevalent screening by radiologists. But the possibility of precise breast cancer screening is encouraged by predicting the accurate type of abnormality through Computer Aided Diagnosis (CAD) systems. The two most important indicators of breast malignancy are microcalcifications and masses. In this study, framelet transform, a multiresolutional analysis is investigated for the classification of the above mentioned two indicators. The statistical and co-occurrence features are extracted from the framelet decomposed mammograms with different resolution levels and support vector machine is employed for classification with k-fold cross validation. This system achieves 94.82% and 100% accuracy in normal/abnormal classification (stage I) and benign/malignant classification (stage II) of mass classification system and 98.57% and 100% for microcalcification system when using the MIAS database.