NASA Astrophysics Data System (ADS)
Chen, Y.; Luo, M.; Xu, L.; Zhou, X.; Ren, J.; Zhou, J.
2018-04-01
The RF method based on grid-search parameter optimization could achieve a classification accuracy of 88.16 % in the classification of images with multiple feature variables. This classification accuracy was higher than that of SVM and ANN under the same feature variables. In terms of efficiency, the RF classification method performs better than SVM and ANN, it is more capable of handling multidimensional feature variables. The RF method combined with object-based analysis approach could highlight the classification accuracy further. The multiresolution segmentation approach on the basis of ESP scale parameter optimization was used for obtaining six scales to execute image segmentation, when the segmentation scale was 49, the classification accuracy reached the highest value of 89.58 %. The classification accuracy of object-based RF classification was 1.42 % higher than that of pixel-based classification (88.16 %), and the classification accuracy was further improved. Therefore, the RF classification method combined with object-based analysis approach could achieve relatively high accuracy in the classification and extraction of land use information for industrial and mining reclamation areas. Moreover, the interpretation of remotely sensed imagery using the proposed method could provide technical support and theoretical reference for remotely sensed monitoring land reclamation.
NASA Astrophysics Data System (ADS)
Gao, Yan; Marpu, Prashanth; Morales Manila, Luis M.
2014-11-01
This paper assesses the suitability of 8-band Worldview-2 (WV2) satellite data and object-based random forest algorithm for the classification of avocado growth stages in Mexico. We tested both pixel-based with minimum distance (MD) and maximum likelihood (MLC) and object-based with Random Forest (RF) algorithm for this task. Training samples and verification data were selected by visual interpreting the WV2 images for seven thematic classes: fully grown, middle stage, and early stage of avocado crops, bare land, two types of natural forests, and water body. To examine the contribution of the four new spectral bands of WV2 sensor, all the tested classifications were carried out with and without the four new spectral bands. Classification accuracy assessment results show that object-based classification with RF algorithm obtained higher overall higher accuracy (93.06%) than pixel-based MD (69.37%) and MLC (64.03%) method. For both pixel-based and object-based methods, the classifications with the four new spectral bands (overall accuracy obtained higher accuracy than those without: overall accuracy of object-based RF classification with vs without: 93.06% vs 83.59%, pixel-based MD: 69.37% vs 67.2%, pixel-based MLC: 64.03% vs 36.05%, suggesting that the four new spectral bands in WV2 sensor contributed to the increase of the classification accuracy.
NASA Technical Reports Server (NTRS)
Justice, C.; Townshend, J. (Principal Investigator)
1981-01-01
Two unsupervised classification procedures were applied to ratioed and unratioed LANDSAT multispectral scanner data of an area of spatially complex vegetation and terrain. An objective accuracy assessment was undertaken on each classification and comparison was made of the classification accuracies. The two unsupervised procedures use the same clustering algorithm. By on procedure the entire area is clustered and by the other a representative sample of the area is clustered and the resulting statistics are extrapolated to the remaining area using a maximum likelihood classifier. Explanation is given of the major steps in the classification procedures including image preprocessing; classification; interpretation of cluster classes; and accuracy assessment. Of the four classifications undertaken, the monocluster block approach on the unratioed data gave the highest accuracy of 80% for five coarse cover classes. This accuracy was increased to 84% by applying a 3 x 3 contextual filter to the classified image. A detailed description and partial explanation is provided for the major misclassification. The classification of the unratioed data produced higher percentage accuracies than for the ratioed data and the monocluster block approach gave higher accuracies than clustering the entire area. The moncluster block approach was additionally the most economical in terms of computing time.
Zhou, Tao; Li, Zhaofu; Pan, Jianjun
2018-01-27
This paper focuses on evaluating the ability and contribution of using backscatter intensity, texture, coherence, and color features extracted from Sentinel-1A data for urban land cover classification and comparing different multi-sensor land cover mapping methods to improve classification accuracy. Both Landsat-8 OLI and Hyperion images were also acquired, in combination with Sentinel-1A data, to explore the potential of different multi-sensor urban land cover mapping methods to improve classification accuracy. The classification was performed using a random forest (RF) method. The results showed that the optimal window size of the combination of all texture features was 9 × 9, and the optimal window size was different for each individual texture feature. For the four different feature types, the texture features contributed the most to the classification, followed by the coherence and backscatter intensity features; and the color features had the least impact on the urban land cover classification. Satisfactory classification results can be obtained using only the combination of texture and coherence features, with an overall accuracy up to 91.55% and a kappa coefficient up to 0.8935, respectively. Among all combinations of Sentinel-1A-derived features, the combination of the four features had the best classification result. Multi-sensor urban land cover mapping obtained higher classification accuracy. The combination of Sentinel-1A and Hyperion data achieved higher classification accuracy compared to the combination of Sentinel-1A and Landsat-8 OLI images, with an overall accuracy of up to 99.12% and a kappa coefficient up to 0.9889. When Sentinel-1A data was added to Hyperion images, the overall accuracy and kappa coefficient were increased by 4.01% and 0.0519, respectively.
NASA Astrophysics Data System (ADS)
Wei, Hongqiang; Zhou, Guiyun; Zhou, Junjie
2018-04-01
The classification of leaf and wood points is an essential preprocessing step for extracting inventory measurements and canopy characterization of trees from the terrestrial laser scanning (TLS) data. The geometry-based approach is one of the widely used classification method. In the geometry-based method, it is common practice to extract salient features at one single scale before the features are used for classification. It remains unclear how different scale(s) used affect the classification accuracy and efficiency. To assess the scale effect on the classification accuracy and efficiency, we extracted the single-scale and multi-scale salient features from the point clouds of two oak trees of different sizes and conducted the classification on leaf and wood. Our experimental results show that the balanced accuracy of the multi-scale method is higher than the average balanced accuracy of the single-scale method by about 10 % for both trees. The average speed-up ratio of single scale classifiers over multi-scale classifier for each tree is higher than 30.
Porras-Alfaro, Andrea; Liu, Kuan-Liang; Kuske, Cheryl R; Xie, Gary
2014-02-01
We compared the classification accuracy of two sections of the fungal internal transcribed spacer (ITS) region, individually and combined, and the 5' section (about 600 bp) of the large-subunit rRNA (LSU), using a naive Bayesian classifier and BLASTN. A hand-curated ITS-LSU training set of 1,091 sequences and a larger training set of 8,967 ITS region sequences were used. Of the factors evaluated, database composition and quality had the largest effect on classification accuracy, followed by fragment size and use of a bootstrap cutoff to improve classification confidence. The naive Bayesian classifier and BLASTN gave similar results at higher taxonomic levels, but the classifier was faster and more accurate at the genus level when a bootstrap cutoff was used. All of the ITS and LSU sections performed well (>97.7% accuracy) at higher taxonomic ranks from kingdom to family, and differences between them were small at the genus level (within 0.66 to 1.23%). When full-length sequence sections were used, the LSU outperformed the ITS1 and ITS2 fragments at the genus level, but the ITS1 and ITS2 showed higher accuracy when smaller fragment sizes of the same length and a 50% bootstrap cutoff were used. In a comparison using the larger ITS training set, ITS1 and ITS2 had very similar accuracy classification for fragments between 100 and 200 bp. Collectively, the results show that any of the ITS or LSU sections we tested provided comparable classification accuracy to the genus level and underscore the need for larger and more diverse classification training sets.
Liu, Kuan-Liang; Kuske, Cheryl R.
2014-01-01
We compared the classification accuracy of two sections of the fungal internal transcribed spacer (ITS) region, individually and combined, and the 5′ section (about 600 bp) of the large-subunit rRNA (LSU), using a naive Bayesian classifier and BLASTN. A hand-curated ITS-LSU training set of 1,091 sequences and a larger training set of 8,967 ITS region sequences were used. Of the factors evaluated, database composition and quality had the largest effect on classification accuracy, followed by fragment size and use of a bootstrap cutoff to improve classification confidence. The naive Bayesian classifier and BLASTN gave similar results at higher taxonomic levels, but the classifier was faster and more accurate at the genus level when a bootstrap cutoff was used. All of the ITS and LSU sections performed well (>97.7% accuracy) at higher taxonomic ranks from kingdom to family, and differences between them were small at the genus level (within 0.66 to 1.23%). When full-length sequence sections were used, the LSU outperformed the ITS1 and ITS2 fragments at the genus level, but the ITS1 and ITS2 showed higher accuracy when smaller fragment sizes of the same length and a 50% bootstrap cutoff were used. In a comparison using the larger ITS training set, ITS1 and ITS2 had very similar accuracy classification for fragments between 100 and 200 bp. Collectively, the results show that any of the ITS or LSU sections we tested provided comparable classification accuracy to the genus level and underscore the need for larger and more diverse classification training sets. PMID:24242255
Application of Sensor Fusion to Improve Uav Image Classification
NASA Astrophysics Data System (ADS)
Jabari, S.; Fathollahi, F.; Zhang, Y.
2017-08-01
Image classification is one of the most important tasks of remote sensing projects including the ones that are based on using UAV images. Improving the quality of UAV images directly affects the classification results and can save a huge amount of time and effort in this area. In this study, we show that sensor fusion can improve image quality which results in increasing the accuracy of image classification. Here, we tested two sensor fusion configurations by using a Panchromatic (Pan) camera along with either a colour camera or a four-band multi-spectral (MS) camera. We use the Pan camera to benefit from its higher sensitivity and the colour or MS camera to benefit from its spectral properties. The resulting images are then compared to the ones acquired by a high resolution single Bayer-pattern colour camera (here referred to as HRC). We assessed the quality of the output images by performing image classification tests. The outputs prove that the proposed sensor fusion configurations can achieve higher accuracies compared to the images of the single Bayer-pattern colour camera. Therefore, incorporating a Pan camera on-board in the UAV missions and performing image fusion can help achieving higher quality images and accordingly higher accuracy classification results.
Pan, Jianjun
2018-01-01
This paper focuses on evaluating the ability and contribution of using backscatter intensity, texture, coherence, and color features extracted from Sentinel-1A data for urban land cover classification and comparing different multi-sensor land cover mapping methods to improve classification accuracy. Both Landsat-8 OLI and Hyperion images were also acquired, in combination with Sentinel-1A data, to explore the potential of different multi-sensor urban land cover mapping methods to improve classification accuracy. The classification was performed using a random forest (RF) method. The results showed that the optimal window size of the combination of all texture features was 9 × 9, and the optimal window size was different for each individual texture feature. For the four different feature types, the texture features contributed the most to the classification, followed by the coherence and backscatter intensity features; and the color features had the least impact on the urban land cover classification. Satisfactory classification results can be obtained using only the combination of texture and coherence features, with an overall accuracy up to 91.55% and a kappa coefficient up to 0.8935, respectively. Among all combinations of Sentinel-1A-derived features, the combination of the four features had the best classification result. Multi-sensor urban land cover mapping obtained higher classification accuracy. The combination of Sentinel-1A and Hyperion data achieved higher classification accuracy compared to the combination of Sentinel-1A and Landsat-8 OLI images, with an overall accuracy of up to 99.12% and a kappa coefficient up to 0.9889. When Sentinel-1A data was added to Hyperion images, the overall accuracy and kappa coefficient were increased by 4.01% and 0.0519, respectively. PMID:29382073
NASA Astrophysics Data System (ADS)
Müller-Putz, Gernot R.; Scherer, Reinhold; Brauneis, Christian; Pfurtscheller, Gert
2005-12-01
Brain-computer interfaces (BCIs) can be realized on the basis of steady-state evoked potentials (SSEPs). These types of brain signals resulting from repetitive stimulation have the same fundamental frequency as the stimulation but also include higher harmonics. This study investigated how the classification accuracy of a 4-class BCI system can be improved by incorporating visually evoked harmonic oscillations. The current study revealed that the use of three SSVEP harmonics yielded a significantly higher classification accuracy than was the case for one or two harmonics. During feedback experiments, the five subjects investigated reached a classification accuracy between 42.5% and 94.4%.
Müller-Putz, Gernot R; Scherer, Reinhold; Brauneis, Christian; Pfurtscheller, Gert
2005-12-01
Brain-computer interfaces (BCIs) can be realized on the basis of steady-state evoked potentials (SSEPs). These types of brain signals resulting from repetitive stimulation have the same fundamental frequency as the stimulation but also include higher harmonics. This study investigated how the classification accuracy of a 4-class BCI system can be improved by incorporating visually evoked harmonic oscillations. The current study revealed that the use of three SSVEP harmonics yielded a significantly higher classification accuracy than was the case for one or two harmonics. During feedback experiments, the five subjects investigated reached a classification accuracy between 42.5% and 94.4%.
A Classification of Remote Sensing Image Based on Improved Compound Kernels of Svm
NASA Astrophysics Data System (ADS)
Zhao, Jianing; Gao, Wanlin; Liu, Zili; Mou, Guifen; Lu, Lin; Yu, Lina
The accuracy of RS classification based on SVM which is developed from statistical learning theory is high under small number of train samples, which results in satisfaction of classification on RS using SVM methods. The traditional RS classification method combines visual interpretation with computer classification. The accuracy of the RS classification, however, is improved a lot based on SVM method, because it saves much labor and time which is used to interpret images and collect training samples. Kernel functions play an important part in the SVM algorithm. It uses improved compound kernel function and therefore has a higher accuracy of classification on RS images. Moreover, compound kernel improves the generalization and learning ability of the kernel.
NASA Astrophysics Data System (ADS)
Hu, Ruiguang; Xiao, Liping; Zheng, Wenjuan
2015-12-01
In this paper, multi-kernel learning(MKL) is used for drug-related webpages classification. First, body text and image-label text are extracted through HTML parsing, and valid images are chosen by the FOCARSS algorithm. Second, text based BOW model is used to generate text representation, and image-based BOW model is used to generate images representation. Last, text and images representation are fused with a few methods. Experimental results demonstrate that the classification accuracy of MKL is higher than those of all other fusion methods in decision level and feature level, and much higher than the accuracy of single-modal classification.
Shin, Jaeyoung; Kwon, Jinuk; Im, Chang-Hwan
2018-01-01
The performance of a brain-computer interface (BCI) can be enhanced by simultaneously using two or more modalities to record brain activity, which is generally referred to as a hybrid BCI. To date, many BCI researchers have tried to implement a hybrid BCI system by combining electroencephalography (EEG) and functional near-infrared spectroscopy (NIRS) to improve the overall accuracy of binary classification. However, since hybrid EEG-NIRS BCI, which will be denoted by hBCI in this paper, has not been applied to ternary classification problems, paradigms and classification strategies appropriate for ternary classification using hBCI are not well investigated. Here we propose the use of an hBCI for the classification of three brain activation patterns elicited by mental arithmetic, motor imagery, and idle state, with the aim to elevate the information transfer rate (ITR) of hBCI by increasing the number of classes while minimizing the loss of accuracy. EEG electrodes were placed over the prefrontal cortex and the central cortex, and NIRS optodes were placed only on the forehead. The ternary classification problem was decomposed into three binary classification problems using the "one-versus-one" (OVO) classification strategy to apply the filter-bank common spatial patterns filter to EEG data. A 10 × 10-fold cross validation was performed using shrinkage linear discriminant analysis (sLDA) to evaluate the average classification accuracies for EEG-BCI, NIRS-BCI, and hBCI when the meta-classification method was adopted to enhance classification accuracy. The ternary classification accuracies for EEG-BCI, NIRS-BCI, and hBCI were 76.1 ± 12.8, 64.1 ± 9.7, and 82.2 ± 10.2%, respectively. The classification accuracy of the proposed hBCI was thus significantly higher than those of the other BCIs ( p < 0.005). The average ITR for the proposed hBCI was calculated to be 4.70 ± 1.92 bits/minute, which was 34.3% higher than that reported for a previous binary hBCI study.
Derivation of an artificial gene to improve classification accuracy upon gene selection.
Seo, Minseok; Oh, Sejong
2012-02-01
Classification analysis has been developed continuously since 1936. This research field has advanced as a result of development of classifiers such as KNN, ANN, and SVM, as well as through data preprocessing areas. Feature (gene) selection is required for very high dimensional data such as microarray before classification work. The goal of feature selection is to choose a subset of informative features that reduces processing time and provides higher classification accuracy. In this study, we devised a method of artificial gene making (AGM) for microarray data to improve classification accuracy. Our artificial gene was derived from a whole microarray dataset, and combined with a result of gene selection for classification analysis. We experimentally confirmed a clear improvement of classification accuracy after inserting artificial gene. Our artificial gene worked well for popular feature (gene) selection algorithms and classifiers. The proposed approach can be applied to any type of high dimensional dataset. Copyright © 2011 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Fujita, Yusuke; Mitani, Yoshihiro; Hamamoto, Yoshihiko; Segawa, Makoto; Terai, Shuji; Sakaida, Isao
2017-03-01
Ultrasound imaging is a popular and non-invasive tool used in the diagnoses of liver disease. Cirrhosis is a chronic liver disease and it can advance to liver cancer. Early detection and appropriate treatment are crucial to prevent liver cancer. However, ultrasound image analysis is very challenging, because of the low signal-to-noise ratio of ultrasound images. To achieve the higher classification performance, selection of training regions of interest (ROIs) is very important that effect to classification accuracy. The purpose of our study is cirrhosis detection with high accuracy using liver ultrasound images. In our previous works, training ROI selection by MILBoost and multiple-ROI classification based on the product rule had been proposed, to achieve high classification performance. In this article, we propose self-training method to select training ROIs effectively. Evaluation experiments were performed to evaluate effect of self-training, using manually selected ROIs and also automatically selected ROIs. Experimental results show that self-training for manually selected ROIs achieved higher classification performance than other approaches, including our conventional methods. The manually ROI definition and sample selection are important to improve classification accuracy in cirrhosis detection using ultrasound images.
Zhang, Xiaoheng; Wang, Lirui; Cao, Yao; Wang, Pin; Zhang, Cheng; Yang, Liuyang; Li, Yongming; Zhang, Yanling; Cheng, Oumei
2018-02-01
Diagnosis of Parkinson's disease (PD) based on speech data has been proved to be an effective way in recent years. However, current researches just care about the feature extraction and classifier design, and do not consider the instance selection. Former research by authors showed that the instance selection can lead to improvement on classification accuracy. However, no attention is paid on the relationship between speech sample and feature until now. Therefore, a new diagnosis algorithm of PD is proposed in this paper by simultaneously selecting speech sample and feature based on relevant feature weighting algorithm and multiple kernel method, so as to find their synergy effects, thereby improving classification accuracy. Experimental results showed that this proposed algorithm obtained apparent improvement on classification accuracy. It can obtain mean classification accuracy of 82.5%, which was 30.5% higher than the relevant algorithm. Besides, the proposed algorithm detected the synergy effects of speech sample and feature, which is valuable for speech marker extraction.
Blob-level active-passive data fusion for Benthic classification
NASA Astrophysics Data System (ADS)
Park, Joong Yong; Kalluri, Hemanth; Mathur, Abhinav; Ramnath, Vinod; Kim, Minsu; Aitken, Jennifer; Tuell, Grady
2012-06-01
We extend the data fusion pixel level to the more semantically meaningful blob level, using the mean-shift algorithm to form labeled blobs having high similarity in the feature domain, and connectivity in the spatial domain. We have also developed Bhattacharyya Distance (BD) and rule-based classifiers, and have implemented these higher-level data fusion algorithms into the CZMIL Data Processing System. Applying these new algorithms to recent SHOALS and CASI data at Plymouth Harbor, Massachusetts, we achieved improved benthic classification accuracies over those produced with either single sensor, or pixel-level fusion strategies. These results appear to validate the hypothesis that classification accuracy may be generally improved by adopting higher spatial and semantic levels of fusion.
A neural network approach to cloud classification
NASA Technical Reports Server (NTRS)
Lee, Jonathan; Weger, Ronald C.; Sengupta, Sailes K.; Welch, Ronald M.
1990-01-01
It is shown that, using high-spatial-resolution data, very high cloud classification accuracies can be obtained with a neural network approach. A texture-based neural network classifier using only single-channel visible Landsat MSS imagery achieves an overall cloud identification accuracy of 93 percent. Cirrus can be distinguished from boundary layer cloudiness with an accuracy of 96 percent, without the use of an infrared channel. Stratocumulus is retrieved with an accuracy of 92 percent, cumulus at 90 percent. The use of the neural network does not improve cirrus classification accuracy. Rather, its main effect is in the improved separation between stratocumulus and cumulus cloudiness. While most cloud classification algorithms rely on linear parametric schemes, the present study is based on a nonlinear, nonparametric four-layer neural network approach. A three-layer neural network architecture, the nonparametric K-nearest neighbor approach, and the linear stepwise discriminant analysis procedure are compared. A significant finding is that significantly higher accuracies are attained with the nonparametric approaches using only 20 percent of the database as training data, compared to 67 percent of the database in the linear approach.
Papageorgiou, Eirini; Nieuwenhuys, Angela; Desloovere, Kaat
2017-01-01
Background This study aimed to improve the automatic probabilistic classification of joint motion gait patterns in children with cerebral palsy by using the expert knowledge available via a recently developed Delphi-consensus study. To this end, this study applied both Naïve Bayes and Logistic Regression classification with varying degrees of usage of the expert knowledge (expert-defined and discretized features). A database of 356 patients and 1719 gait trials was used to validate the classification performance of eleven joint motions. Hypotheses Two main hypotheses stated that: (1) Joint motion patterns in children with CP, obtained through a Delphi-consensus study, can be automatically classified following a probabilistic approach, with an accuracy similar to clinical expert classification, and (2) The inclusion of clinical expert knowledge in the selection of relevant gait features and the discretization of continuous features increases the performance of automatic probabilistic joint motion classification. Findings This study provided objective evidence supporting the first hypothesis. Automatic probabilistic gait classification using the expert knowledge available from the Delphi-consensus study resulted in accuracy (91%) similar to that obtained with two expert raters (90%), and higher accuracy than that obtained with non-expert raters (78%). Regarding the second hypothesis, this study demonstrated that the use of more advanced machine learning techniques such as automatic feature selection and discretization instead of expert-defined and discretized features can result in slightly higher joint motion classification performance. However, the increase in performance is limited and does not outweigh the additional computational cost and the higher risk of loss of clinical interpretability, which threatens the clinical acceptance and applicability. PMID:28570616
NASA Technical Reports Server (NTRS)
Spann, G. W.; Faust, N. L.
1974-01-01
It is known from several previous investigations that many categories of land-use can be mapped via computer processing of Earth Resources Technology Satellite data. The results are presented of one such experiment using the USGS/NASA land-use classification system. Douglas County, Georgia, was chosen as the test site for this project. It was chosen primarily because of its recent rapid growth and future growth potential. Results of the investigation indicate an overall land-use mapping accuracy of 67% with higher accuracies in rural areas and lower accuracies in urban areas. It is estimated, however, that 95% of the State of Georgia could be mapped by these techniques with an accuracy of 80% to 90%.
Accurate crop classification using hierarchical genetic fuzzy rule-based systems
NASA Astrophysics Data System (ADS)
Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.
2014-10-01
This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.
Classification of EEG Signals Based on Pattern Recognition Approach.
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a "pattern recognition" approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90-7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11-89.63% and 91.60-81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy.
Classification of EEG Signals Based on Pattern Recognition Approach
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a “pattern recognition” approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90–7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11–89.63% and 91.60–81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy. PMID:29209190
Van Cott, Andrew; Hastings, Charles E; Landsiedel, Robert; Kolle, Susanne; Stinchcombe, Stefan
2018-02-01
In vivo acute systemic testing is a regulatory requirement for agrochemical formulations. GHS specifies an alternative computational approach (GHS additivity formula) for calculating the acute toxicity of mixtures. We collected acute systemic toxicity data from formulations that contained one of several acutely-toxic active ingredients. The resulting acute data set includes 210 formulations tested for oral toxicity, 128 formulations tested for inhalation toxicity and 31 formulations tested for dermal toxicity. The GHS additivity formula was applied to each of these formulations and compared with the experimental in vivo result. In the acute oral assay, the GHS additivity formula misclassified 110 formulations using the GHS classification criteria (48% accuracy) and 119 formulations using the USEPA classification criteria (43% accuracy). With acute inhalation, the GHS additivity formula misclassified 50 formulations using the GHS classification criteria (61% accuracy) and 34 formulations using the USEPA classification criteria (73% accuracy). For acute dermal toxicity, the GHS additivity formula misclassified 16 formulations using the GHS classification criteria (48% accuracy) and 20 formulations using the USEPA classification criteria (36% accuracy). This data indicates the acute systemic toxicity of many formulations is not the sum of the ingredients' toxicity (additivity); but rather, ingredients in a formulation can interact to result in lower or higher toxicity than predicted by the GHS additivity formula. Copyright © 2018 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Mukhopadhyay, Sabyasachi; Kurmi, Indrajit; Pratiher, Sawon; Mukherjee, Sukanya; Barman, Ritwik; Ghosh, Nirmalya; Panigrahi, Prasanta K.
2018-02-01
In this paper, a comparative study between SVM and HMM has been carried out for multiclass classification of cervical healthy and cancerous tissues. In our study, the HMM methodology is more promising to produce higher accuracy in classification.
NASA Astrophysics Data System (ADS)
Szuflitowska, B.; Orlowski, P.
2017-08-01
Automated detection system consists of two key steps: extraction of features from EEG signals and classification for detection of pathology activity. The EEG sequences were analyzed using Short-Time Fourier Transform and the classification was performed using Linear Discriminant Analysis. The accuracy of the technique was tested on three sets of EEG signals: epilepsy, healthy and Alzheimer's Disease. The classification error below 10% has been considered a success. The higher accuracy are obtained for new data of unknown classes than testing data. The methodology can be helpful in differentiation epilepsy seizure and disturbances in the EEG signal in Alzheimer's Disease.
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features.
Li, Linyi; Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images.
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features
Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images. PMID:28761440
Singha, Mrinal; Wu, Bingfang; Zhang, Miao
2016-01-01
Accurate and timely mapping of paddy rice is vital for food security and environmental sustainability. This study evaluates the utility of temporal features extracted from coarse resolution data for object-based paddy rice classification of fine resolution data. The coarse resolution vegetation index data is first fused with the fine resolution data to generate the time series fine resolution data. Temporal features are extracted from the fused data and added with the multi-spectral data to improve the classification accuracy. Temporal features provided the crop growth information, while multi-spectral data provided the pattern variation of paddy rice. The achieved overall classification accuracy and kappa coefficient were 84.37% and 0.68, respectively. The results indicate that the use of temporal features improved the overall classification accuracy of a single-date multi-spectral image by 18.75% from 65.62% to 84.37%. The minimum sensitivity (MS) of the paddy rice classification has also been improved. The comparison showed that the mapped paddy area was analogous to the agricultural statistics at the district level. This work also highlighted the importance of feature selection to achieve higher classification accuracies. These results demonstrate the potential of the combined use of temporal and spectral features for accurate paddy rice classification. PMID:28025525
Janousova, Eva; Schwarz, Daniel; Kasparek, Tomas
2015-06-30
We investigated a combination of three classification algorithms, namely the modified maximum uncertainty linear discriminant analysis (mMLDA), the centroid method, and the average linkage, with three types of features extracted from three-dimensional T1-weighted magnetic resonance (MR) brain images, specifically MR intensities, grey matter densities, and local deformations for distinguishing 49 first episode schizophrenia male patients from 49 healthy male subjects. The feature sets were reduced using intersubject principal component analysis before classification. By combining the classifiers, we were able to obtain slightly improved results when compared with single classifiers. The best classification performance (81.6% accuracy, 75.5% sensitivity, and 87.8% specificity) was significantly better than classification by chance. We also showed that classifiers based on features calculated using more computation-intensive image preprocessing perform better; mMLDA with classification boundary calculated as weighted mean discriminative scores of the groups had improved sensitivity but similar accuracy compared to the original MLDA; reducing a number of eigenvectors during data reduction did not always lead to higher classification accuracy, since noise as well as the signal important for classification were removed. Our findings provide important information for schizophrenia research and may improve accuracy of computer-aided diagnostics of neuropsychiatric diseases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Singha, Mrinal; Wu, Bingfang; Zhang, Miao
2016-12-22
Accurate and timely mapping of paddy rice is vital for food security and environmental sustainability. This study evaluates the utility of temporal features extracted from coarse resolution data for object-based paddy rice classification of fine resolution data. The coarse resolution vegetation index data is first fused with the fine resolution data to generate the time series fine resolution data. Temporal features are extracted from the fused data and added with the multi-spectral data to improve the classification accuracy. Temporal features provided the crop growth information, while multi-spectral data provided the pattern variation of paddy rice. The achieved overall classification accuracy and kappa coefficient were 84.37% and 0.68, respectively. The results indicate that the use of temporal features improved the overall classification accuracy of a single-date multi-spectral image by 18.75% from 65.62% to 84.37%. The minimum sensitivity (MS) of the paddy rice classification has also been improved. The comparison showed that the mapped paddy area was analogous to the agricultural statistics at the district level. This work also highlighted the importance of feature selection to achieve higher classification accuracies. These results demonstrate the potential of the combined use of temporal and spectral features for accurate paddy rice classification.
Classifying four-category visual objects using multiple ERP components in single-trial ERP.
Qin, Yu; Zhan, Yu; Wang, Changming; Zhang, Jiacai; Yao, Li; Guo, Xiaojuan; Wu, Xia; Hu, Bin
2016-08-01
Object categorization using single-trial electroencephalography (EEG) data measured while participants view images has been studied intensively. In previous studies, multiple event-related potential (ERP) components (e.g., P1, N1, P2, and P3) were used to improve the performance of object categorization of visual stimuli. In this study, we introduce a novel method that uses multiple-kernel support vector machine to fuse multiple ERP component features. We investigate whether fusing the potential complementary information of different ERP components (e.g., P1, N1, P2a, and P2b) can improve the performance of four-category visual object classification in single-trial EEGs. We also compare the classification accuracy of different ERP component fusion methods. Our experimental results indicate that the classification accuracy increases through multiple ERP fusion. Additional comparative analyses indicate that the multiple-kernel fusion method can achieve a mean classification accuracy higher than 72 %, which is substantially better than that achieved with any single ERP component feature (55.07 % for the best single ERP component, N1). We compare the classification results with those of other fusion methods and determine that the accuracy of the multiple-kernel fusion method is 5.47, 4.06, and 16.90 % higher than those of feature concatenation, feature extraction, and decision fusion, respectively. Our study shows that our multiple-kernel fusion method outperforms other fusion methods and thus provides a means to improve the classification performance of single-trial ERPs in brain-computer interface research.
Boursier, Jérôme; Bertrais, Sandrine; Oberti, Frédéric; Gallois, Yves; Fouchard-Hubert, Isabelle; Rousselet, Marie-Christine; Zarski, Jean-Pierre; Calès, Paul
2011-11-30
Non-invasive tests have been constructed and evaluated mainly for binary diagnoses such as significant fibrosis. Recently, detailed fibrosis classifications for several non-invasive tests have been developed, but their accuracy has not been thoroughly evaluated in comparison to liver biopsy, especially in clinical practice and for Fibroscan. Therefore, the main aim of the present study was to evaluate the accuracy of detailed fibrosis classifications available for non-invasive tests and liver biopsy. The secondary aim was to validate these accuracies in independent populations. Four HCV populations provided 2,068 patients with liver biopsy, four different pathologist skill-levels and non-invasive tests. Results were expressed as percentages of correctly classified patients. In population #1 including 205 patients and comparing liver biopsy (reference: consensus reading by two experts) and blood tests, Metavir fibrosis (FM) stage accuracy was 64.4% in local pathologists vs. 82.2% (p < 10-3) in single expert pathologist. Significant discrepancy (≥ 2FM vs reference histological result) rates were: Fibrotest: 17.2%, FibroMeter2G: 5.6%, local pathologists: 4.9%, FibroMeter3G: 0.5%, expert pathologist: 0% (p < 10-3). In population #2 including 1,056 patients and comparing blood tests, the discrepancy scores, taking into account the error magnitude, of detailed fibrosis classification were significantly different between FibroMeter2G (0.30 ± 0.55) and FibroMeter3G (0.14 ± 0.37, p < 10-3) or Fibrotest (0.84 ± 0.80, p < 10-3). In population #3 (and #4) including 458 (359) patients and comparing blood tests and Fibroscan, accuracies of detailed fibrosis classification were, respectively: Fibrotest: 42.5% (33.5%), Fibroscan: 64.9% (50.7%), FibroMeter2G: 68.7% (68.2%), FibroMeter3G: 77.1% (83.4%), p < 10-3 (p < 10-3). Significant discrepancy (≥ 2 FM) rates were, respectively: Fibrotest: 21.3% (22.2%), Fibroscan: 12.9% (12.3%), FibroMeter2G: 5.7% (6.0%), FibroMeter3G: 0.9% (0.9%), p < 10-3 (p < 10-3). The accuracy in detailed fibrosis classification of the best-performing blood test outperforms liver biopsy read by a local pathologist, i.e., in clinical practice; however, the classification precision is apparently lesser. This detailed classification accuracy is much lower than that of significant fibrosis with Fibroscan and even Fibrotest but higher with FibroMeter3G. FibroMeter classification accuracy was significantly higher than those of other non-invasive tests. Finally, for hepatitis C evaluation in clinical practice, fibrosis degree can be evaluated using an accurate blood test.
2011-01-01
Background Non-invasive tests have been constructed and evaluated mainly for binary diagnoses such as significant fibrosis. Recently, detailed fibrosis classifications for several non-invasive tests have been developed, but their accuracy has not been thoroughly evaluated in comparison to liver biopsy, especially in clinical practice and for Fibroscan. Therefore, the main aim of the present study was to evaluate the accuracy of detailed fibrosis classifications available for non-invasive tests and liver biopsy. The secondary aim was to validate these accuracies in independent populations. Methods Four HCV populations provided 2,068 patients with liver biopsy, four different pathologist skill-levels and non-invasive tests. Results were expressed as percentages of correctly classified patients. Results In population #1 including 205 patients and comparing liver biopsy (reference: consensus reading by two experts) and blood tests, Metavir fibrosis (FM) stage accuracy was 64.4% in local pathologists vs. 82.2% (p < 10-3) in single expert pathologist. Significant discrepancy (≥ 2FM vs reference histological result) rates were: Fibrotest: 17.2%, FibroMeter2G: 5.6%, local pathologists: 4.9%, FibroMeter3G: 0.5%, expert pathologist: 0% (p < 10-3). In population #2 including 1,056 patients and comparing blood tests, the discrepancy scores, taking into account the error magnitude, of detailed fibrosis classification were significantly different between FibroMeter2G (0.30 ± 0.55) and FibroMeter3G (0.14 ± 0.37, p < 10-3) or Fibrotest (0.84 ± 0.80, p < 10-3). In population #3 (and #4) including 458 (359) patients and comparing blood tests and Fibroscan, accuracies of detailed fibrosis classification were, respectively: Fibrotest: 42.5% (33.5%), Fibroscan: 64.9% (50.7%), FibroMeter2G: 68.7% (68.2%), FibroMeter3G: 77.1% (83.4%), p < 10-3 (p < 10-3). Significant discrepancy (≥ 2 FM) rates were, respectively: Fibrotest: 21.3% (22.2%), Fibroscan: 12.9% (12.3%), FibroMeter2G: 5.7% (6.0%), FibroMeter3G: 0.9% (0.9%), p < 10-3 (p < 10-3). Conclusions The accuracy in detailed fibrosis classification of the best-performing blood test outperforms liver biopsy read by a local pathologist, i.e., in clinical practice; however, the classification precision is apparently lesser. This detailed classification accuracy is much lower than that of significant fibrosis with Fibroscan and even Fibrotest but higher with FibroMeter3G. FibroMeter classification accuracy was significantly higher than those of other non-invasive tests. Finally, for hepatitis C evaluation in clinical practice, fibrosis degree can be evaluated using an accurate blood test. PMID:22129438
Study of wavelet packet energy entropy for emotion classification in speech and glottal signals
NASA Astrophysics Data System (ADS)
He, Ling; Lech, Margaret; Zhang, Jing; Ren, Xiaomei; Deng, Lihua
2013-07-01
The automatic speech emotion recognition has important applications in human-machine communication. Majority of current research in this area is focused on finding optimal feature parameters. In recent studies, several glottal features were examined as potential cues for emotion differentiation. In this study, a new type of feature parameter is proposed, which calculates energy entropy on values within selected Wavelet Packet frequency bands. The modeling and classification tasks are conducted using the classical GMM algorithm. The experiments use two data sets: the Speech Under Simulated Emotion (SUSE) data set annotated with three different emotions (angry, neutral and soft) and Berlin Emotional Speech (BES) database annotated with seven different emotions (angry, bored, disgust, fear, happy, sad and neutral). The average classification accuracy achieved for the SUSE data (74%-76%) is significantly higher than the accuracy achieved for the BES data (51%-54%). In both cases, the accuracy was significantly higher than the respective random guessing levels (33% for SUSE and 14.3% for BES).
Thematic accuracy of the National Land Cover Database (NLCD) 2001 land cover for Alaska
Selkowitz, D.J.; Stehman, S.V.
2011-01-01
The National Land Cover Database (NLCD) 2001 Alaska land cover classification is the first 30-m resolution land cover product available covering the entire state of Alaska. The accuracy assessment of the NLCD 2001 Alaska land cover classification employed a geographically stratified three-stage sampling design to select the reference sample of pixels. Reference land cover class labels were determined via fixed wing aircraft, as the high resolution imagery used for determining the reference land cover classification in the conterminous U.S. was not available for most of Alaska. Overall thematic accuracy for the Alaska NLCD was 76.2% (s.e. 2.8%) at Level II (12 classes evaluated) and 83.9% (s.e. 2.1%) at Level I (6 classes evaluated) when agreement was defined as a match between the map class and either the primary or alternate reference class label. When agreement was defined as a match between the map class and primary reference label only, overall accuracy was 59.4% at Level II and 69.3% at Level I. The majority of classification errors occurred at Level I of the classification hierarchy (i.e., misclassifications were generally to a different Level I class, not to a Level II class within the same Level I class). Classification accuracy was higher for more abundant land cover classes and for pixels located in the interior of homogeneous land cover patches. ?? 2011.
Forest tree species discrimination in western Himalaya using EO-1 Hyperion
NASA Astrophysics Data System (ADS)
George, Rajee; Padalia, Hitendra; Kushwaha, S. P. S.
2014-05-01
The information acquired in the narrow bands of hyperspectral remote sensing data has potential to capture plant species spectral variability, thereby improving forest tree species mapping. This study assessed the utility of spaceborne EO-1 Hyperion data in discrimination and classification of broadleaved evergreen and conifer forest tree species in western Himalaya. The pre-processing of 242 bands of Hyperion data resulted into 160 noise-free and vertical stripe corrected reflectance bands. Of these, 29 bands were selected through step-wise exclusion of bands (Wilk's Lambda). Spectral Angle Mapper (SAM) and Support Vector Machine (SVM) algorithms were applied to the selected bands to assess their effectiveness in classification. SVM was also applied to broadband data (Landsat TM) to compare the variation in classification accuracy. All commonly occurring six gregarious tree species, viz., white oak, brown oak, chir pine, blue pine, cedar and fir in western Himalaya could be effectively discriminated. SVM produced a better species classification (overall accuracy 82.27%, kappa statistic 0.79) than SAM (overall accuracy 74.68%, kappa statistic 0.70). It was noticed that classification accuracy achieved with Hyperion bands was significantly higher than Landsat TM bands (overall accuracy 69.62%, kappa statistic 0.65). Study demonstrated the potential utility of narrow spectral bands of Hyperion data in discriminating tree species in a hilly terrain.
Word pair classification during imagined speech using direct brain recordings
NASA Astrophysics Data System (ADS)
Martin, Stephanie; Brunner, Peter; Iturrate, Iñaki; Millán, José Del R.; Schalk, Gerwin; Knight, Robert T.; Pasley, Brian N.
2016-05-01
People that cannot communicate due to neurological disorders would benefit from an internal speech decoder. Here, we showed the ability to classify individual words during imagined speech from electrocorticographic signals. In a word imagery task, we used high gamma (70-150 Hz) time features with a support vector machine model to classify individual words from a pair of words. To account for temporal irregularities during speech production, we introduced a non-linear time alignment into the SVM kernel. Classification accuracy reached 88% in a two-class classification framework (50% chance level), and average classification accuracy across fifteen word-pairs was significant across five subjects (mean = 58% p < 0.05). We also compared classification accuracy between imagined speech, overt speech and listening. As predicted, higher classification accuracy was obtained in the listening and overt speech conditions (mean = 89% and 86%, respectively; p < 0.0001), where speech stimuli were directly presented. The results provide evidence for a neural representation for imagined words in the temporal lobe, frontal lobe and sensorimotor cortex, consistent with previous findings in speech perception and production. These data represent a proof of concept study for basic decoding of speech imagery, and delineate a number of key challenges to usage of speech imagery neural representations for clinical applications.
Word pair classification during imagined speech using direct brain recordings
Martin, Stephanie; Brunner, Peter; Iturrate, Iñaki; Millán, José del R.; Schalk, Gerwin; Knight, Robert T.; Pasley, Brian N.
2016-01-01
People that cannot communicate due to neurological disorders would benefit from an internal speech decoder. Here, we showed the ability to classify individual words during imagined speech from electrocorticographic signals. In a word imagery task, we used high gamma (70–150 Hz) time features with a support vector machine model to classify individual words from a pair of words. To account for temporal irregularities during speech production, we introduced a non-linear time alignment into the SVM kernel. Classification accuracy reached 88% in a two-class classification framework (50% chance level), and average classification accuracy across fifteen word-pairs was significant across five subjects (mean = 58%; p < 0.05). We also compared classification accuracy between imagined speech, overt speech and listening. As predicted, higher classification accuracy was obtained in the listening and overt speech conditions (mean = 89% and 86%, respectively; p < 0.0001), where speech stimuli were directly presented. The results provide evidence for a neural representation for imagined words in the temporal lobe, frontal lobe and sensorimotor cortex, consistent with previous findings in speech perception and production. These data represent a proof of concept study for basic decoding of speech imagery, and delineate a number of key challenges to usage of speech imagery neural representations for clinical applications. PMID:27165452
NASA Technical Reports Server (NTRS)
Fagan, Matthew E.; Defries, Ruth S.; Sesnie, Steven E.; Arroyo-Mora, J. Pablo; Soto, Carlomagno; Singh, Aditya; Townsend, Philip A.; Chazdon, Robin L.
2015-01-01
An efficient means to map tree plantations is needed to detect tropical land use change and evaluate reforestation projects. To analyze recent tree plantation expansion in northeastern Costa Rica, we examined the potential of combining moderate-resolution hyperspectral imagery (2005 HyMap mosaic) with multitemporal, multispectral data (Landsat) to accurately classify (1) general forest types and (2) tree plantations by species composition. Following a linear discriminant analysis to reduce data dimensionality, we compared four Random Forest classification models: hyperspectral data (HD) alone; HD plus interannual spectral metrics; HD plus a multitemporal forest regrowth classification; and all three models combined. The fourth, combined model achieved overall accuracy of 88.5%. Adding multitemporal data significantly improved classification accuracy (p less than 0.0001) of all forest types, although the effect on tree plantation accuracy was modest. The hyperspectral data alone classified six species of tree plantations with 75% to 93% producer's accuracy; adding multitemporal spectral data increased accuracy only for two species with dense canopies. Non-native tree species had higher classification accuracy overall and made up the majority of tree plantations in this landscape. Our results indicate that combining occasionally acquired hyperspectral data with widely available multitemporal satellite imagery enhances mapping and monitoring of reforestation in tropical landscapes.
Zourmand, Alireza; Ting, Hua-Nong; Mirhassani, Seyed Mostafa
2013-03-01
Speech is one of the prevalent communication mediums for humans. Identifying the gender of a child speaker based on his/her speech is crucial in telecommunication and speech therapy. This article investigates the use of fundamental and formant frequencies from sustained vowel phonation to distinguish the gender of Malay children aged between 7 and 12 years. The Euclidean minimum distance and multilayer perceptron were used to classify the gender of 360 Malay children based on different combinations of fundamental and formant frequencies (F0, F1, F2, and F3). The Euclidean minimum distance with normalized frequency data achieved a classification accuracy of 79.44%, which was higher than that of the nonnormalized frequency data. Age-dependent modeling was used to improve the accuracy of gender classification. The Euclidean distance method obtained 84.17% based on the optimal classification accuracy for all age groups. The accuracy was further increased to 99.81% using multilayer perceptron based on mel-frequency cepstral coefficients. Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Palaniappan, Rajkumar; Sundaraj, Kenneth; Sundaraj, Sebastian; Huliraj, N; Revadi, S S
2017-06-08
Auscultation is a medical procedure used for the initial diagnosis and assessment of lung and heart diseases. From this perspective, we propose assessing the performance of the extreme learning machine (ELM) classifiers for the diagnosis of pulmonary pathology using breath sounds. Energy and entropy features were extracted from the breath sound using the wavelet packet transform. The statistical significance of the extracted features was evaluated by one-way analysis of variance (ANOVA). The extracted features were inputted into the ELM classifier. The maximum classification accuracies obtained for the conventional validation (CV) of the energy and entropy features were 97.36% and 98.37%, respectively, whereas the accuracies obtained for the cross validation (CRV) of the energy and entropy features were 96.80% and 97.91%, respectively. In addition, maximum classification accuracies of 98.25% and 99.25% were obtained for the CV and CRV of the ensemble features, respectively. The results indicate that the classification accuracy obtained with the ensemble features was higher than those obtained with the energy and entropy features.
Evaluation of airborne image data for mapping riparian vegetation within the Grand Canyon
Davis, Philip A.; Staid, Matthew I.; Plescia, Jeffrey B.; Johnson, Jeffrey R.
2002-01-01
This study examined various types of remote-sensing data that have been acquired during a 12-month period over a portion of the Colorado River corridor to determine the type of data and conditions for data acquisition that provide the optimum classification results for mapping riparian vegetation. Issues related to vegetation mapping included time of year, number and positions of wavelength bands, and spatial resolution for data acquisition to produce accurate vegetation maps versus cost of data. Image data considered in the study consisted of scanned color-infrared (CIR) film, digital CIR, and digital multispectral data, whose resolutions from 11 cm (photographic film) to 100 cm (multispectral), that were acquired during the Spring, Summer, and Fall seasons in 2000 for five long-term monitoring sites containing riparian vegetation. Results show that digitally acquired data produce higher and more consistent classification accuracies for mapping vegetation units than do film products. The highest accuracies were obtained from nine-band multispectral data; however, a four-band subset of these data, that did not include short-wave infrared bands, produced comparable mapping results. The four-band subset consisted of the wavelength bands 0.52-0.59 µm, 0.59-0.62 µm, 0.67-0.72 µm, and 0.73-0.85 µm. Use of only three of these bands that simulate digital CIR sensors produced accuracies for several vegetation units that were 10% lower than those obtained using the full multispectral data set. Classification tests using band ratios produced lower accuracies than those using band reflectance for scanned film data; a result attributed to the relatively poor radiometric fidelity maintained by the film scanning process, whereas calibrated multispectral data produced similar classification accuracies using band reflectance and band ratios. This suggests that the intrinsic band reflectance of the vegetation is more important than inter-band reflectance differences in attaining high mapping accuracies. These results also indicate that radiometrically calibrated sensors that record a wide range of radiance produce superior results and that such sensors should be used for monitoring purposes. When texture (spatial variance) at near-infrared wavelength is combined with spectral data in classification, accuracy increased most markedly (20-30%) for the highest resolution (11-cm) CIR film data, but decreased in its effect on accuracy in lower-resolution multi-spectral image data; a result observed in previous studies (Franklin and McDermid 1993, Franklin et al. 2000, 2001). While many classification unit accuracies obtained from the 11-cm film CIR band with texture data were in fact higher than those produced using the 100-cm, nine-band multispectral data with texture, the 11-cm film CIR data produced much lower accuracies than the 100-cm multispectral data for the more sparsely populated vegetation units due to saturation of picture elements during the film scanning process in vegetation units with a high proportion of alluvium. Overall classification accuracies obtained from spectral band and texture data range from 36% to 78% for all databases considered, from 57% to 71% for the 11-cm film CIR data, and from 54% to 78% for the 100-cm multispectral data. Classification results obtained from 20-cm film CIR band and texture data, which were produced by applying a Gaussian filter to the 11-cm film CIR data, showed increases in accuracy due to texture that were similar to those observed using the original 11-cm film CIR data. This suggests that data can be collected at the lower resolution and still retain the added power of vegetation texture. Classification accuracies for the riparian vegetation units examined in this study do not appear to be influenced by season of data acquisition, although data acquired under direct sunlight produced higher overall accuracies than data acquired under overcast conditions. The latter observation, in addition to the importance of band reflectance for classification, implies that data should be acquired near summer solstice when sun elevation and reflectance is highest and when shadows cast by steep canyon walls are minimized.
NASA Astrophysics Data System (ADS)
Sukawattanavijit, Chanika; Srestasathiern, Panu
2017-10-01
Land Use and Land Cover (LULC) information are significant to observe and evaluate environmental change. LULC classification applying remotely sensed data is a technique popularly employed on a global and local dimension particularly, in urban areas which have diverse land cover types. These are essential components of the urban terrain and ecosystem. In the present, object-based image analysis (OBIA) is becoming widely popular for land cover classification using the high-resolution image. COSMO-SkyMed SAR data was fused with THAICHOTE (namely, THEOS: Thailand Earth Observation Satellite) optical data for land cover classification using object-based. This paper indicates a comparison between object-based and pixel-based approaches in image fusion. The per-pixel method, support vector machines (SVM) was implemented to the fused image based on Principal Component Analysis (PCA). For the objectbased classification was applied to the fused images to separate land cover classes by using nearest neighbor (NN) classifier. Finally, the accuracy assessment was employed by comparing with the classification of land cover mapping generated from fused image dataset and THAICHOTE image. The object-based data fused COSMO-SkyMed with THAICHOTE images demonstrated the best classification accuracies, well over 85%. As the results, an object-based data fusion provides higher land cover classification accuracy than per-pixel data fusion.
Atzori, Manfredo; Cognolato, Matteo; Müller, Henning
2016-01-01
Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too. PMID:27656140
Atzori, Manfredo; Cognolato, Matteo; Müller, Henning
2016-01-01
Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too.
Hao, Pengyu; Wang, Li; Niu, Zheng
2015-01-01
A range of single classifiers have been proposed to classify crop types using time series vegetation indices, and hybrid classifiers are used to improve discriminatory power. Traditional fusion rules use the product of multi-single classifiers, but that strategy cannot integrate the classification output of machine learning classifiers. In this research, the performance of two hybrid strategies, multiple voting (M-voting) and probabilistic fusion (P-fusion), for crop classification using NDVI time series were tested with different training sample sizes at both pixel and object levels, and two representative counties in north Xinjiang were selected as study area. The single classifiers employed in this research included Random Forest (RF), Support Vector Machine (SVM), and See 5 (C 5.0). The results indicated that classification performance improved (increased the mean overall accuracy by 5%~10%, and reduced standard deviation of overall accuracy by around 1%) substantially with the training sample number, and when the training sample size was small (50 or 100 training samples), hybrid classifiers substantially outperformed single classifiers with higher mean overall accuracy (1%~2%). However, when abundant training samples (4,000) were employed, single classifiers could achieve good classification accuracy, and all classifiers obtained similar performances. Additionally, although object-based classification did not improve accuracy, it resulted in greater visual appeal, especially in study areas with a heterogeneous cropping pattern. PMID:26360597
A Unified Classification Framework for FP, DP and CP Data at X-Band in Southern China
NASA Astrophysics Data System (ADS)
Xie, Lei; Zhang, Hong; Li, Hhongzhong; Wang, Chao
2015-04-01
The main objective of this paper is to introduce an unified framework for crop classification in Southern China using data in fully polarimetric (FP), dual-pol (DP) and compact polarimetric (CP) modes. The TerraSAR-X data acquired over the Leizhou Peninsula, South China are used in our experiments. The study site involves four main crops (rice, banana, sugarcane eucalyptus). Through exploring the similarities between data in these three modes, a knowledge-based characteristic space is created and the unified framework is presented. The overall classification accuracies for data in the FP, coherent HH/VV are about 95%, and is about 91% in CP modes, which suggests that the proposed classification scheme is effective and promising. Compared with the Wishart Maximum Likelihood (ML) classifier, the proposed method exhibits higher classification accuracy.
NASA Astrophysics Data System (ADS)
Tao, C.-S.; Chen, S.-W.; Li, Y.-Z.; Xiao, S.-P.
2017-09-01
Land cover classification is an important application for polarimetric synthetic aperture radar (PolSAR) data utilization. Rollinvariant polarimetric features such as H / Ani / α / Span are commonly adopted in PolSAR land cover classification. However, target orientation diversity effect makes PolSAR images understanding and interpretation difficult. Only using the roll-invariant polarimetric features may introduce ambiguity in the interpretation of targets' scattering mechanisms and limit the followed classification accuracy. To address this problem, this work firstly focuses on hidden polarimetric feature mining in the rotation domain along the radar line of sight using the recently reported uniform polarimetric matrix rotation theory and the visualization and characterization tool of polarimetric coherence pattern. The former rotates the acquired polarimetric matrix along the radar line of sight and fully describes the rotation characteristics of each entry of the matrix. Sets of new polarimetric features are derived to describe the hidden scattering information of the target in the rotation domain. The latter extends the traditional polarimetric coherence at a given rotation angle to the rotation domain for complete interpretation. A visualization and characterization tool is established to derive new polarimetric features for hidden information exploration. Then, a classification scheme is developed combing both the selected new hidden polarimetric features in rotation domain and the commonly used roll-invariant polarimetric features with a support vector machine (SVM) classifier. Comparison experiments based on AIRSAR and multi-temporal UAVSAR data demonstrate that compared with the conventional classification scheme which only uses the roll-invariant polarimetric features, the proposed classification scheme achieves both higher classification accuracy and better robustness. For AIRSAR data, the overall classification accuracy with the proposed classification scheme is 94.91 %, while that with the conventional classification scheme is 93.70 %. Moreover, for multi-temporal UAVSAR data, the averaged overall classification accuracy with the proposed classification scheme is up to 97.08 %, which is much higher than the 87.79 % from the conventional classification scheme. Furthermore, for multitemporal PolSAR data, the proposed classification scheme can achieve better robustness. The comparison studies also clearly demonstrate that mining and utilization of hidden polarimetric features and information in the rotation domain can gain the added benefits for PolSAR land cover classification and provide a new vision for PolSAR image interpretation and application.
Cao, Jianfang; Cui, Hongyan; Shi, Hao; Jiao, Lijuan
2016-01-01
A back-propagation (BP) neural network can solve complicated random nonlinear mapping problems; therefore, it can be applied to a wide range of problems. However, as the sample size increases, the time required to train BP neural networks becomes lengthy. Moreover, the classification accuracy decreases as well. To improve the classification accuracy and runtime efficiency of the BP neural network algorithm, we proposed a parallel design and realization method for a particle swarm optimization (PSO)-optimized BP neural network based on MapReduce on the Hadoop platform using both the PSO algorithm and a parallel design. The PSO algorithm was used to optimize the BP neural network's initial weights and thresholds and improve the accuracy of the classification algorithm. The MapReduce parallel programming model was utilized to achieve parallel processing of the BP algorithm, thereby solving the problems of hardware and communication overhead when the BP neural network addresses big data. Datasets on 5 different scales were constructed using the scene image library from the SUN Database. The classification accuracy of the parallel PSO-BP neural network algorithm is approximately 92%, and the system efficiency is approximately 0.85, which presents obvious advantages when processing big data. The algorithm proposed in this study demonstrated both higher classification accuracy and improved time efficiency, which represents a significant improvement obtained from applying parallel processing to an intelligent algorithm on big data.
A review of supervised object-based land-cover image classification
NASA Astrophysics Data System (ADS)
Ma, Lei; Li, Manchun; Ma, Xiaoxue; Cheng, Liang; Du, Peijun; Liu, Yongxue
2017-08-01
Object-based image classification for land-cover mapping purposes using remote-sensing imagery has attracted significant attention in recent years. Numerous studies conducted over the past decade have investigated a broad array of sensors, feature selection, classifiers, and other factors of interest. However, these research results have not yet been synthesized to provide coherent guidance on the effect of different supervised object-based land-cover classification processes. In this study, we first construct a database with 28 fields using qualitative and quantitative information extracted from 254 experimental cases described in 173 scientific papers. Second, the results of the meta-analysis are reported, including general characteristics of the studies (e.g., the geographic range of relevant institutes, preferred journals) and the relationships between factors of interest (e.g., spatial resolution and study area or optimal segmentation scale, accuracy and number of targeted classes), especially with respect to the classification accuracy of different sensors, segmentation scale, training set size, supervised classifiers, and land-cover types. Third, useful data on supervised object-based image classification are determined from the meta-analysis. For example, we find that supervised object-based classification is currently experiencing rapid advances, while development of the fuzzy technique is limited in the object-based framework. Furthermore, spatial resolution correlates with the optimal segmentation scale and study area, and Random Forest (RF) shows the best performance in object-based classification. The area-based accuracy assessment method can obtain stable classification performance, and indicates a strong correlation between accuracy and training set size, while the accuracy of the point-based method is likely to be unstable due to mixed objects. In addition, the overall accuracy benefits from higher spatial resolution images (e.g., unmanned aerial vehicle) or agricultural sites where it also correlates with the number of targeted classes. More than 95.6% of studies involve an area less than 300 ha, and the spatial resolution of images is predominantly between 0 and 2 m. Furthermore, we identify some methods that may advance supervised object-based image classification. For example, deep learning and type-2 fuzzy techniques may further improve classification accuracy. Lastly, scientists are strongly encouraged to report results of uncertainty studies to further explore the effects of varied factors on supervised object-based image classification.
Lu, Huijuan; Wei, Shasha; Zhou, Zili; Miao, Yanzi; Lu, Yi
2015-01-01
The main purpose of traditional classification algorithms on bioinformatics application is to acquire better classification accuracy. However, these algorithms cannot meet the requirement that minimises the average misclassification cost. In this paper, a new algorithm of cost-sensitive regularised extreme learning machine (CS-RELM) was proposed by using probability estimation and misclassification cost to reconstruct the classification results. By improving the classification accuracy of a group of small sample which higher misclassification cost, the new CS-RELM can minimise the classification cost. The 'rejection cost' was integrated into CS-RELM algorithm to further reduce the average misclassification cost. By using Colon Tumour dataset and SRBCT (Small Round Blue Cells Tumour) dataset, CS-RELM was compared with other cost-sensitive algorithms such as extreme learning machine (ELM), cost-sensitive extreme learning machine, regularised extreme learning machine, cost-sensitive support vector machine (SVM). The results of experiments show that CS-RELM with embedded rejection cost could reduce the average cost of misclassification and made more credible classification decision than others.
Semi-supervised classification tool for DubaiSat-2 multispectral imagery
NASA Astrophysics Data System (ADS)
Al-Mansoori, Saeed
2015-10-01
This paper addresses a semi-supervised classification tool based on a pixel-based approach of the multi-spectral satellite imagery. There are not many studies demonstrating such algorithm for the multispectral images, especially when the image consists of 4 bands (Red, Green, Blue and Near Infrared) as in DubaiSat-2 satellite images. The proposed approach utilizes both unsupervised and supervised classification schemes sequentially to identify four classes in the image, namely, water bodies, vegetation, land (developed and undeveloped areas) and paved areas (i.e. roads). The unsupervised classification concept is applied to identify two classes; water bodies and vegetation, based on a well-known index that uses the distinct wavelengths of visible and near-infrared sunlight that is absorbed and reflected by the plants to identify the classes; this index parameter is called "Normalized Difference Vegetation Index (NDVI)". Afterward, the supervised classification is performed by selecting training homogenous samples for roads and land areas. Here, a precise selection of training samples plays a vital role in the classification accuracy. Post classification is finally performed to enhance the classification accuracy, where the classified image is sieved, clumped and filtered before producing final output. Overall, the supervised classification approach produced higher accuracy than the unsupervised method. This paper shows some current preliminary research results which point out the effectiveness of the proposed technique in a virtual perspective.
Convolutional Neural Network for Histopathological Analysis of Osteosarcoma.
Mishra, Rashika; Daescu, Ovidiu; Leavey, Patrick; Rakheja, Dinesh; Sengupta, Anita
2018-03-01
Pathologists often deal with high complexity and sometimes disagreement over osteosarcoma tumor classification due to cellular heterogeneity in the dataset. Segmentation and classification of histology tissue in H&E stained tumor image datasets is a challenging task because of intra-class variations, inter-class similarity, crowded context, and noisy data. In recent years, deep learning approaches have led to encouraging results in breast cancer and prostate cancer analysis. In this article, we propose convolutional neural network (CNN) as a tool to improve efficiency and accuracy of osteosarcoma tumor classification into tumor classes (viable tumor, necrosis) versus nontumor. The proposed CNN architecture contains eight learned layers: three sets of stacked two convolutional layers interspersed with max pooling layers for feature extraction and two fully connected layers with data augmentation strategies to boost performance. The use of a neural network results in higher accuracy of average 92% for the classification. We compare the proposed architecture with three existing and proven CNN architectures for image classification: AlexNet, LeNet, and VGGNet. We also provide a pipeline to calculate percentage necrosis in a given whole slide image. We conclude that the use of neural networks can assure both high accuracy and efficiency in osteosarcoma classification.
NASA Astrophysics Data System (ADS)
Dou, P.
2017-12-01
Guangzhou has experienced a rapid urbanization period called "small change in three years and big change in five years" since the reform of China, resulting in significant land use/cover changes(LUC). To overcome the disadvantages of single classifier for remote sensing image classification accuracy, a multiple classifier system (MCS) is proposed to improve the quality of remote sensing image classification. The new method combines advantages of different learning algorithms, and achieves higher accuracy (88.12%) than any single classifier did. With the proposed MCS, land use/cover (LUC) on Landsat images from 1987 to 2015 was obtained, and the LUCs were used on three watersheds (Shijing river, Chebei stream, and Shahe stream) to estimate the impact of urbanization on water flood. The results show that with the high accuracy LUC, the uncertainty in flood simulations are reduced effectively (for Shijing river, Chebei stream, and Shahe stream, the uncertainty reduced 15.5%, 17.3% and 19.8% respectively).
Myint, S.W.; Yuan, M.; Cerveny, R.S.; Giri, C.P.
2008-01-01
Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and objectoriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. ?? 2008 by MDPI.
Myint, Soe W.; Yuan, May; Cerveny, Randall S.; Giri, Chandra P.
2008-01-01
Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and object-oriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. PMID:27879757
Analysis of swallowing sounds using hidden Markov models.
Aboofazeli, Mohammad; Moussavi, Zahra
2008-04-01
In recent years, acoustical analysis of the swallowing mechanism has received considerable attention due to its diagnostic potentials. This paper presents a hidden Markov model (HMM) based method for the swallowing sound segmentation and classification. Swallowing sound signals of 15 healthy and 11 dysphagic subjects were studied. The signals were divided into sequences of 25 ms segments each of which were represented by seven features. The sequences of features were modeled by HMMs. Trained HMMs were used for segmentation of the swallowing sounds into three distinct phases, i.e., initial quiet period, initial discrete sounds (IDS) and bolus transit sounds (BTS). Among the seven features, accuracy of segmentation by the HMM based on multi-scale product of wavelet coefficients was higher than that of the other HMMs and the linear prediction coefficient (LPC)-based HMM showed the weakest performance. In addition, HMMs were used for classification of the swallowing sounds of healthy subjects and dysphagic patients. Classification accuracy of different HMM configurations was investigated. When we increased the number of states of the HMMs from 4 to 8, the classification error gradually decreased. In most cases, classification error for N=9 was higher than that of N=8. Among the seven features used, root mean square (RMS) and waveform fractal dimension (WFD) showed the best performance in the HMM-based classification of swallowing sounds. When the sequences of the features of IDS segment were modeled separately, the accuracy reached up to 85.5%. As a second stage classification, a screening algorithm was used which correctly classified all the subjects but one healthy subject when RMS was used as characteristic feature of the swallowing sounds and the number of states was set to N=8.
NASA Astrophysics Data System (ADS)
Roychowdhury, K.
2016-06-01
Landcover is the easiest detectable indicator of human interventions on land. Urban and peri-urban areas present a complex combination of landcover, which makes classification challenging. This paper assesses the different methods of classifying landcover using dual polarimetric Sentinel-1 data collected during monsoon (July) and winter (December) months of 2015. Four broad landcover classes such as built up areas, water bodies and wetlands, vegetation and open spaces of Kolkata and its surrounding regions were identified. Polarimetric analyses were conducted on Single Look Complex (SLC) data of the region while ground range detected (GRD) data were used for spectral and spatial classification. Unsupervised classification by means of K-Means clustering used backscatter values and was able to identify homogenous landcovers over the study area. The results produced an overall accuracy of less than 50% for both the seasons. Higher classification accuracy (around 70%) was achieved by adding texture variables as inputs along with the backscatter values. However, the accuracy of classification increased significantly with polarimetric analyses. The overall accuracy was around 80% in Wishart H-A-Alpha unsupervised classification. The method was useful in identifying urban areas due to their double-bounce scattering and vegetated areas, which have more random scattering. Normalized Difference Built-up index (NDBI) and Normalized Difference Vegetation Index (NDVI) obtained from Landsat 8 data over the study area were used to verify vegetation and urban classes. The study compares the accuracies of different methods of classifying landcover using medium resolution SAR data in a complex urban area and suggests that polarimetric analyses present the most accurate results for urban and suburban areas.
AVHRR composite period selection for land cover classification
Maxwell, S.K.; Hoffer, R.M.; Chapman, P.L.
2002-01-01
Multitemporal satellite image datasets provide valuable information on the phenological characteristics of vegetation, thereby significantly increasing the accuracy of cover type classifications compared to single date classifications. However, the processing of these datasets can become very complex when dealing with multitemporal data combined with multispectral data. Advanced Very High Resolution Radiometer (AVHRR) biweekly composite data are commonly used to classify land cover over large regions. Selecting a subset of these biweekly composite periods may be required to reduce the complexity and cost of land cover mapping. The objective of our research was to evaluate the effect of reducing the number of composite periods and altering the spacing of those composite periods on classification accuracy. Because inter-annual variability can have a major impact on classification results, 5 years of AVHRR data were evaluated. AVHRR biweekly composite images for spectral channels 1-4 (visible, near-infrared and two thermal bands) covering the entire growing season were used to classify 14 cover types over the entire state of Colorado for each of five different years. A supervised classification method was applied to maintain consistent procedures for each case tested. Results indicate that the number of composite periods can be halved-reduced from 14 composite dates to seven composite dates-without significantly reducing overall classification accuracy (80.4% Kappa accuracy for the 14-composite data-set as compared to 80.0% for a seven-composite dataset). At least seven composite periods were required to ensure the classification accuracy was not affected by inter-annual variability due to climate fluctuations. Concentrating more composites near the beginning and end of the growing season, as compared to using evenly spaced time periods, consistently produced slightly higher classification values over the 5 years tested (average Kappa) of 80.3% for the heavy early/late case as compared to 79.0% for the alternate dataset case).
Yu, Xiao; Ding, Enjie; Chen, Chunxu; Liu, Xiaoming; Li, Li
2015-01-01
Because roller element bearings (REBs) failures cause unexpected machinery breakdowns, their fault diagnosis has attracted considerable research attention. Established fault feature extraction methods focus on statistical characteristics of the vibration signal, which is an approach that loses sight of the continuous waveform features. Considering this weakness, this article proposes a novel feature extraction method for frequency bands, named Window Marginal Spectrum Clustering (WMSC) to select salient features from the marginal spectrum of vibration signals by Hilbert–Huang Transform (HHT). In WMSC, a sliding window is used to divide an entire HHT marginal spectrum (HMS) into window spectrums, following which Rand Index (RI) criterion of clustering method is used to evaluate each window. The windows returning higher RI values are selected to construct characteristic frequency bands (CFBs). Next, a hybrid REBs fault diagnosis is constructed, termed by its elements, HHT-WMSC-SVM (support vector machines). The effectiveness of HHT-WMSC-SVM is validated by running series of experiments on REBs defect datasets from the Bearing Data Center of Case Western Reserve University (CWRU). The said test results evidence three major advantages of the novel method. First, the fault classification accuracy of the HHT-WMSC-SVM model is higher than that of HHT-SVM and ST-SVM, which is a method that combines statistical characteristics with SVM. Second, with Gauss white noise added to the original REBs defect dataset, the HHT-WMSC-SVM model maintains high classification accuracy, while the classification accuracy of ST-SVM and HHT-SVM models are significantly reduced. Third, fault classification accuracy by HHT-WMSC-SVM can exceed 95% under a Pmin range of 500–800 and a m range of 50–300 for REBs defect dataset, adding Gauss white noise at Signal Noise Ratio (SNR) = 5. Experimental results indicate that the proposed WMSC method yields a high REBs fault classification accuracy and a good performance in Gauss white noise reduction. PMID:26540059
Yu, Xiao; Ding, Enjie; Chen, Chunxu; Liu, Xiaoming; Li, Li
2015-11-03
Because roller element bearings (REBs) failures cause unexpected machinery breakdowns, their fault diagnosis has attracted considerable research attention. Established fault feature extraction methods focus on statistical characteristics of the vibration signal, which is an approach that loses sight of the continuous waveform features. Considering this weakness, this article proposes a novel feature extraction method for frequency bands, named Window Marginal Spectrum Clustering (WMSC) to select salient features from the marginal spectrum of vibration signals by Hilbert-Huang Transform (HHT). In WMSC, a sliding window is used to divide an entire HHT marginal spectrum (HMS) into window spectrums, following which Rand Index (RI) criterion of clustering method is used to evaluate each window. The windows returning higher RI values are selected to construct characteristic frequency bands (CFBs). Next, a hybrid REBs fault diagnosis is constructed, termed by its elements, HHT-WMSC-SVM (support vector machines). The effectiveness of HHT-WMSC-SVM is validated by running series of experiments on REBs defect datasets from the Bearing Data Center of Case Western Reserve University (CWRU). The said test results evidence three major advantages of the novel method. First, the fault classification accuracy of the HHT-WMSC-SVM model is higher than that of HHT-SVM and ST-SVM, which is a method that combines statistical characteristics with SVM. Second, with Gauss white noise added to the original REBs defect dataset, the HHT-WMSC-SVM model maintains high classification accuracy, while the classification accuracy of ST-SVM and HHT-SVM models are significantly reduced. Third, fault classification accuracy by HHT-WMSC-SVM can exceed 95% under a Pmin range of 500-800 and a m range of 50-300 for REBs defect dataset, adding Gauss white noise at Signal Noise Ratio (SNR) = 5. Experimental results indicate that the proposed WMSC method yields a high REBs fault classification accuracy and a good performance in Gauss white noise reduction.
Subject-Adaptive Real-Time Sleep Stage Classification Based on Conditional Random Field
Luo, Gang; Min, Wanli
2007-01-01
Sleep staging is the pattern recognition task of classifying sleep recordings into sleep stages. This task is one of the most important steps in sleep analysis. It is crucial for the diagnosis and treatment of various sleep disorders, and also relates closely to brain-machine interfaces. We report an automatic, online sleep stager using electroencephalogram (EEG) signal based on a recently-developed statistical pattern recognition method, conditional random field, and novel potential functions that have explicit physical meanings. Using sleep recordings from human subjects, we show that the average classification accuracy of our sleep stager almost approaches the theoretical limit and is about 8% higher than that of existing systems. Moreover, for a new subject snew with limited training data Dnew, we perform subject adaptation to improve classification accuracy. Our idea is to use the knowledge learned from old subjects to obtain from Dnew a regulated estimate of CRF’s parameters. Using sleep recordings from human subjects, we show that even without any Dnew, our sleep stager can achieve an average classification accuracy of 70% on snew. This accuracy increases with the size of Dnew and eventually becomes close to the theoretical limit. PMID:18693884
Zbroch, Tomasz; Knapp, Paweł Grzegorz; Knapp, Piotr Andrzej
2007-09-01
Increasing knowledge concerning carcinogenesis within cervical epithelium has forced us to make continues modifications of cytology classification of the cervical smears. Eventually, new descriptions of the submicroscopic cytomorphological abnormalities have enabled the implementation of Bethesda System which was meant to take place of the former Papanicolaou classification although temporarily both are sometimes used simultaneously. The aim of this study was to compare results of these two classification systems in the aspect of diagnostic accuracy verified by further tests of the diagnostic algorithm for the cervical lesion evaluation. The study was conducted in the group of women selected from general population, the criteria being the place of living and cervical cancer age risk group, in the consecutive periods of mass screening in Podlaski region. The performed diagnostic tests have been based on the commonly used algorithm, as well as identical laboratory and methodological conditions. Performed assessment revealed comparable diagnostic accuracy of both analyzing classifications, verified by histological examination, although with marked higher specificity for dysplastic lesions with decreased number of HSIL results and increased diagnosis of LSILs. Higher number of performed colposcopies and biopsies were an additional consequence of TBS classification. Results based on Bethesda System made it possible to find the sources and reasons of abnormalities with much greater precision, which enabled causing agent treatment. Two evaluated cytology classification systems, although not much different, depicted higher potential of TBS and better, more effective communication between cytology laboratory and gynecologist, making reasonable implementation of The Bethesda System in the daily cytology screening work.
Research on cardiovascular disease prediction based on distance metric learning
NASA Astrophysics Data System (ADS)
Ni, Zhuang; Liu, Kui; Kang, Guixia
2018-04-01
Distance metric learning algorithm has been widely applied to medical diagnosis and exhibited its strengths in classification problems. The k-nearest neighbour (KNN) is an efficient method which treats each feature equally. The large margin nearest neighbour classification (LMNN) improves the accuracy of KNN by learning a global distance metric, which did not consider the locality of data distributions. In this paper, we propose a new distance metric algorithm adopting cosine metric and LMNN named COS-SUBLMNN which takes more care about local feature of data to overcome the shortage of LMNN and improve the classification accuracy. The proposed methodology is verified on CVDs patient vector derived from real-world medical data. The Experimental results show that our method provides higher accuracy than KNN and LMNN did, which demonstrates the effectiveness of the Risk predictive model of CVDs based on COS-SUBLMNN.
Cao, Jianfang; Cui, Hongyan; Shi, Hao; Jiao, Lijuan
2016-01-01
A back-propagation (BP) neural network can solve complicated random nonlinear mapping problems; therefore, it can be applied to a wide range of problems. However, as the sample size increases, the time required to train BP neural networks becomes lengthy. Moreover, the classification accuracy decreases as well. To improve the classification accuracy and runtime efficiency of the BP neural network algorithm, we proposed a parallel design and realization method for a particle swarm optimization (PSO)-optimized BP neural network based on MapReduce on the Hadoop platform using both the PSO algorithm and a parallel design. The PSO algorithm was used to optimize the BP neural network’s initial weights and thresholds and improve the accuracy of the classification algorithm. The MapReduce parallel programming model was utilized to achieve parallel processing of the BP algorithm, thereby solving the problems of hardware and communication overhead when the BP neural network addresses big data. Datasets on 5 different scales were constructed using the scene image library from the SUN Database. The classification accuracy of the parallel PSO-BP neural network algorithm is approximately 92%, and the system efficiency is approximately 0.85, which presents obvious advantages when processing big data. The algorithm proposed in this study demonstrated both higher classification accuracy and improved time efficiency, which represents a significant improvement obtained from applying parallel processing to an intelligent algorithm on big data. PMID:27304987
Optimizing Support Vector Machine Parameters with Genetic Algorithm for Credit Risk Assessment
NASA Astrophysics Data System (ADS)
Manurung, Jonson; Mawengkang, Herman; Zamzami, Elviawaty
2017-12-01
Support vector machine (SVM) is a popular classification method known to have strong generalization capabilities. SVM can solve the problem of classification and linear regression or nonlinear kernel which can be a learning algorithm for the ability of classification and regression. However, SVM also has a weakness that is difficult to determine the optimal parameter value. SVM calculates the best linear separator on the input feature space according to the training data. To classify data which are non-linearly separable, SVM uses kernel tricks to transform the data into a linearly separable data on a higher dimension feature space. The kernel trick using various kinds of kernel functions, such as : linear kernel, polynomial, radial base function (RBF) and sigmoid. Each function has parameters which affect the accuracy of SVM classification. To solve the problem genetic algorithms are proposed to be applied as the optimal parameter value search algorithm thus increasing the best classification accuracy on SVM. Data taken from UCI repository of machine learning database: Australian Credit Approval. The results show that the combination of SVM and genetic algorithms is effective in improving classification accuracy. Genetic algorithms has been shown to be effective in systematically finding optimal kernel parameters for SVM, instead of randomly selected kernel parameters. The best accuracy for data has been upgraded from kernel Linear: 85.12%, polynomial: 81.76%, RBF: 77.22% Sigmoid: 78.70%. However, for bigger data sizes, this method is not practical because it takes a lot of time.
Das, Dev Kumar; Ghosh, Madhumala; Pal, Mallika; Maiti, Asok K; Chakraborty, Chandan
2013-02-01
The aim of this paper is to address the development of computer assisted malaria parasite characterization and classification using machine learning approach based on light microscopic images of peripheral blood smears. In doing this, microscopic image acquisition from stained slides, illumination correction and noise reduction, erythrocyte segmentation, feature extraction, feature selection and finally classification of different stages of malaria (Plasmodium vivax and Plasmodium falciparum) have been investigated. The erythrocytes are segmented using marker controlled watershed transformation and subsequently total ninety six features describing shape-size and texture of erythrocytes are extracted in respect to the parasitemia infected versus non-infected cells. Ninety four features are found to be statistically significant in discriminating six classes. Here a feature selection-cum-classification scheme has been devised by combining F-statistic, statistical learning techniques i.e., Bayesian learning and support vector machine (SVM) in order to provide the higher classification accuracy using best set of discriminating features. Results show that Bayesian approach provides the highest accuracy i.e., 84% for malaria classification by selecting 19 most significant features while SVM provides highest accuracy i.e., 83.5% with 9 most significant features. Finally, the performance of these two classifiers under feature selection framework has been compared toward malaria parasite classification. Copyright © 2012 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Melville, Bethany; Lucieer, Arko; Aryal, Jagannath
2018-04-01
This paper presents a random forest classification approach for identifying and mapping three types of lowland native grassland communities found in the Tasmanian Midlands region. Due to the high conservation priority assigned to these communities, there has been an increasing need to identify appropriate datasets that can be used to derive accurate and frequently updateable maps of community extent. Therefore, this paper proposes a method employing repeat classification and statistical significance testing as a means of identifying the most appropriate dataset for mapping these communities. Two datasets were acquired and analysed; a Landsat ETM+ scene, and a WorldView-2 scene, both from 2010. Training and validation data were randomly subset using a k-fold (k = 50) approach from a pre-existing field dataset. Poa labillardierei, Themeda triandra and lowland native grassland complex communities were identified in addition to dry woodland and agriculture. For each subset of randomly allocated points, a random forest model was trained based on each dataset, and then used to classify the corresponding imagery. Validation was performed using the reciprocal points from the independent subset that had not been used to train the model. Final training and classification accuracies were reported as per class means for each satellite dataset. Analysis of Variance (ANOVA) was undertaken to determine whether classification accuracy differed between the two datasets, as well as between classifications. Results showed mean class accuracies between 54% and 87%. Class accuracy only differed significantly between datasets for the dry woodland and Themeda grassland classes, with the WorldView-2 dataset showing higher mean classification accuracies. The results of this study indicate that remote sensing is a viable method for the identification of lowland native grassland communities in the Tasmanian Midlands, and that repeat classification and statistical significant testing can be used to identify optimal datasets for vegetation community mapping.
An Evaluation of Item Response Theory Classification Accuracy and Consistency Indices
ERIC Educational Resources Information Center
Wyse, Adam E.; Hao, Shiqi
2012-01-01
This article introduces two new classification consistency indices that can be used when item response theory (IRT) models have been applied. The new indices are shown to be related to Rudner's classification accuracy index and Guo's classification accuracy index. The Rudner- and Guo-based classification accuracy and consistency indices are…
Devos, Olivier; Downey, Gerard; Duponchel, Ludovic
2014-04-01
Classification is an important task in chemometrics. For several years now, support vector machines (SVMs) have proven to be powerful for infrared spectral data classification. However such methods require optimisation of parameters in order to control the risk of overfitting and the complexity of the boundary. Furthermore, it is established that the prediction ability of classification models can be improved using pre-processing in order to remove unwanted variance in the spectra. In this paper we propose a new methodology based on genetic algorithm (GA) for the simultaneous optimisation of SVM parameters and pre-processing (GENOPT-SVM). The method has been tested for the discrimination of the geographical origin of Italian olive oil (Ligurian and non-Ligurian) on the basis of near infrared (NIR) or mid infrared (FTIR) spectra. Different classification models (PLS-DA, SVM with mean centre data, GENOPT-SVM) have been tested and statistically compared using McNemar's statistical test. For the two datasets, SVM with optimised pre-processing give models with higher accuracy than the one obtained with PLS-DA on pre-processed data. In the case of the NIR dataset, most of this accuracy improvement (86.3% compared with 82.8% for PLS-DA) occurred using only a single pre-processing step. For the FTIR dataset, three optimised pre-processing steps are required to obtain SVM model with significant accuracy improvement (82.2%) compared to the one obtained with PLS-DA (78.6%). Furthermore, this study demonstrates that even SVM models have to be developed on the basis of well-corrected spectral data in order to obtain higher classification rates. Copyright © 2013 Elsevier Ltd. All rights reserved.
Multidate mapping of mosquito habitat. [Nebraska, South Dakota
NASA Technical Reports Server (NTRS)
Woodzick, T. L.; Maxwell, E. L.
1977-01-01
LANDSAT data from three overpasses formed the data base for a multidate classification of 15 ground cover categories in the margins of Lewis and Clark Lake, a fresh water impoundment between South Dakota and Nebraska. When scaled to match topographic maps of the area, the ground cover classification maps were used as a general indicator of potential mosquito-breeding habitat by distinguishing productive wetlands areas from nonproductive nonwetlands areas. The 12 channel multidate classification was found to have an accuracy 23% higher than the average of the three single date 4 channel classifications.
Estimation of different data compositions for early-season crop type classification.
Hao, Pengyu; Wu, Mingquan; Niu, Zheng; Wang, Li; Zhan, Yulin
2018-01-01
Timely and accurate crop type distribution maps are an important inputs for crop yield estimation and production forecasting as multi-temporal images can observe phenological differences among crops. Therefore, time series remote sensing data are essential for crop type mapping, and image composition has commonly been used to improve the quality of the image time series. However, the optimal composition period is unclear as long composition periods (such as compositions lasting half a year) are less informative and short composition periods lead to information redundancy and missing pixels. In this study, we initially acquired daily 30 m Normalized Difference Vegetation Index (NDVI) time series by fusing MODIS, Landsat, Gaofen and Huanjing (HJ) NDVI, and then composited the NDVI time series using four strategies (daily, 8-day, 16-day, and 32-day). We used Random Forest to identify crop types and evaluated the classification performances of the NDVI time series generated from four composition strategies in two studies regions from Xinjiang, China. Results indicated that crop classification performance improved as crop separabilities and classification accuracies increased, and classification uncertainties dropped in the green-up stage of the crops. When using daily NDVI time series, overall accuracies saturated at 113-day and 116-day in Bole and Luntai, and the saturated overall accuracies (OAs) were 86.13% and 91.89%, respectively. Cotton could be identified 40∼60 days and 35∼45 days earlier than the harvest in Bole and Luntai when using daily, 8-day and 16-day composition NDVI time series since both producer's accuracies (PAs) and user's accuracies (UAs) were higher than 85%. Among the four compositions, the daily NDVI time series generated the highest classification accuracies. Although the 8-day, 16-day and 32-day compositions had similar saturated overall accuracies (around 85% in Bole and 83% in Luntai), the 8-day and 16-day compositions achieved these accuracies around 155-day in Bole and 133-day in Luntai, which were earlier than the 32-day composition (170-day in both Bole and Luntai). Therefore, when the daily NDVI time series cannot be acquired, the 16-day composition is recommended in this study.
Estimation of different data compositions for early-season crop type classification
Wu, Mingquan; Wang, Li; Zhan, Yulin
2018-01-01
Timely and accurate crop type distribution maps are an important inputs for crop yield estimation and production forecasting as multi-temporal images can observe phenological differences among crops. Therefore, time series remote sensing data are essential for crop type mapping, and image composition has commonly been used to improve the quality of the image time series. However, the optimal composition period is unclear as long composition periods (such as compositions lasting half a year) are less informative and short composition periods lead to information redundancy and missing pixels. In this study, we initially acquired daily 30 m Normalized Difference Vegetation Index (NDVI) time series by fusing MODIS, Landsat, Gaofen and Huanjing (HJ) NDVI, and then composited the NDVI time series using four strategies (daily, 8-day, 16-day, and 32-day). We used Random Forest to identify crop types and evaluated the classification performances of the NDVI time series generated from four composition strategies in two studies regions from Xinjiang, China. Results indicated that crop classification performance improved as crop separabilities and classification accuracies increased, and classification uncertainties dropped in the green-up stage of the crops. When using daily NDVI time series, overall accuracies saturated at 113-day and 116-day in Bole and Luntai, and the saturated overall accuracies (OAs) were 86.13% and 91.89%, respectively. Cotton could be identified 40∼60 days and 35∼45 days earlier than the harvest in Bole and Luntai when using daily, 8-day and 16-day composition NDVI time series since both producer’s accuracies (PAs) and user’s accuracies (UAs) were higher than 85%. Among the four compositions, the daily NDVI time series generated the highest classification accuracies. Although the 8-day, 16-day and 32-day compositions had similar saturated overall accuracies (around 85% in Bole and 83% in Luntai), the 8-day and 16-day compositions achieved these accuracies around 155-day in Bole and 133-day in Luntai, which were earlier than the 32-day composition (170-day in both Bole and Luntai). Therefore, when the daily NDVI time series cannot be acquired, the 16-day composition is recommended in this study. PMID:29868265
NASA Astrophysics Data System (ADS)
Erener, A.
2013-04-01
Automatic extraction of urban features from high resolution satellite images is one of the main applications in remote sensing. It is useful for wide scale applications, namely: urban planning, urban mapping, disaster management, GIS (geographic information systems) updating, and military target detection. One common approach to detecting urban features from high resolution images is to use automatic classification methods. This paper has four main objectives with respect to detecting buildings. The first objective is to compare the performance of the most notable supervised classification algorithms, including the maximum likelihood classifier (MLC) and the support vector machine (SVM). In this experiment the primary consideration is the impact of kernel configuration on the performance of the SVM. The second objective of the study is to explore the suitability of integrating additional bands, namely first principal component (1st PC) and the intensity image, for original data for multi classification approaches. The performance evaluation of classification results is done using two different accuracy assessment methods: pixel based and object based approaches, which reflect the third aim of the study. The objective here is to demonstrate the differences in the evaluation of accuracies of classification methods. Considering consistency, the same set of ground truth data which is produced by labeling the building boundaries in the GIS environment is used for accuracy assessment. Lastly, the fourth aim is to experimentally evaluate variation in the accuracy of classifiers for six different real situations in order to identify the impact of spatial and spectral diversity on results. The method is applied to Quickbird images for various urban complexity levels, extending from simple to complex urban patterns. The simple surface type includes a regular urban area with low density and systematic buildings with brick rooftops. The complex surface type involves almost all kinds of challenges, such as high dense build up areas, regions with bare soil, and small and large buildings with different rooftops, such as concrete, brick, and metal. Using the pixel based accuracy assessment it was shown that the percent building detection (PBD) and quality percent (QP) of the MLC and SVM depend on the complexity and texture variation of the region. Generally, PBD values range between 70% and 90% for the MLC and SVM, respectively. No substantial improvements were observed when the SVM and MLC classifications were developed by the addition of more variables, instead of the use of only four bands. In the evaluation of object based accuracy assessment, it was demonstrated that while MLC and SVM provide higher rates of correct detection, they also provide higher rates of false alarms.
Feature Extraction of Electronic Nose Signals Using QPSO-Based Multiple KFDA Signal Processing
Wen, Tailai; Huang, Daoyu; Lu, Kun; Deng, Changjian; Zeng, Tanyue; Yu, Song; He, Zhiyi
2018-01-01
The aim of this research was to enhance the classification accuracy of an electronic nose (E-nose) in different detecting applications. During the learning process of the E-nose to predict the types of different odors, the prediction accuracy was not quite satisfying because the raw features extracted from sensors’ responses were regarded as the input of a classifier without any feature extraction processing. Therefore, in order to obtain more useful information and improve the E-nose’s classification accuracy, in this paper, a Weighted Kernels Fisher Discriminant Analysis (WKFDA) combined with Quantum-behaved Particle Swarm Optimization (QPSO), i.e., QWKFDA, was presented to reprocess the original feature matrix. In addition, we have also compared the proposed method with quite a few previously existing ones including Principal Component Analysis (PCA), Locality Preserving Projections (LPP), Fisher Discriminant Analysis (FDA) and Kernels Fisher Discriminant Analysis (KFDA). Experimental results proved that QWKFDA is an effective feature extraction method for E-nose in predicting the types of wound infection and inflammable gases, which shared much higher classification accuracy than those of the contrast methods. PMID:29382146
Feature Extraction of Electronic Nose Signals Using QPSO-Based Multiple KFDA Signal Processing.
Wen, Tailai; Yan, Jia; Huang, Daoyu; Lu, Kun; Deng, Changjian; Zeng, Tanyue; Yu, Song; He, Zhiyi
2018-01-29
The aim of this research was to enhance the classification accuracy of an electronic nose (E-nose) in different detecting applications. During the learning process of the E-nose to predict the types of different odors, the prediction accuracy was not quite satisfying because the raw features extracted from sensors' responses were regarded as the input of a classifier without any feature extraction processing. Therefore, in order to obtain more useful information and improve the E-nose's classification accuracy, in this paper, a Weighted Kernels Fisher Discriminant Analysis (WKFDA) combined with Quantum-behaved Particle Swarm Optimization (QPSO), i.e., QWKFDA, was presented to reprocess the original feature matrix. In addition, we have also compared the proposed method with quite a few previously existing ones including Principal Component Analysis (PCA), Locality Preserving Projections (LPP), Fisher Discriminant Analysis (FDA) and Kernels Fisher Discriminant Analysis (KFDA). Experimental results proved that QWKFDA is an effective feature extraction method for E-nose in predicting the types of wound infection and inflammable gases, which shared much higher classification accuracy than those of the contrast methods.
NASA Technical Reports Server (NTRS)
Craig, R. G. (Principal Investigator)
1983-01-01
Richmond, Virginia and Denver, Colorado were study sites in an effort to determine the effect of autocorrelation on the accuracy of a parallelopiped classifier of LANDSAT digital data. The autocorrelation was assumed to decay to insignificant levels when sampled at distances of at least ten pixels. Spectral themes developed using blocks of adjacent pixels, and using groups of pixels spaced at least 10 pixels apart were used. Effects of geometric distortions were minimized by using only pixels from the interiors of land cover sections. Accuracy was evaluated for three classes; agriculture, residential and "all other"; both type 1 and type 2 errors were evaluated by means of overall classification accuracy. All classes give comparable results. Accuracy is approximately the same in both techniques; however, the variance in accuracy is significantly higher using the themes developed from autocorrelated data. The vectors of mean spectral response were nearly identical regardless of sampling method used. The estimated variances were much larger when using autocorrelated pixels.
A hybrid three-class brain-computer interface system utilizing SSSEPs and transient ERPs
NASA Astrophysics Data System (ADS)
Breitwieser, Christian; Pokorny, Christoph; Müller-Putz, Gernot R.
2016-12-01
Objective. This paper investigates the fusion of steady-state somatosensory evoked potentials (SSSEPs) and transient event-related potentials (tERPs), evoked through tactile simulation on the left and right-hand fingertips, in a three-class EEG based hybrid brain-computer interface. It was hypothesized, that fusing the input signals leads to higher classification rates than classifying tERP and SSSEP individually. Approach. Fourteen subjects participated in the studies, consisting of a screening paradigm to determine person dependent resonance-like frequencies and a subsequent online paradigm. The whole setup of the BCI system was based on open interfaces, following suggestions for a common implementation platform. During the online experiment, subjects were instructed to focus their attention on the stimulated fingertips as indicated by a visual cue. The recorded data were classified during runtime using a multi-class shrinkage LDA classifier and the outputs were fused together applying a posterior probability based fusion. Data were further analyzed offline, involving a combined classification of SSSEP and tERP features as a second fusion principle. The final results were tested for statistical significance applying a repeated measures ANOVA. Main results. A significant classification increase was achieved when fusing the results with a combined classification compared to performing an individual classification. Furthermore, the SSSEP classifier was significantly better in detecting a non-control state, whereas the tERP classifier was significantly better in detecting control states. Subjects who had a higher relative band power increase during the screening session also achieved significantly higher classification results than subjects with lower relative band power increase. Significance. It could be shown that utilizing SSSEP and tERP for hBCIs increases the classification accuracy and also that tERP and SSSEP are not classifying control- and non-control states with the same level of accuracy.
LDA boost classification: boosting by topics
NASA Astrophysics Data System (ADS)
Lei, La; Qiao, Guo; Qimin, Cao; Qitao, Li
2012-12-01
AdaBoost is an efficacious classification algorithm especially in text categorization (TC) tasks. The methodology of setting up a classifier committee and voting on the documents for classification can achieve high categorization precision. However, traditional Vector Space Model can easily lead to the curse of dimensionality and feature sparsity problems; so it affects classification performance seriously. This article proposed a novel classification algorithm called LDABoost based on boosting ideology which uses Latent Dirichlet Allocation (LDA) to modeling the feature space. Instead of using words or phrase, LDABoost use latent topics as the features. In this way, the feature dimension is significantly reduced. Improved Naïve Bayes (NB) is designed as the weaker classifier which keeps the efficiency advantage of classic NB algorithm and has higher precision. Moreover, a two-stage iterative weighted method called Cute Integration in this article is proposed for improving the accuracy by integrating weak classifiers into strong classifier in a more rational way. Mutual Information is used as metrics of weights allocation. The voting information and the categorization decision made by basis classifiers are fully utilized for generating the strong classifier. Experimental results reveals LDABoost making categorization in a low-dimensional space, it has higher accuracy than traditional AdaBoost algorithms and many other classic classification algorithms. Moreover, its runtime consumption is lower than different versions of AdaBoost, TC algorithms based on support vector machine and Neural Networks.
Combining High Spatial Resolution Optical and LIDAR Data for Object-Based Image Classification
NASA Astrophysics Data System (ADS)
Li, R.; Zhang, T.; Geng, R.; Wang, L.
2018-04-01
In order to classify high spatial resolution images more accurately, in this research, a hierarchical rule-based object-based classification framework was developed based on a high-resolution image with airborne Light Detection and Ranging (LiDAR) data. The eCognition software is employed to conduct the whole process. In detail, firstly, the FBSP optimizer (Fuzzy-based Segmentation Parameter) is used to obtain the optimal scale parameters for different land cover types. Then, using the segmented regions as basic units, the classification rules for various land cover types are established according to the spectral, morphological and texture features extracted from the optical images, and the height feature from LiDAR respectively. Thirdly, the object classification results are evaluated by using the confusion matrix, overall accuracy and Kappa coefficients. As a result, a method using the combination of an aerial image and the airborne Lidar data shows higher accuracy.
Zhang, He-Hua; Yang, Liuyang; Liu, Yuchuan; Wang, Pin; Yin, Jun; Li, Yongming; Qiu, Mingguo; Zhu, Xueru; Yan, Fang
2016-11-16
The use of speech based data in the classification of Parkinson disease (PD) has been shown to provide an effect, non-invasive mode of classification in recent years. Thus, there has been an increased interest in speech pattern analysis methods applicable to Parkinsonism for building predictive tele-diagnosis and tele-monitoring models. One of the obstacles in optimizing classifications is to reduce noise within the collected speech samples, thus ensuring better classification accuracy and stability. While the currently used methods are effect, the ability to invoke instance selection has been seldomly examined. In this study, a PD classification algorithm was proposed and examined that combines a multi-edit-nearest-neighbor (MENN) algorithm and an ensemble learning algorithm. First, the MENN algorithm is applied for selecting optimal training speech samples iteratively, thereby obtaining samples with high separability. Next, an ensemble learning algorithm, random forest (RF) or decorrelated neural network ensembles (DNNE), is used to generate trained samples from the collected training samples. Lastly, the trained ensemble learning algorithms are applied to the test samples for PD classification. This proposed method was examined using a more recently deposited public datasets and compared against other currently used algorithms for validation. Experimental results showed that the proposed algorithm obtained the highest degree of improved classification accuracy (29.44%) compared with the other algorithm that was examined. Furthermore, the MENN algorithm alone was found to improve classification accuracy by as much as 45.72%. Moreover, the proposed algorithm was found to exhibit a higher stability, particularly when combining the MENN and RF algorithms. This study showed that the proposed method could improve PD classification when using speech data and can be applied to future studies seeking to improve PD classification methods.
Estimating workload using EEG spectral power and ERPs in the n-back task
NASA Astrophysics Data System (ADS)
Brouwer, Anne-Marie; Hogervorst, Maarten A.; van Erp, Jan B. F.; Heffelaar, Tobias; Zimmerman, Patrick H.; Oostenveld, Robert
2012-08-01
Previous studies indicate that both electroencephalogram (EEG) spectral power (in particular the alpha and theta band) and event-related potentials (ERPs) (in particular the P300) can be used as a measure of mental work or memory load. We compare their ability to estimate workload level in a well-controlled task. In addition, we combine both types of measures in a single classification model to examine whether this results in higher classification accuracy than either one alone. Participants watched a sequence of visually presented letters and indicated whether or not the current letter was the same as the one (n instances) before. Workload was varied by varying n. We developed different classification models using ERP features, frequency power features or a combination (fusion). Training and testing of the models simulated an online workload estimation situation. All our ERP, power and fusion models provide classification accuracies between 80% and 90% when distinguishing between the highest and the lowest workload condition after 2 min. For 32 out of 35 participants, classification was significantly higher than chance level after 2.5 s (or one letter) as estimated by the fusion model. Differences between the models are rather small, though the fusion model performs better than the other models when only short data segments are available for estimating workload.
Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.
Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi
2013-01-01
The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.
AVNM: A Voting based Novel Mathematical Rule for Image Classification.
Vidyarthi, Ankit; Mittal, Namita
2016-12-01
In machine learning, the accuracy of the system depends upon classification result. Classification accuracy plays an imperative role in various domains. Non-parametric classifier like K-Nearest Neighbor (KNN) is the most widely used classifier for pattern analysis. Besides its easiness, simplicity and effectiveness characteristics, the main problem associated with KNN classifier is the selection of a number of nearest neighbors i.e. "k" for computation. At present, it is hard to find the optimal value of "k" using any statistical algorithm, which gives perfect accuracy in terms of low misclassification error rate. Motivated by the prescribed problem, a new sample space reduction weighted voting mathematical rule (AVNM) is proposed for classification in machine learning. The proposed AVNM rule is also non-parametric in nature like KNN. AVNM uses the weighted voting mechanism with sample space reduction to learn and examine the predicted class label for unidentified sample. AVNM is free from any initial selection of predefined variable and neighbor selection as found in KNN algorithm. The proposed classifier also reduces the effect of outliers. To verify the performance of the proposed AVNM classifier, experiments are made on 10 standard datasets taken from UCI database and one manually created dataset. The experimental result shows that the proposed AVNM rule outperforms the KNN classifier and its variants. Experimentation results based on confusion matrix accuracy parameter proves higher accuracy value with AVNM rule. The proposed AVNM rule is based on sample space reduction mechanism for identification of an optimal number of nearest neighbor selections. AVNM results in better classification accuracy and minimum error rate as compared with the state-of-art algorithm, KNN, and its variants. The proposed rule automates the selection of nearest neighbor selection and improves classification rate for UCI dataset and manually created dataset. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Activity classification using the GENEA: optimum sampling frequency and number of axes.
Zhang, Shaoyan; Murray, Peter; Zillmer, Ruediger; Eston, Roger G; Catt, Michael; Rowlands, Alex V
2012-11-01
The GENEA shows high accuracy for classification of sedentary, household, walking, and running activities when sampling at 80 Hz on three axes. It is not known whether it is possible to decrease this sampling frequency and/or the number of axes without detriment to classification accuracy. The purpose of this study was to compare the classification rate of activities on the basis of data from a single axis, two axes, and three axes, with sampling rates ranging from 5 to 80 Hz. Sixty participants (age, 49.4 yr (6.5 yr); BMI, 24.6 kg·m (3.4 kg·m)) completed 10-12 semistructured activities in the laboratory and outdoor environment while wearing a GENEA accelerometer on the right wrist. We analyzed data from single axis, dual axes, and three axes at sampling rates of 5, 10, 20, 40, and 80 Hz. Mathematical models based on features extracted from mean, SD, fast Fourier transform, and wavelet decomposition were built, which combined one of the numbers of axes with one of the sampling rates to classify activities into sedentary, household, walking, and running. Classification accuracy was high irrespective of the number of axes for data collected at 80 Hz (96.93% ± 0.97%), 40 Hz (97.4% ± 0.73%), 20 Hz (96.86% ± 1.12%), and 10 Hz (97.01% ± 1.01%) but dropped for data collected at 5 Hz (94.98% ± 1.36%). Sampling frequencies >10 Hz and/or more than one axis of measurement were not associated with greater classification accuracy. Lower sampling rates and measurement of a single axis would result in a lower data load, longer battery life, and higher efficiency of data processing. Further research should investigate whether a lower sampling rate and a single axis affects classification accuracy when considering a wider range of activities.
NASA Technical Reports Server (NTRS)
Spruce, J. P.; Smoot, James; Ellis, Jean; Hilbert, Kent; Swann, Roberta
2012-01-01
This paper discusses the development and implementation of a geospatial data processing method and multi-decadal Landsat time series for computing general coastal U.S. land-use and land-cover (LULC) classifications and change products consisting of seven classes (water, barren, upland herbaceous, non-woody wetland, woody upland, woody wetland, and urban). Use of this approach extends the observational period of the NOAA-generated Coastal Change and Analysis Program (C-CAP) products by almost two decades, assuming the availability of one cloud free Landsat scene from any season for each targeted year. The Mobile Bay region in Alabama was used as a study area to develop, demonstrate, and validate the method that was applied to derive LULC products for nine dates at approximate five year intervals across a 34-year time span, using single dates of data for each classification in which forests were either leaf-on, leaf-off, or mixed senescent conditions. Classifications were computed and refined using decision rules in conjunction with unsupervised classification of Landsat data and C-CAP value-added products. Each classification's overall accuracy was assessed by comparing stratified random locations to available reference data, including higher spatial resolution satellite and aerial imagery, field survey data, and raw Landsat RGBs. Overall classification accuracies ranged from 83 to 91% with overall Kappa statistics ranging from 0.78 to 0.89. The accuracies are comparable to those from similar, generalized LULC products derived from C-CAP data. The Landsat MSS-based LULC product accuracies are similar to those from Landsat TM or ETM+ data. Accurate classifications were computed for all nine dates, yielding effective results regardless of season. This classification method yielded products that were used to compute LULC change products via additive GIS overlay techniques.
Study on bayes discriminant analysis of EEG data.
Shi, Yuan; He, DanDan; Qin, Fang
2014-01-01
In this paper, we have done Bayes Discriminant analysis to EEG data of experiment objects which are recorded impersonally come up with a relatively accurate method used in feature extraction and classification decisions. In accordance with the strength of α wave, the head electrodes are divided into four species. In use of part of 21 electrodes EEG data of 63 people, we have done Bayes Discriminant analysis to EEG data of six objects. Results In use of part of EEG data of 63 people, we have done Bayes Discriminant analysis, the electrode classification accuracy rates is 64.4%. Bayes Discriminant has higher prediction accuracy, EEG features (mainly αwave) extract more accurate. Bayes Discriminant would be better applied to the feature extraction and classification decisions of EEG data.
NASA Astrophysics Data System (ADS)
Chan, Heang-Ping; Helvie, Mark A.; Petrick, Nicholas; Sahiner, Berkman; Adler, Dorit D.; Blane, Caroline E.; Joynt, Lynn K.; Paramagul, Chintana; Roubidoux, Marilyn A.; Wilson, Todd E.; Hadjiiski, Lubomir M.; Goodsitt, Mitchell M.
1999-05-01
A receiver operating characteristic (ROC) experiment was conducted to evaluate the effects of pixel size on the characterization of mammographic microcalcifications. Digital mammograms were obtained by digitizing screen-film mammograms with a laser film scanner. One hundred twelve two-view mammograms with biopsy-proven microcalcifications were digitized at a pixel size of 35 micrometer X 35 micrometer. A region of interest (ROI) containing the microcalcifications was extracted from each image. ROI images with pixel sizes of 70 micrometers, 105 micrometers, and 140 micrometers were derived from the ROI of 35 micrometer pixel size by averaging 2 X 2, 3 X 3, and 4 X 4 neighboring pixels, respectively. The ROI images were printed on film with a laser imager. Seven MQSA-approved radiologists participated as observers. The likelihood of malignancy of the microcalcifications was rated on a 10-point confidence rating scale and analyzed with ROC methodology. The classification accuracy was quantified by the area, Az, under the ROC curve. The statistical significance of the differences in the Az values for different pixel sizes was estimated with the Dorfman-Berbaum-Metz (DBM) method for multi-reader, multi-case ROC data. It was found that five of the seven radiologists demonstrated a higher classification accuracy with the 70 micrometer or 105 micrometer images. The average Az also showed a higher classification accuracy in the range of 70 to 105 micrometer pixel size. However, the differences in A(subscript z/ between different pixel sizes did not achieve statistical significance. The low specificity of image features of microcalcifications an the large interobserver and intraobserver variabilities may have contributed to the relatively weak dependence of classification accuracy on pixel size.
D Land Cover Classification Based on Multispectral LIDAR Point Clouds
NASA Astrophysics Data System (ADS)
Zou, Xiaoliang; Zhao, Guihua; Li, Jonathan; Yang, Yuanxi; Fang, Yong
2016-06-01
Multispectral Lidar System can emit simultaneous laser pulses at the different wavelengths. The reflected multispectral energy is captured through a receiver of the sensor, and the return signal together with the position and orientation information of sensor is recorded. These recorded data are solved with GNSS/IMU data for further post-processing, forming high density multispectral 3D point clouds. As the first commercial multispectral airborne Lidar sensor, Optech Titan system is capable of collecting point clouds data from all three channels at 532nm visible (Green), at 1064 nm near infrared (NIR) and at 1550nm intermediate infrared (IR). It has become a new source of data for 3D land cover classification. The paper presents an Object Based Image Analysis (OBIA) approach to only use multispectral Lidar point clouds datasets for 3D land cover classification. The approach consists of three steps. Firstly, multispectral intensity images are segmented into image objects on the basis of multi-resolution segmentation integrating different scale parameters. Secondly, intensity objects are classified into nine categories by using the customized features of classification indexes and a combination the multispectral reflectance with the vertical distribution of object features. Finally, accuracy assessment is conducted via comparing random reference samples points from google imagery tiles with the classification results. The classification results show higher overall accuracy for most of the land cover types. Over 90% of overall accuracy is achieved via using multispectral Lidar point clouds for 3D land cover classification.
Mapping Mangrove Density from Rapideye Data in Central America
NASA Astrophysics Data System (ADS)
Son, Nguyen-Thanh; Chen, Chi-Farn; Chen, Cheng-Ru
2017-06-01
Mangrove forests provide a wide range of socioeconomic and ecological services for coastal communities. Extensive aquaculture development of mangrove waters in many developing countries has constantly ignored services of mangrove ecosystems, leading to unintended environmental consequences. Monitoring the current status and distribution of mangrove forests is deemed important for evaluating forest management strategies. This study aims to delineate the density distribution of mangrove forests in the Gulf of Fonseca, Central America with Rapideye data using the support vector machines (SVM). The data collected in 2012 for density classification of mangrove forests were processed based on four different band combination schemes: scheme-1 (bands 1-3, 5 excluding the red-edge band 4), scheme-2 (bands 1-5), scheme-3 (bands 1-3, 5 incorporating with the normalized difference vegetation index, NDVI), and scheme-4 (bands 1-3, 5 incorporating with the normalized difference red-edge index, NDRI). We also hypothesized if the obvious contribution of Rapideye red-edge band could improve the classification results. Three main steps of data processing were employed: (1), data pre-processing, (2) image classification, and (3) accuracy assessment to evaluate the contribution of red-edge band in terms of the accuracy of classification results across these four schemes. The classification maps compared with the ground reference data indicated the slightly higher accuracy level observed for schemes 2 and 4. The overall accuracies and Kappa coefficients were 97% and 0.95 for scheme-2 and 96.9% and 0.95 for scheme-4, respectively.
A hybrid sensing approach for pure and adulterated honey classification.
Subari, Norazian; Mohamad Saleh, Junita; Md Shakaff, Ali Yeon; Zakaria, Ammar
2012-10-17
This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data.
NASA Astrophysics Data System (ADS)
Gibril, Mohamed Barakat A.; Idrees, Mohammed Oludare; Yao, Kouame; Shafri, Helmi Zulhaidi Mohd
2018-01-01
The growing use of optimization for geographic object-based image analysis and the possibility to derive a wide range of information about the image in textual form makes machine learning (data mining) a versatile tool for information extraction from multiple data sources. This paper presents application of data mining for land-cover classification by fusing SPOT-6, RADARSAT-2, and derived dataset. First, the images and other derived indices (normalized difference vegetation index, normalized difference water index, and soil adjusted vegetation index) were combined and subjected to segmentation process with optimal segmentation parameters obtained using combination of spatial and Taguchi statistical optimization. The image objects, which carry all the attributes of the input datasets, were extracted and related to the target land-cover classes through data mining algorithms (decision tree) for classification. To evaluate the performance, the result was compared with two nonparametric classifiers: support vector machine (SVM) and random forest (RF). Furthermore, the decision tree classification result was evaluated against six unoptimized trials segmented using arbitrary parameter combinations. The result shows that the optimized process produces better land-use land-cover classification with overall classification accuracy of 91.79%, 87.25%, and 88.69% for SVM and RF, respectively, while the results of the six unoptimized classifications yield overall accuracy between 84.44% and 88.08%. Higher accuracy of the optimized data mining classification approach compared to the unoptimized results indicates that the optimization process has significant impact on the classification quality.
Habeck, C; Gazes, Y; Razlighi, Q; Steffener, J; Brickman, A; Barulli, D; Salthouse, T; Stern, Y
2016-01-15
Analyses of large test batteries administered to individuals ranging from young to old have consistently yielded a set of latent variables representing reference abilities (RAs) that capture the majority of the variance in age-related cognitive change: Episodic Memory, Fluid Reasoning, Perceptual Processing Speed, and Vocabulary. In a previous paper (Stern et al., 2014), we introduced the Reference Ability Neural Network Study, which administers 12 cognitive neuroimaging tasks (3 for each RA) to healthy adults age 20-80 in order to derive unique neural networks underlying these 4 RAs and investigate how these networks may be affected by aging. We used a multivariate approach, linear indicator regression, to derive a unique covariance pattern or Reference Ability Neural Network (RANN) for each of the 4 RAs. The RANNs were derived from the neural task data of 64 younger adults of age 30 and below. We then prospectively applied the RANNs to fMRI data from the remaining sample of 227 adults of age 31 and above in order to classify each subject-task map into one of the 4 possible reference domains. Overall classification accuracy across subjects in the sample age 31 and above was 0.80±0.18. Classification accuracy by RA domain was also good, but variable; memory: 0.72±0.32; reasoning: 0.75±0.35; speed: 0.79±0.31; vocabulary: 0.94±0.16. Classification accuracy was not associated with cross-sectional age, suggesting that these networks, and their specificity to the respective reference domain, might remain intact throughout the age range. Higher mean brain volume was correlated with increased overall classification accuracy; better overall performance on the tasks in the scanner was also associated with classification accuracy. For the RANN network scores, we observed for each RANN that a higher score was associated with a higher corresponding classification accuracy for that reference ability. Despite the absence of behavioral performance information in the derivation of these networks, we also observed some brain-behavioral correlations, notably for the fluid-reasoning network whose network score correlated with performance on the memory and fluid-reasoning tasks. While age did not influence the expression of this RANN, the slope of the association between network score and fluid-reasoning performance was negatively associated with higher ages. These results provide support for the hypothesis that a set of specific, age-invariant neural networks underlies these four RAs, and that these networks maintain their cognitive specificity and level of intensity across age. Activation common to all 12 tasks was identified as another activation pattern resulting from a mean-contrast Partial-Least-Squares technique. This common pattern did show associations with age and some subject demographics for some of the reference domains, lending support to the overall conclusion that aspects of neural processing that are specific to any cognitive reference ability stay constant across age, while aspects that are common to all reference abilities differ across age. Copyright © 2015 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Dondurur, Mehmet
The primary objective of this study was to determine the degree to which modern SAR systems can be used to obtain information about the Earth's vegetative resources. Information obtainable from microwave synthetic aperture radar (SAR) data was compared with that obtainable from LANDSAT-TM and SPOT data. Three hypotheses were tested: (a) Classification of land cover/use from SAR data can be accomplished on a pixel-by-pixel basis with the same overall accuracy as from LANDSAT-TM and SPOT data. (b) Classification accuracy for individual land cover/use classes will differ between sensors. (c) Combining information derived from optical and SAR data into an integrated monitoring system will improve overall and individual land cover/use class accuracies. The study was conducted with three data sets for the Sleeping Bear Dunes test site in the northwestern part of Michigan's lower peninsula, including an October 1982 LANDSAT-TM scene, a June 1989 SPOT scene and C-, L- and P-Band radar data from the Jet Propulsion Laboratory AIRSAR. Reference data were derived from the Michigan Resource Information System (MIRIS) and available color infrared aerial photos. Classification and rectification of data sets were done using ERDAS Image Processing Programs. Classification algorithms included Maximum Likelihood, Mahalanobis Distance, Minimum Spectral Distance, ISODATA, Parallelepiped, and Sequential Cluster Analysis. Classified images were rectified as necessary so that all were at the same scale and oriented north-up. Results were analyzed with contingency tables and percent correctly classified (PCC) and Cohen's Kappa (CK) as accuracy indices using CSLANT and ImagePro programs developed for this study. Accuracy analyses were based upon a 1.4 by 6.5 km area with its long axis east-west. Reference data for this subscene total 55,770 15 by 15 m pixels with sixteen cover types, including seven level III forest classes, three level III urban classes, two level II range classes, two water classes, one wetland class and one agriculture class. An initial analysis was made without correcting the 1978 MIRIS reference data to the different dates of the TM, SPOT and SAR data sets. In this analysis, highest overall classification accuracy (PCC) was 87% with the TM data set, with both SPOT and C-Band SAR at 85%, a difference statistically significant at the 0.05 level. When the reference data were corrected for land cover change between 1978 and 1991, classification accuracy with the C-Band SAR data increased to 87%. Classification accuracy differed from sensor to sensor for individual land cover classes, Combining sensors into hypothetical multi-sensor systems resulted in higher accuracies than for any single sensor. Combining LANDSAT -TM and C-Band SAR yielded an overall classification accuracy (PCC) of 92%. The results of this study indicate that C-Band SAR data provide an acceptable substitute for LANDSAT-TM or SPOT data when land cover information is desired of areas where cloud cover obscures the terrain. Even better results can be obtained by integrating TM and C-Band SAR data into a multi-sensor system.
Sabr, Abutaleb; Moeinaddini, Mazaher; Azarnivand, Hossein; Guinot, Benjamin
2016-12-01
In the recent years, dust storms originating from local abandoned agricultural lands have increasingly impacted Tehran and Karaj air quality. Designing and implementing mitigation plans are necessary to study land use/land cover change (LUCC). Land use/cover classification is particularly relevant in arid areas. This study aimed to map land use/cover by pixel- and object-based image classification methods, analyse landscape fragmentation and determine the effects of two different classification methods on landscape metrics. The same sets of ground data were used for both classification methods. Because accuracy of classification plays a key role in better understanding LUCC, both methods were employed. Land use/cover maps of the southwest area of Tehran city for the years 1985, 2000 and 2014 were obtained from Landsat digital images and classified into three categories: built-up, agricultural and barren lands. The results of our LUCC analysis showed that the most important changes in built-up agricultural land categories were observed in zone B (Shahriar, Robat Karim and Eslamshahr) between 1985 and 2014. The landscape metrics obtained for all categories pictured high landscape fragmentation in the study area. Despite no significant difference was evidenced between the two classification methods, the object-based classification led to an overall higher accuracy than using the pixel-based classification. In particular, the accuracy of the built-up category showed a marked increase. In addition, both methods showed similar trends in fragmentation metrics. One of the reasons is that the object-based classification is able to identify buildings, impervious surface and roads in dense urban areas, which produced more accurate maps.
Dyrba, Martin; Barkhof, Frederik; Fellgiebel, Andreas; Filippi, Massimo; Hausner, Lucrezia; Hauenstein, Karlheinz; Kirste, Thomas; Teipel, Stefan J
2015-01-01
Alzheimer's disease (AD) patients show early changes in white matter (WM) structural integrity. We studied the use of diffusion tensor imaging (DTI) in assessing WM alterations in the predementia stage of mild cognitive impairment (MCI). We applied a Support Vector Machine (SVM) classifier to DTI and volumetric magnetic resonance imaging data from 35 amyloid-β42 negative MCI subjects (MCI-Aβ42-), 35 positive MCI subjects (MCI-Aβ42+), and 25 healthy controls (HC) retrieved from the European DTI Study on Dementia. The SVM was applied to DTI-derived fractional anisotropy, mean diffusivity (MD), and mode of anisotropy (MO) maps. For comparison, we studied classification based on gray matter (GM) and WM volume. We obtained accuracies of up to 68% for MO and 63% for GM volume when it came to distinguishing between MCI-Aβ42- and MCI-Aβ42+. When it came to separating MCI-Aβ42+ from HC we achieved an accuracy of up to 77% for MD and a significantly lower accuracy of 68% for GM volume. The accuracy of multimodal classification was not higher than the accuracy of the best single modality. Our results suggest that DTI data provide better prediction accuracy than GM volume in predementia AD. Copyright © 2015 by the American Society of Neuroimaging.
NASA Astrophysics Data System (ADS)
Zou, Xiaoliang; Zhao, Guihua; Li, Jonathan; Yang, Yuanxi; Fang, Yong
2016-06-01
With the rapid developments of the sensor technology, high spatial resolution imagery and airborne Lidar point clouds can be captured nowadays, which make classification, extraction, evaluation and analysis of a broad range of object features available. High resolution imagery, Lidar dataset and parcel map can be widely used for classification as information carriers. Therefore, refinement of objects classification is made possible for the urban land cover. The paper presents an approach to object based image analysis (OBIA) combing high spatial resolution imagery and airborne Lidar point clouds. The advanced workflow for urban land cover is designed with four components. Firstly, colour-infrared TrueOrtho photo and laser point clouds were pre-processed to derive the parcel map of water bodies and nDSM respectively. Secondly, image objects are created via multi-resolution image segmentation integrating scale parameter, the colour and shape properties with compactness criterion. Image can be subdivided into separate object regions. Thirdly, image objects classification is performed on the basis of segmentation and a rule set of knowledge decision tree. These objects imagery are classified into six classes such as water bodies, low vegetation/grass, tree, low building, high building and road. Finally, in order to assess the validity of the classification results for six classes, accuracy assessment is performed through comparing randomly distributed reference points of TrueOrtho imagery with the classification results, forming the confusion matrix and calculating overall accuracy and Kappa coefficient. The study area focuses on test site Vaihingen/Enz and a patch of test datasets comes from the benchmark of ISPRS WG III/4 test project. The classification results show higher overall accuracy for most types of urban land cover. Overall accuracy is 89.5% and Kappa coefficient equals to 0.865. The OBIA approach provides an effective and convenient way to combine high resolution imagery and Lidar ancillary data for classification of urban land cover.
Kim, Junghoe; Calhoun, Vince D.; Shim, Eunsoo; Lee, Jong-Hwan
2015-01-01
Functional connectivity (FC) patterns obtained from resting-state functional magnetic resonance imaging data are commonly employed to study neuropsychiatric conditions by using pattern classifiers such as the support vector machine (SVM). Meanwhile, a deep neural network (DNN) with multiple hidden layers has shown its ability to systematically extract lower-to-higher level information of image and speech data from lower-to-higher hidden layers, markedly enhancing classification accuracy. The objective of this study was to adopt the DNN for whole-brain resting-state FC pattern classification of schizophrenia (SZ) patients vs. healthy controls (HCs) and identification of aberrant FC patterns associated with SZ. We hypothesized that the lower-to-higher level features learned via the DNN would significantly enhance the classification accuracy, and proposed an adaptive learning algorithm to explicitly control the weight sparsity in each hidden layer via L1-norm regularization. Furthermore, the weights were initialized via stacked autoencoder based pre-training to further improve the classification performance. Classification accuracy was systematically evaluated as a function of (1) the number of hidden layers/nodes, (2) the use of L1-norm regularization, (3) the use of the pre-training, (4) the use of framewise displacement (FD) removal, and (5) the use of anatomical/functional parcellation. Using FC patterns from anatomically parcellated regions without FD removal, an error rate of 14.2% was achieved by employing three hidden layers and 50 hidden nodes with both L1-norm regularization and pre-training, which was substantially lower than the error rate from the SVM (22.3%). Moreover, the trained DNN weights (i.e., the learned features) were found to represent the hierarchical organization of aberrant FC patterns in SZ compared with HC. Specifically, pairs of nodes extracted from the lower hidden layer represented sparse FC patterns implicated in SZ, which was quantified by using kurtosis/modularity measures and features from the higher hidden layer showed holistic/global FC patterns differentiating SZ from HC. Our proposed schemes and reported findings attained by using the DNN classifier and whole-brain FC data suggest that such approaches show improved ability to learn hidden patterns in brain imaging data, which may be useful for developing diagnostic tools for SZ and other neuropsychiatric disorders and identifying associated aberrant FC patterns. PMID:25987366
NASA Astrophysics Data System (ADS)
Samsudin, Sarah Hanim; Shafri, Helmi Z. M.; Hamedianfar, Alireza
2016-04-01
Status observations of roofing material degradation are constantly evolving due to urban feature heterogeneities. Although advanced classification techniques have been introduced to improve within-class impervious surface classifications, these techniques involve complex processing and high computation times. This study integrates field spectroscopy and satellite multispectral remote sensing data to generate degradation status maps of concrete and metal roofing materials. Field spectroscopy data were used as bases for selecting suitable bands for spectral index development because of the limited number of multispectral bands. Mapping methods for roof degradation status were established for metal and concrete roofing materials by developing the normalized difference concrete condition index (NDCCI) and the normalized difference metal condition index (NDMCI). Results indicate that the accuracies achieved using the spectral indices are higher than those obtained using supervised pixel-based classification. The NDCCI generated an accuracy of 84.44%, whereas the support vector machine (SVM) approach yielded an accuracy of 73.06%. The NDMCI obtained an accuracy of 94.17% compared with 62.5% for the SVM approach. These findings support the suitability of the developed spectral index methods for determining roof degradation statuses from satellite observations in heterogeneous urban environments.
Liu, Jingfang; Zhang, Pengzhu; Lu, Yingjie
2014-11-01
User-generated medical messages on Internet contain extensive information related to adverse drug reactions (ADRs) and are known as valuable resources for post-marketing drug surveillance. The aim of this study was to find an effective method to identify messages related to ADRs automatically from online user reviews. We conducted experiments on online user reviews using different feature set and different classification technique. Firstly, the messages from three communities, allergy community, schizophrenia community and pain management community, were collected, the 3000 messages were annotated. Secondly, the N-gram-based features set and medical domain-specific features set were generated. Thirdly, three classification techniques, SVM, C4.5 and Naïve Bayes, were used to perform classification tasks separately. Finally, we evaluated the performance of different method using different feature set and different classification technique by comparing the metrics including accuracy and F-measure. In terms of accuracy, the accuracy of SVM classifier was higher than 0.8, the accuracy of C4.5 classifier or Naïve Bayes classifier was lower than 0.8; meanwhile, the combination feature sets including n-gram-based feature set and domain-specific feature set consistently outperformed single feature set. In terms of F-measure, the highest F-measure is 0.895 which was achieved by using combination feature sets and a SVM classifier. In all, we can get the best classification performance by using combination feature sets and SVM classifier. By using combination feature sets and SVM classifier, we can get an effective method to identify messages related to ADRs automatically from online user reviews.
NASA Astrophysics Data System (ADS)
Aricò, P.; Aloise, F.; Schettini, F.; Salinari, S.; Mattia, D.; Cincotti, F.
2014-06-01
Objective. Several ERP-based brain-computer interfaces (BCIs) that can be controlled even without eye movements (covert attention) have been recently proposed. However, when compared to similar systems based on overt attention, they displayed significantly lower accuracy. In the current interpretation, this is ascribed to the absence of the contribution of short-latency visual evoked potentials (VEPs) in the tasks performed in the covert attention modality. This study aims to investigate if this decrement (i) is fully explained by the lack of VEP contribution to the classification accuracy; (ii) correlates with lower temporal stability of the single-trial P300 potentials elicited in the covert attention modality. Approach. We evaluated the latency jitter of P300 evoked potentials in three BCI interfaces exploiting either overt or covert attention modalities in 20 healthy subjects. The effect of attention modality on the P300 jitter, and the relative contribution of VEPs and P300 jitter to the classification accuracy have been analyzed. Main results. The P300 jitter is higher when the BCI is controlled in covert attention. Classification accuracy negatively correlates with jitter. Even disregarding short-latency VEPs, overt-attention BCI yields better accuracy than covert. When the latency jitter is compensated offline, the difference between accuracies is not significant. Significance. The lower temporal stability of the P300 evoked potential generated during the tasks performed in covert attention modality should be regarded as the main contributing explanation of lower accuracy of covert-attention ERP-based BCIs.
Schmitter, Daniel; Roche, Alexis; Maréchal, Bénédicte; Ribes, Delphine; Abdulkadir, Ahmed; Bach-Cuadra, Meritxell; Daducci, Alessandro; Granziera, Cristina; Klöppel, Stefan; Maeder, Philippe; Meuli, Reto; Krueger, Gunnar
2014-01-01
Voxel-based morphometry from conventional T1-weighted images has proved effective to quantify Alzheimer's disease (AD) related brain atrophy and to enable fairly accurate automated classification of AD patients, mild cognitive impaired patients (MCI) and elderly controls. Little is known, however, about the classification power of volume-based morphometry, where features of interest consist of a few brain structure volumes (e.g. hippocampi, lobes, ventricles) as opposed to hundreds of thousands of voxel-wise gray matter concentrations. In this work, we experimentally evaluate two distinct volume-based morphometry algorithms (FreeSurfer and an in-house algorithm called MorphoBox) for automatic disease classification on a standardized data set from the Alzheimer's Disease Neuroimaging Initiative. Results indicate that both algorithms achieve classification accuracy comparable to the conventional whole-brain voxel-based morphometry pipeline using SPM for AD vs elderly controls and MCI vs controls, and higher accuracy for classification of AD vs MCI and early vs late AD converters, thereby demonstrating the potential of volume-based morphometry to assist diagnosis of mild cognitive impairment and Alzheimer's disease. PMID:25429357
NASA Astrophysics Data System (ADS)
Gevaert, C. M.; Persello, C.; Sliuzas, R.; Vosselman, G.
2016-06-01
Unmanned Aerial Vehicles (UAVs) are capable of providing very high resolution and up-to-date information to support informal settlement upgrading projects. In order to provide accurate basemaps, urban scene understanding through the identification and classification of buildings and terrain is imperative. However, common characteristics of informal settlements such as small, irregular buildings with heterogeneous roof material and large presence of clutter challenge state-of-the-art algorithms. Especially the dense buildings and steeply sloped terrain cause difficulties in identifying elevated objects. This work investigates how 2D radiometric and textural features, 2.5D topographic features, and 3D geometric features obtained from UAV imagery can be integrated to obtain a high classification accuracy in challenging classification problems for the analysis of informal settlements. It compares the utility of pixel-based and segment-based features obtained from an orthomosaic and DSM with point-based and segment-based features extracted from the point cloud to classify an unplanned settlement in Kigali, Rwanda. Findings show that the integration of 2D and 3D features leads to higher classification accuracies.
Branch classification: A new mechanism for improving branch predictor performance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, P.Y.; Hao, E.; Patt, Y.
There is wide agreement that one of the most significant impediments to the performance of current and future pipelined superscalar processors is the presence of conditional branches in the instruction stream. Speculative execution is one solution to the branch problem, but speculative work is discarded if a branch is mispredicted. For it to be effective, speculative work is discarded if a branch is mispredicted. For it to be effective, speculative execution requires a very accurate branch predictor; 95% accuracy is not good enough. This paper proposes branch classification, a methodology for building more accurate branch predictors. Branch classification allows anmore » individual branch instruction to be associated with the branch predictor best suited to predict its direction. Using this approach, a hybrid branch predictor can be constructed such that each component branch predictor predicts those branches for which it is best suited. To demonstrate the usefulness of branch classification, an example classification scheme is given and a new hybrid predictor is built based on this scheme which achieves a higher prediction accuracy than any branch predictor previously reported in the literature.« less
Pashaei, Elnaz; Ozen, Mustafa; Aydin, Nizamettin
2015-08-01
Improving accuracy of supervised classification algorithms in biomedical applications is one of active area of research. In this study, we improve the performance of Particle Swarm Optimization (PSO) combined with C4.5 decision tree (PSO+C4.5) classifier by applying Boosted C5.0 decision tree as the fitness function. To evaluate the effectiveness of our proposed method, it is implemented on 1 microarray dataset and 5 different medical data sets obtained from UCI machine learning databases. Moreover, the results of PSO + Boosted C5.0 implementation are compared to eight well-known benchmark classification methods (PSO+C4.5, support vector machine under the kernel of Radial Basis Function, Classification And Regression Tree (CART), C4.5 decision tree, C5.0 decision tree, Boosted C5.0 decision tree, Naive Bayes and Weighted K-Nearest neighbor). Repeated five-fold cross-validation method was used to justify the performance of classifiers. Experimental results show that our proposed method not only improve the performance of PSO+C4.5 but also obtains higher classification accuracy compared to the other classification methods.
Variance approximations for assessments of classification accuracy
R. L. Czaplewski
1994-01-01
Variance approximations are derived for the weighted and unweighted kappa statistics, the conditional kappa statistic, and conditional probabilities. These statistics are useful to assess classification accuracy, such as accuracy of remotely sensed classifications in thematic maps when compared to a sample of reference classifications made in the field. Published...
Das, D K; Maiti, A K; Chakraborty, C
2015-03-01
In this paper, we propose a comprehensive image characterization cum classification framework for malaria-infected stage detection using microscopic images of thin blood smears. The methodology mainly includes microscopic imaging of Leishman stained blood slides, noise reduction and illumination correction, erythrocyte segmentation, feature selection followed by machine classification. Amongst three-image segmentation algorithms (namely, rule-based, Chan-Vese-based and marker-controlled watershed methods), marker-controlled watershed technique provides better boundary detection of erythrocytes specially in overlapping situations. Microscopic features at intensity, texture and morphology levels are extracted to discriminate infected and noninfected erythrocytes. In order to achieve subgroup of potential features, feature selection techniques, namely, F-statistic and information gain criteria are considered here for ranking. Finally, five different classifiers, namely, Naive Bayes, multilayer perceptron neural network, logistic regression, classification and regression tree (CART), RBF neural network have been trained and tested by 888 erythrocytes (infected and noninfected) for each features' subset. Performance evaluation of the proposed methodology shows that multilayer perceptron network provides higher accuracy for malaria-infected erythrocytes recognition and infected stage classification. Results show that top 90 features ranked by F-statistic (specificity: 98.64%, sensitivity: 100%, PPV: 99.73% and overall accuracy: 96.84%) and top 60 features ranked by information gain provides better results (specificity: 97.29%, sensitivity: 100%, PPV: 99.46% and overall accuracy: 96.73%) for malaria-infected stage classification. © 2014 The Authors Journal of Microscopy © 2014 Royal Microscopical Society.
Rossini, Paolo M; Buscema, Massimo; Capriotti, Massimiliano; Grossi, Enzo; Rodriguez, Guido; Del Percio, Claudio; Babiloni, Claudio
2008-07-01
It has been shown that a new procedure (implicit function as squashing time, IFAST) based on artificial neural networks (ANNs) is able to compress eyes-closed resting electroencephalographic (EEG) data into spatial invariants of the instant voltage distributions for an automatic classification of mild cognitive impairment (MCI) and Alzheimer's disease (AD) subjects with classification accuracy of individual subjects higher than 92%. Here we tested the hypothesis that this is the case also for the classification of individual normal elderly (Nold) vs. MCI subjects, an important issue for the screening of large populations at high risk of AD. Eyes-closed resting EEG data (10-20 electrode montage) were recorded in 171 Nold and in 115 amnesic MCI subjects. The data inputs for the classification by IFAST were the weights of the connections within a nonlinear auto-associative ANN trained to generate the instant voltage distributions of 60-s artifact-free EEG data. The most relevant features were selected and coincidently the dataset was split into two halves for the final binary classification (training and testing) performed by a supervised ANN. The classification of the individual Nold and MCI subjects reached 95.87% of sensitivity and 91.06% of specificity (93.46% of accuracy). These results indicate that IFAST can reliably distinguish eyes-closed resting EEG in individual Nold and MCI subjects. IFAST may be used for large-scale periodic screening of large populations at risk of AD and personalized care.
An Active Learning Framework for Hyperspectral Image Classification Using Hierarchical Segmentation
NASA Technical Reports Server (NTRS)
Zhang, Zhou; Pasolli, Edoardo; Crawford, Melba M.; Tilton, James C.
2015-01-01
Augmenting spectral data with spatial information for image classification has recently gained significant attention, as classification accuracy can often be improved by extracting spatial information from neighboring pixels. In this paper, we propose a new framework in which active learning (AL) and hierarchical segmentation (HSeg) are combined for spectral-spatial classification of hyperspectral images. The spatial information is extracted from a best segmentation obtained by pruning the HSeg tree using a new supervised strategy. The best segmentation is updated at each iteration of the AL process, thus taking advantage of informative labeled samples provided by the user. The proposed strategy incorporates spatial information in two ways: 1) concatenating the extracted spatial features and the original spectral features into a stacked vector and 2) extending the training set using a self-learning-based semi-supervised learning (SSL) approach. Finally, the two strategies are combined within an AL framework. The proposed framework is validated with two benchmark hyperspectral datasets. Higher classification accuracies are obtained by the proposed framework with respect to five other state-of-the-art spectral-spatial classification approaches. Moreover, the effectiveness of the proposed pruning strategy is also demonstrated relative to the approaches based on a fixed segmentation.
Ramsey, Elijah W.; Nelson, Gene A.; Sapkota, Sijan
1998-01-01
A progressive classification of a marsh and forest system using Landsat Thematic Mapper (TM), color infrared (CIR) photograph, and ERS-1 synthetic aperture radar (SAR) data improved classification accuracy when compared to classification using solely TM reflective band data. The classification resulted in a detailed identification of differences within a nearly monotypic black needlerush marsh. Accuracy percentages of these classes were surprisingly high given the complexities of classification. The detailed classification resulted in a more accurate portrayal of the marsh transgressive sequence than was obtainable with TM data alone. Individual sensor contribution to the improved classification was compared to that using only the six reflective TM bands. Individually, the green reflective CIR and SAR data identified broad categories of water, marsh, and forest. In combination with TM, SAR and the green CIR band each improved overall accuracy by about 3% and 15% respectively. The SAR data improved the TM classification accuracy mostly in the marsh classes. The green CIR data also improved the marsh classification accuracy and accuracies in some water classes. The final combination of all sensor data improved almost all class accuracies from 2% to 70% with an overall improvement of about 20% over TM data alone. Not only was the identification of vegetation types improved, but the spatial detail of the classification approached 10 m in some areas.
Esteve Agelet, Lidia; Armstrong, Paul R; Tallada, Jasper G; Hurburgh, Charles R
2013-12-01
Previous studies showed that Near Infrared Spectroscopy (NIRS) could distinguish between Roundup Ready® (RR) and conventional soybeans at the bulk and single seed sample level, but it was not clear which compounds drove the classification. In this research the varieties used did not show significant differences in major compounds between RR and conventional beans, but moisture content had a big impact on classification accuracies. Four of the five RR samples had slightly higher moistures and had a higher water uptake than their conventional counterparts. This could be linked with differences in their hulls, being either compositional or morphological. Because water absorption occurs in the same region as main compounds in hulls (mainly carbohydrates) and water causes physical changes from swelling, variations in moisture cause a complex interaction resulting in a large impact on discrimination accuracies. Copyright © 2013 Elsevier Ltd. All rights reserved.
Comparison of Feature Selection Techniques in Machine Learning for Anatomical Brain MRI in Dementia.
Tohka, Jussi; Moradi, Elaheh; Huttunen, Heikki
2016-07-01
We present a comparative split-half resampling analysis of various data driven feature selection and classification methods for the whole brain voxel-based classification analysis of anatomical magnetic resonance images. We compared support vector machines (SVMs), with or without filter based feature selection, several embedded feature selection methods and stability selection. While comparisons of the accuracy of various classification methods have been reported previously, the variability of the out-of-training sample classification accuracy and the set of selected features due to independent training and test sets have not been previously addressed in a brain imaging context. We studied two classification problems: 1) Alzheimer's disease (AD) vs. normal control (NC) and 2) mild cognitive impairment (MCI) vs. NC classification. In AD vs. NC classification, the variability in the test accuracy due to the subject sample did not vary between different methods and exceeded the variability due to different classifiers. In MCI vs. NC classification, particularly with a large training set, embedded feature selection methods outperformed SVM-based ones with the difference in the test accuracy exceeding the test accuracy variability due to the subject sample. The filter and embedded methods produced divergent feature patterns for MCI vs. NC classification that suggests the utility of the embedded feature selection for this problem when linked with the good generalization performance. The stability of the feature sets was strongly correlated with the number of features selected, weakly correlated with the stability of classification accuracy, and uncorrelated with the average classification accuracy.
ERIC Educational Resources Information Center
Wang, Wenyi; Song, Lihong; Chen, Ping; Meng, Yaru; Ding, Shuliang
2015-01-01
Classification consistency and accuracy are viewed as important indicators for evaluating the reliability and validity of classification results in cognitive diagnostic assessment (CDA). Pattern-level classification consistency and accuracy indices were introduced by Cui, Gierl, and Chang. However, the indices at the attribute level have not yet…
NASA Astrophysics Data System (ADS)
Nitze, Ingmar; Barrett, Brian; Cawkwell, Fiona
2015-02-01
The analysis and classification of land cover is one of the principal applications in terrestrial remote sensing. Due to the seasonal variability of different vegetation types and land surface characteristics, the ability to discriminate land cover types changes over time. Multi-temporal classification can help to improve the classification accuracies, but different constraints, such as financial restrictions or atmospheric conditions, may impede their application. The optimisation of image acquisition timing and frequencies can help to increase the effectiveness of the classification process. For this purpose, the Feature Importance (FI) measure of the state-of-the art machine learning method Random Forest was used to determine the optimal image acquisition periods for a general (Grassland, Forest, Water, Settlement, Peatland) and Grassland specific (Improved Grassland, Semi-Improved Grassland) land cover classification in central Ireland based on a 9-year time-series of MODIS Terra 16 day composite data (MOD13Q1). Feature Importances for each acquisition period of the Enhanced Vegetation Index (EVI) and Normalised Difference Vegetation Index (NDVI) were calculated for both classification scenarios. In the general land cover classification, the months December and January showed the highest, and July and August the lowest separability for both VIs over the entire nine-year period. This temporal separability was reflected in the classification accuracies, where the optimal choice of image dates outperformed the worst image date by 13% using NDVI and 5% using EVI on a mono-temporal analysis. With the addition of the next best image periods to the data input the classification accuracies converged quickly to their limit at around 8-10 images. The binary classification schemes, using two classes only, showed a stronger seasonal dependency with a higher intra-annual, but lower inter-annual variation. Nonetheless anomalous weather conditions, such as the cold winter of 2009/2010 can alter the temporal separability pattern significantly. Due to the extensive use of the NDVI for land cover discrimination, the findings of this study should be transferrable to data from other optical sensors with a higher spatial resolution. However, the high impact of outliers from the general climatic pattern highlights the limitation of spatial transferability to locations with different climatic and land cover conditions. The use of high-temporal, moderate resolution data such as MODIS in conjunction with machine-learning techniques proved to be a good base for the prediction of image acquisition timing for optimal land cover classification results.
NASA Astrophysics Data System (ADS)
Qu, Haicheng; Liang, Xuejian; Liang, Shichao; Liu, Wanjun
2018-01-01
Many methods of hyperspectral image classification have been proposed recently, and the convolutional neural network (CNN) achieves outstanding performance. However, spectral-spatial classification of CNN requires an excessively large model, tremendous computations, and complex network, and CNN is generally unable to use the noisy bands caused by water-vapor absorption. A dimensionality-varied CNN (DV-CNN) is proposed to address these issues. There are four stages in DV-CNN and the dimensionalities of spectral-spatial feature maps vary with the stages. DV-CNN can reduce the computation and simplify the structure of the network. All feature maps are processed by more kernels in higher stages to extract more precise features. DV-CNN also improves the classification accuracy and enhances the robustness to water-vapor absorption bands. The experiments are performed on data sets of Indian Pines and Pavia University scene. The classification performance of DV-CNN is compared with state-of-the-art methods, which contain the variations of CNN, traditional, and other deep learning methods. The experiment of performance analysis about DV-CNN itself is also carried out. The experimental results demonstrate that DV-CNN outperforms state-of-the-art methods for spectral-spatial classification and it is also robust to water-vapor absorption bands. Moreover, reasonable parameters selection is effective to improve classification accuracy.
A Hybrid Sensing Approach for Pure and Adulterated Honey Classification
Subari, Norazian; Saleh, Junita Mohamad; Shakaff, Ali Yeon Md; Zakaria, Ammar
2012-01-01
This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data. PMID:23202033
Exploration of Force Myography and surface Electromyography in hand gesture classification.
Jiang, Xianta; Merhi, Lukas-Karim; Xiao, Zhen Gang; Menon, Carlo
2017-03-01
Whereas pressure sensors increasingly have received attention as a non-invasive interface for hand gesture recognition, their performance has not been comprehensively evaluated. This work examined the performance of hand gesture classification using Force Myography (FMG) and surface Electromyography (sEMG) technologies by performing 3 sets of 48 hand gestures using a prototyped FMG band and an array of commercial sEMG sensors worn both on the wrist and forearm simultaneously. The results show that the FMG band achieved classification accuracies as good as the high quality, commercially available, sEMG system on both wrist and forearm positions; specifically, by only using 8 Force Sensitive Resisters (FSRs), the FMG band achieved accuracies of 91.2% and 83.5% in classifying the 48 hand gestures in cross-validation and cross-trial evaluations, which were higher than those of sEMG (84.6% and 79.1%). By using all 16 FSRs on the band, our device achieved high accuracies of 96.7% and 89.4% in cross-validation and cross-trial evaluations. Copyright © 2017 IPEM. Published by Elsevier Ltd. All rights reserved.
Identifying Autism from Resting-State fMRI Using Long Short-Term Memory Networks.
Dvornek, Nicha C; Ventola, Pamela; Pelphrey, Kevin A; Duncan, James S
2017-09-01
Functional magnetic resonance imaging (fMRI) has helped characterize the pathophysiology of autism spectrum disorders (ASD) and carries promise for producing objective biomarkers for ASD. Recent work has focused on deriving ASD biomarkers from resting-state functional connectivity measures. However, current efforts that have identified ASD with high accuracy were limited to homogeneous, small datasets, while classification results for heterogeneous, multi-site data have shown much lower accuracy. In this paper, we propose the use of recurrent neural networks with long short-term memory (LSTMs) for classification of individuals with ASD and typical controls directly from the resting-state fMRI time-series. We used the entire large, multi-site Autism Brain Imaging Data Exchange (ABIDE) I dataset for training and testing the LSTM models. Under a cross-validation framework, we achieved classification accuracy of 68.5%, which is 9% higher than previously reported methods that used fMRI data from the whole ABIDE cohort. Finally, we presented interpretation of the trained LSTM weights, which highlight potential functional networks and regions that are known to be implicated in ASD.
Shermeyer, Jacob S.; Haack, Barry N.
2015-01-01
Two forestry-change detection methods are described, compared, and contrasted for estimating deforestation and growth in threatened forests in southern Peru from 2000 to 2010. The methods used in this study rely on freely available data, including atmospherically corrected Landsat 5 Thematic Mapper and Moderate Resolution Imaging Spectroradiometer (MODIS) vegetation continuous fields (VCF). The two methods include a conventional supervised signature extraction method and a unique self-calibrating method called MODIS VCF guided forest/nonforest (FNF) masking. The process chain for each of these methods includes a threshold classification of MODIS VCF, training data or signature extraction, signature evaluation, k-nearest neighbor classification, analyst-guided reclassification, and postclassification image differencing to generate forest change maps. Comparisons of all methods were based on an accuracy assessment using 500 validation pixels. Results of this accuracy assessment indicate that FNF masking had a 5% higher overall accuracy and was superior to conventional supervised classification when estimating forest change. Both methods succeeded in classifying persistently forested and nonforested areas, and both had limitations when classifying forest change.
Identifying Autism from Resting-State fMRI Using Long Short-Term Memory Networks
Dvornek, Nicha C.; Ventola, Pamela; Pelphrey, Kevin A.; Duncan, James S.
2017-01-01
Functional magnetic resonance imaging (fMRI) has helped characterize the pathophysiology of autism spectrum disorders (ASD) and carries promise for producing objective biomarkers for ASD. Recent work has focused on deriving ASD biomarkers from resting-state functional connectivity measures. However, current efforts that have identified ASD with high accuracy were limited to homogeneous, small datasets, while classification results for heterogeneous, multi-site data have shown much lower accuracy. In this paper, we propose the use of recurrent neural networks with long short-term memory (LSTMs) for classification of individuals with ASD and typical controls directly from the resting-state fMRI time-series. We used the entire large, multi-site Autism Brain Imaging Data Exchange (ABIDE) I dataset for training and testing the LSTM models. Under a cross-validation framework, we achieved classification accuracy of 68.5%, which is 9% higher than previously reported methods that used fMRI data from the whole ABIDE cohort. Finally, we presented interpretation of the trained LSTM weights, which highlight potential functional networks and regions that are known to be implicated in ASD. PMID:29104967
[Accuracy improvement of spectral classification of crop using microwave backscatter data].
Jia, Kun; Li, Qiang-Zi; Tian, Yi-Chen; Wu, Bing-Fang; Zhang, Fei-Fei; Meng, Ji-Hua
2011-02-01
In the present study, VV polarization microwave backscatter data used for improving accuracies of spectral classification of crop is investigated. Classification accuracy using different classifiers based on the fusion data of HJ satellite multi-spectral and Envisat ASAR VV backscatter data are compared. The results indicate that fusion data can take full advantage of spectral information of HJ multi-spectral data and the structure sensitivity feature of ASAR VV polarization data. The fusion data enlarges the spectral difference among different classifications and improves crop classification accuracy. The classification accuracy using fusion data can be increased by 5 percent compared to the single HJ data. Furthermore, ASAR VV polarization data is sensitive to non-agrarian area of planted field, and VV polarization data joined classification can effectively distinguish the field border. VV polarization data associating with multi-spectral data used in crop classification enlarges the application of satellite data and has the potential of spread in the domain of agriculture.
NASA Astrophysics Data System (ADS)
Sakuma, Jun; Wright, Rebecca N.
Privacy-preserving classification is the task of learning or training a classifier on the union of privately distributed datasets without sharing the datasets. The emphasis of existing studies in privacy-preserving classification has primarily been put on the design of privacy-preserving versions of particular data mining algorithms, However, in classification problems, preprocessing and postprocessing— such as model selection or attribute selection—play a prominent role in achieving higher classification accuracy. In this paper, we show generalization error of classifiers in privacy-preserving classification can be securely evaluated without sharing prediction results. Our main technical contribution is a new generalized Hamming distance protocol that is universally applicable to preprocessing and postprocessing of various privacy-preserving classification problems, such as model selection in support vector machine and attribute selection in naive Bayes classification.
Hu, Shan; Xu, Chao; Guan, Weiqiao; Tang, Yong; Liu, Yana
2014-01-01
Osteosarcoma is the most common malignant bone tumor among children and adolescents. In this study, image texture analysis was made to extract texture features from bone CR images to evaluate the recognition rate of osteosarcoma. To obtain the optimal set of features, Sym4 and Db4 wavelet transforms and gray-level co-occurrence matrices were applied to the image, with statistical methods being used to maximize the feature selection. To evaluate the performance of these methods, a support vector machine algorithm was used. The experimental results demonstrated that the Sym4 wavelet had a higher classification accuracy (93.44%) than the Db4 wavelet with respect to osteosarcoma occurrence in the epiphysis, whereas the Db4 wavelet had a higher classification accuracy (96.25%) for osteosarcoma occurrence in the diaphysis. Results including accuracy, sensitivity, specificity and ROC curves obtained using the wavelets were all higher than those obtained using the features derived from the GLCM method. It is concluded that, a set of texture features can be extracted from the wavelets and used in computer-aided osteosarcoma diagnosis systems. In addition, this study also confirms that multi-resolution analysis is a useful tool for texture feature extraction during bone CR image processing.
NASA Astrophysics Data System (ADS)
Wang, Bingjie; Pi, Shaohua; Sun, Qi; Jia, Bo
2015-05-01
An improved classification algorithm that considers multiscale wavelet packet Shannon entropy is proposed. Decomposition coefficients at all levels are obtained to build the initial Shannon entropy feature vector. After subtracting the Shannon entropy map of the background signal, components of the strongest discriminating power in the initial feature vector are picked out to rebuild the Shannon entropy feature vector, which is transferred to radial basis function (RBF) neural network for classification. Four types of man-made vibrational intrusion signals are recorded based on a modified Sagnac interferometer. The performance of the improved classification algorithm has been evaluated by the classification experiments via RBF neural network under different diffusion coefficients. An 85% classification accuracy rate is achieved, which is higher than the other common algorithms. The classification results show that this improved classification algorithm can be used to classify vibrational intrusion signals in an automatic real-time monitoring system.
Can segmentation evaluation metric be used as an indicator of land cover classification accuracy?
NASA Astrophysics Data System (ADS)
Švab Lenarčič, Andreja; Đurić, Nataša; Čotar, Klemen; Ritlop, Klemen; Oštir, Krištof
2016-10-01
It is a broadly established belief that the segmentation result significantly affects subsequent image classification accuracy. However, the actual correlation between the two has never been evaluated. Such an evaluation would be of considerable importance for any attempts to automate the object-based classification process, as it would reduce the amount of user intervention required to fine-tune the segmentation parameters. We conducted an assessment of segmentation and classification by analyzing 100 different segmentation parameter combinations, 3 classifiers, 5 land cover classes, 20 segmentation evaluation metrics, and 7 classification accuracy measures. The reliability definition of segmentation evaluation metrics as indicators of land cover classification accuracy was based on the linear correlation between the two. All unsupervised metrics that are not based on number of segments have a very strong correlation with all classification measures and are therefore reliable as indicators of land cover classification accuracy. On the other hand, correlation at supervised metrics is dependent on so many factors that it cannot be trusted as a reliable classification quality indicator. Algorithms for land cover classification studied in this paper are widely used; therefore, presented results are applicable to a wider area.
NASA Astrophysics Data System (ADS)
Liu, F.; Chen, T.; He, J.; Wen, Q.; Yu, F.; Gu, X.; Wang, Z.
2018-04-01
In recent years, the quick upgrading and improvement of SAR sensors provide beneficial complements for the traditional optical remote sensing in the aspects of theory, technology and data. In this paper, Sentinel-1A SAR data and GF-1 optical data were selected for image fusion, and more emphases were put on the dryland crop classification under a complex crop planting structure, regarding corn and cotton as the research objects. Considering the differences among various data fusion methods, the principal component analysis (PCA), Gram-Schmidt (GS), Brovey and wavelet transform (WT) methods were compared with each other, and the GS and Brovey methods were proved to be more applicable in the study area. Then, the classification was conducted based on the object-oriented technique process. And for the GS, Brovey fusion images and GF-1 optical image, the nearest neighbour algorithm was adopted to realize the supervised classification with the same training samples. Based on the sample plots in the study area, the accuracy assessment was conducted subsequently. The values of overall accuracy and kappa coefficient of fusion images were all higher than those of GF-1 optical image, and GS method performed better than Brovey method. In particular, the overall accuracy of GS fusion image was 79.8 %, and the Kappa coefficient was 0.644. Thus, the results showed that GS and Brovey fusion images were superior to optical images for dryland crop classification. This study suggests that the fusion of SAR and optical images is reliable for dryland crop classification under a complex crop planting structure.
Hartling, Lisa; Bond, Kenneth; Santaguida, P Lina; Viswanathan, Meera; Dryden, Donna M
2011-08-01
To develop and test a study design classification tool. We contacted relevant organizations and individuals to identify tools used to classify study designs and ranked these using predefined criteria. The highest ranked tool was a design algorithm developed, but no longer advocated, by the Cochrane Non-Randomized Studies Methods Group; this was modified to include additional study designs and decision points. We developed a reference classification for 30 studies; 6 testers applied the tool to these studies. Interrater reliability (Fleiss' κ) and accuracy against the reference classification were assessed. The tool was further revised and retested. Initial reliability was fair among the testers (κ=0.26) and the reference standard raters κ=0.33). Testing after revisions showed improved reliability (κ=0.45, moderate agreement) with improved, but still low, accuracy. The most common disagreements were whether the study design was experimental (5 of 15 studies), and whether there was a comparison of any kind (4 of 15 studies). Agreement was higher among testers who had completed graduate level training versus those who had not. The moderate reliability and low accuracy may be because of lack of clarity and comprehensiveness of the tool, inadequate reporting of the studies, and variability in tester characteristics. The results may not be generalizable to all published studies, as the test studies were selected because they had posed challenges for previous reviewers with respect to their design classification. Application of such a tool should be accompanied by training, pilot testing, and context-specific decision rules. Copyright © 2011 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Amit, Guy; Ben-Ari, Rami; Hadad, Omer; Monovich, Einat; Granot, Noa; Hashoul, Sharbell
2017-03-01
Diagnostic interpretation of breast MRI studies requires meticulous work and a high level of expertise. Computerized algorithms can assist radiologists by automatically characterizing the detected lesions. Deep learning approaches have shown promising results in natural image classification, but their applicability to medical imaging is limited by the shortage of large annotated training sets. In this work, we address automatic classification of breast MRI lesions using two different deep learning approaches. We propose a novel image representation for dynamic contrast enhanced (DCE) breast MRI lesions, which combines the morphological and kinetics information in a single multi-channel image. We compare two classification approaches for discriminating between benign and malignant lesions: training a designated convolutional neural network and using a pre-trained deep network to extract features for a shallow classifier. The domain-specific trained network provided higher classification accuracy, compared to the pre-trained model, with an area under the ROC curve of 0.91 versus 0.81, and an accuracy of 0.83 versus 0.71. Similar accuracy was achieved in classifying benign lesions, malignant lesions, and normal tissue images. The trained network was able to improve accuracy by using the multi-channel image representation, and was more robust to reductions in the size of the training set. A small-size convolutional neural network can learn to accurately classify findings in medical images using only a few hundred images from a few dozen patients. With sufficient data augmentation, such a network can be trained to outperform a pre-trained out-of-domain classifier. Developing domain-specific deep-learning models for medical imaging can facilitate technological advancements in computer-aided diagnosis.
Valderrama-Landeros, L; Flores-de-Santiago, F; Kovacs, J M; Flores-Verdugo, F
2017-12-14
Optimizing the classification accuracy of a mangrove forest is of utmost importance for conservation practitioners. Mangrove forest mapping using satellite-based remote sensing techniques is by far the most common method of classification currently used given the logistical difficulties of field endeavors in these forested wetlands. However, there is now an abundance of options from which to choose in regards to satellite sensors, which has led to substantially different estimations of mangrove forest location and extent with particular concern for degraded systems. The objective of this study was to assess the accuracy of mangrove forest classification using different remotely sensed data sources (i.e., Landsat-8, SPOT-5, Sentinel-2, and WorldView-2) for a system located along the Pacific coast of Mexico. Specifically, we examined a stressed semiarid mangrove forest which offers a variety of conditions such as dead areas, degraded stands, healthy mangroves, and very dense mangrove island formations. The results indicated that Landsat-8 (30 m per pixel) had the lowest overall accuracy at 64% and that WorldView-2 (1.6 m per pixel) had the highest at 93%. Moreover, the SPOT-5 and the Sentinel-2 classifications (10 m per pixel) were very similar having accuracies of 75 and 78%, respectively. In comparison to WorldView-2, the other sensors overestimated the extent of Laguncularia racemosa and underestimated the extent of Rhizophora mangle. When considering such type of sensors, the higher spatial resolution can be particularly important in mapping small mangrove islands that often occur in degraded mangrove systems.
Gao, Xiang; Lin, Huaiying; Revanna, Kashi; Dong, Qunfeng
2017-05-10
Species-level classification for 16S rRNA gene sequences remains a serious challenge for microbiome researchers, because existing taxonomic classification tools for 16S rRNA gene sequences either do not provide species-level classification, or their classification results are unreliable. The unreliable results are due to the limitations in the existing methods which either lack solid probabilistic-based criteria to evaluate the confidence of their taxonomic assignments, or use nucleotide k-mer frequency as the proxy for sequence similarity measurement. We have developed a method that shows significantly improved species-level classification results over existing methods. Our method calculates true sequence similarity between query sequences and database hits using pairwise sequence alignment. Taxonomic classifications are assigned from the species to the phylum levels based on the lowest common ancestors of multiple database hits for each query sequence, and further classification reliabilities are evaluated by bootstrap confidence scores. The novelty of our method is that the contribution of each database hit to the taxonomic assignment of the query sequence is weighted by a Bayesian posterior probability based upon the degree of sequence similarity of the database hit to the query sequence. Our method does not need any training datasets specific for different taxonomic groups. Instead only a reference database is required for aligning to the query sequences, making our method easily applicable for different regions of the 16S rRNA gene or other phylogenetic marker genes. Reliable species-level classification for 16S rRNA or other phylogenetic marker genes is critical for microbiome research. Our software shows significantly higher classification accuracy than the existing tools and we provide probabilistic-based confidence scores to evaluate the reliability of our taxonomic classification assignments based on multiple database matches to query sequences. Despite its higher computational costs, our method is still suitable for analyzing large-scale microbiome datasets for practical purposes. Furthermore, our method can be applied for taxonomic classification of any phylogenetic marker gene sequences. Our software, called BLCA, is freely available at https://github.com/qunfengdong/BLCA .
Tabu search and binary particle swarm optimization for feature selection using microarray data.
Chuang, Li-Yeh; Yang, Cheng-Huei; Yang, Cheng-Hong
2009-12-01
Gene expression profiles have great potential as a medical diagnosis tool because they represent the state of a cell at the molecular level. In the classification of cancer type research, available training datasets generally have a fairly small sample size compared to the number of genes involved. This fact poses an unprecedented challenge to some classification methodologies due to training data limitations. Therefore, a good selection method for genes relevant for sample classification is needed to improve the predictive accuracy, and to avoid incomprehensibility due to the large number of genes investigated. In this article, we propose to combine tabu search (TS) and binary particle swarm optimization (BPSO) for feature selection. BPSO acts as a local optimizer each time the TS has been run for a single generation. The K-nearest neighbor method with leave-one-out cross-validation and support vector machine with one-versus-rest serve as evaluators of the TS and BPSO. The proposed method is applied and compared to the 11 classification problems taken from the literature. Experimental results show that our method simplifies features effectively and either obtains higher classification accuracy or uses fewer features compared to other feature selection methods.
NASA Technical Reports Server (NTRS)
Rignot, E.; Chellappa, R.
1993-01-01
We present a maximum a posteriori (MAP) classifier for classifying multifrequency, multilook, single polarization SAR intensity data into regions or ensembles of pixels of homogeneous and similar radar backscatter characteristics. A model for the prior joint distribution of the multifrequency SAR intensity data is combined with a Markov random field for representing the interactions between region labels to obtain an expression for the posterior distribution of the region labels given the multifrequency SAR observations. The maximization of the posterior distribution yields Bayes's optimum region labeling or classification of the SAR data or its MAP estimate. The performance of the MAP classifier is evaluated by using computer-simulated multilook SAR intensity data as a function of the parameters in the classification process. Multilook SAR intensity data are shown to yield higher classification accuracies than one-look SAR complex amplitude data. The MAP classifier is extended to the case in which the radar backscatter from the remotely sensed surface varies within the SAR image because of incidence angle effects. The results obtained illustrate the practicality of the method for combining SAR intensity observations acquired at two different frequencies and for improving classification accuracy of SAR data.
Kim, Junghoe; Calhoun, Vince D; Shim, Eunsoo; Lee, Jong-Hwan
2016-01-01
Functional connectivity (FC) patterns obtained from resting-state functional magnetic resonance imaging data are commonly employed to study neuropsychiatric conditions by using pattern classifiers such as the support vector machine (SVM). Meanwhile, a deep neural network (DNN) with multiple hidden layers has shown its ability to systematically extract lower-to-higher level information of image and speech data from lower-to-higher hidden layers, markedly enhancing classification accuracy. The objective of this study was to adopt the DNN for whole-brain resting-state FC pattern classification of schizophrenia (SZ) patients vs. healthy controls (HCs) and identification of aberrant FC patterns associated with SZ. We hypothesized that the lower-to-higher level features learned via the DNN would significantly enhance the classification accuracy, and proposed an adaptive learning algorithm to explicitly control the weight sparsity in each hidden layer via L1-norm regularization. Furthermore, the weights were initialized via stacked autoencoder based pre-training to further improve the classification performance. Classification accuracy was systematically evaluated as a function of (1) the number of hidden layers/nodes, (2) the use of L1-norm regularization, (3) the use of the pre-training, (4) the use of framewise displacement (FD) removal, and (5) the use of anatomical/functional parcellation. Using FC patterns from anatomically parcellated regions without FD removal, an error rate of 14.2% was achieved by employing three hidden layers and 50 hidden nodes with both L1-norm regularization and pre-training, which was substantially lower than the error rate from the SVM (22.3%). Moreover, the trained DNN weights (i.e., the learned features) were found to represent the hierarchical organization of aberrant FC patterns in SZ compared with HC. Specifically, pairs of nodes extracted from the lower hidden layer represented sparse FC patterns implicated in SZ, which was quantified by using kurtosis/modularity measures and features from the higher hidden layer showed holistic/global FC patterns differentiating SZ from HC. Our proposed schemes and reported findings attained by using the DNN classifier and whole-brain FC data suggest that such approaches show improved ability to learn hidden patterns in brain imaging data, which may be useful for developing diagnostic tools for SZ and other neuropsychiatric disorders and identifying associated aberrant FC patterns. Copyright © 2015 Elsevier Inc. All rights reserved.
Analysis on the Utility of Satellite Imagery for Detection of Agricultural Facility
NASA Astrophysics Data System (ADS)
Kang, J.-M.; Baek, S.-H.; Jung, K.-Y.
2012-07-01
Now that the agricultural facilities are being increase owing to development of technology and diversification of agriculture and the ratio of garden crops that are imported a lot and the crops cultivated in facilities are raised in Korea, the number of vinyl greenhouses is tending upward. So, it is important to grasp the distribution of vinyl greenhouses as much as that of rice fields, dry fields and orchards, but it is difficult to collect the information of wide areas economically and correctly. Remote sensing using satellite imagery is able to obtain data of wide area at the same time, quickly and cost-effectively collect, monitor and analyze information from every object on earth. In this study, in order to analyze the utilization of satellite imagery at detection of agricultural facility, image classification was performed about the agricultural facility, vinyl greenhouse using Formosat-2 satellite imagery. The training set of sea, vegetation, building, bare ground and vinyl greenhouse was set to monitor the agricultural facilities of the object area and the training set for the vinyl greenhouses that are main monitoring object was classified and set again into 3 types according the spectral characteristics. The image classification using 4 kinds of supervise classification methods applied by the same training set were carried out to grasp the image classification method which is effective for monitoring agricultural facilities. And, in order to minimize the misclassification appeared in the classification using the spectral information, the accuracy of classification was intended to be raised by adding texture information. The results of classification were analyzed regarding the accuracy comparing with that of naked-eyed detection. As the results of classification, the method of Mahalanobis distance was shown as more efficient than other methods and the accuracy of classification was higher when adding texture information. Hence the more effective monitoring of agricultural facilities is expected to be available if the characteristics such as texture information including satellite images or spatial pattern are studied in detail.
Landcover Classification Using Deep Fully Convolutional Neural Networks
NASA Astrophysics Data System (ADS)
Wang, J.; Li, X.; Zhou, S.; Tang, J.
2017-12-01
Land cover classification has always been an essential application in remote sensing. Certain image features are needed for land cover classification whether it is based on pixel or object-based methods. Different from other machine learning methods, deep learning model not only extracts useful information from multiple bands/attributes, but also learns spatial characteristics. In recent years, deep learning methods have been developed rapidly and widely applied in image recognition, semantic understanding, and other application domains. However, there are limited studies applying deep learning methods in land cover classification. In this research, we used fully convolutional networks (FCN) as the deep learning model to classify land covers. The National Land Cover Database (NLCD) within the state of Kansas was used as training dataset and Landsat images were classified using the trained FCN model. We also applied an image segmentation method to improve the original results from the FCN model. In addition, the pros and cons between deep learning and several machine learning methods were compared and explored. Our research indicates: (1) FCN is an effective classification model with an overall accuracy of 75%; (2) image segmentation improves the classification results with better match of spatial patterns; (3) FCN has an excellent ability of learning which can attains higher accuracy and better spatial patterns compared with several machine learning methods.
Classification of Strawberry Fruit Shape by Machine Learning
NASA Astrophysics Data System (ADS)
Ishikawa, T.; Hayashi, A.; Nagamatsu, S.; Kyutoku, Y.; Dan, I.; Wada, T.; Oku, K.; Saeki, Y.; Uto, T.; Tanabata, T.; Isobe, S.; Kochi, N.
2018-05-01
Shape is one of the most important traits of agricultural products due to its relationships with the quality, quantity, and value of the products. For strawberries, the nine types of fruit shape were defined and classified by humans based on the sampler patterns of the nine types. In this study, we tested the classification of strawberry shapes by machine learning in order to increase the accuracy of the classification, and we introduce the concept of computerization into this field. Four types of descriptors were extracted from the digital images of strawberries: (1) the Measured Values (MVs) including the length of the contour line, the area, the fruit length and width, and the fruit width/length ratio; (2) the Ellipse Similarity Index (ESI); (3) Elliptic Fourier Descriptors (EFDs), and (4) Chain Code Subtraction (CCS). We used these descriptors for the classification test along with the random forest approach, and eight of the nine shape types were classified with combinations of MVs + CCS + EFDs. CCS is a descriptor that adds human knowledge to the chain codes, and it showed higher robustness in classification than the other descriptors. Our results suggest machine learning's high ability to classify fruit shapes accurately. We will attempt to increase the classification accuracy and apply the machine learning methods to other plant species.
Zhang, Chuncheng; Song, Sutao; Wen, Xiaotong; Yao, Li; Long, Zhiying
2015-04-30
Feature selection plays an important role in improving the classification accuracy of multivariate classification techniques in the context of fMRI-based decoding due to the "few samples and large features" nature of functional magnetic resonance imaging (fMRI) data. Recently, several sparse representation methods have been applied to the voxel selection of fMRI data. Despite the low computational efficiency of the sparse representation methods, they still displayed promise for applications that select features from fMRI data. In this study, we proposed the Laplacian smoothed L0 norm (LSL0) approach for feature selection of fMRI data. Based on the fast sparse decomposition using smoothed L0 norm (SL0) (Mohimani, 2007), the LSL0 method used the Laplacian function to approximate the L0 norm of sources. Results of the simulated and real fMRI data demonstrated the feasibility and robustness of LSL0 for the sparse source estimation and feature selection. Simulated results indicated that LSL0 produced more accurate source estimation than SL0 at high noise levels. The classification accuracy using voxels that were selected by LSL0 was higher than that by SL0 in both simulated and real fMRI experiment. Moreover, both LSL0 and SL0 showed higher classification accuracy and required less time than ICA and t-test for the fMRI decoding. LSL0 outperformed SL0 in sparse source estimation at high noise level and in feature selection. Moreover, LSL0 and SL0 showed better performance than ICA and t-test for feature selection. Copyright © 2015 Elsevier B.V. All rights reserved.
Improving zero-training brain-computer interfaces by mixing model estimators
NASA Astrophysics Data System (ADS)
Verhoeven, T.; Hübner, D.; Tangermann, M.; Müller, K. R.; Dambre, J.; Kindermans, P. J.
2017-06-01
Objective. Brain-computer interfaces (BCI) based on event-related potentials (ERP) incorporate a decoder to classify recorded brain signals and subsequently select a control signal that drives a computer application. Standard supervised BCI decoders require a tedious calibration procedure prior to every session. Several unsupervised classification methods have been proposed that tune the decoder during actual use and as such omit this calibration. Each of these methods has its own strengths and weaknesses. Our aim is to improve overall accuracy of ERP-based BCIs without calibration. Approach. We consider two approaches for unsupervised classification of ERP signals. Learning from label proportions (LLP) was recently shown to be guaranteed to converge to a supervised decoder when enough data is available. In contrast, the formerly proposed expectation maximization (EM) based decoding for ERP-BCI does not have this guarantee. However, while this decoder has high variance due to random initialization of its parameters, it obtains a higher accuracy faster than LLP when the initialization is good. We introduce a method to optimally combine these two unsupervised decoding methods, letting one method’s strengths compensate for the weaknesses of the other and vice versa. The new method is compared to the aforementioned methods in a resimulation of an experiment with a visual speller. Main results. Analysis of the experimental results shows that the new method exceeds the performance of the previous unsupervised classification approaches in terms of ERP classification accuracy and symbol selection accuracy during the spelling experiment. Furthermore, the method shows less dependency on random initialization of model parameters and is consequently more reliable. Significance. Improving the accuracy and subsequent reliability of calibrationless BCIs makes these systems more appealing for frequent use.
On the classification techniques in data mining for microarray data classification
NASA Astrophysics Data System (ADS)
Aydadenta, Husna; Adiwijaya
2018-03-01
Cancer is one of the deadly diseases, according to data from WHO by 2015 there are 8.8 million more deaths caused by cancer, and this will increase every year if not resolved earlier. Microarray data has become one of the most popular cancer-identification studies in the field of health, since microarray data can be used to look at levels of gene expression in certain cell samples that serve to analyze thousands of genes simultaneously. By using data mining technique, we can classify the sample of microarray data thus it can be identified with cancer or not. In this paper we will discuss some research using some data mining techniques using microarray data, such as Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5, and simulation of Random Forest algorithm with technique of reduction dimension using Relief. The result of this paper show performance measure (accuracy) from classification algorithm (SVM, ANN, Naive Bayes, kNN, C4.5, and Random Forets).The results in this paper show the accuracy of Random Forest algorithm higher than other classification algorithms (Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5). It is hoped that this paper can provide some information about the speed, accuracy, performance and computational cost generated from each Data Mining Classification Technique based on microarray data.
Youn, Su Hyun; Sim, Taeyong; Choi, Ahnryul; Song, Jinsung; Shin, Ki Young; Lee, Il Kwon; Heo, Hyun Mu; Lee, Daeweon; Mun, Joung Hwan
2015-06-01
Ultrasonic surgical units (USUs) have the advantage of minimizing tissue damage during surgeries that require tissue dissection by reducing problems such as coagulation and unwanted carbonization, but the disadvantage of requiring manual adjustment of power output according to the target tissue. In order to overcome this limitation, it is necessary to determine the properties of in vivo tissues automatically. We propose a multi-classifier that can accurately classify tissues based on the unique impedance of each tissue. For this purpose, a multi-classifier was built based on single classifiers with high classification rates, and the classification accuracy of the proposed model was compared with that of single classifiers for various electrode types (Type-I: 6 mm invasive; Type-II: 3 mm invasive; Type-III: surface). The sensitivity and positive predictive value (PPV) of the multi-classifier by cross checks were determined. According to the 10-fold cross validation results, the classification accuracy of the proposed model was significantly higher (p<0.05 or <0.01) than that of existing single classifiers for all electrode types. In particular, the classification accuracy of the proposed model was highest when the 3mm invasive electrode (Type-II) was used (sensitivity=97.33-100.00%; PPV=96.71-100.00%). The results of this study are an important contribution to achieving automatic optimal output power adjustment of USUs according to the properties of individual tissues. Copyright © 2015 Elsevier Ltd. All rights reserved.
Land-cover classification in a moist tropical region of Brazil with Landsat TM imagery.
Li, Guiying; Lu, Dengsheng; Moran, Emilio; Hetrick, Scott
2011-01-01
This research aims to improve land-cover classification accuracy in a moist tropical region in Brazil by examining the use of different remote sensing-derived variables and classification algorithms. Different scenarios based on Landsat Thematic Mapper (TM) spectral data and derived vegetation indices and textural images, and different classification algorithms - maximum likelihood classification (MLC), artificial neural network (ANN), classification tree analysis (CTA), and object-based classification (OBC), were explored. The results indicated that a combination of vegetation indices as extra bands into Landsat TM multispectral bands did not improve the overall classification performance, but the combination of textural images was valuable for improving vegetation classification accuracy. In particular, the combination of both vegetation indices and textural images into TM multispectral bands improved overall classification accuracy by 5.6% and kappa coefficient by 6.25%. Comparison of the different classification algorithms indicated that CTA and ANN have poor classification performance in this research, but OBC improved primary forest and pasture classification accuracies. This research indicates that use of textural images or use of OBC are especially valuable for improving the vegetation classes such as upland and liana forest classes having complex stand structures and having relatively large patch sizes.
Land-cover classification in a moist tropical region of Brazil with Landsat TM imagery
LI, GUIYING; LU, DENGSHENG; MORAN, EMILIO; HETRICK, SCOTT
2011-01-01
This research aims to improve land-cover classification accuracy in a moist tropical region in Brazil by examining the use of different remote sensing-derived variables and classification algorithms. Different scenarios based on Landsat Thematic Mapper (TM) spectral data and derived vegetation indices and textural images, and different classification algorithms – maximum likelihood classification (MLC), artificial neural network (ANN), classification tree analysis (CTA), and object-based classification (OBC), were explored. The results indicated that a combination of vegetation indices as extra bands into Landsat TM multispectral bands did not improve the overall classification performance, but the combination of textural images was valuable for improving vegetation classification accuracy. In particular, the combination of both vegetation indices and textural images into TM multispectral bands improved overall classification accuracy by 5.6% and kappa coefficient by 6.25%. Comparison of the different classification algorithms indicated that CTA and ANN have poor classification performance in this research, but OBC improved primary forest and pasture classification accuracies. This research indicates that use of textural images or use of OBC are especially valuable for improving the vegetation classes such as upland and liana forest classes having complex stand structures and having relatively large patch sizes. PMID:22368311
Comparison of Classification Methods for P300 Brain-Computer Interface on Disabled Subjects
Manyakov, Nikolay V.; Chumerin, Nikolay; Combaz, Adrien; Van Hulle, Marc M.
2011-01-01
We report on tests with a mind typing paradigm based on a P300 brain-computer interface (BCI) on a group of amyotrophic lateral sclerosis (ALS), middle cerebral artery (MCA) stroke, and subarachnoid hemorrhage (SAH) patients, suffering from motor and speech disabilities. We investigate the achieved typing accuracy given the individual patient's disorder, and how it correlates with the type of classifier used. We considered 7 types of classifiers, linear as well as nonlinear ones, and found that, overall, one type of linear classifier yielded a higher classification accuracy. In addition to the selection of the classifier, we also suggest and discuss a number of recommendations to be considered when building a P300-based typing system for disabled subjects. PMID:21941530
NASA Astrophysics Data System (ADS)
Dronova, I.; Gong, P.; Wang, L.; Clinton, N.; Fu, W.; Qi, S.
2011-12-01
Remote sensing-based vegetation classifications representing plant function such as photosynthesis and productivity are challenging in wetlands with complex cover and difficult field access. Recent advances in object-based image analysis (OBIA) and machine-learning algorithms offer new classification tools; however, few comparisons of different algorithms and spatial scales have been discussed to date. We applied OBIA to delineate wetland plant functional types (PFTs) for Poyang Lake, the largest freshwater lake in China and Ramsar wetland conservation site, from 30-m Landsat TM scene at the peak of spring growing season. We targeted major PFTs (C3 grasses, C3 forbs and different types of C4 grasses and aquatic vegetation) that are both key players in system's biogeochemical cycles and critical providers of waterbird habitat. Classification results were compared among: a) several object segmentation scales (with average object sizes 900-9000 m2); b) several families of statistical classifiers (including Bayesian, Logistic, Neural Network, Decision Trees and Support Vector Machines) and c) two hierarchical levels of vegetation classification, a generalized 3-class set and more detailed 6-class set. We found that classification benefited from object-based approach which allowed including object shape, texture and context descriptors in classification. While a number of classifiers achieved high accuracy at the finest pixel-equivalent segmentation scale, the highest accuracies and best agreement among algorithms occurred at coarser object scales. No single classifier was consistently superior across all scales, although selected algorithms of Neural Network, Logistic and K-Nearest Neighbors families frequently provided the best discrimination of classes at different scales. The choice of vegetation categories also affected classification accuracy. The 6-class set allowed for higher individual class accuracies but lower overall accuracies than the 3-class set because individual classes differed in scales at which they were best discriminated from others. Main classification challenges included a) presence of C3 grasses in C4-grass areas, particularly following harvesting of C4 reeds and b) mixtures of emergent, floating and submerged aquatic plants at sub-object and sub-pixel scales. We conclude that OBIA with advanced statistical classifiers offers useful instruments for landscape vegetation analyses, and that spatial scale considerations are critical in mapping PFTs, while multi-scale comparisons can be used to guide class selection. Future work will further apply fuzzy classification and field-collected spectral data for PFT analysis and compare results with MODIS PFT products.
Breast density characterization using texton distributions.
Petroudi, Styliani; Brady, Michael
2011-01-01
Breast density has been shown to be one of the most significant risks for developing breast cancer, with women with dense breasts at four to six times higher risk. The Breast Imaging Reporting and Data System (BI-RADS) has a four class classification scheme that describes the different breast densities. However, there is great inter and intra observer variability among clinicians in reporting a mammogram's density class. This work presents a novel texture classification method and its application for the development of a completely automated breast density classification system. The new method represents the mammogram using textons, which can be thought of as the building blocks of texture under the operational definition of Leung and Malik as clustered filter responses. The new proposed method characterizes the mammographic appearance of the different density patterns by evaluating the texton spatial dependence matrix (TDSM) in the breast region's corresponding texton map. The TSDM is a texture model that captures both statistical and structural texture characteristics. The normalized TSDM matrices are evaluated for mammograms from the different density classes and corresponding texture models are established. Classification is achieved using a chi-square distance measure. The fully automated TSDM breast density classification method is quantitatively evaluated on mammograms from all density classes from the Oxford Mammogram Database. The incorporation of texton spatial dependencies allows for classification accuracy reaching over 82%. The breast density classification accuracy is better using texton TSDM compared to simple texton histograms.
Le, Trang T; Simmons, W Kyle; Misaki, Masaya; Bodurka, Jerzy; White, Bill C; Savitz, Jonathan; McKinney, Brett A
2017-09-15
Classification of individuals into disease or clinical categories from high-dimensional biological data with low prediction error is an important challenge of statistical learning in bioinformatics. Feature selection can improve classification accuracy but must be incorporated carefully into cross-validation to avoid overfitting. Recently, feature selection methods based on differential privacy, such as differentially private random forests and reusable holdout sets, have been proposed. However, for domains such as bioinformatics, where the number of features is much larger than the number of observations p≫n , these differential privacy methods are susceptible to overfitting. We introduce private Evaporative Cooling, a stochastic privacy-preserving machine learning algorithm that uses Relief-F for feature selection and random forest for privacy preserving classification that also prevents overfitting. We relate the privacy-preserving threshold mechanism to a thermodynamic Maxwell-Boltzmann distribution, where the temperature represents the privacy threshold. We use the thermal statistical physics concept of Evaporative Cooling of atomic gases to perform backward stepwise privacy-preserving feature selection. On simulated data with main effects and statistical interactions, we compare accuracies on holdout and validation sets for three privacy-preserving methods: the reusable holdout, reusable holdout with random forest, and private Evaporative Cooling, which uses Relief-F feature selection and random forest classification. In simulations where interactions exist between attributes, private Evaporative Cooling provides higher classification accuracy without overfitting based on an independent validation set. In simulations without interactions, thresholdout with random forest and private Evaporative Cooling give comparable accuracies. We also apply these privacy methods to human brain resting-state fMRI data from a study of major depressive disorder. Code available at http://insilico.utulsa.edu/software/privateEC . brett-mckinney@utulsa.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Classification of right-hand grasp movement based on EMOTIV Epoc+
NASA Astrophysics Data System (ADS)
Tobing, T. A. M. L.; Prawito, Wijaya, S. K.
2017-07-01
Combinations of BCT elements for right-hand grasp movement have been obtained, providing the average value of their classification accuracy. The aim of this study is to find a suitable combination for best classification accuracy of right-hand grasp movement based on EEG headset, EMOTIV Epoc+. There are three movement classifications: grasping hand, relax, and opening hand. These classifications take advantage of Event-Related Desynchronization (ERD) phenomenon that makes it possible to differ relaxation, imagery, and movement state from each other. The combinations of elements are the usage of Independent Component Analysis (ICA), spectrum analysis by Fast Fourier Transform (FFT), maximum mu and beta power with their frequency as features, and also classifier Probabilistic Neural Network (PNN) and Radial Basis Function (RBF). The average values of classification accuracy are ± 83% for training and ± 57% for testing. To have a better understanding of the signal quality recorded by EMOTIV Epoc+, the result of classification accuracy of left or right-hand grasping movement EEG signal (provided by Physionet) also be given, i.e.± 85% for training and ± 70% for testing. The comparison of accuracy value from each combination, experiment condition, and external EEG data are provided for the purpose of value analysis of classification accuracy.
Hierarchical Higher Order Crf for the Classification of Airborne LIDAR Point Clouds in Urban Areas
NASA Astrophysics Data System (ADS)
Niemeyer, J.; Rottensteiner, F.; Soergel, U.; Heipke, C.
2016-06-01
We propose a novel hierarchical approach for the classification of airborne 3D lidar points. Spatial and semantic context is incorporated via a two-layer Conditional Random Field (CRF). The first layer operates on a point level and utilises higher order cliques. Segments are generated from the labelling obtained in this way. They are the entities of the second layer, which incorporates larger scale context. The classification result of the segments is introduced as an energy term for the next iteration of the point-based layer. This framework iterates and mutually propagates context to improve the classification results. Potentially wrong decisions can be revised at later stages. The output is a labelled point cloud as well as segments roughly corresponding to object instances. Moreover, we present two new contextual features for the segment classification: the distance and the orientation of a segment with respect to the closest road. It is shown that the classification benefits from these features. In our experiments the hierarchical framework improve the overall accuracies by 2.3% on a point-based level and by 3.0% on a segment-based level, respectively, compared to a purely point-based classification.
Hyperspectral feature mapping classification based on mathematical morphology
NASA Astrophysics Data System (ADS)
Liu, Chang; Li, Junwei; Wang, Guangping; Wu, Jingli
2016-03-01
This paper proposed a hyperspectral feature mapping classification algorithm based on mathematical morphology. Without the priori information such as spectral library etc., the spectral and spatial information can be used to realize the hyperspectral feature mapping classification. The mathematical morphological erosion and dilation operations are performed respectively to extract endmembers. The spectral feature mapping algorithm is used to carry on hyperspectral image classification. The hyperspectral image collected by AVIRIS is applied to evaluate the proposed algorithm. The proposed algorithm is compared with minimum Euclidean distance mapping algorithm, minimum Mahalanobis distance mapping algorithm, SAM algorithm and binary encoding mapping algorithm. From the results of the experiments, it is illuminated that the proposed algorithm's performance is better than that of the other algorithms under the same condition and has higher classification accuracy.
NASA Astrophysics Data System (ADS)
Hammann, Mark Gregory
The fusion of electro-optical (EO) multi-spectral satellite imagery with Synthetic Aperture Radar (SAR) data was explored with the working hypothesis that the addition of multi-band SAR will increase the land-cover (LC) classification accuracy compared to EO alone. Three satellite sources for SAR imagery were used: X-band from TerraSAR-X, C-band from RADARSAT-2, and L-band from PALSAR. Images from the RapidEye satellites were the source of the EO imagery. Imagery from the GeoEye-1 and WorldView-2 satellites aided the selection of ground truth. Three study areas were chosen: Wad Medani, Sudan; Campinas, Brazil; and Fresno- Kings Counties, USA. EO imagery were radiometrically calibrated, atmospherically compensated, orthorectifed, co-registered, and clipped to a common area of interest (AOI). SAR imagery were radiometrically calibrated, and geometrically corrected for terrain and incidence angle by converting to ground range and Sigma Naught (?0). The original SAR HH data were included in the fused image stack after despeckling with a 3x3 Enhanced Lee filter. The variance and Gray-Level-Co-occurrence Matrix (GLCM) texture measures of contrast, entropy, and correlation were derived from the non-despeckled SAR HH bands. Data fusion was done with layer stacking and all data were resampled to a common spatial resolution. The Support Vector Machine (SVM) decision rule was used for the supervised classifications. Similar LC classes were identified and tested for each study area. For Wad Medani, nine classes were tested: low and medium intensity urban, sparse forest, water, barren ground, and four agriculture classes (fallow, bare agricultural ground, green crops, and orchards). For Campinas, Brazil, five generic classes were tested: urban, agriculture, forest, water, and barren ground. For the Fresno-Kings Counties location 11 classes were studied: three generic classes (urban, water, barren land), and eight specific crops. In all cases the addition of SAR to EO resulted in higher overall classification accuracies. In many cases using more than a single SAR band also improved the classification accuracy. There was no single best SAR band for all cases; for specific study areas or LC classes, different SAR bands were better. For Wad Medani, the overall accuracy increased nearly 25% over EO by using all three SAR bands and GLCM texture. For Campinas, the improvement over EO was 4.3%; the large areas of vegetation were classified by EO with good accuracy. At Fresno-Kings Counties, EO+SAR fusion improved the overall classification accuracy by 7%. For times or regions where EO is not available due to extended cloud cover, classification with SAR is often the only option; note that SAR alone typically results in lower classification accuracies than when using EO or EO-SAR fusion. Fusion of EO and SAR was especially important to improve the separability of orchards from other crops, and separating urban areas with buildings from bare soil; those classes are difficult to accurately separate with EO. The outcome of this dissertation contributes to the understanding of the benefits of combining data from EO imagery with different SAR bands and SAR derived texture data to identify different LC classes. In times of increased public and private budget constraints and industry consolidation, this dissertation provides insight as to which band packages could be most useful for increased accuracy in LC classification.
Comparative analysis of Worldview-2 and Landsat 8 for coastal saltmarsh mapping accuracy assessment
NASA Astrophysics Data System (ADS)
Rasel, Sikdar M. M.; Chang, Hsing-Chung; Diti, Israt Jahan; Ralph, Tim; Saintilan, Neil
2016-05-01
Coastal saltmarsh and their constituent components and processes are of an interest scientifically due to their ecological function and services. However, heterogeneity and seasonal dynamic of the coastal wetland system makes it challenging to map saltmarshes with remotely sensed data. This study selected four important saltmarsh species Pragmitis australis, Sporobolus virginicus, Ficiona nodosa and Schoeloplectus sp. as well as a Mangrove and Pine tree species, Avecinia and Casuarina sp respectively. High Spatial Resolution Worldview-2 data and Coarse Spatial resolution Landsat 8 imagery were selected in this study. Among the selected vegetation types some patches ware fragmented and close to the spatial resolution of Worldview-2 data while and some patch were larger than the 30 meter resolution of Landsat 8 data. This study aims to test the effectiveness of different classifier for the imagery with various spatial and spectral resolutions. Three different classification algorithm, Maximum Likelihood Classifier (MLC), Support Vector Machine (SVM) and Artificial Neural Network (ANN) were tested and compared with their mapping accuracy of the results derived from both satellite imagery. For Worldview-2 data SVM was giving the higher overall accuracy (92.12%, kappa =0.90) followed by ANN (90.82%, Kappa 0.89) and MLC (90.55%, kappa = 0.88). For Landsat 8 data, MLC (82.04%) showed the highest classification accuracy comparing to SVM (77.31%) and ANN (75.23%). The producer accuracy of the classification results were also presented in the paper.
Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods.
Polat, Huseyin; Danaei Mehr, Homay; Cetin, Aydin
2017-04-01
As Chronic Kidney Disease progresses slowly, early detection and effective treatment are the only cure to reduce the mortality rate. Machine learning techniques are gaining significance in medical diagnosis because of their classification ability with high accuracy rates. The accuracy of classification algorithms depend on the use of correct feature selection algorithms to reduce the dimension of datasets. In this study, Support Vector Machine classification algorithm was used to diagnose Chronic Kidney Disease. To diagnose the Chronic Kidney Disease, two essential types of feature selection methods namely, wrapper and filter approaches were chosen to reduce the dimension of Chronic Kidney Disease dataset. In wrapper approach, classifier subset evaluator with greedy stepwise search engine and wrapper subset evaluator with the Best First search engine were used. In filter approach, correlation feature selection subset evaluator with greedy stepwise search engine and filtered subset evaluator with the Best First search engine were used. The results showed that the Support Vector Machine classifier by using filtered subset evaluator with the Best First search engine feature selection method has higher accuracy rate (98.5%) in the diagnosis of Chronic Kidney Disease compared to other selected methods.
Targeting an efficient target-to-target interval for P300 speller brain–computer interfaces
Sellers, Eric W.; Wang, Xingyu
2013-01-01
Longer target-to-target intervals (TTI) produce greater P300 event-related potential amplitude, which can increase brain–computer interface (BCI) classification accuracy and decrease the number of flashes needed for accurate character classification. However, longer TTIs requires more time for each trial, which will decrease the information transfer rate of BCI. In this paper, a P300 BCI using a 7 × 12 matrix explored new flash patterns (16-, 18- and 21-flash pattern) with different TTIs to assess the effects of TTI on P300 BCI performance. The new flash patterns were designed to minimize TTI, decrease repetition blindness, and examine the temporal relationship between each flash of a given stimulus by placing a minimum of one (16-flash pattern), two (18-flash pattern), or three (21-flash pattern) non-target flashes between each target flashes. Online results showed that the 16-flash pattern yielded the lowest classification accuracy among the three patterns. The results also showed that the 18-flash pattern provides a significantly higher information transfer rate (ITR) than the 21-flash pattern; both patterns provide high ITR and high accuracy for all subjects. PMID:22350331
Accurate, Rapid Taxonomic Classification of Fungal Large-Subunit rRNA Genes
Liu, Kuan-Liang; Porras-Alfaro, Andrea; Eichorst, Stephanie A.
2012-01-01
Taxonomic and phylogenetic fingerprinting based on sequence analysis of gene fragments from the large-subunit rRNA (LSU) gene or the internal transcribed spacer (ITS) region is becoming an integral part of fungal classification. The lack of an accurate and robust classification tool trained by a validated sequence database for taxonomic placement of fungal LSU genes is a severe limitation in taxonomic analysis of fungal isolates or large data sets obtained from environmental surveys. Using a hand-curated set of 8,506 fungal LSU gene fragments, we determined the performance characteristics of a naïve Bayesian classifier across multiple taxonomic levels and compared the classifier performance to that of a sequence similarity-based (BLASTN) approach. The naïve Bayesian classifier was computationally more rapid (>460-fold with our system) than the BLASTN approach, and it provided equal or superior classification accuracy. Classifier accuracies were compared using sequence fragments of 100 bp and 400 bp and two different PCR primer anchor points to mimic sequence read lengths commonly obtained using current high-throughput sequencing technologies. Accuracy was higher with 400-bp sequence reads than with 100-bp reads. It was also significantly affected by sequence location across the 1,400-bp test region. The highest accuracy was obtained across either the D1 or D2 variable region. The naïve Bayesian classifier provides an effective and rapid means to classify fungal LSU sequences from large environmental surveys. The training set and tool are publicly available through the Ribosomal Database Project (http://rdp.cme.msu.edu/classifier/classifier.jsp). PMID:22194300
Shamim, Mohammad Tabrez Anwar; Anwaruddin, Mohammad; Nagarajaram, H A
2007-12-15
Fold recognition is a key step in the protein structure discovery process, especially when traditional sequence comparison methods fail to yield convincing structural homologies. Although many methods have been developed for protein fold recognition, their accuracies remain low. This can be attributed to insufficient exploitation of fold discriminatory features. We have developed a new method for protein fold recognition using structural information of amino acid residues and amino acid residue pairs. Since protein fold recognition can be treated as a protein fold classification problem, we have developed a Support Vector Machine (SVM) based classifier approach that uses secondary structural state and solvent accessibility state frequencies of amino acids and amino acid pairs as feature vectors. Among the individual properties examined secondary structural state frequencies of amino acids gave an overall accuracy of 65.2% for fold discrimination, which is better than the accuracy by any method reported so far in the literature. Combination of secondary structural state frequencies with solvent accessibility state frequencies of amino acids and amino acid pairs further improved the fold discrimination accuracy to more than 70%, which is approximately 8% higher than the best available method. In this study we have also tested, for the first time, an all-together multi-class method known as Crammer and Singer method for protein fold classification. Our studies reveal that the three multi-class classification methods, namely one versus all, one versus one and Crammer and Singer method, yield similar predictions. Dataset and stand-alone program are available upon request.
Estimating Classification Consistency and Accuracy for Cognitive Diagnostic Assessment
ERIC Educational Resources Information Center
Cui, Ying; Gierl, Mark J.; Chang, Hua-Hua
2012-01-01
This article introduces procedures for the computation and asymptotic statistical inference for classification consistency and accuracy indices specifically designed for cognitive diagnostic assessments. The new classification indices can be used as important indicators of the reliability and validity of classification results produced by…
The effect of finite field size on classification and atmospheric correction
NASA Technical Reports Server (NTRS)
Kaufman, Y. J.; Fraser, R. S.
1981-01-01
The atmospheric effect on the upward radiance of sunlight scattered from the Earth-atmosphere system is strongly influenced by the contrasts between fields and their sizes. For a given atmospheric turbidity, the atmospheric effect on classification of surface features is much stronger for nonuniform surfaces than for uniform surfaces. Therefore, the classification accuracy of agricultural fields and urban areas is dependent not only on the optical characteristics of the atmosphere, but also on the size of the surface do not account for the nonuniformity of the surface have only a slight effect on the classification accuracy; in other cases the classification accuracy descreases. The radiances above finite fields were computed to simulate radiances measured by a satellite. A simulation case including 11 agricultural fields and four natural fields (water, soil, savanah, and forest) was used to test the effect of the size of the background reflectance and the optical thickness of the atmosphere on classification accuracy. It is concluded that new atmospheric correction methods, which take into account the finite size of the fields, have to be developed to improve significantly the classification accuracy.
Yang, Xiaoyan; Chen, Longgao; Li, Yingkui; Xi, Wenjia; Chen, Longqian
2015-07-01
Land use/land cover (LULC) inventory provides an important dataset in regional planning and environmental assessment. To efficiently obtain the LULC inventory, we compared the LULC classifications based on single satellite imagery with a rule-based classification based on multi-seasonal imagery in Lianyungang City, a coastal city in China, using CBERS-02 (the 2nd China-Brazil Environmental Resource Satellites) images. The overall accuracies of the classification based on single imagery are 78.9, 82.8, and 82.0% in winter, early summer, and autumn, respectively. The rule-based classification improves the accuracy to 87.9% (kappa 0.85), suggesting that combining multi-seasonal images can considerably improve the classification accuracy over any single image-based classification. This method could also be used to analyze seasonal changes of LULC types, especially for those associated with tidal changes in coastal areas. The distribution and inventory of LULC types with an overall accuracy of 87.9% and a spatial resolution of 19.5 m can assist regional planning and environmental assessment efficiently in Lianyungang City. This rule-based classification provides a guidance to improve accuracy for coastal areas with distinct LULC temporal spectral features.
Statistical sensor fusion of ECG data using automotive-grade sensors
NASA Astrophysics Data System (ADS)
Koenig, A.; Rehg, T.; Rasshofer, R.
2015-11-01
Driver states such as fatigue, stress, aggression, distraction or even medical emergencies continue to be yield to severe mistakes in driving and promote accidents. A pathway towards improving driver state assessment can be found in psycho-physiological measures to directly quantify the driver's state from physiological recordings. Although heart rate is a well-established physiological variable that reflects cognitive stress, obtaining heart rate contactless and reliably is a challenging task in an automotive environment. Our aim was to investigate, how sensory fusion of two automotive grade sensors would influence the accuracy of automatic classification of cognitive stress levels. We induced cognitive stress in subjects and estimated levels from their heart rate signals, acquired from automotive ready ECG sensors. Using signal quality indices and Kalman filters, we were able to decrease Root Mean Squared Error (RMSE) of heart rate recordings by 10 beats per minute. We then trained a neural network to classify the cognitive workload state of subjects from heart rate and compared classification performance for ground truth, the individual sensors and the fused heart rate signal. We obtained an increase of 5 % higher correct classification by fusing signals as compared to individual sensors, staying only 4 % below the maximally possible classification accuracy from ground truth. These results are a first step towards real world applications of psycho-physiological measurements in vehicle settings. Future implementations of driver state modeling will be able to draw from a larger pool of data sources, such as additional physiological values or vehicle related data, which can be expected to drive classification to significantly higher values.
The Application of Remote Sensing Techniques to Urban Data Acquisition
NASA Technical Reports Server (NTRS)
Horton, F. E.
1971-01-01
The application of remote sensing techniques useful in acquiring data concerning housing quality is discussed. Conclusions reached from the investigation were: (1) Use of individuals with a higher degree of training in photointerpretation should significantly increase the percentage of successful classifications. (2) Small area classification of urban housing quality can definitely be accomplished via high resolution aerial photography. Such surveys, at the levels of accuracy demonstrated, can be of major utility in quick look surveys. (3) Survey costs should be significantly reduced.
Austin, Peter C; Lee, Douglas S
2011-01-01
Purpose: Classification trees are increasingly being used to classifying patients according to the presence or absence of a disease or health outcome. A limitation of classification trees is their limited predictive accuracy. In the data-mining and machine learning literature, boosting has been developed to improve classification. Boosting with classification trees iteratively grows classification trees in a sequence of reweighted datasets. In a given iteration, subjects that were misclassified in the previous iteration are weighted more highly than subjects that were correctly classified. Classifications from each of the classification trees in the sequence are combined through a weighted majority vote to produce a final classification. The authors' objective was to examine whether boosting improved the accuracy of classification trees for predicting outcomes in cardiovascular patients. Methods: We examined the utility of boosting classification trees for classifying 30-day mortality outcomes in patients hospitalized with either acute myocardial infarction or congestive heart failure. Results: Improvements in the misclassification rate using boosted classification trees were at best minor compared to when conventional classification trees were used. Minor to modest improvements to sensitivity were observed, with only a negligible reduction in specificity. For predicting cardiovascular mortality, boosted classification trees had high specificity, but low sensitivity. Conclusions: Gains in predictive accuracy for predicting cardiovascular outcomes were less impressive than gains in performance observed in the data mining literature. PMID:22254181
Seeland, Marco; Rzanny, Michael; Alaqraa, Nedal; Wäldchen, Jana; Mäder, Patrick
2017-01-01
Steady improvements of image description methods induced a growing interest in image-based plant species classification, a task vital to the study of biodiversity and ecological sensitivity. Various techniques have been proposed for general object classification over the past years and several of them have already been studied for plant species classification. However, results of these studies are selective in the evaluated steps of a classification pipeline, in the utilized datasets for evaluation, and in the compared baseline methods. No study is available that evaluates the main competing methods for building an image representation on the same datasets allowing for generalized findings regarding flower-based plant species classification. The aim of this paper is to comparatively evaluate methods, method combinations, and their parameters towards classification accuracy. The investigated methods span from detection, extraction, fusion, pooling, to encoding of local features for quantifying shape and color information of flower images. We selected the flower image datasets Oxford Flower 17 and Oxford Flower 102 as well as our own Jena Flower 30 dataset for our experiments. Findings show large differences among the various studied techniques and that their wisely chosen orchestration allows for high accuracies in species classification. We further found that true local feature detectors in combination with advanced encoding methods yield higher classification results at lower computational costs compared to commonly used dense sampling and spatial pooling methods. Color was found to be an indispensable feature for high classification results, especially while preserving spatial correspondence to gray-level features. In result, our study provides a comprehensive overview of competing techniques and the implications of their main parameters for flower-based plant species classification. PMID:28234999
Di-codon Usage for Gene Classification
NASA Astrophysics Data System (ADS)
Nguyen, Minh N.; Ma, Jianmin; Fogel, Gary B.; Rajapakse, Jagath C.
Classification of genes into biologically related groups facilitates inference of their functions. Codon usage bias has been described previously as a potential feature for gene classification. In this paper, we demonstrate that di-codon usage can further improve classification of genes. By using both codon and di-codon features, we achieve near perfect accuracies for the classification of HLA molecules into major classes and sub-classes. The method is illustrated on 1,841 HLA sequences which are classified into two major classes, HLA-I and HLA-II. Major classes are further classified into sub-groups. A binary SVM using di-codon usage patterns achieved 99.95% accuracy in the classification of HLA genes into major HLA classes; and multi-class SVM achieved accuracy rates of 99.82% and 99.03% for sub-class classification of HLA-I and HLA-II genes, respectively. Furthermore, by combining codon and di-codon usages, the prediction accuracies reached 100%, 99.82%, and 99.84% for HLA major class classification, and for sub-class classification of HLA-I and HLA-II genes, respectively.
NASA Technical Reports Server (NTRS)
Cibula, William G.; Nyquist, Maurice O.
1987-01-01
An unsupervised computer classification of vegetation/landcover of Olympic National Park and surrounding environs was initially carried out using four bands of Landsat MSS data. The primary objective of the project was to derive a level of landcover classifications useful for park management applications while maintaining an acceptably high level of classification accuracy. Initially, nine generalized vegetation/landcover classes were derived. Overall classification accuracy was 91.7 percent. In an attempt to refine the level of classification, a geographic information system (GIS) approach was employed. Topographic data and watershed boundaries (inferred precipitation/temperature) data were registered with the Landsat MSS data. The resultant boolean operations yielded 21 vegetation/landcover classes while maintaining the same level of classification accuracy. The final classification provided much better identification and location of the major forest types within the park at the same high level of accuracy, and these met the project objective. This classification could now become inputs into a GIS system to help provide answers to park management coupled with other ancillary data programs such as fire management.
Gas Classification Using Deep Convolutional Neural Networks.
Peng, Pai; Zhao, Xiaojin; Pan, Xiaofang; Ye, Wenbin
2018-01-08
In this work, we propose a novel Deep Convolutional Neural Network (DCNN) tailored for gas classification. Inspired by the great success of DCNN in the field of computer vision, we designed a DCNN with up to 38 layers. In general, the proposed gas neural network, named GasNet, consists of: six convolutional blocks, each block consist of six layers; a pooling layer; and a fully-connected layer. Together, these various layers make up a powerful deep model for gas classification. Experimental results show that the proposed DCNN method is an effective technique for classifying electronic nose data. We also demonstrate that the DCNN method can provide higher classification accuracy than comparable Support Vector Machine (SVM) methods and Multiple Layer Perceptron (MLP).
Gas Classification Using Deep Convolutional Neural Networks
Peng, Pai; Zhao, Xiaojin; Pan, Xiaofang; Ye, Wenbin
2018-01-01
In this work, we propose a novel Deep Convolutional Neural Network (DCNN) tailored for gas classification. Inspired by the great success of DCNN in the field of computer vision, we designed a DCNN with up to 38 layers. In general, the proposed gas neural network, named GasNet, consists of: six convolutional blocks, each block consist of six layers; a pooling layer; and a fully-connected layer. Together, these various layers make up a powerful deep model for gas classification. Experimental results show that the proposed DCNN method is an effective technique for classifying electronic nose data. We also demonstrate that the DCNN method can provide higher classification accuracy than comparable Support Vector Machine (SVM) methods and Multiple Layer Perceptron (MLP). PMID:29316723
Parsons, Helen M; Ludwig, Christian; Günther, Ulrich L; Viant, Mark R
2007-01-01
Background Classifying nuclear magnetic resonance (NMR) spectra is a crucial step in many metabolomics experiments. Since several multivariate classification techniques depend upon the variance of the data, it is important to first minimise any contribution from unwanted technical variance arising from sample preparation and analytical measurements, and thereby maximise any contribution from wanted biological variance between different classes. The generalised logarithm (glog) transform was developed to stabilise the variance in DNA microarray datasets, but has rarely been applied to metabolomics data. In particular, it has not been rigorously evaluated against other scaling techniques used in metabolomics, nor tested on all forms of NMR spectra including 1-dimensional (1D) 1H, projections of 2D 1H, 1H J-resolved (pJRES), and intact 2D J-resolved (JRES). Results Here, the effects of the glog transform are compared against two commonly used variance stabilising techniques, autoscaling and Pareto scaling, as well as unscaled data. The four methods are evaluated in terms of the effects on the variance of NMR metabolomics data and on the classification accuracy following multivariate analysis, the latter achieved using principal component analysis followed by linear discriminant analysis. For two of three datasets analysed, classification accuracies were highest following glog transformation: 100% accuracy for discriminating 1D NMR spectra of hypoxic and normoxic invertebrate muscle, and 100% accuracy for discriminating 2D JRES spectra of fish livers sampled from two rivers. For the third dataset, pJRES spectra of urine from two breeds of dog, the glog transform and autoscaling achieved equal highest accuracies. Additionally we extended the glog algorithm to effectively suppress noise, which proved critical for the analysis of 2D JRES spectra. Conclusion We have demonstrated that the glog and extended glog transforms stabilise the technical variance in NMR metabolomics datasets. This significantly improves the discrimination between sample classes and has resulted in higher classification accuracies compared to unscaled, autoscaled or Pareto scaled data. Additionally we have confirmed the broad applicability of the glog approach using three disparate datasets from different biological samples using 1D NMR spectra, 1D projections of 2D JRES spectra, and intact 2D JRES spectra. PMID:17605789
Foliar and woody materials discriminated using terrestrial LiDAR in a mixed natural forest
NASA Astrophysics Data System (ADS)
Zhu, Xi; Skidmore, Andrew K.; Darvishzadeh, Roshanak; Niemann, K. Olaf; Liu, Jing; Shi, Yifang; Wang, Tiejun
2018-02-01
Separation of foliar and woody materials using remotely sensed data is crucial for the accurate estimation of leaf area index (LAI) and woody biomass across forest stands. In this paper, we present a new method to accurately separate foliar and woody materials using terrestrial LiDAR point clouds obtained from ten test sites in a mixed forest in Bavarian Forest National Park, Germany. Firstly, we applied and compared an adaptive radius near-neighbor search algorithm with a fixed radius near-neighbor search method in order to obtain both radiometric and geometric features derived from terrestrial LiDAR point clouds. Secondly, we used a random forest machine learning algorithm to classify foliar and woody materials and examined the impact of understory and slope on the classification accuracy. An average overall accuracy of 84.4% (Kappa = 0.75) was achieved across all experimental plots. The adaptive radius near-neighbor search method outperformed the fixed radius near-neighbor search method. The classification accuracy was significantly higher when the combination of both radiometric and geometric features was utilized. The analysis showed that increasing slope and understory coverage had a significant negative effect on the overall classification accuracy. Our results suggest that the utilization of the adaptive radius near-neighbor search method coupling both radiometric and geometric features has the potential to accurately discriminate foliar and woody materials from terrestrial LiDAR data in a mixed natural forest.
NASA Astrophysics Data System (ADS)
Yekkehkhany, B.; Safari, A.; Homayouni, S.; Hasanlou, M.
2014-10-01
In this paper, a framework is developed based on Support Vector Machines (SVM) for crop classification using polarimetric features extracted from multi-temporal Synthetic Aperture Radar (SAR) imageries. The multi-temporal integration of data not only improves the overall retrieval accuracy but also provides more reliable estimates with respect to single-date data. Several kernel functions are employed and compared in this study for mapping the input space to higher Hilbert dimension space. These kernel functions include linear, polynomials and Radial Based Function (RBF). The method is applied to several UAVSAR L-band SAR images acquired over an agricultural area near Winnipeg, Manitoba, Canada. In this research, the temporal alpha features of H/A/α decomposition method are used in classification. The experimental tests show an SVM classifier with RBF kernel for three dates of data increases the Overall Accuracy (OA) to up to 3% in comparison to using linear kernel function, and up to 1% in comparison to a 3rd degree polynomial kernel function.
Cooperative Learning for Distributed In-Network Traffic Classification
NASA Astrophysics Data System (ADS)
Joseph, S. B.; Loo, H. R.; Ismail, I.; Andromeda, T.; Marsono, M. N.
2017-04-01
Inspired by the concept of autonomic distributed/decentralized network management schemes, we consider the issue of information exchange among distributed network nodes to network performance and promote scalability for in-network monitoring. In this paper, we propose a cooperative learning algorithm for propagation and synchronization of network information among autonomic distributed network nodes for online traffic classification. The results show that network nodes with sharing capability perform better with a higher average accuracy of 89.21% (sharing data) and 88.37% (sharing clusters) compared to 88.06% for nodes without cooperative learning capability. The overall performance indicates that cooperative learning is promising for distributed in-network traffic classification.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Yongjun; Lim, Jonghyuck; Kim, Namkug
2013-05-15
Purpose: To investigate the effect of using different computed tomography (CT) scanners on the accuracy of high-resolution CT (HRCT) images in classifying regional disease patterns in patients with diffuse lung disease, support vector machine (SVM) and Bayesian classifiers were applied to multicenter data. Methods: Two experienced radiologists marked sets of 600 rectangular 20 Multiplication-Sign 20 pixel regions of interest (ROIs) on HRCT images obtained from two scanners (GE and Siemens), including 100 ROIs for each of local patterns of lungs-normal lung and five of regional pulmonary disease patterns (ground-glass opacity, reticular opacity, honeycombing, emphysema, and consolidation). Each ROI was assessedmore » using 22 quantitative features belonging to one of the following descriptors: histogram, gradient, run-length, gray level co-occurrence matrix, low-attenuation area cluster, and top-hat transform. For automatic classification, a Bayesian classifier and a SVM classifier were compared under three different conditions. First, classification accuracies were estimated using data from each scanner. Next, data from the GE and Siemens scanners were used for training and testing, respectively, and vice versa. Finally, all ROI data were integrated regardless of the scanner type and were then trained and tested together. All experiments were performed based on forward feature selection and fivefold cross-validation with 20 repetitions. Results: For each scanner, better classification accuracies were achieved with the SVM classifier than the Bayesian classifier (92% and 82%, respectively, for the GE scanner; and 92% and 86%, respectively, for the Siemens scanner). The classification accuracies were 82%/72% for training with GE data and testing with Siemens data, and 79%/72% for the reverse. The use of training and test data obtained from the HRCT images of different scanners lowered the classification accuracy compared to the use of HRCT images from the same scanner. For integrated ROI data obtained from both scanners, the classification accuracies with the SVM and Bayesian classifiers were 92% and 77%, respectively. The selected features resulting from the classification process differed by scanner, with more features included for the classification of the integrated HRCT data than for the classification of the HRCT data from each scanner. For the integrated data, consisting of HRCT images of both scanners, the classification accuracy based on the SVM was statistically similar to the accuracy of the data obtained from each scanner. However, the classification accuracy of the integrated data using the Bayesian classifier was significantly lower than the classification accuracy of the ROI data of each scanner. Conclusions: The use of an integrated dataset along with a SVM classifier rather than a Bayesian classifier has benefits in terms of the classification accuracy of HRCT images acquired with more than one scanner. This finding is of relevance in studies involving large number of images, as is the case in a multicenter trial with different scanners.« less
Detection of stress factors in crop and weed species using hyperspectral remote sensing reflectance
NASA Astrophysics Data System (ADS)
Henry, William Brien
The primary objective of this work was to determine if stress factors such as moisture stress or herbicide injury stress limit the ability to distinguish between weeds and crops using remotely sensed data. Additional objectives included using hyperspectral reflectance data to measure moisture content within a species, and to measure crop injury in response to drift rates of non-selective herbicides. Moisture stress did not reduce the ability to discriminate between species. Regardless of analysis technique, the trend was that as moisture stress increased, so too did the ability to distinguish between species. Signature amplitudes (SA) of the top 5 bands, discrete wavelet transforms (DWT), and multiple indices were promising analysis techniques. Discriminant models created from one year's data set and validated on additional data sets provided, on average, approximately 80% accurate classification among weeds and crop. This suggests that these models are relatively robust and could potentially be used across environmental conditions in field scenarios. Distinguishing between leaves grown at high-moisture stress and no-stress was met with limited success, primarily because there was substantial variation among samples within the treatments. Leaf water potential (LWP) was measured, and these were classified into three categories using indices. Classification accuracies were as high as 68%. The 10 bands most highly correlated to LWP were selected; however, there were no obvious trends or patterns in these top 10 bands with respect to time, species or moisture level, suggesting that LWP is an elusive parameter to quantify spectrally. In order to address herbicide injury stress and its impact on species discrimination, discriminant models were created from combinations of multiple indices. The model created from the second experimental run's data set and validated on the first experimental run's data provided an average of 97% correct classification of soybean and an overall average classification accuracy of 65% for all species. This suggests that these models are relatively robust and could potentially be used across a wide range of herbicide applications in field scenarios. From the pooled data set, a single discriminant model was created with multiple indices that discriminated soybean from weeds 88%, on average, regardless of herbicide, rate or species. Several analysis techniques including multiple indices, signature amplitude with spectral bands as features, and wavelet analysis were employed to distinguish between herbicide-treated and nontreated plants. Classification accuracy using signature amplitude (SA) analysis of paraquat injury on soybean was better than 75% for both 1/2 and 1/8X rates at 1, 4, and 7 DAA. Classification accuracy of paraquat injury on corn was better than 72% for the 1/2X rate at 1, 4, and 7 DAA. These data suggest that hyperspectral reflectance may be used to distinguish between healthy plants and injured plants to which herbicides have been applied; however, the classification accuracies remained at 75% or higher only when the higher rates of herbicide were applied. (Abstract shortened by UMI.)
Corn and soybean Landsat MSS classification performance as a function of scene characteristics
NASA Technical Reports Server (NTRS)
Batista, G. T.; Hixson, M. M.; Bauer, M. E.
1982-01-01
In order to fully utilize remote sensing to inventory crop production, it is important to identify the factors that affect the accuracy of Landsat classifications. The objective of this study was to investigate the effect of scene characteristics involving crop, soil, and weather variables on the accuracy of Landsat classifications of corn and soybeans. Segments sampling the U.S. Corn Belt were classified using a Gaussian maximum likelihood classifier on multitemporally registered data from two key acquisition periods. Field size had a strong effect on classification accuracy with small fields tending to have low accuracies even when the effect of mixed pixels was eliminated. Other scene characteristics accounting for variability in classification accuracy included proportions of corn and soybeans, crop diversity index, proportion of all field crops, soil drainage, slope, soil order, long-term average soybean yield, maximum yield, relative position of the segment in the Corn Belt, weather, and crop development stage.
2008-09-01
2 X Components: 1 Y Components: 1 Product MBR Geographic Coordinates Number of Coordinates: 4 Coordinate: 1 Latitude...bottom (other than live coral) bldgs., docks, etc.) 4. linear reef- B. SHORELINE -INTERTIDAL modifiers 5. pinnacle reef- c. submerged vegetation- sand
Nationwide forestry applications program. Analysis of forest classification accuracy
NASA Technical Reports Server (NTRS)
Congalton, R. G.; Mead, R. A.; Oderwald, R. G.; Heinen, J. (Principal Investigator)
1981-01-01
The development of LANDSAT classification accuracy assessment techniques, and of a computerized system for assessing wildlife habitat from land cover maps are considered. A literature review on accuracy assessment techniques and an explanation for the techniques development under both projects are included along with listings of the computer programs. The presentations and discussions at the National Working Conference on LANDSAT Classification Accuracy are summarized. Two symposium papers which were published on the results of this project are appended.
NASA Technical Reports Server (NTRS)
Hoffbeck, Joseph P.; Landgrebe, David A.
1994-01-01
Many analysis algorithms for high-dimensional remote sensing data require that the remotely sensed radiance spectra be transformed to approximate reflectance to allow comparison with a library of laboratory reflectance spectra. In maximum likelihood classification, however, the remotely sensed spectra are compared to training samples, thus a transformation to reflectance may or may not be helpful. The effect of several radiance-to-reflectance transformations on maximum likelihood classification accuracy is investigated in this paper. We show that the empirical line approach, LOWTRAN7, flat-field correction, single spectrum method, and internal average reflectance are all non-singular affine transformations, and that non-singular affine transformations have no effect on discriminant analysis feature extraction and maximum likelihood classification accuracy. (An affine transformation is a linear transformation with an optional offset.) Since the Atmosphere Removal Program (ATREM) and the log residue method are not affine transformations, experiments with Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data were conducted to determine the effect of these transformations on maximum likelihood classification accuracy. The average classification accuracy of the data transformed by ATREM and the log residue method was slightly less than the accuracy of the original radiance data. Since the radiance-to-reflectance transformations allow direct comparison of remotely sensed spectra with laboratory reflectance spectra, they can be quite useful in labeling the training samples required by maximum likelihood classification, but these transformations have only a slight effect or no effect at all on discriminant analysis and maximum likelihood classification accuracy.
NASA Astrophysics Data System (ADS)
Madokoro, H.; Yamanashi, A.; Sato, K.
2013-08-01
This paper presents an unsupervised scene classification method for actualizing semantic recognition of indoor scenes. Background and foreground features are respectively extracted using Gist and color scale-invariant feature transform (SIFT) as feature representations based on context. We used hue, saturation, and value SIFT (HSV-SIFT) because of its simple algorithm with low calculation costs. Our method creates bags of features for voting visual words created from both feature descriptors to a two-dimensional histogram. Moreover, our method generates labels as candidates of categories for time-series images while maintaining stability and plasticity together. Automatic labeling of category maps can be realized using labels created using adaptive resonance theory (ART) as teaching signals for counter propagation networks (CPNs). We evaluated our method for semantic scene classification using KTH's image database for robot localization (KTH-IDOL), which is popularly used for robot localization and navigation. The mean classification accuracies of Gist, gray SIFT, one class support vector machines (OC-SVM), position-invariant robust features (PIRF), and our method are, respectively, 39.7, 58.0, 56.0, 63.6, and 79.4%. The result of our method is 15.8% higher than that of PIRF. Moreover, we applied our method for fine classification using our original mobile robot. We obtained mean classification accuracy of 83.2% for six zones.
Automated Tissue Classification Framework for Reproducible Chronic Wound Assessment
Mukherjee, Rashmi; Manohar, Dhiraj Dhane; Das, Dev Kumar; Achar, Arun; Mitra, Analava; Chakraborty, Chandan
2014-01-01
The aim of this paper was to develop a computer assisted tissue classification (granulation, necrotic, and slough) scheme for chronic wound (CW) evaluation using medical image processing and statistical machine learning techniques. The red-green-blue (RGB) wound images grabbed by normal digital camera were first transformed into HSI (hue, saturation, and intensity) color space and subsequently the “S” component of HSI color channels was selected as it provided higher contrast. Wound areas from 6 different types of CW were segmented from whole images using fuzzy divergence based thresholding by minimizing edge ambiguity. A set of color and textural features describing granulation, necrotic, and slough tissues in the segmented wound area were extracted using various mathematical techniques. Finally, statistical learning algorithms, namely, Bayesian classification and support vector machine (SVM), were trained and tested for wound tissue classification in different CW images. The performance of the wound area segmentation protocol was further validated by ground truth images labeled by clinical experts. It was observed that SVM with 3rd order polynomial kernel provided the highest accuracies, that is, 86.94%, 90.47%, and 75.53%, for classifying granulation, slough, and necrotic tissues, respectively. The proposed automated tissue classification technique achieved the highest overall accuracy, that is, 87.61%, with highest kappa statistic value (0.793). PMID:25114925
A fuzzy hill-climbing algorithm for the development of a compact associative classifier
NASA Astrophysics Data System (ADS)
Mitra, Soumyaroop; Lam, Sarah S.
2012-02-01
Classification, a data mining technique, has widespread applications including medical diagnosis, targeted marketing, and others. Knowledge discovery from databases in the form of association rules is one of the important data mining tasks. An integrated approach, classification based on association rules, has drawn the attention of the data mining community over the last decade. While attention has been mainly focused on increasing classifier accuracies, not much efforts have been devoted towards building interpretable and less complex models. This paper discusses the development of a compact associative classification model using a hill-climbing approach and fuzzy sets. The proposed methodology builds the rule-base by selecting rules which contribute towards increasing training accuracy, thus balancing classification accuracy with the number of classification association rules. The results indicated that the proposed associative classification model can achieve competitive accuracies on benchmark datasets with continuous attributes and lend better interpretability, when compared with other rule-based systems.
SVM-RFE based feature selection and Taguchi parameters optimization for multiclass SVM classifier.
Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W M; Li, R K; Jiang, Bo-Ru
2014-01-01
Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.
SVM-RFE Based Feature Selection and Taguchi Parameters Optimization for Multiclass SVM Classifier
Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W. M.; Li, R. K.; Jiang, Bo-Ru
2014-01-01
Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases. PMID:25295306
2018-01-01
Hyperspectral image classification with a limited number of training samples without loss of accuracy is desirable, as collecting such data is often expensive and time-consuming. However, classifiers trained with limited samples usually end up with a large generalization error. To overcome the said problem, we propose a fuzziness-based active learning framework (FALF), in which we implement the idea of selecting optimal training samples to enhance generalization performance for two different kinds of classifiers, discriminative and generative (e.g. SVM and KNN). The optimal samples are selected by first estimating the boundary of each class and then calculating the fuzziness-based distance between each sample and the estimated class boundaries. Those samples that are at smaller distances from the boundaries and have higher fuzziness are chosen as target candidates for the training set. Through detailed experimentation on three publically available datasets, we showed that when trained with the proposed sample selection framework, both classifiers achieved higher classification accuracy and lower processing time with the small amount of training data as opposed to the case where the training samples were selected randomly. Our experiments demonstrate the effectiveness of our proposed method, which equates favorably with the state-of-the-art methods. PMID:29304512
Classification of urban features using airborne hyperspectral data
NASA Astrophysics Data System (ADS)
Ganesh Babu, Bharath
Accurate mapping and modeling of urban environments are critical for their efficient and successful management. Superior understanding of complex urban environments is made possible by using modern geospatial technologies. This research focuses on thematic classification of urban land use and land cover (LULC) using 248 bands of 2.0 meter resolution hyperspectral data acquired from an airborne imaging spectrometer (AISA+) on 24th July 2006 in and near Terre Haute, Indiana. Three distinct study areas including two commercial classes, two residential classes, and two urban parks/recreational classes were selected for classification and analysis. Four commonly used classification methods -- maximum likelihood (ML), extraction and classification of homogeneous objects (ECHO), spectral angle mapper (SAM), and iterative self organizing data analysis (ISODATA) - were applied to each data set. Accuracy assessment was conducted and overall accuracies were compared between the twenty four resulting thematic maps. With the exception of SAM and ISODATA in a complex commercial area, all methods employed classified the designated urban features with more than 80% accuracy. The thematic classification from ECHO showed the best agreement with ground reference samples. The residential area with relatively homogeneous composition was classified consistently with highest accuracy by all four of the classification methods used. The average accuracy amongst the classifiers was 93.60% for this area. When individually observed, the complex recreational area (Deming Park) was classified with the highest accuracy by ECHO, with an accuracy of 96.80% and 96.10% Kappa. The average accuracy amongst all the classifiers was 92.07%. The commercial area with relatively high complexity was classified with the least accuracy by all classifiers. The lowest accuracy was achieved by SAM at 63.90% with 59.20% Kappa. This was also the lowest accuracy in the entire analysis. This study demonstrates the potential for using the visible and near infrared (VNIR) bands from AISA+ hyperspectral data in urban LULC classification. Based on their performance, the need for further research using ECHO and SAM is underscored. The importance incorporating imaging spectrometer data in high resolution urban feature mapping is emphasized.
Classification of large-scale fundus image data sets: a cloud-computing framework.
Roychowdhury, Sohini
2016-08-01
Large medical image data sets with high dimensionality require substantial amount of computation time for data creation and data processing. This paper presents a novel generalized method that finds optimal image-based feature sets that reduce computational time complexity while maximizing overall classification accuracy for detection of diabetic retinopathy (DR). First, region-based and pixel-based features are extracted from fundus images for classification of DR lesions and vessel-like structures. Next, feature ranking strategies are used to distinguish the optimal classification feature sets. DR lesion and vessel classification accuracies are computed using the boosted decision tree and decision forest classifiers in the Microsoft Azure Machine Learning Studio platform, respectively. For images from the DIARETDB1 data set, 40 of its highest-ranked features are used to classify four DR lesion types with an average classification accuracy of 90.1% in 792 seconds. Also, for classification of red lesion regions and hemorrhages from microaneurysms, accuracies of 85% and 72% are observed, respectively. For images from STARE data set, 40 high-ranked features can classify minor blood vessels with an accuracy of 83.5% in 326 seconds. Such cloud-based fundus image analysis systems can significantly enhance the borderline classification performances in automated screening systems.
Real-time, resource-constrained object classification on a micro-air vehicle
NASA Astrophysics Data System (ADS)
Buck, Louis; Ray, Laura
2013-12-01
A real-time embedded object classification algorithm is developed through the novel combination of binary feature descriptors, a bag-of-visual-words object model and the cortico-striatal loop (CSL) learning algorithm. The BRIEF, ORB and FREAK binary descriptors are tested and compared to SIFT descriptors with regard to their respective classification accuracies, execution times, and memory requirements when used with CSL on a 12.6 g ARM Cortex embedded processor running at 800 MHz. Additionally, the effect of x2 feature mapping and opponent-color representations used with these descriptors is examined. These tests are performed on four data sets of varying sizes and difficulty, and the BRIEF descriptor is found to yield the best combination of speed and classification accuracy. Its use with CSL achieves accuracies between 67% and 95% of those achieved with SIFT descriptors and allows for the embedded classification of a 128x192 pixel image in 0.15 seconds, 60 times faster than classification with SIFT. X2 mapping is found to provide substantial improvements in classification accuracy for all of the descriptors at little cost, while opponent-color descriptors are offer accuracy improvements only on colorful datasets.
A Nonparametric Approach to Estimate Classification Accuracy and Consistency
ERIC Educational Resources Information Center
Lathrop, Quinn N.; Cheng, Ying
2014-01-01
When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…
NASA Astrophysics Data System (ADS)
Löw, Fabian; Schorcht, Gunther; Michel, Ulrich; Dech, Stefan; Conrad, Christopher
2012-10-01
Accurate crop identification and crop area estimation are important for studies on irrigated agricultural systems, yield and water demand modeling, and agrarian policy development. In this study a novel combination of Random Forest (RF) and Support Vector Machine (SVM) classifiers is presented that (i) enhances crop classification accuracy and (ii) provides spatial information on map uncertainty. The methodology was implemented over four distinct irrigated sites in Middle Asia using RapidEye time series data. The RF feature importance statistics was used as feature-selection strategy for the SVM to assess possible negative effects on classification accuracy caused by an oversized feature space. The results of the individual RF and SVM classifications were combined with rules based on posterior classification probability and estimates of classification probability entropy. SVM classification performance was increased by feature selection through RF. Further experimental results indicate that the hybrid classifier improves overall classification accuracy in comparison to the single classifiers as well as useŕs and produceŕs accuracy.
EEG channels reduction using PCA to increase XGBoost's accuracy for stroke detection
NASA Astrophysics Data System (ADS)
Fitriah, N.; Wijaya, S. K.; Fanany, M. I.; Badri, C.; Rezal, M.
2017-07-01
In Indonesia, based on the result of Basic Health Research 2013, the number of stroke patients had increased from 8.3 ‰ (2007) to 12.1 ‰ (2013). These days, some researchers are using electroencephalography (EEG) result as another option to detect the stroke disease besides CT Scan image as the gold standard. A previous study on the data of stroke and healthy patients in National Brain Center Hospital (RS PON) used Brain Symmetry Index (BSI), Delta-Alpha Ratio (DAR), and Delta-Theta-Alpha-Beta Ratio (DTABR) as the features for classification by an Extreme Learning Machine (ELM). The study got 85% accuracy with sensitivity above 86 % for acute ischemic stroke detection. Using EEG data means dealing with many data dimensions, and it can reduce the accuracy of classifier (the curse of dimensionality). Principal Component Analysis (PCA) could reduce dimensionality and computation cost without decreasing classification accuracy. XGBoost, as the scalable tree boosting classifier, can solve real-world scale problems (Higgs Boson and Allstate dataset) with using a minimal amount of resources. This paper reuses the same data from RS PON and features from previous research, preprocessed with PCA and classified with XGBoost, to increase the accuracy with fewer electrodes. The specific fewer electrodes improved the accuracy of stroke detection. Our future work will examine the other algorithm besides PCA to get higher accuracy with less number of channels.
Information extraction with object based support vector machines and vegetation indices
NASA Astrophysics Data System (ADS)
Ustuner, Mustafa; Abdikan, Saygin; Balik Sanli, Fusun
2016-07-01
Information extraction through remote sensing data is important for policy and decision makers as extracted information provide base layers for many application of real world. Classification of remotely sensed data is the one of the most common methods of extracting information however it is still a challenging issue because several factors are affecting the accuracy of the classification. Resolution of the imagery, number and homogeneity of land cover classes, purity of training data and characteristic of adopted classifiers are just some of these challenging factors. Object based image classification has some superiority than pixel based classification for high resolution images since it uses geometry and structure information besides spectral information. Vegetation indices are also commonly used for the classification process since it provides additional spectral information for vegetation, forestry and agricultural areas. In this study, the impacts of the Normalized Difference Vegetation Index (NDVI) and Normalized Difference Red Edge Index (NDRE) on the classification accuracy of RapidEye imagery were investigated. Object based Support Vector Machines were implemented for the classification of crop types for the study area located in Aegean region of Turkey. Results demonstrated that the incorporation of NDRE increase the classification accuracy from 79,96% to 86,80% as overall accuracy, however NDVI decrease the classification accuracy from 79,96% to 78,90%. Moreover it is proven than object based classification with RapidEye data give promising results for crop type mapping and analysis.
Zhan, Liang; Liu, Yashu; Wang, Yalin; Zhou, Jiayu; Jahanshad, Neda; Ye, Jieping; Thompson, Paul M.
2015-01-01
Alzheimer's disease (AD) is a progressive brain disease. Accurate detection of AD and its prodromal stage, mild cognitive impairment (MCI), are crucial. There is also a growing interest in identifying brain imaging biomarkers that help to automatically differentiate stages of Alzheimer's disease. Here, we focused on brain structural networks computed from diffusion MRI and proposed a new feature extraction and classification framework based on higher order singular value decomposition and sparse logistic regression. In tests on publicly available data from the Alzheimer's Disease Neuroimaging Initiative, our proposed framework showed promise in detecting brain network differences that help in classifying different stages of Alzheimer's disease. PMID:26257601
Bolin, Jocelyn Holden; Finch, W Holmes
2014-01-01
Statistical classification of phenomena into observed groups is very common in the social and behavioral sciences. Statistical classification methods, however, are affected by the characteristics of the data under study. Statistical classification can be further complicated by initial misclassification of the observed groups. The purpose of this study is to investigate the impact of initial training data misclassification on several statistical classification and data mining techniques. Misclassification conditions in the three group case will be simulated and results will be presented in terms of overall as well as subgroup classification accuracy. Results show decreased classification accuracy as sample size, group separation and group size ratio decrease and as misclassification percentage increases with random forests demonstrating the highest accuracy across conditions.
NASA Astrophysics Data System (ADS)
Law, Yan Nei; Lieng, Monica Keiko; Li, Jingmei; Khoo, David Aik-Aun
2014-03-01
Breast cancer is the most common cancer and second leading cause of cancer death among women in the US. The relative survival rate is lower among women with a more advanced stage at diagnosis. Early detection through screening is vital. Mammography is the most widely used and only proven screening method for reliably and effectively detecting abnormal breast tissues. In particular, mammographic density is one of the strongest breast cancer risk factors, after age and gender, and can be used to assess the future risk of disease before individuals become symptomatic. A reliable method for automatic density assessment would be beneficial and could assist radiologists in the evaluation of mammograms. To address this problem, we propose a density classification method which uses statistical features from different parts of the breast. Our method is composed of three parts: breast region identification, feature extraction and building ensemble classifiers for density assessment. It explores the potential of the features extracted from second and higher order statistical information for mammographic density classification. We further investigate the registration of bilateral pairs and time-series of mammograms. The experimental results on 322 mammograms demonstrate that (1) a classifier using features from dense regions has higher discriminative power than a classifier using only features from the whole breast region; (2) these high-order features can be effectively combined to boost the classification accuracy; (3) a classifier using these statistical features from dense regions achieves 75% accuracy, which is a significant improvement from 70% accuracy obtained by the existing approaches.
Dai, Shengfa; Wei, Qingguo
2017-01-01
Common spatial pattern algorithm is widely used to estimate spatial filters in motor imagery based brain-computer interfaces. However, use of a large number of channels will make common spatial pattern tend to over-fitting and the classification of electroencephalographic signals time-consuming. To overcome these problems, it is necessary to choose an optimal subset of the whole channels to save computational time and improve the classification accuracy. In this paper, a novel method named backtracking search optimization algorithm is proposed to automatically select the optimal channel set for common spatial pattern. Each individual in the population is a N-dimensional vector, with each component representing one channel. A population of binary codes generate randomly in the beginning, and then channels are selected according to the evolution of these codes. The number and positions of 1's in the code denote the number and positions of chosen channels. The objective function of backtracking search optimization algorithm is defined as the combination of classification error rate and relative number of channels. Experimental results suggest that higher classification accuracy can be achieved with much fewer channels compared to standard common spatial pattern with whole channels.
Park, Jinhee; Javier, Rios Jesus; Moon, Taesup; Kim, Youngwook
2016-11-24
Accurate classification of human aquatic activities using radar has a variety of potential applications such as rescue operations and border patrols. Nevertheless, the classification of activities on water using radar has not been extensively studied, unlike the case on dry ground, due to its unique challenge. Namely, not only is the radar cross section of a human on water small, but the micro-Doppler signatures are much noisier due to water drops and waves. In this paper, we first investigate whether discriminative signatures could be obtained for activities on water through a simulation study. Then, we show how we can effectively achieve high classification accuracy by applying deep convolutional neural networks (DCNN) directly to the spectrogram of real measurement data. From the five-fold cross-validation on our dataset, which consists of five aquatic activities, we report that the conventional feature-based scheme only achieves an accuracy of 45.1%. In contrast, the DCNN trained using only the collected data attains 66.7%, and the transfer learned DCNN, which takes a DCNN pre-trained on a RGB image dataset and fine-tunes the parameters using the collected data, achieves a much higher 80.3%, which is a significant performance boost.
Large Scale Crop Mapping in Ukraine Using Google Earth Engine
NASA Astrophysics Data System (ADS)
Shelestov, A.; Lavreniuk, M. S.; Kussul, N.
2016-12-01
There are no globally available high resolution satellite-derived crop specific maps at present. Only coarse-resolution imagery (> 250 m spatial resolution) has been utilized to derive global cropland extent. In 2016 we are going to carry out a country level demonstration of Sentinel-2 use for crop classification in Ukraine within the ESA Sen2-Agri project. But optical imagery can be contaminated by cloud cover that makes it difficult to acquire imagery in an optimal time range to discriminate certain crops. Due to the Copernicus program since 2015, a lot of Sentinel-1 SAR data at high spatial resolution is available for free for Ukraine. It allows us to use the time series of SAR data for crop classification. Our experiment for one administrative region in 2015 showed much higher crop classification accuracy with SAR data than with optical only time series [1, 2]. Therefore, in 2016 within the Google Earth Engine Research Award we use SAR data together with optical ones for large area crop mapping (entire territory of Ukraine) using cloud computing capabilities available at Google Earth Engine (GEE). This study compares different classification methods for crop mapping for the whole territory of Ukraine using data and algorithms from GEE. Classification performance assessed using overall classification accuracy, Kappa coefficients, and user's and producer's accuracies. Also, crop areas from derived classification maps compared to the official statistics [3]. S. Skakun et al., "Efficiency assessment of multitemporal C-band Radarsat-2 intensity and Landsat-8 surface reflectance satellite imagery for crop classification in Ukraine," IEEE Journal of Selected Topics in Applied Earth Observ. and Rem. Sens., 2015, DOI: 10.1109/JSTARS.2015.2454297. N. Kussul, S. Skakun, A. Shelestov, O. Kussul, "The use of satellite SAR imagery to crop classification in Ukraine within JECAM project," IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pp.1497-1500, 13-18 July 2014, Quebec City, Canada. F.J. Gallego, N. Kussul, S. Skakun, O. Kravchenko, A. Shelestov, O. Kussul, "Efficiency assessment of using satellite data for crop area estimation in Ukraine," International Journal of Applied Earth Observation and Geoinformation vol. 29, pp. 22-30, 2014.
Zheng, Haiyong; Wang, Ruchen; Yu, Zhibin; Wang, Nan; Gu, Zhaorui; Zheng, Bing
2017-12-28
Plankton, including phytoplankton and zooplankton, are the main source of food for organisms in the ocean and form the base of marine food chain. As the fundamental components of marine ecosystems, plankton is very sensitive to environment changes, and the study of plankton abundance and distribution is crucial, in order to understand environment changes and protect marine ecosystems. This study was carried out to develop an extensive applicable plankton classification system with high accuracy for the increasing number of various imaging devices. Literature shows that most plankton image classification systems were limited to only one specific imaging device and a relatively narrow taxonomic scope. The real practical system for automatic plankton classification is even non-existent and this study is partly to fill this gap. Inspired by the analysis of literature and development of technology, we focused on the requirements of practical application and proposed an automatic system for plankton image classification combining multiple view features via multiple kernel learning (MKL). For one thing, in order to describe the biomorphic characteristics of plankton more completely and comprehensively, we combined general features with robust features, especially by adding features like Inner-Distance Shape Context for morphological representation. For another, we divided all the features into different types from multiple views and feed them to multiple classifiers instead of only one by combining different kernel matrices computed from different types of features optimally via multiple kernel learning. Moreover, we also applied feature selection method to choose the optimal feature subsets from redundant features for satisfying different datasets from different imaging devices. We implemented our proposed classification system on three different datasets across more than 20 categories from phytoplankton to zooplankton. The experimental results validated that our system outperforms state-of-the-art plankton image classification systems in terms of accuracy and robustness. This study demonstrated automatic plankton image classification system combining multiple view features using multiple kernel learning. The results indicated that multiple view features combined by NLMKL using three kernel functions (linear, polynomial and Gaussian kernel functions) can describe and use information of features better so that achieve a higher classification accuracy.
Minimum distance classification in remote sensing
NASA Technical Reports Server (NTRS)
Wacker, A. G.; Landgrebe, D. A.
1972-01-01
The utilization of minimum distance classification methods in remote sensing problems, such as crop species identification, is considered. Literature concerning both minimum distance classification problems and distance measures is reviewed. Experimental results are presented for several examples. The objective of these examples is to: (a) compare the sample classification accuracy of a minimum distance classifier, with the vector classification accuracy of a maximum likelihood classifier, and (b) compare the accuracy of a parametric minimum distance classifier with that of a nonparametric one. Results show the minimum distance classifier performance is 5% to 10% better than that of the maximum likelihood classifier. The nonparametric classifier is only slightly better than the parametric version.
Huo, Guanying
2017-01-01
As a typical deep-learning model, Convolutional Neural Networks (CNNs) can be exploited to automatically extract features from images using the hierarchical structure inspired by mammalian visual system. For image classification tasks, traditional CNN models employ the softmax function for classification. However, owing to the limited capacity of the softmax function, there are some shortcomings of traditional CNN models in image classification. To deal with this problem, a new method combining Biomimetic Pattern Recognition (BPR) with CNNs is proposed for image classification. BPR performs class recognition by a union of geometrical cover sets in a high-dimensional feature space and therefore can overcome some disadvantages of traditional pattern recognition. The proposed method is evaluated on three famous image classification benchmarks, that is, MNIST, AR, and CIFAR-10. The classification accuracies of the proposed method for the three datasets are 99.01%, 98.40%, and 87.11%, respectively, which are much higher in comparison with the other four methods in most cases. PMID:28316614
Mejia Tobar, Alejandra; Hyoudou, Rikiya; Kita, Kahori; Nakamura, Tatsuhiro; Kambara, Hiroyuki; Ogata, Yousuke; Hanakawa, Takashi; Koike, Yasuharu; Yoshimura, Natsue
2017-01-01
The classification of ankle movements from non-invasive brain recordings can be applied to a brain-computer interface (BCI) to control exoskeletons, prosthesis, and functional electrical stimulators for the benefit of patients with walking impairments. In this research, ankle flexion and extension tasks at two force levels in both legs, were classified from cortical current sources estimated by a hierarchical variational Bayesian method, using electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) recordings. The hierarchical prior for the current source estimation from EEG was obtained from activated brain areas and their intensities from an fMRI group (second-level) analysis. The fMRI group analysis was performed on regions of interest defined over the primary motor cortex, the supplementary motor area, and the somatosensory area, which are well-known to contribute to movement control. A sparse logistic regression method was applied for a nine-class classification (eight active tasks and a resting control task) obtaining a mean accuracy of 65.64% for time series of current sources, estimated from the EEG and the fMRI signals using a variational Bayesian method, and a mean accuracy of 22.19% for the classification of the pre-processed of EEG sensor signals, with a chance level of 11.11%. The higher classification accuracy of current sources, when compared to EEG classification accuracy, was attributed to the high number of sources and the different signal patterns obtained in the same vertex for different motor tasks. Since the inverse filter estimation for current sources can be done offline with the present method, the present method is applicable to real-time BCIs. Finally, due to the highly enhanced spatial distribution of current sources over the brain cortex, this method has the potential to identify activation patterns to design BCIs for the control of an affected limb in patients with stroke, or BCIs from motor imagery in patients with spinal cord injury.
Yang, Hao; Zhang, Junran; Jiang, Xiaomei; Liu, Fei
2018-04-01
In recent years, with the rapid development of machine learning techniques,the deep learning algorithm has been widely used in one-dimensional physiological signal processing. In this paper we used electroencephalography (EEG) signals based on deep belief network (DBN) model in open source frameworks of deep learning to identify emotional state (positive, negative and neutrals), then the results of DBN were compared with support vector machine (SVM). The EEG signals were collected from the subjects who were under different emotional stimuli, and DBN and SVM were adopted to identify the EEG signals with changes of different characteristics and different frequency bands. We found that the average accuracy of differential entropy (DE) feature by DBN is 89.12%±6.54%, which has a better performance than previous research based on the same data set. At the same time, the classification effects of DBN are better than the results from traditional SVM (the average classification accuracy of 84.2%±9.24%) and its accuracy and stability have a better trend. In three experiments with different time points, single subject can achieve the consistent results of classification by using DBN (the mean standard deviation is1.44%), and the experimental results show that the system has steady performance and good repeatability. According to our research, the characteristic of DE has a better classification result than other characteristics. Furthermore, the Beta band and the Gamma band in the emotional recognition model have higher classification accuracy. To sum up, the performances of classifiers have a promotion by using the deep learning algorithm, which has a reference for establishing a more accurate system of emotional recognition. Meanwhile, we can trace through the results of recognition to find out the brain regions and frequency band that are related to the emotions, which can help us to understand the emotional mechanism better. This study has a high academic value and practical significance, so further investigation still needs to be done.
2014-01-01
Background Although body fat percent (BF%) may be used for screening metabolic risk factors, its accuracy compared to BMI and waist circumference is unknown in a Mexican population. We compared the classification accuracy of BF%, BMI and WC for the detection of metabolic risk factors in a sample of Mexican adults; optimized cutoffs as well as sensitivity and specificity at commonly used BF% and BMI international cutoffs were estimated. We also estimated conditional BF% means at BMI international cutoffs. Methods We performed a cross-sectional analysis of data on body composition, anthropometry and metabolic risk factors(high glucose, high triglycerides, low HDL cholesterol and hypertension) from 5,100 Mexican men and women. The association between BMI, WC and BF%was evaluated with linear regression models. The BF%, BMI and WC optimal cutoffs for the detection of metabolic risk factors were selected at the point where sensitivity was closest to specificity. Areas under the ROC Curve (AUC) were compared among classifiers using a non-parametric method. Results After adjustment for WC, a 1% increase in BMI was associated with a BF% rise of 0.05 percentage points (p.p.) in men (P < 0.05) and 0.25 p.p. in women (P < 0.001). At BMI = 25.0 predicted BF% was 27.6 ± 0.16 (mean ± SE) in men and 41.2 ± 0.07 in women. Estimated BF% cutoffs for detection of metabolic risk factors were close to 30.0 in men and close to 44.0 in women. In men WC had higher AUC than BF% for the classification of all conditions whereas BMI had higher AUC than BF% for the classification of high triglycerides and hypertension. In womenBMI and WC had higher AUC than BF% for the classification of all metabolic risk factors. Conclusions BMI and WC were more accurate than BF% for classifying the studied metabolic disorders. International BF% cutoffs had very low specificity and thus produced a high rate of false positives in both sexes. PMID:24721260
Vidić, Igor; Egnell, Liv; Jerome, Neil P; Teruel, Jose R; Sjøbakk, Torill E; Østlie, Agnes; Fjøsne, Hans E; Bathen, Tone F; Goa, Pål Erik
2018-05-01
Diffusion-weighted MRI (DWI) is currently one of the fastest developing MRI-based techniques in oncology. Histogram properties from model fitting of DWI are useful features for differentiation of lesions, and classification can potentially be improved by machine learning. To evaluate classification of malignant and benign tumors and breast cancer subtypes using support vector machine (SVM). Prospective. Fifty-one patients with benign (n = 23) and malignant (n = 28) breast tumors (26 ER+, whereof six were HER2+). Patients were imaged with DW-MRI (3T) using twice refocused spin-echo echo-planar imaging with echo time / repetition time (TR/TE) = 9000/86 msec, 90 × 90 matrix size, 2 × 2 mm in-plane resolution, 2.5 mm slice thickness, and 13 b-values. Apparent diffusion coefficient (ADC), relative enhanced diffusivity (RED), and the intravoxel incoherent motion (IVIM) parameters diffusivity (D), pseudo-diffusivity (D*), and perfusion fraction (f) were calculated. The histogram properties (median, mean, standard deviation, skewness, kurtosis) were used as features in SVM (10-fold cross-validation) for differentiation of lesions and subtyping. Accuracies of the SVM classifications were calculated to find the combination of features with highest prediction accuracy. Mann-Whitney tests were performed for univariate comparisons. For benign versus malignant tumors, univariate analysis found 11 histogram properties to be significant differentiators. Using SVM, the highest accuracy (0.96) was achieved from a single feature (mean of RED), or from three feature combinations of IVIM or ADC. Combining features from all models gave perfect classification. No single feature predicted HER2 status of ER + tumors (univariate or SVM), although high accuracy (0.90) was achieved with SVM combining several features. Importantly, these features had to include higher-order statistics (kurtosis and skewness), indicating the importance to account for heterogeneity. Our findings suggest that SVM, using features from a combination of diffusion models, improves prediction accuracy for differentiation of benign versus malignant breast tumors, and may further assist in subtyping of breast cancer. 3 Technical Efficacy: Stage 3 J. Magn. Reson. Imaging 2018;47:1205-1216. © 2017 International Society for Magnetic Resonance in Medicine.
Automated color classification of urine dipstick image in urine examination
NASA Astrophysics Data System (ADS)
Rahmat, R. F.; Royananda; Muchtar, M. A.; Taqiuddin, R.; Adnan, S.; Anugrahwaty, R.; Budiarto, R.
2018-03-01
Urine examination using urine dipstick has long been used to determine the health status of a person. The economical and convenient use of urine dipstick is one of the reasons urine dipstick is still used to check people health status. The real-life implementation of urine dipstick is done manually, in general, that is by comparing it with the reference color visually. This resulted perception differences in the color reading of the examination results. In this research, authors used a scanner to obtain the urine dipstick color image. The use of scanner can be one of the solutions in reading the result of urine dipstick because the light produced is consistent. A method is required to overcome the problems of urine dipstick color matching and the test reference color that have been conducted manually. The method proposed by authors is Euclidean Distance, Otsu along with RGB color feature extraction method to match the colors on the urine dipstick with the standard reference color of urine examination. The result shows that the proposed approach was able to classify the colors on a urine dipstick with an accuracy of 95.45%. The accuracy of color classification on urine dipstick against the standard reference color is influenced by the level of scanner resolution used, the higher the scanner resolution level, the higher the accuracy.
Selective classification for improved robustness of myoelectric control under nonideal conditions.
Scheme, Erik J; Englehart, Kevin B; Hudgins, Bernard S
2011-06-01
Recent literature in pattern recognition-based myoelectric control has highlighted a disparity between classification accuracy and the usability of upper limb prostheses. This paper suggests that the conventionally defined classification accuracy may be idealistic and may not reflect true clinical performance. Herein, a novel myoelectric control system based on a selective multiclass one-versus-one classification scheme, capable of rejecting unknown data patterns, is introduced. This scheme is shown to outperform nine other popular classifiers when compared using conventional classification accuracy as well as a form of leave-one-out analysis that may be more representative of real prosthetic use. Additionally, the classification scheme allows for real-time, independent adjustment of individual class-pair boundaries making it flexible and intuitive for clinical use.
Multi-source remotely sensed data fusion for improving land cover classification
NASA Astrophysics Data System (ADS)
Chen, Bin; Huang, Bo; Xu, Bing
2017-02-01
Although many advances have been made in past decades, land cover classification of fine-resolution remotely sensed (RS) data integrating multiple temporal, angular, and spectral features remains limited, and the contribution of different RS features to land cover classification accuracy remains uncertain. We proposed to improve land cover classification accuracy by integrating multi-source RS features through data fusion. We further investigated the effect of different RS features on classification performance. The results of fusing Landsat-8 Operational Land Imager (OLI) data with Moderate Resolution Imaging Spectroradiometer (MODIS), China Environment 1A series (HJ-1A), and Advanced Spaceborne Thermal Emission and Reflection (ASTER) digital elevation model (DEM) data, showed that the fused data integrating temporal, spectral, angular, and topographic features achieved better land cover classification accuracy than the original RS data. Compared with the topographic feature, the temporal and angular features extracted from the fused data played more important roles in classification performance, especially those temporal features containing abundant vegetation growth information, which markedly increased the overall classification accuracy. In addition, the multispectral and hyperspectral fusion successfully discriminated detailed forest types. Our study provides a straightforward strategy for hierarchical land cover classification by making full use of available RS data. All of these methods and findings could be useful for land cover classification at both regional and global scales.
NASA Technical Reports Server (NTRS)
Kim, H.; Swain, P. H.
1991-01-01
A method of classifying multisource data in remote sensing is presented. The proposed method considers each data source as an information source providing a body of evidence, represents statistical evidence by interval-valued probabilities, and uses Dempster's rule to integrate information based on multiple data source. The method is applied to the problems of ground-cover classification of multispectral data combined with digital terrain data such as elevation, slope, and aspect. Then this method is applied to simulated 201-band High Resolution Imaging Spectrometer (HIRIS) data by dividing the dimensionally huge data source into smaller and more manageable pieces based on the global statistical correlation information. It produces higher classification accuracy than the Maximum Likelihood (ML) classification method when the Hughes phenomenon is apparent.
NASA Astrophysics Data System (ADS)
Luo, Juhua; Duan, Hongtao; Ma, Ronghua; Jin, Xiuliang; Li, Fei; Hu, Weiping; Shi, Kun; Huang, Wenjiang
2017-05-01
Spatial information of the dominant species of submerged aquatic vegetation (SAV) is essential for restoration projects in eutrophic lakes, especially eutrophic Taihu Lake, China. Mapping the distribution of SAV species is very challenging and difficult using only multispectral satellite remote sensing. In this study, we proposed an approach to map the distribution of seven dominant species of SAV in Taihu Lake. Our approach involved information on the life histories of the seven SAV species and eight distribution maps of SAV from February to October. The life history information of the dominant SAV species was summarized from the literature and field surveys. Eight distribution maps of the SAV were extracted from eight 30 m HJ-CCD images from February to October in 2013 based on the classification tree models, and the overall classification accuracies for the SAV were greater than 80%. Finally, the spatial distribution of the SAV species in Taihu in 2013 was mapped using multilayer erasing approach. Based on validation, the overall classification accuracy for the seven species was 68.4%, and kappa was 0.6306, which suggests that larger differences in life histories between species can produce higher identification accuracies. The classification results show that Potamogeton malaianus was the most widely distributed species in Taihu Lake, followed by Myriophyllum spicatum, Potamogeton maackianus, Potamogeton crispus, Elodea nuttallii, Ceratophyllum demersum and Vallisneria spiralis. The information is useful for planning shallow-water habitat restoration projects.
A robust data scaling algorithm to improve classification accuracies in biomedical data.
Cao, Xi Hang; Stojkovic, Ivan; Obradovic, Zoran
2016-09-09
Machine learning models have been adapted in biomedical research and practice for knowledge discovery and decision support. While mainstream biomedical informatics research focuses on developing more accurate models, the importance of data preprocessing draws less attention. We propose the Generalized Logistic (GL) algorithm that scales data uniformly to an appropriate interval by learning a generalized logistic function to fit the empirical cumulative distribution function of the data. The GL algorithm is simple yet effective; it is intrinsically robust to outliers, so it is particularly suitable for diagnostic/classification models in clinical/medical applications where the number of samples is usually small; it scales the data in a nonlinear fashion, which leads to potential improvement in accuracy. To evaluate the effectiveness of the proposed algorithm, we conducted experiments on 16 binary classification tasks with different variable types and cover a wide range of applications. The resultant performance in terms of area under the receiver operation characteristic curve (AUROC) and percentage of correct classification showed that models learned using data scaled by the GL algorithm outperform the ones using data scaled by the Min-max and the Z-score algorithm, which are the most commonly used data scaling algorithms. The proposed GL algorithm is simple and effective. It is robust to outliers, so no additional denoising or outlier detection step is needed in data preprocessing. Empirical results also show models learned from data scaled by the GL algorithm have higher accuracy compared to the commonly used data scaling algorithms.
Training set extension for SVM ensemble in P300-speller with familiar face paradigm.
Li, Qi; Shi, Kaiyang; Gao, Ning; Li, Jian; Bai, Ou
2018-03-27
P300-spellers are brain-computer interface (BCI)-based character input systems. Support vector machine (SVM) ensembles are trained with large-scale training sets and used as classifiers in these systems. However, the required large-scale training data necessitate a prolonged collection time for each subject, which results in data collected toward the end of the period being contaminated by the subject's fatigue. This study aimed to develop a method for acquiring more training data based on a collected small training set. A new method was developed in which two corresponding training datasets in two sequences are superposed and averaged to extend the training set. The proposed method was tested offline on a P300-speller with the familiar face paradigm. The SVM ensemble with extended training set achieved 85% classification accuracy for the averaged results of four sequences, and 100% for 11 sequences in the P300-speller. In contrast, the conventional SVM ensemble with non-extended training set achieved only 65% accuracy for four sequences, and 92% for 11 sequences. The SVM ensemble with extended training set achieves higher classification accuracies than the conventional SVM ensemble, which verifies that the proposed method effectively improves the classification performance of BCI P300-spellers, thus enhancing their practicality.
Al-Bayati, Mohammad; Grueneisen, Johannes; Lütje, Susanne; Sawicki, Lino M; Suntharalingam, Saravanabavaan; Tschirdewahn, Stephan; Forsting, Michael; Rübben, Herbert; Herrmann, Ken; Umutlu, Lale; Wetter, Axel
2018-01-01
To evaluate diagnostic accuracy of integrated 68Gallium labelled prostate-specific membrane antigen (68Ga-PSMA)-11 positron emission tomography (PET)/MRI in patients with primary prostate cancer (PCa) as compared to multi-parametric MRI. A total of 22 patients with recently diagnosed primary PCa underwent clinically indicated 68Ga-PSMA-11 PET/CT for initial staging followed by integrated 68Ga-PSMA-11 PET/MRI. Images of multi-parametric magnetic resonance imaging (mpMRI), PET and PET/MRI were evaluated separately by applying Prostate Imaging Reporting and Data System (PIRADSv2) for mpMRI and a 5-point Likert scale for PET and PET/MRI. Results were compared with pathology reports of biopsy or resection. Statistical analyses including receiver operating characteristics analysis were performed to compare the diagnostic performance of mpMRI, PET and PET/MRI. PET and integrated PET/MRI demonstrated a higher diagnostic accuracy than mpMRI (area under the curve: mpMRI: 0.679, PET and PET/MRI: 0.951). The proportion of equivocal results (PIRADS 3 and Likert 3) was considerably higher in mpMRI than in PET and PET/MRI. In a notable proportion of equivocal PIRADS results, PET led to a correct shift towards higher suspicion of malignancy and enabled correct lesion classification. Integrated 68Ga-PSMA-11 PET/MRI demonstrates higher diagnostic accuracy than mpMRI and is particularly valuable in tumours with equivocal results from PIRADS classification. © 2018 S. Karger AG, Basel.
Feature ranking and rank aggregation for automatic sleep stage classification: a comparative study.
Najdi, Shirin; Gharbali, Ali Abdollahi; Fonseca, José Manuel
2017-08-18
Nowadays, sleep quality is one of the most important measures of healthy life, especially considering the huge number of sleep-related disorders. Identifying sleep stages using polysomnographic (PSG) signals is the traditional way of assessing sleep quality. However, the manual process of sleep stage classification is time-consuming, subjective and costly. Therefore, in order to improve the accuracy and efficiency of the sleep stage classification, researchers have been trying to develop automatic classification algorithms. Automatic sleep stage classification mainly consists of three steps: pre-processing, feature extraction and classification. Since classification accuracy is deeply affected by the extracted features, a poor feature vector will adversely affect the classifier and eventually lead to low classification accuracy. Therefore, special attention should be given to the feature extraction and selection process. In this paper the performance of seven feature selection methods, as well as two feature rank aggregation methods, were compared. Pz-Oz EEG, horizontal EOG and submental chin EMG recordings of 22 healthy males and females were used. A comprehensive feature set including 49 features was extracted from these recordings. The extracted features are among the most common and effective features used in sleep stage classification from temporal, spectral, entropy-based and nonlinear categories. The feature selection methods were evaluated and compared using three criteria: classification accuracy, stability, and similarity. Simulation results show that MRMR-MID achieves the highest classification performance while Fisher method provides the most stable ranking. In our simulations, the performance of the aggregation methods was in the average level, although they are known to generate more stable results and better accuracy. The Borda and RRA rank aggregation methods could not outperform significantly the conventional feature ranking methods. Among conventional methods, some of them slightly performed better than others, although the choice of a suitable technique is dependent on the computational complexity and accuracy requirements of the user.
Comparison of wheat classification accuracy using different classifiers of the image-100 system
NASA Technical Reports Server (NTRS)
Dejesusparada, N. (Principal Investigator); Chen, S. C.; Moreira, M. A.; Delima, A. M.
1981-01-01
Classification results using single-cell and multi-cell signature acquisition options, a point-by-point Gaussian maximum-likelihood classifier, and K-means clustering of the Image-100 system are presented. Conclusions reached are that: a better indication of correct classification can be provided by using a test area which contains various cover types of the study area; classification accuracy should be evaluated considering both the percentages of correct classification and error of commission; supervised classification approaches are better than K-means clustering; Gaussian distribution maximum likelihood classifier is better than Single-cell and Multi-cell Signature Acquisition Options of the Image-100 system; and in order to obtain a high classification accuracy in a large and heterogeneous crop area, using Gaussian maximum-likelihood classifier, homogeneous spectral subclasses of the study crop should be created to derive training statistics.
NASA Astrophysics Data System (ADS)
Bangs, Corey F.; Kruse, Fred A.; Olsen, Chris R.
2013-05-01
Hyperspectral data were assessed to determine the effect of integrating spectral data and extracted texture feature data on classification accuracy. Four separate spectral ranges (hundreds of spectral bands total) were used from the Visible and Near Infrared (VNIR) and Shortwave Infrared (SWIR) portions of the electromagnetic spectrum. Haralick texture features (contrast, entropy, and correlation) were extracted from the average gray-level image for each of the four spectral ranges studied. A maximum likelihood classifier was trained using a set of ground truth regions of interest (ROIs) and applied separately to the spectral data, texture data, and a fused dataset containing both. Classification accuracy was measured by comparison of results to a separate verification set of test ROIs. Analysis indicates that the spectral range (source of the gray-level image) used to extract the texture feature data has a significant effect on the classification accuracy. This result applies to texture-only classifications as well as the classification of integrated spectral data and texture feature data sets. Overall classification improvement for the integrated data sets was near 1%. Individual improvement for integrated spectral and texture classification of the "Urban" class showed approximately 9% accuracy increase over spectral-only classification. Texture-only classification accuracy was highest for the "Dirt Path" class at approximately 92% for the spectral range from 947 to 1343nm. This research demonstrates the effectiveness of texture feature data for more accurate analysis of hyperspectral data and the importance of selecting the correct spectral range to be used for the gray-level image source to extract these features.
Accuracy of Remotely Sensed Classifications For Stratification of Forest and Nonforest Lands
Raymond L. Czaplewski; Paul L. Patterson
2001-01-01
We specify accuracy standards for remotely sensed classifications used by FIA to stratify landscapes into two categories: forest and nonforest. Accuracy must be highest when forest area approaches 100 percent of the landscape. If forest area is rare in a landscape, then accuracy in the nonforest stratum must be very high, even at the expense of accuracy in the forest...
NASA Astrophysics Data System (ADS)
Shahriari Nia, Morteza; Wang, Daisy Zhe; Bohlman, Stephanie Ann; Gader, Paul; Graves, Sarah J.; Petrovic, Milenko
2015-01-01
Hyperspectral images can be used to identify savannah tree species at the landscape scale, which is a key step in measuring biomass and carbon, and tracking changes in species distributions, including invasive species, in these ecosystems. Before automated species mapping can be performed, image processing and atmospheric correction is often performed, which can potentially affect the performance of classification algorithms. We determine how three processing and correction techniques (atmospheric correction, Gaussian filters, and shade/green vegetation filters) affect the prediction accuracy of classification of tree species at pixel level from airborne visible/infrared imaging spectrometer imagery of longleaf pine savanna in Central Florida, United States. Species classification using fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) atmospheric correction outperformed ATCOR in the majority of cases. Green vegetation (normalized difference vegetation index) and shade (near-infrared) filters did not increase classification accuracy when applied to large and continuous patches of specific species. Finally, applying a Gaussian filter reduces interband noise and increases species classification accuracy. Using the optimal preprocessing steps, our classification accuracy of six species classes is about 75%.
Comparing Features for Classification of MEG Responses to Motor Imagery.
Halme, Hanna-Leena; Parkkonen, Lauri
2016-01-01
Motor imagery (MI) with real-time neurofeedback could be a viable approach, e.g., in rehabilitation of cerebral stroke. Magnetoencephalography (MEG) noninvasively measures electric brain activity at high temporal resolution and is well-suited for recording oscillatory brain signals. MI is known to modulate 10- and 20-Hz oscillations in the somatomotor system. In order to provide accurate feedback to the subject, the most relevant MI-related features should be extracted from MEG data. In this study, we evaluated several MEG signal features for discriminating between left- and right-hand MI and between MI and rest. MEG was measured from nine healthy participants imagining either left- or right-hand finger tapping according to visual cues. Data preprocessing, feature extraction and classification were performed offline. The evaluated MI-related features were power spectral density (PSD), Morlet wavelets, short-time Fourier transform (STFT), common spatial patterns (CSP), filter-bank common spatial patterns (FBCSP), spatio-spectral decomposition (SSD), and combined SSD+CSP, CSP+PSD, CSP+Morlet, and CSP+STFT. We also compared four classifiers applied to single trials using 5-fold cross-validation for evaluating the classification accuracy and its possible dependence on the classification algorithm. In addition, we estimated the inter-session left-vs-right accuracy for each subject. The SSD+CSP combination yielded the best accuracy in both left-vs-right (mean 73.7%) and MI-vs-rest (mean 81.3%) classification. CSP+Morlet yielded the best mean accuracy in inter-session left-vs-right classification (mean 69.1%). There were large inter-subject differences in classification accuracy, and the level of the 20-Hz suppression correlated significantly with the subjective MI-vs-rest accuracy. Selection of the classification algorithm had only a minor effect on the results. We obtained good accuracy in sensor-level decoding of MI from single-trial MEG data. Feature extraction methods utilizing both the spatial and spectral profile of MI-related signals provided the best classification results, suggesting good performance of these methods in an online MEG neurofeedback system.
PCA based feature reduction to improve the accuracy of decision tree c4.5 classification
NASA Astrophysics Data System (ADS)
Nasution, M. Z. F.; Sitompul, O. S.; Ramli, M.
2018-03-01
Splitting attribute is a major process in Decision Tree C4.5 classification. However, this process does not give a significant impact on the establishment of the decision tree in terms of removing irrelevant features. It is a major problem in decision tree classification process called over-fitting resulting from noisy data and irrelevant features. In turns, over-fitting creates misclassification and data imbalance. Many algorithms have been proposed to overcome misclassification and overfitting on classifications Decision Tree C4.5. Feature reduction is one of important issues in classification model which is intended to remove irrelevant data in order to improve accuracy. The feature reduction framework is used to simplify high dimensional data to low dimensional data with non-correlated attributes. In this research, we proposed a framework for selecting relevant and non-correlated feature subsets. We consider principal component analysis (PCA) for feature reduction to perform non-correlated feature selection and Decision Tree C4.5 algorithm for the classification. From the experiments conducted using available data sets from UCI Cervical cancer data set repository with 858 instances and 36 attributes, we evaluated the performance of our framework based on accuracy, specificity and precision. Experimental results show that our proposed framework is robust to enhance classification accuracy with 90.70% accuracy rates.
Improving crop classification through attention to the timing of airborne radar acquisitions
NASA Technical Reports Server (NTRS)
Brisco, B.; Ulaby, F. T.; Protz, R.
1984-01-01
Radar remote sensors may provide valuable input to crop classification procedures because of (1) their independence of weather conditions and solar illumination, and (2) their ability to respond to differences in crop type. Manual classification of multidate synthetic aperture radar (SAR) imagery resulted in an overall accuracy of 83 percent for corn, forest, grain, and 'other' cover types. Forests and corn fields were identified with accuracies approaching or exceeding 90 percent. Grain fields and 'other' fields were often confused with each other, resulting in classification accuracies of 51 and 66 percent, respectively. The 83 percent correct classification represents a 10 percent improvement when compared to similar SAR data for the same area collected at alternate time periods in 1978. These results demonstrate that improvements in crop classification accuracy can be achieved with SAR data by synchronizing data collection times with crop growth stages in order to maximize differences in the geometric and dielectric properties of the cover types of interest.
Xia, Wenjun; Mita, Yoshio; Shibata, Tadashi
2016-05-01
Aiming at efficient data condensation and improving accuracy, this paper presents a hardware-friendly template reduction (TR) method for the nearest neighbor (NN) classifiers by introducing the concept of critical boundary vectors. A hardware system is also implemented to demonstrate the feasibility of using an field-programmable gate array (FPGA) to accelerate the proposed method. Initially, k -means centers are used as substitutes for the entire template set. Then, to enhance the classification performance, critical boundary vectors are selected by a novel learning algorithm, which is completed within a single iteration. Moreover, to remove noisy boundary vectors that can mislead the classification in a generalized manner, a global categorization scheme has been explored and applied to the algorithm. The global characterization automatically categorizes each classification problem and rapidly selects the boundary vectors according to the nature of the problem. Finally, only critical boundary vectors and k -means centers are used as the new template set for classification. Experimental results for 24 data sets show that the proposed algorithm can effectively reduce the number of template vectors for classification with a high learning speed. At the same time, it improves the accuracy by an average of 2.17% compared with the traditional NN classifiers and also shows greater accuracy than seven other TR methods. We have shown the feasibility of using a proof-of-concept FPGA system of 256 64-D vectors to accelerate the proposed method on hardware. At a 50-MHz clock frequency, the proposed system achieves a 3.86 times higher learning speed than on a 3.4-GHz PC, while consuming only 1% of the power of that used by the PC.
NASA Astrophysics Data System (ADS)
Knoefel, Patrick; Loew, Fabian; Conrad, Christopher
2015-04-01
Crop maps based on classification of remotely sensed data are of increased attendance in agricultural management. This induces a more detailed knowledge about the reliability of such spatial information. However, classification of agricultural land use is often limited by high spectral similarities of the studied crop types. More, spatially and temporally varying agro-ecological conditions can introduce confusion in crop mapping. Classification errors in crop maps in turn may have influence on model outputs, like agricultural production monitoring. One major goal of the PhenoS project ("Phenological structuring to determine optimal acquisition dates for Sentinel-2 data for field crop classification"), is the detection of optimal phenological time windows for land cover classification purposes. Since many crop species are spectrally highly similar, accurate classification requires the right selection of satellite images for a certain classification task. In the course of one growing season, phenological phases exist where crops are separable with higher accuracies. For this purpose, coupling of multi-temporal spectral characteristics and phenological events is promising. The focus of this study is set on the separation of spectrally similar cereal crops like winter wheat, barley, and rye of two test sites in Germany called "Harz/Central German Lowland" and "Demmin". However, this study uses object based random forest (RF) classification to investigate the impact of image acquisition frequency and timing on crop classification uncertainty by permuting all possible combinations of available RapidEye time series recorded on the test sites between 2010 and 2014. The permutations were applied to different segmentation parameters. Then, classification uncertainty was assessed and analysed, based on the probabilistic soft-output from the RF algorithm at the per-field basis. From this soft output, entropy was calculated as a spatial measure of classification uncertainty. The results indicate that uncertainty estimates provide a valuable addition to traditional accuracy assessments and helps the user to allocate error in crop maps.
IMPACTS OF PATCH SIZE AND LANDSCAPE HETEROGENEITY ON THEMATIC IMAGE CLASSIFICATION ACCURACY
Impacts of Patch Size and Landscape Heterogeneity on Thematic Image Classification Accuracy.
Currently, most thematic accuracy assessments of classified remotely sensed images oily account for errors between the various classes employed, at particular pixels of interest, thu...
Tahmasian, Masoud; Jamalabadi, Hamidreza; Abedini, Mina; Ghadami, Mohammad R; Sepehry, Amir A; Knight, David C; Khazaie, Habibolah
2017-05-22
Sleep disturbance is common in chronic post-traumatic stress disorder (PTSD). However, prior work has demonstrated that there are inconsistencies between subjective and objective assessments of sleep disturbance in PTSD. Therefore, we investigated whether subjective or objective sleep assessment has greater clinical utility to differentiate PTSD patients from healthy subjects. Further, we evaluated whether the combination of subjective and objective methods improves the accuracy of classification into patient versus healthy groups, which has important diagnostic implications. We recruited 32 chronic war-induced PTSD patients and 32 age- and gender-matched healthy subjects to participate in this study. Subjective (i.e. from three self-reported sleep questionnaires) and objective sleep-related data (i.e. from actigraphy scores) were collected from each participant. Subjective, objective, and combined (subjective and objective) sleep data were then analyzed using support vector machine classification. The classification accuracy, sensitivity, and specificity for subjective variables were 89.2%, 89.3%, and 89%, respectively. The classification accuracy, sensitivity, and specificity for objective variables were 65%, 62.3%, and 67.8%, respectively. The classification accuracy, sensitivity, and specificity for the aggregate variables (combination of subjective and objective variables) were 91.6%, 93.0%, and 90.3%, respectively. Our findings indicate that classification accuracy using subjective measurements is superior to objective measurements and the combination of both assessments appears to improve the classification accuracy for differentiating PTSD patients from healthy individuals. Copyright © 2017 Elsevier B.V. All rights reserved.
Prahm, Cosima; Eckstein, Korbinian; Ortiz-Catalan, Max; Dorffner, Georg; Kaniusas, Eugenijus; Aszmann, Oskar C
2016-08-31
Controlling a myoelectric prosthesis for upper limbs is increasingly challenging for the user as more electrodes and joints become available. Motion classification based on pattern recognition with a multi-electrode array allows multiple joints to be controlled simultaneously. Previous pattern recognition studies are difficult to compare, because individual research groups use their own data sets. To resolve this shortcoming and to facilitate comparisons, open access data sets were analysed using components of BioPatRec and Netlab pattern recognition models. Performances of the artificial neural networks, linear models, and training program components were compared. Evaluation took place within the BioPatRec environment, a Matlab-based open source platform that provides feature extraction, processing and motion classification algorithms for prosthetic control. The algorithms were applied to myoelectric signals for individual and simultaneous classification of movements, with the aim of finding the best performing algorithm and network model. Evaluation criteria included classification accuracy and training time. Results in both the linear and the artificial neural network models demonstrated that Netlab's implementation using scaled conjugate training algorithm reached significantly higher accuracies than BioPatRec. It is concluded that the best movement classification performance would be achieved through integrating Netlab training algorithms in the BioPatRec environment so that future prosthesis training can be shortened and control made more reliable. Netlab was therefore included into the newest release of BioPatRec (v4.0).
Fan, Yong; Batmanghelich, Nematollah; Clark, Chris M.; Davatzikos, Christos
2010-01-01
Spatial patterns of brain atrophy in mild cognitive impairment (MCI) and Alzheimer’s disease (AD) were measured via methods of computational neuroanatomy. These patterns were spatially complex and involved many brain regions. In addition to the hippocampus and the medial temporal lobe gray matter, a number of other regions displayed significant atrophy, including orbitofrontal and medial-prefrontal grey matter, cingulate (mainly posterior), insula, uncus, and temporal lobe white matter. Approximately 2/3 of the MCI group presented patterns of atrophy that overlapped with AD, whereas the remaining 1/3 overlapped with cognitively normal individuals, thereby indicating that some, but not all, MCI patients have significant and extensive brain atrophy in this cohort of MCI patients. Importantly, the group with AD-like patterns presented much higher rate of MMSE decline in follow-up visits; conversely, pattern classification provided relatively high classification accuracy (87%) of the individuals that presented relatively higher MMSE decline within a year from baseline. High-dimensional pattern classification, a nonlinear multivariate analysis, provided measures of structural abnormality that can potentially be useful for individual patient classification, as well as for predicting progression and examining multivariate relationships in group analyses. PMID:18053747
NASA Astrophysics Data System (ADS)
Musa Abbagoni, Baba; Yeung, Hoi
2016-08-01
The identification of flow pattern is a key issue in multiphase flow which is encountered in the petrochemical industry. It is difficult to identify the gas-liquid flow regimes objectively with the gas-liquid two-phase flow. This paper presents the feasibility of a clamp-on instrument for an objective flow regime classification of two-phase flow using an ultrasonic Doppler sensor and an artificial neural network, which records and processes the ultrasonic signals reflected from the two-phase flow. Experimental data is obtained on a horizontal test rig with a total pipe length of 21 m and 5.08 cm internal diameter carrying air-water two-phase flow under slug, elongated bubble, stratified-wavy and, stratified flow regimes. Multilayer perceptron neural networks (MLPNNs) are used to develop the classification model. The classifier requires features as an input which is representative of the signals. Ultrasound signal features are extracted by applying both power spectral density (PSD) and discrete wavelet transform (DWT) methods to the flow signals. A classification scheme of ‘1-of-C coding method for classification’ was adopted to classify features extracted into one of four flow regime categories. To improve the performance of the flow regime classifier network, a second level neural network was incorporated by using the output of a first level networks feature as an input feature. The addition of the two network models provided a combined neural network model which has achieved a higher accuracy than single neural network models. Classification accuracies are evaluated in the form of both the PSD and DWT features. The success rates of the two models are: (1) using PSD features, the classifier missed 3 datasets out of 24 test datasets of the classification and scored 87.5% accuracy; (2) with the DWT features, the network misclassified only one data point and it was able to classify the flow patterns up to 95.8% accuracy. This approach has demonstrated the success of a clamp-on ultrasound sensor for flow regime classification that would be possible in industry practice. It is considerably more promising than other techniques as it uses a non-invasive and non-radioactive sensor.
Global Optimization Ensemble Model for Classification Methods
Anwar, Hina; Qamar, Usman; Muzaffar Qureshi, Abdul Wahab
2014-01-01
Supervised learning is the process of data mining for deducing rules from training datasets. A broad array of supervised learning algorithms exists, every one of them with its own advantages and drawbacks. There are some basic issues that affect the accuracy of classifier while solving a supervised learning problem, like bias-variance tradeoff, dimensionality of input space, and noise in the input data space. All these problems affect the accuracy of classifier and are the reason that there is no global optimal method for classification. There is not any generalized improvement method that can increase the accuracy of any classifier while addressing all the problems stated above. This paper proposes a global optimization ensemble model for classification methods (GMC) that can improve the overall accuracy for supervised learning problems. The experimental results on various public datasets showed that the proposed model improved the accuracy of the classification models from 1% to 30% depending upon the algorithm complexity. PMID:24883382
NASA Astrophysics Data System (ADS)
Jing, Ya-Bing; Liu, Chang-Wen; Bi, Feng-Rong; Bi, Xiao-Yang; Wang, Xia; Shao, Kang
2017-07-01
Numerous vibration-based techniques are rarely used in diesel engines fault diagnosis in a direct way, due to the surface vibration signals of diesel engines with the complex non-stationary and nonlinear time-varying features. To investigate the fault diagnosis of diesel engines, fractal correlation dimension, wavelet energy and entropy as features reflecting the diesel engine fault fractal and energy characteristics are extracted from the decomposed signals through analyzing vibration acceleration signals derived from the cylinder head in seven different states of valve train. An intelligent fault detector FastICA-SVM is applied for diesel engine fault diagnosis and classification. The results demonstrate that FastICA-SVM achieves higher classification accuracy and makes better generalization performance in small samples recognition. Besides, the fractal correlation dimension and wavelet energy and entropy as the special features of diesel engine vibration signal are considered as input vectors of classifier FastICA-SVM and could produce the excellent classification results. The proposed methodology improves the accuracy of feature extraction and the fault diagnosis of diesel engines.
Supervised Learning Applied to Air Traffic Trajectory Classification
NASA Technical Reports Server (NTRS)
Bosson, Christabelle S.; Nikoleris, Tasos
2018-01-01
Given the recent increase of interest in introducing new vehicle types and missions into the National Airspace System, a transition towards a more autonomous air traffic control system is required in order to enable and handle increased density and complexity. This paper presents an exploratory effort of the needed autonomous capabilities by exploring supervised learning techniques in the context of aircraft trajectories. In particular, it focuses on the application of machine learning algorithms and neural network models to a runway recognition trajectory-classification study. It investigates the applicability and effectiveness of various classifiers using datasets containing trajectory records for a month of air traffic. A feature importance and sensitivity analysis are conducted to challenge the chosen time-based datasets and the ten selected features. The study demonstrates that classification accuracy levels of 90% and above can be reached in less than 40 seconds of training for most machine learning classifiers when one track data point, described by the ten selected features at a particular time step, per trajectory is used as input. It also shows that neural network models can achieve similar accuracy levels but at higher training time costs.
Gaze-independent ERP-BCIs: augmenting performance through location-congruent bimodal stimuli
Thurlings, Marieke E.; Brouwer, Anne-Marie; Van Erp, Jan B. F.; Werkhoven, Peter
2014-01-01
Gaze-independent event-related potential (ERP) based brain-computer interfaces (BCIs) yield relatively low BCI performance and traditionally employ unimodal stimuli. Bimodal ERP-BCIs may increase BCI performance due to multisensory integration or summation in the brain. An additional advantage of bimodal BCIs may be that the user can choose which modality or modalities to attend to. We studied bimodal, visual-tactile, gaze-independent BCIs and investigated whether or not ERP components’ tAUCs and subsequent classification accuracies are increased for (1) bimodal vs. unimodal stimuli; (2) location-congruent vs. location-incongruent bimodal stimuli; and (3) attending to both modalities vs. to either one modality. We observed an enhanced bimodal (compared to unimodal) P300 tAUC, which appeared to be positively affected by location-congruency (p = 0.056) and resulted in higher classification accuracies. Attending either to one or to both modalities of the bimodal location-congruent stimuli resulted in differences between ERP components, but not in classification performance. We conclude that location-congruent bimodal stimuli improve ERP-BCIs, and offer the user the possibility to switch the attended modality without losing performance. PMID:25249947
Three-Category Classification of Magnetic Resonance Hearing Loss Images Based on Deep Autoencoder.
Jia, Wenjuan; Yang, Ming; Wang, Shui-Hua
2017-09-11
Hearing loss, a partial or total inability to hear, is known as hearing impairment. Untreated hearing loss can have a bad effect on normal social communication, and it can cause psychological problems in patients. Therefore, we design a three-category classification system to detect the specific category of hearing loss, which is beneficial to be treated in time for patients. Before the training and test stages, we use the technology of data augmentation to produce a balanced dataset. Then we use deep autoencoder neural network to classify the magnetic resonance brain images. In the stage of deep autoencoder, we use stacked sparse autoencoder to generate visual features, and softmax layer to classify the different brain images into three categories of hearing loss. Our method can obtain good experimental results. The overall accuracy of our method is 99.5%, and the time consuming is 0.078 s per brain image. Our proposed method based on stacked sparse autoencoder works well in classification of hearing loss images. The overall accuracy of our method is 4% higher than the best of state-of-the-art approaches.
Sosenko, Jay M; Skyler, Jay S; Mahon, Jeffrey; Krischer, Jeffrey P; Greenbaum, Carla J; Rafkin, Lisa E; Beam, Craig A; Boulware, David C; Matheson, Della; Cuthbertson, David; Herold, Kevan C; Eisenbarth, George; Palmer, Jerry P
2014-04-01
OBJECTIVE We studied the utility of the Diabetes Prevention Trial-Type 1 Risk Score (DPTRS) for improving the accuracy of type 1 diabetes (T1D) risk classification in TrialNet Natural History Study (TNNHS) participants. RESEARCH DESIGN AND METHODS The cumulative incidence of T1D was compared between normoglycemic individuals with DPTRS values >7.00 and dysglycemic individuals in the TNNHS (n = 991). It was also compared between individuals with DPTRS values <7.00 or >7.00 among those with dysglycemia and those with multiple autoantibodies in the TNNHS. DPTRS values >7.00 were compared with dysglycemia for characterizing risk in Diabetes Prevention Trial-Type 1 (DPT-1) (n = 670) and TNNHS participants. The reliability of DPTRS values >7.00 was compared with dysglycemia in the TNNHS. RESULTS The cumulative incidence of T1D for normoglycemic TNNHS participants with DPTRS values >7.00 was comparable to those with dysglycemia. Among those with dysglycemia, the cumulative incidence was much higher (P < 0.001) for those with DPTRS values >7.00 than for those with values <7.00 (3-year risks: 0.16 for <7.00 and 0.46 for >7.00). Dysglycemic individuals in DPT-1 were at much higher risk for T1D than those with dysglycemia in the TNNHS (P < 0.001); there was no significant difference in risk between the studies among those with DPTRS values >7.00. The proportion in the TNNHS reverting from dysglycemia to normoglycemia at the next visit was higher than the proportion reverting from DPTRS values >7.00 to values <7.00 (36 vs. 23%). CONCLUSIONS DPTRS thresholds can improve T1D risk classification accuracy by identifying high-risk normoglycemic and low-risk dysglycemic individuals. The 7.00 DPTRS threshold characterizes risk more consistently between populations and has greater reliability than dysglycemia.
NASA Astrophysics Data System (ADS)
Chauhan, H.; Krishna Mohan, B.
2014-11-01
The present study was undertaken with the objective to check effectiveness of spectral similarity measures to develop precise crop spectra from the collected hyperspectral field spectra. In Multispectral and Hyperspectral remote sensing, classification of pixels is obtained by statistical comparison (by means of spectral similarity) of known field or library spectra to unknown image spectra. Though these algorithms are readily used, little emphasis has been placed on use of various spectral similarity measures to select precise crop spectra from the set of field spectra. Conventionally crop spectra are developed after rejecting outliers based only on broad-spectrum analysis. Here a successful attempt has been made to develop precise crop spectra based on spectral similarity. As unevaluated data usage leads to uncertainty in the image classification, it is very crucial to evaluate the data. Hence, notwithstanding the conventional method, the data precision has been performed effectively to serve the purpose of the present research work. The effectiveness of developed precise field spectra was evaluated by spectral discrimination measures and found higher discrimination values compared to spectra developed conventionally. Overall classification accuracy for the image classified by field spectra selected conventionally is 51.89% and 75.47% for the image classified by field spectra selected precisely based on spectral similarity. KHAT values are 0.37, 0.62 and Z values are 2.77, 9.59 for image classified using conventional and precise field spectra respectively. Reasonable higher classification accuracy, KHAT and Z values shows the possibility of a new approach for field spectra selection based on spectral similarity measure.
NASA Astrophysics Data System (ADS)
Li, Long; Solana, Carmen; Canters, Frank; Kervyn, Matthieu
2017-10-01
Mapping lava flows using satellite images is an important application of remote sensing in volcanology. Several volcanoes have been mapped through remote sensing using a wide range of data, from optical to thermal infrared and radar images, using techniques such as manual mapping, supervised/unsupervised classification, and elevation subtraction. So far, spectral-based mapping applications mainly focus on the use of traditional pixel-based classifiers, without much investigation into the added value of object-based approaches and into advantages of using machine learning algorithms. In this study, Nyamuragira, characterized by a series of > 20 overlapping lava flows erupted over the last century, was used as a case study. The random forest classifier was tested to map lava flows based on pixels and objects. Image classification was conducted for the 20 individual flows and for 8 groups of flows of similar age using a Landsat 8 image and a DEM of the volcano, both at 30-meter spatial resolution. Results show that object-based classification produces maps with continuous and homogeneous lava surfaces, in agreement with the physical characteristics of lava flows, while lava flows mapped through the pixel-based classification are heterogeneous and fragmented including much "salt and pepper noise". In terms of accuracy, both pixel-based and object-based classification performs well but the former results in higher accuracies than the latter except for mapping lava flow age groups without using topographic features. It is concluded that despite spectral similarity, lava flows of contrasting age can be well discriminated and mapped by means of image classification. The classification approach demonstrated in this study only requires easily accessible image data and can be applied to other volcanoes as well if there is sufficient information to calibrate the mapping.
Bredesen, Ida Marie; Bjøro, Karen; Gunningberg, Lena; Hofoss, Dag
2016-05-01
Pressure ulcers (PUs) are a problem in health care. Staff competency is paramount to PU prevention. Education is essential to increase skills in pressure ulcer classification and risk assessment. Currently, no pressure ulcer learning programs are available in Norwegian. Develop and test an e-learning program for assessment of pressure ulcer risk and pressure ulcer classification. Forty-four nurses working in acute care hospital wards or nursing homes participated and were assigned randomly into two groups: an e-learning program group (intervention) and a traditional classroom lecture group (control). Data was collected immediately before and after training, and again after three months. The study was conducted at one nursing home and two hospitals between May and December 2012. Accuracy of risk assessment (five patient cases) and pressure ulcer classification (40 photos [normal skin, pressure ulcer categories I-IV] split in two sets) were measured by comparing nurse evaluations in each of the two groups to a pre-established standard based on ratings by experts in pressure ulcer classification and risk assessment. Inter-rater reliability was measured by exact percent agreement and multi-rater Fleiss kappa. A Mann-Whitney U test was used for continuous sum score variables. An e-learning program did not improve Braden subscale scoring. For pressure ulcer classification, however, the intervention group scored significantly higher than the control group on several of the categories in post-test immediately after training. However, after three months there were no significant differences in classification skills between the groups. An e-learning program appears to have a greater effect on the accuracy of pressure ulcer classification than classroom teaching in the short term. For proficiency in Braden scoring, no significant effect of educational methods on learning results was detected. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Geelen, Christopher D.; Wijnhoven, Rob G. J.; Dubbelman, Gijs; de With, Peter H. N.
2015-03-01
This research considers gender classification in surveillance environments, typically involving low-resolution images and a large amount of viewpoint variations and occlusions. Gender classification is inherently difficult due to the large intra-class variation and interclass correlation. We have developed a gender classification system, which is successfully evaluated on two novel datasets, which realistically consider the above conditions, typical for surveillance. The system reaches a mean accuracy of up to 90% and approaches our human baseline of 92.6%, proving a high-quality gender classification system. We also present an in-depth discussion of the fundamental differences between SVM and RF classifiers. We conclude that balancing the degree of randomization in any classifier is required for the highest classification accuracy. For our problem, an RF-SVM hybrid classifier exploiting the combination of HSV and LBP features results in the highest classification accuracy of 89.9 0.2%, while classification computation time is negligible compared to the detection time of pedestrians.
A Directed Acyclic Graph-Large Margin Distribution Machine Model for Music Symbol Classification
Wen, Cuihong; Zhang, Jing; Rebelo, Ana; Cheng, Fanyong
2016-01-01
Optical Music Recognition (OMR) has received increasing attention in recent years. In this paper, we propose a classifier based on a new method named Directed Acyclic Graph-Large margin Distribution Machine (DAG-LDM). The DAG-LDM is an improvement of the Large margin Distribution Machine (LDM), which is a binary classifier that optimizes the margin distribution by maximizing the margin mean and minimizing the margin variance simultaneously. We modify the LDM to the DAG-LDM to solve the multi-class music symbol classification problem. Tests are conducted on more than 10000 music symbol images, obtained from handwritten and printed images of music scores. The proposed method provides superior classification capability and achieves much higher classification accuracy than the state-of-the-art algorithms such as Support Vector Machines (SVMs) and Neural Networks (NNs). PMID:26985826
A Directed Acyclic Graph-Large Margin Distribution Machine Model for Music Symbol Classification.
Wen, Cuihong; Zhang, Jing; Rebelo, Ana; Cheng, Fanyong
2016-01-01
Optical Music Recognition (OMR) has received increasing attention in recent years. In this paper, we propose a classifier based on a new method named Directed Acyclic Graph-Large margin Distribution Machine (DAG-LDM). The DAG-LDM is an improvement of the Large margin Distribution Machine (LDM), which is a binary classifier that optimizes the margin distribution by maximizing the margin mean and minimizing the margin variance simultaneously. We modify the LDM to the DAG-LDM to solve the multi-class music symbol classification problem. Tests are conducted on more than 10000 music symbol images, obtained from handwritten and printed images of music scores. The proposed method provides superior classification capability and achieves much higher classification accuracy than the state-of-the-art algorithms such as Support Vector Machines (SVMs) and Neural Networks (NNs).
NASA Astrophysics Data System (ADS)
Tamimi, E.; Ebadi, H.; Kiani, A.
2017-09-01
Automatic building detection from High Spatial Resolution (HSR) images is one of the most important issues in Remote Sensing (RS). Due to the limited number of spectral bands in HSR images, using other features will lead to improve accuracy. By adding these features, the presence probability of dependent features will be increased, which leads to accuracy reduction. In addition, some parameters should be determined in Support Vector Machine (SVM) classification. Therefore, it is necessary to simultaneously determine classification parameters and select independent features according to image type. Optimization algorithm is an efficient method to solve this problem. On the other hand, pixel-based classification faces several challenges such as producing salt-paper results and high computational time in high dimensional data. Hence, in this paper, a novel method is proposed to optimize object-based SVM classification by applying continuous Ant Colony Optimization (ACO) algorithm. The advantages of the proposed method are relatively high automation level, independency of image scene and type, post processing reduction for building edge reconstruction and accuracy improvement. The proposed method was evaluated by pixel-based SVM and Random Forest (RF) classification in terms of accuracy. In comparison with optimized pixel-based SVM classification, the results showed that the proposed method improved quality factor and overall accuracy by 17% and 10%, respectively. Also, in the proposed method, Kappa coefficient was improved by 6% rather than RF classification. Time processing of the proposed method was relatively low because of unit of image analysis (image object). These showed the superiority of the proposed method in terms of time and accuracy.
Speaker-sensitive emotion recognition via ranking: Studies on acted and spontaneous speech☆
Cao, Houwei; Verma, Ragini; Nenkova, Ani
2014-01-01
We introduce a ranking approach for emotion recognition which naturally incorporates information about the general expressivity of speakers. We demonstrate that our approach leads to substantial gains in accuracy compared to conventional approaches. We train ranking SVMs for individual emotions, treating the data from each speaker as a separate query, and combine the predictions from all rankers to perform multi-class prediction. The ranking method provides two natural benefits. It captures speaker specific information even in speaker-independent training/testing conditions. It also incorporates the intuition that each utterance can express a mix of possible emotion and that considering the degree to which each emotion is expressed can be productively exploited to identify the dominant emotion. We compare the performance of the rankers and their combination to standard SVM classification approaches on two publicly available datasets of acted emotional speech, Berlin and LDC, as well as on spontaneous emotional data from the FAU Aibo dataset. On acted data, ranking approaches exhibit significantly better performance compared to SVM classification both in distinguishing a specific emotion from all others and in multi-class prediction. On the spontaneous data, which contains mostly neutral utterances with a relatively small portion of less intense emotional utterances, ranking-based classifiers again achieve much higher precision in identifying emotional utterances than conventional SVM classifiers. In addition, we discuss the complementarity of conventional SVM and ranking-based classifiers. On all three datasets we find dramatically higher accuracy for the test items on whose prediction the two methods agree compared to the accuracy of individual methods. Furthermore on the spontaneous data the ranking and standard classification are complementary and we obtain marked improvement when we combine the two classifiers by late-stage fusion. PMID:25422534
Speaker-sensitive emotion recognition via ranking: Studies on acted and spontaneous speech☆
Cao, Houwei; Verma, Ragini; Nenkova, Ani
2015-01-01
We introduce a ranking approach for emotion recognition which naturally incorporates information about the general expressivity of speakers. We demonstrate that our approach leads to substantial gains in accuracy compared to conventional approaches. We train ranking SVMs for individual emotions, treating the data from each speaker as a separate query, and combine the predictions from all rankers to perform multi-class prediction. The ranking method provides two natural benefits. It captures speaker specific information even in speaker-independent training/testing conditions. It also incorporates the intuition that each utterance can express a mix of possible emotion and that considering the degree to which each emotion is expressed can be productively exploited to identify the dominant emotion. We compare the performance of the rankers and their combination to standard SVM classification approaches on two publicly available datasets of acted emotional speech, Berlin and LDC, as well as on spontaneous emotional data from the FAU Aibo dataset. On acted data, ranking approaches exhibit significantly better performance compared to SVM classification both in distinguishing a specific emotion from all others and in multi-class prediction. On the spontaneous data, which contains mostly neutral utterances with a relatively small portion of less intense emotional utterances, ranking-based classifiers again achieve much higher precision in identifying emotional utterances than conventional SVM classifiers. In addition, we discuss the complementarity of conventional SVM and ranking-based classifiers. On all three datasets we find dramatically higher accuracy for the test items on whose prediction the two methods agree compared to the accuracy of individual methods. Furthermore on the spontaneous data the ranking and standard classification are complementary and we obtain marked improvement when we combine the two classifiers by late-stage fusion.
Aktaruzzaman, M; Migliorini, M; Tenhunen, M; Himanen, S L; Bianchi, A M; Sassi, R
2015-05-01
The work considers automatic sleep stage classification, based on heart rate variability (HRV) analysis, with a focus on the distinction of wakefulness (WAKE) from sleep and rapid eye movement (REM) from non-REM (NREM) sleep. A set of 20 automatically annotated one-night polysomnographic recordings was considered, and artificial neural networks were selected for classification. For each inter-heartbeat (RR) series, beside features previously presented in literature, we introduced a set of four parameters related to signal regularity. RR series of three different lengths were considered (corresponding to 2, 6, and 10 successive epochs, 30 s each, in the same sleep stage). Two sets of only four features captured 99 % of the data variance in each classification problem, and both of them contained one of the new regularity features proposed. The accuracy of classification for REM versus NREM (68.4 %, 2 epochs; 83.8 %, 10 epochs) was higher than when distinguishing WAKE versus SLEEP (67.6 %, 2 epochs; 71.3 %, 10 epochs). Also, the reliability parameter (Cohens's Kappa) was higher (0.68 and 0.45, respectively). Sleep staging classification based on HRV was still less precise than other staging methods, employing a larger variety of signals collected during polysomnographic studies. However, cheap and unobtrusive HRV-only sleep classification proved sufficiently precise for a wide range of applications.
Integrating conventional and inverse representation for face recognition.
Xu, Yong; Li, Xuelong; Yang, Jian; Lai, Zhihui; Zhang, David
2014-10-01
Representation-based classification methods are all constructed on the basis of the conventional representation, which first expresses the test sample as a linear combination of the training samples and then exploits the deviation between the test sample and the expression result of every class to perform classification. However, this deviation does not always well reflect the difference between the test sample and each class. With this paper, we propose a novel representation-based classification method for face recognition. This method integrates conventional and the inverse representation-based classification for better recognizing the face. It first produces conventional representation of the test sample, i.e., uses a linear combination of the training samples to represent the test sample. Then it obtains the inverse representation, i.e., provides an approximation representation of each training sample of a subject by exploiting the test sample and training samples of the other subjects. Finally, the proposed method exploits the conventional and inverse representation to generate two kinds of scores of the test sample with respect to each class and combines them to recognize the face. The paper shows the theoretical foundation and rationale of the proposed method. Moreover, this paper for the first time shows that a basic nature of the human face, i.e., the symmetry of the face can be exploited to generate new training and test samples. As these new samples really reflect some possible appearance of the face, the use of them will enable us to obtain higher accuracy. The experiments show that the proposed conventional and inverse representation-based linear regression classification (CIRLRC), an improvement to linear regression classification (LRC), can obtain very high accuracy and greatly outperforms the naive LRC and other state-of-the-art conventional representation based face recognition methods. The accuracy of CIRLRC can be 10% greater than that of LRC.
[Extracting black soil border in Heilongjiang province based on spectral angle match method].
Zhang, Xin-Le; Zhang, Shu-Wen; Li, Ying; Liu, Huan-Jun
2009-04-01
As soils are generally covered by vegetation most time of a year, the spectral reflectance collected by remote sensing technique is from the mixture of soil and vegetation, so the classification precision based on remote sensing (RS) technique is unsatisfied. Under RS and geographic information systems (GIS) environment and with the help of buffer and overlay analysis methods, land use and soil maps were used to derive regions of interest (ROI) for RS supervised classification, which plus MODIS reflectance products were chosen to extract black soil border, with methods including spectral single match. The results showed that the black soil border in Heilongjiang province can be extracted with soil remote sensing method based on MODIS reflectance products, especially in the north part of black soil zone; the classification precision of spectral angel mapping method is the highest, but the classifying accuracy of other soils can not meet the need, because of vegetation covering and similar spectral characteristics; even for the same soil, black soil, the classifying accuracy has obvious spatial heterogeneity, in the north part of black soil zone in Heilongjiang province it is higher than in the south, which is because of spectral differences; as soil uncovering period in Northeastern China is relatively longer, high temporal resolution make MODIS images get the advantage over soil remote sensing classification; with the help of GIS, extracting ROIs by making the best of auxiliary data can improve the precision of soil classification; with the help of auxiliary information, such as topography and climate, the classification accuracy was enhanced significantly. As there are five main factors determining soil classes, much data of different types, such as DEM, terrain factors, climate (temperature, precipitation, etc.), parent material, vegetation map, and remote sensing images, were introduced to classify soils, so how to choose some of the data and quantify the weights of different data layers needs further study.
Segmentation of oil spills in SAR images by using discriminant cuts
NASA Astrophysics Data System (ADS)
Ding, Xianwen; Zou, Xiaolin
2018-02-01
The discriminant cut is used to segment the oil spills in synthetic aperture radar (SAR) images. The proposed approach is a region-based one, which is able to capture and utilize spatial information in SAR images. The real SAR images, i.e. ALOS-1 PALSAR and Sentinel-1 SAR images were collected and used to validate the accuracy of the proposed approach for oil spill segmentation in SAR images. The accuracy of the proposed approach is higher than that of the fuzzy C-means classification method.
Saini, Harsh; Lal, Sunil Pranit; Naidu, Vimal Vikash; Pickering, Vincel Wince; Singh, Gurmeet; Tsunoda, Tatsuhiko; Sharma, Alok
2016-12-05
High dimensional feature space generally degrades classification in several applications. In this paper, we propose a strategy called gene masking, in which non-contributing dimensions are heuristically removed from the data to improve classification accuracy. Gene masking is implemented via a binary encoded genetic algorithm that can be integrated seamlessly with classifiers during the training phase of classification to perform feature selection. It can also be used to discriminate between features that contribute most to the classification, thereby, allowing researchers to isolate features that may have special significance. This technique was applied on publicly available datasets whereby it substantially reduced the number of features used for classification while maintaining high accuracies. The proposed technique can be extremely useful in feature selection as it heuristically removes non-contributing features to improve the performance of classifiers.
Analyzing thematic maps and mapping for accuracy
Rosenfield, G.H.
1982-01-01
Two problems which exist while attempting to test the accuracy of thematic maps and mapping are: (1) evaluating the accuracy of thematic content, and (2) evaluating the effects of the variables on thematic mapping. Statistical analysis techniques are applicable to both these problems and include techniques for sampling the data and determining their accuracy. In addition, techniques for hypothesis testing, or inferential statistics, are used when comparing the effects of variables. A comprehensive and valid accuracy test of a classification project, such as thematic mapping from remotely sensed data, includes the following components of statistical analysis: (1) sample design, including the sample distribution, sample size, size of the sample unit, and sampling procedure; and (2) accuracy estimation, including estimation of the variance and confidence limits. Careful consideration must be given to the minimum sample size necessary to validate the accuracy of a given. classification category. The results of an accuracy test are presented in a contingency table sometimes called a classification error matrix. Usually the rows represent the interpretation, and the columns represent the verification. The diagonal elements represent the correct classifications. The remaining elements of the rows represent errors by commission, and the remaining elements of the columns represent the errors of omission. For tests of hypothesis that compare variables, the general practice has been to use only the diagonal elements from several related classification error matrices. These data are arranged in the form of another contingency table. The columns of the table represent the different variables being compared, such as different scales of mapping. The rows represent the blocking characteristics, such as the various categories of classification. The values in the cells of the tables might be the counts of correct classification or the binomial proportions of these counts divided by either the row totals or the column totals from the original classification error matrices. In hypothesis testing, when the results of tests of multiple sample cases prove to be significant, some form of statistical test must be used to separate any results that differ significantly from the others. In the past, many analyses of the data in this error matrix were made by comparing the relative magnitudes of the percentage of correct classifications, for either individual categories, the entire map or both. More rigorous analyses have used data transformations and (or) two-way classification analysis of variance. A more sophisticated step of data analysis techniques would be to use the entire classification error matrices using the methods of discrete multivariate analysis or of multiviariate analysis of variance.
Sub-pixel image classification for forest types in East Texas
NASA Astrophysics Data System (ADS)
Westbrook, Joey
Sub-pixel classification is the extraction of information about the proportion of individual materials of interest within a pixel. Landcover classification at the sub-pixel scale provides more discrimination than traditional per-pixel multispectral classifiers for pixels where the material of interest is mixed with other materials. It allows for the un-mixing of pixels to show the proportion of each material of interest. The materials of interest for this study are pine, hardwood, mixed forest and non-forest. The goal of this project was to perform a sub-pixel classification, which allows a pixel to have multiple labels, and compare the result to a traditional supervised classification, which allows a pixel to have only one label. The satellite image used was a Landsat 5 Thematic Mapper (TM) scene of the Stephen F. Austin Experimental Forest in Nacogdoches County, Texas and the four cover type classes are pine, hardwood, mixed forest and non-forest. Once classified, a multi-layer raster datasets was created that comprised four raster layers where each layer showed the percentage of that cover type within the pixel area. Percentage cover type maps were then produced and the accuracy of each was assessed using a fuzzy error matrix for the sub-pixel classifications, and the results were compared to the supervised classification in which a traditional error matrix was used. The overall accuracy of the sub-pixel classification using the aerial photo for both training and reference data had the highest (65% overall) out of the three sub-pixel classifications. This was understandable because the analyst can visually observe the cover types actually on the ground for training data and reference data, whereas using the FIA (Forest Inventory and Analysis) plot data, the analyst must assume that an entire pixel contains the exact percentage of a cover type found in a plot. An increase in accuracy was found after reclassifying each sub-pixel classification from nine classes with 10 percent interval each to five classes with 20 percent interval each. When compared to the supervised classification which has a satisfactory overall accuracy of 90%, none of the sub-pixel classification achieved the same level. However, since traditional per-pixel classifiers assign only one label to pixels throughout the landscape while sub-pixel classifications assign multiple labels to each pixel, the traditional 85% accuracy of acceptance for pixel-based classifications should not apply to sub-pixel classifications. More research is needed in order to define the level of accuracy that is deemed acceptable for sub-pixel classifications.
Stinchfield, Randy; McCready, John; Turner, Nigel E; Jimenez-Murcia, Susana; Petry, Nancy M; Grant, Jon; Welte, John; Chapman, Heather; Winters, Ken C
2016-09-01
The DSM-5 was published in 2013 and it included two substantive revisions for gambling disorder (GD). These changes are the reduction in the threshold from five to four criteria and elimination of the illegal activities criterion. The purpose of this study was to twofold. First, to assess the reliability, validity and classification accuracy of the DSM-5 diagnostic criteria for GD. Second, to compare the DSM-5-DSM-IV on reliability, validity, and classification accuracy, including an examination of the effect of the elimination of the illegal acts criterion on diagnostic accuracy. To compare DSM-5 and DSM-IV, eight datasets from three different countries (Canada, USA, and Spain; total N = 3247) were used. All datasets were based on similar research methods. Participants were recruited from outpatient gambling treatment services to represent the group with a GD and from the community to represent the group without a GD. All participants were administered a standardized measure of diagnostic criteria. The DSM-5 yielded satisfactory reliability, validity and classification accuracy. In comparing the DSM-5 to the DSM-IV, most comparisons of reliability, validity and classification accuracy showed more similarities than differences. There was evidence of modest improvements in classification accuracy for DSM-5 over DSM-IV, particularly in reduction of false negative errors. This reduction in false negative errors was largely a function of lowering the cut score from five to four and this revision is an improvement over DSM-IV. From a statistical standpoint, eliminating the illegal acts criterion did not make a significant impact on diagnostic accuracy. From a clinical standpoint, illegal acts can still be addressed in the context of the DSM-5 criterion of lying to others.
Li, Pengfei; Jiang, Yongying; Xiang, Jiawei
2014-01-01
To deal with the difficulty to obtain a large number of fault samples under the practical condition for mechanical fault diagnosis, a hybrid method that combined wavelet packet decomposition and support vector classification (SVC) is proposed. The wavelet packet is employed to decompose the vibration signal to obtain the energy ratio in each frequency band. Taking energy ratios as feature vectors, the pattern recognition results are obtained by the SVC. The rolling bearing and gear fault diagnostic results of the typical experimental platform show that the present approach is robust to noise and has higher classification accuracy and, thus, provides a better way to diagnose mechanical faults under the condition of small fault samples. PMID:24688361
Practical Issues in Estimating Classification Accuracy and Consistency with R Package cacIRT
ERIC Educational Resources Information Center
Lathrop, Quinn N.
2015-01-01
There are two main lines of research in estimating classification accuracy (CA) and classification consistency (CC) under Item Response Theory (IRT). The R package cacIRT provides computer implementations of both approaches in an accessible and unified framework. Even with available implementations, there remains decisions a researcher faces when…
Variance estimates and confidence intervals for the Kappa measure of classification accuracy
M. A. Kalkhan; R. M. Reich; R. L. Czaplewski
1997-01-01
The Kappa statistic is frequently used to characterize the results of an accuracy assessment used to evaluate land use and land cover classifications obtained by remotely sensed data. This statistic allows comparisons of alternative sampling designs, classification algorithms, photo-interpreters, and so forth. In order to make these comparisons, it is...
Salvatore, C; Cerasa, A; Castiglioni, I; Gallivanone, F; Augimeri, A; Lopez, M; Arabia, G; Morelli, M; Gilardi, M C; Quattrone, A
2014-01-30
Supervised machine learning has been proposed as a revolutionary approach for identifying sensitive medical image biomarkers (or combination of them) allowing for automatic diagnosis of individual subjects. The aim of this work was to assess the feasibility of a supervised machine learning algorithm for the assisted diagnosis of patients with clinically diagnosed Parkinson's disease (PD) and Progressive Supranuclear Palsy (PSP). Morphological T1-weighted Magnetic Resonance Images (MRIs) of PD patients (28), PSP patients (28) and healthy control subjects (28) were used by a supervised machine learning algorithm based on the combination of Principal Components Analysis as feature extraction technique and on Support Vector Machines as classification algorithm. The algorithm was able to obtain voxel-based morphological biomarkers of PD and PSP. The algorithm allowed individual diagnosis of PD versus controls, PSP versus controls and PSP versus PD with an Accuracy, Specificity and Sensitivity>90%. Voxels influencing classification between PD and PSP patients involved midbrain, pons, corpus callosum and thalamus, four critical regions known to be strongly involved in the pathophysiological mechanisms of PSP. Classification accuracy of individual PSP patients was consistent with previous manual morphological metrics and with other supervised machine learning application to MRI data, whereas accuracy in the detection of individual PD patients was significantly higher with our classification method. The algorithm provides excellent discrimination of PD patients from PSP patients at an individual level, thus encouraging the application of computer-based diagnosis in clinical practice. Copyright © 2013 Elsevier B.V. All rights reserved.
HEp-2 cell image classification method based on very deep convolutional networks with small datasets
NASA Astrophysics Data System (ADS)
Lu, Mengchi; Gao, Long; Guo, Xifeng; Liu, Qiang; Yin, Jianping
2017-07-01
Human Epithelial-2 (HEp-2) cell images staining patterns classification have been widely used to identify autoimmune diseases by the anti-Nuclear antibodies (ANA) test in the Indirect Immunofluorescence (IIF) protocol. Because manual test is time consuming, subjective and labor intensive, image-based Computer Aided Diagnosis (CAD) systems for HEp-2 cell classification are developing. However, methods proposed recently are mostly manual features extraction with low accuracy. Besides, the scale of available benchmark datasets is small, which does not exactly suitable for using deep learning methods. This issue will influence the accuracy of cell classification directly even after data augmentation. To address these issues, this paper presents a high accuracy automatic HEp-2 cell classification method with small datasets, by utilizing very deep convolutional networks (VGGNet). Specifically, the proposed method consists of three main phases, namely image preprocessing, feature extraction and classification. Moreover, an improved VGGNet is presented to address the challenges of small-scale datasets. Experimental results over two benchmark datasets demonstrate that the proposed method achieves superior performance in terms of accuracy compared with existing methods.
2011-01-01
Background The accuracy of echocardiography versus surgical and pathological classification of patients with ruptured mitral chordae tendineae (RMCT) has not yet been investigated with a large study. Methods Clinical, hemodynamic, surgical, and pathological findings were reviewed for 242 patients with a preoperative diagnosis of RMCT that required mitral valvular surgery. Subjects were consecutive in-patients at Fuwai Hospital in 2002-2008. Patients were evaluated by thoracic echocardiography (TTE) and transesophageal echocardiography (TEE). RMCT cases were classified by location as anterior or posterior, and classified by degree as partial or complete RMCT, according to surgical findings. RMCT cases were also classified by pathology into four groups: myxomatous degeneration, chronic rheumatic valvulitis (CRV), infective endocarditis and others. Results Echocardiography showed that most patients had a flail mitral valve, moderate to severe mitral regurgitation, a dilated heart chamber, mild to moderate pulmonary artery hypertension and good heart function. The diagnostic accuracy for RMCT was 96.7% for TTE and 100% for TEE compared with surgical findings. Preliminary experiments demonstrated that the sensitivity and specificity of diagnosing anterior, posterior and partial RMCT were high, but the sensitivity of diagnosing complete RMCT was low. Surgical procedures for RMCT depended on the location of ruptured chordae tendineae, with no relationship between surgical procedure and complete or partial RMCT. The echocardiographic characteristics of RMCT included valvular thickening, extended subvalvular chordae, echo enhancement, abnormal echo or vegetation, combined with aortic valve damage in the four groups classified by pathology. The incidence of extended subvalvular chordae in the myxomatous group was higher than that in the other groups, and valve thickening in combination with AV damage in the CRV group was higher than that in the other groups. Infective endocarditis patients were younger than those in the other groups. Furthermore, compared other groups, the CRV group had a larger left atrium, higher aortic velocity, and a higher pulmonary arterial systolic pressure. Conclusions Echocardiography is a reliable method for diagnosing RMCT and is useful for classification. Echocardiography can be used to guide surgical procedures and for preliminary determination of RMCT pathological types. PMID:21801375
On-line analysis of algae in water by discrete three-dimensional fluorescence spectroscopy.
Zhao, Nanjing; Zhang, Xiaoling; Yin, Gaofang; Yang, Ruifang; Hu, Li; Chen, Shuang; Liu, Jianguo; Liu, Wenqing
2018-03-19
In view of the problem of the on-line measurement of algae classification, a method of algae classification and concentration determination based on the discrete three-dimensional fluorescence spectra was studied in this work. The discrete three-dimensional fluorescence spectra of twelve common species of algae belonging to five categories were analyzed, the discrete three-dimensional standard spectra of five categories were built, and the recognition, classification and concentration prediction of algae categories were realized by the discrete three-dimensional fluorescence spectra coupled with non-negative weighted least squares linear regression analysis. The results show that similarities between discrete three-dimensional standard spectra of different categories were reduced and the accuracies of recognition, classification and concentration prediction of the algae categories were significantly improved. By comparing with that of the chlorophyll a fluorescence excitation spectra method, the recognition accuracy rate in pure samples by discrete three-dimensional fluorescence spectra is improved 1.38%, and the recovery rate and classification accuracy in pure diatom samples 34.1% and 46.8%, respectively; the recognition accuracy rate of mixed samples by discrete-three dimensional fluorescence spectra is enhanced by 26.1%, the recovery rate of mixed samples with Chlorophyta 37.8%, and the classification accuracy of mixed samples with diatoms 54.6%.
Belgiu, Mariana; Dr Guţ, Lucian
2014-10-01
Although multiresolution segmentation (MRS) is a powerful technique for dealing with very high resolution imagery, some of the image objects that it generates do not match the geometries of the target objects, which reduces the classification accuracy. MRS can, however, be guided to produce results that approach the desired object geometry using either supervised or unsupervised approaches. Although some studies have suggested that a supervised approach is preferable, there has been no comparative evaluation of these two approaches. Therefore, in this study, we have compared supervised and unsupervised approaches to MRS. One supervised and two unsupervised segmentation methods were tested on three areas using QuickBird and WorldView-2 satellite imagery. The results were assessed using both segmentation evaluation methods and an accuracy assessment of the resulting building classifications. Thus, differences in the geometries of the image objects and in the potential to achieve satisfactory thematic accuracies were evaluated. The two approaches yielded remarkably similar classification results, with overall accuracies ranging from 82% to 86%. The performance of one of the unsupervised methods was unexpectedly similar to that of the supervised method; they identified almost identical scale parameters as being optimal for segmenting buildings, resulting in very similar geometries for the resulting image objects. The second unsupervised method produced very different image objects from the supervised method, but their classification accuracies were still very similar. The latter result was unexpected because, contrary to previously published findings, it suggests a high degree of independence between the segmentation results and classification accuracy. The results of this study have two important implications. The first is that object-based image analysis can be automated without sacrificing classification accuracy, and the second is that the previously accepted idea that classification is dependent on segmentation is challenged by our unexpected results, casting doubt on the value of pursuing 'optimal segmentation'. Our results rather suggest that as long as under-segmentation remains at acceptable levels, imperfections in segmentation can be ruled out, so that a high level of classification accuracy can still be achieved.
NASA Astrophysics Data System (ADS)
Quesada-Barriuso, Pablo; Heras, Dora B.; Argüello, Francisco
2016-10-01
The classification of remote sensing hyperspectral images for land cover applications is a very intensive topic. In the case of supervised classification, Support Vector Machines (SVMs) play a dominant role. Recently, the Extreme Learning Machine algorithm (ELM) has been extensively used. The classification scheme previously published by the authors, and called WT-EMP, introduces spatial information in the classification process by means of an Extended Morphological Profile (EMP) that is created from features extracted by wavelets. In addition, the hyperspectral image is denoised in the 2-D spatial domain, also using wavelets and it is joined to the EMP via a stacked vector. In this paper, the scheme is improved achieving two goals. The first one is to reduce the classification time while preserving the accuracy of the classification by using ELM instead of SVM. The second one is to improve the accuracy results by performing not only a 2-D denoising for every spectral band, but also a previous additional 1-D spectral signature denoising applied to each pixel vector of the image. For each denoising the image is transformed by applying a 1-D or 2-D wavelet transform, and then a NeighShrink thresholding is applied. Improvements in terms of classification accuracy are obtained, especially for images with close regions in the classification reference map, because in these cases the accuracy of the classification in the edges between classes is more relevant.
NASA Astrophysics Data System (ADS)
Teffahi, Hanane; Yao, Hongxun; Belabid, Nasreddine; Chaib, Souleyman
2018-02-01
The satellite images with very high spatial resolution have been recently widely used in image classification topic as it has become challenging task in remote sensing field. Due to a number of limitations such as the redundancy of features and the high dimensionality of the data, different classification methods have been proposed for remote sensing images classification particularly the methods using feature extraction techniques. This paper propose a simple efficient method exploiting the capability of extended multi-attribute profiles (EMAP) with sparse autoencoder (SAE) for remote sensing image classification. The proposed method is used to classify various remote sensing datasets including hyperspectral and multispectral images by extracting spatial and spectral features based on the combination of EMAP and SAE by linking them to kernel support vector machine (SVM) for classification. Experiments on new hyperspectral image "Huston data" and multispectral image "Washington DC data" shows that this new scheme can achieve better performance of feature learning than the primitive features, traditional classifiers and ordinary autoencoder and has huge potential to achieve higher accuracy for classification in short running time.
Comparing Features for Classification of MEG Responses to Motor Imagery
Halme, Hanna-Leena; Parkkonen, Lauri
2016-01-01
Background Motor imagery (MI) with real-time neurofeedback could be a viable approach, e.g., in rehabilitation of cerebral stroke. Magnetoencephalography (MEG) noninvasively measures electric brain activity at high temporal resolution and is well-suited for recording oscillatory brain signals. MI is known to modulate 10- and 20-Hz oscillations in the somatomotor system. In order to provide accurate feedback to the subject, the most relevant MI-related features should be extracted from MEG data. In this study, we evaluated several MEG signal features for discriminating between left- and right-hand MI and between MI and rest. Methods MEG was measured from nine healthy participants imagining either left- or right-hand finger tapping according to visual cues. Data preprocessing, feature extraction and classification were performed offline. The evaluated MI-related features were power spectral density (PSD), Morlet wavelets, short-time Fourier transform (STFT), common spatial patterns (CSP), filter-bank common spatial patterns (FBCSP), spatio—spectral decomposition (SSD), and combined SSD+CSP, CSP+PSD, CSP+Morlet, and CSP+STFT. We also compared four classifiers applied to single trials using 5-fold cross-validation for evaluating the classification accuracy and its possible dependence on the classification algorithm. In addition, we estimated the inter-session left-vs-right accuracy for each subject. Results The SSD+CSP combination yielded the best accuracy in both left-vs-right (mean 73.7%) and MI-vs-rest (mean 81.3%) classification. CSP+Morlet yielded the best mean accuracy in inter-session left-vs-right classification (mean 69.1%). There were large inter-subject differences in classification accuracy, and the level of the 20-Hz suppression correlated significantly with the subjective MI-vs-rest accuracy. Selection of the classification algorithm had only a minor effect on the results. Conclusions We obtained good accuracy in sensor-level decoding of MI from single-trial MEG data. Feature extraction methods utilizing both the spatial and spectral profile of MI-related signals provided the best classification results, suggesting good performance of these methods in an online MEG neurofeedback system. PMID:27992574
Kumar, Surendra; Ghosh, Subhojit; Tetarway, Suhash; Sinha, Rakesh Kumar
2015-07-01
In this study, the magnitude and spatial distribution of frequency spectrum in the resting electroencephalogram (EEG) were examined to address the problem of detecting alcoholism in the cerebral motor cortex. The EEG signals were recorded from chronic alcoholic conditions (n = 20) and the control group (n = 20). Data were taken from motor cortex region and divided into five sub-bands (delta, theta, alpha, beta-1 and beta-2). Three methodologies were adopted for feature extraction: (1) absolute power, (2) relative power and (3) peak power frequency. The dimension of the extracted features is reduced by linear discrimination analysis and classified by support vector machine (SVM) and fuzzy C-mean clustering. The maximum classification accuracy (88 %) with SVM clustering was achieved with the EEG spectral features with absolute power frequency on F4 channel. Among the bands, relatively higher classification accuracy was found over theta band and beta-2 band in most of the channels when computed with the EEG features of relative power. Electrodes wise CZ, C3 and P4 were having more alteration. Considering the good classification accuracy obtained by SVM with relative band power features in most of the EEG channels of motor cortex, it can be suggested that the noninvasive automated online diagnostic system for the chronic alcoholic condition can be developed with the help of EEG signals.
Zhu, Lianzhang; Chen, Leiming; Zhao, Dehai
2017-01-01
Accurate emotion recognition from speech is important for applications like smart health care, smart entertainment, and other smart services. High accuracy emotion recognition from Chinese speech is challenging due to the complexities of the Chinese language. In this paper, we explore how to improve the accuracy of speech emotion recognition, including speech signal feature extraction and emotion classification methods. Five types of features are extracted from a speech sample: mel frequency cepstrum coefficient (MFCC), pitch, formant, short-term zero-crossing rate and short-term energy. By comparing statistical features with deep features extracted by a Deep Belief Network (DBN), we attempt to find the best features to identify the emotion status for speech. We propose a novel classification method that combines DBN and SVM (support vector machine) instead of using only one of them. In addition, a conjugate gradient method is applied to train DBN in order to speed up the training process. Gender-dependent experiments are conducted using an emotional speech database created by the Chinese Academy of Sciences. The results show that DBN features can reflect emotion status better than artificial features, and our new classification approach achieves an accuracy of 95.8%, which is higher than using either DBN or SVM separately. Results also show that DBN can work very well for small training databases if it is properly designed. PMID:28737705
Baltzer, Pascal A T; Dietzel, Matthias; Kaiser, Werner A
2013-08-01
In the face of multiple available diagnostic criteria in MR-mammography (MRM), a practical algorithm for lesion classification is needed. Such an algorithm should be as simple as possible and include only important independent lesion features to differentiate benign from malignant lesions. This investigation aimed to develop a simple classification tree for differential diagnosis in MRM. A total of 1,084 lesions in standardised MRM with subsequent histological verification (648 malignant, 436 benign) were investigated. Seventeen lesion criteria were assessed by 2 readers in consensus. Classification analysis was performed using the chi-squared automatic interaction detection (CHAID) method. Results include the probability for malignancy for every descriptor combination in the classification tree. A classification tree incorporating 5 lesion descriptors with a depth of 3 ramifications (1, root sign; 2, delayed enhancement pattern; 3, border, internal enhancement and oedema) was calculated. Of all 1,084 lesions, 262 (40.4 %) and 106 (24.3 %) could be classified as malignant and benign with an accuracy above 95 %, respectively. Overall diagnostic accuracy was 88.4 %. The classification algorithm reduced the number of categorical descriptors from 17 to 5 (29.4 %), resulting in a high classification accuracy. More than one third of all lesions could be classified with accuracy above 95 %. • A practical algorithm has been developed to classify lesions found in MR-mammography. • A simple decision tree consisting of five criteria reaches high accuracy of 88.4 %. • Unique to this approach, each classification is associated with a diagnostic certainty. • Diagnostic certainty of greater than 95 % is achieved in 34 % of all cases.
Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks.
Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R; Nguyen, Tuan N; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T
2017-01-01
This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively.
Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks
Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R.; Nguyen, Tuan N.; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T.
2017-01-01
This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively. PMID:28326009
Chen, Chien P; Braunstein, Steve; Mourad, Michelle; Hsu, I-Chow J; Haas-Kogan, Daphne; Roach, Mack; Fogh, Shannon E
2015-01-01
Accurate International Classification of Diseases (ICD) diagnosis coding is critical for patient care, billing purposes, and research endeavors. In this single-institution study, we evaluated our baseline ICD-9 (9th revision) diagnosis coding accuracy, identified the most common errors contributing to inaccurate coding, and implemented a multimodality strategy to improve radiation oncology coding. We prospectively studied ICD-9 coding accuracy in our radiation therapy--specific electronic medical record system. Baseline ICD-9 coding accuracy was obtained from chart review targeting ICD-9 coding accuracy of all patients treated at our institution between March and June of 2010. To improve performance an educational session highlighted common coding errors, and a user-friendly software tool, RadOnc ICD Search, version 1.0, for coding radiation oncology specific diagnoses was implemented. We then prospectively analyzed ICD-9 coding accuracy for all patients treated from July 2010 to June 2011, with the goal of maintaining 80% or higher coding accuracy. Data on coding accuracy were analyzed and fed back monthly to individual providers. Baseline coding accuracy for physicians was 463 of 661 (70%) cases. Only 46% of physicians had coding accuracy above 80%. The most common errors involved metastatic cases, whereby primary or secondary site ICD-9 codes were either incorrect or missing, and special procedures such as stereotactic radiosurgery cases. After implementing our project, overall coding accuracy rose to 92% (range, 86%-96%). The median accuracy for all physicians was 93% (range, 77%-100%) with only 1 attending having accuracy below 80%. Incorrect primary and secondary ICD-9 codes in metastatic cases showed the most significant improvement (10% vs 2% after intervention). Identifying common coding errors and implementing both education and systems changes led to significantly improved coding accuracy. This quality assurance project highlights the potential problem of ICD-9 coding accuracy by physicians and offers an approach to effectively address this shortcoming. Copyright © 2015. Published by Elsevier Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Getman, Daniel J
2008-01-01
Many attempts to observe changes in terrestrial systems over time would be significantly enhanced if it were possible to improve the accuracy of classifications of low-resolution historic satellite data. In an effort to examine improving the accuracy of historic satellite image classification by combining satellite and air photo data, two experiments were undertaken in which low-resolution multispectral data and high-resolution panchromatic data were combined and then classified using the ECHO spectral-spatial image classification algorithm and the Maximum Likelihood technique. The multispectral data consisted of 6 multispectral channels (30-meter pixel resolution) from Landsat 7. These data were augmented with panchromatic datamore » (15m pixel resolution) from Landsat 7 in the first experiment, and with a mosaic of digital aerial photography (1m pixel resolution) in the second. The addition of the Landsat 7 panchromatic data provided a significant improvement in the accuracy of classifications made using the ECHO algorithm. Although the inclusion of aerial photography provided an improvement in accuracy, this improvement was only statistically significant at a 40-60% level. These results suggest that once error levels associated with combining aerial photography and multispectral satellite data are reduced, this approach has the potential to significantly enhance the precision and accuracy of classifications made using historic remotely sensed data, as a way to extend the time range of efforts to track temporal changes in terrestrial systems.« less
2011-01-01
Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most sensitivity was around or even lower than a median value of 0.5. Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing. PMID:21849043
ERIC Educational Resources Information Center
Wyse, Adam E.; Babcock, Ben
2016-01-01
A common suggestion made in the psychometric literature for fixed-length classification tests is that one should design tests so that they have maximum information at the cut score. Designing tests in this way is believed to maximize the classification accuracy and consistency of the assessment. This article uses simulated examples to illustrate…
NASA Astrophysics Data System (ADS)
Mafanya, Madodomzi; Tsele, Philemon; Botai, Joel; Manyama, Phetole; Swart, Barend; Monate, Thabang
2017-07-01
Invasive alien plants (IAPs) not only pose a serious threat to biodiversity and water resources but also have impacts on human and animal wellbeing. To support decision making in IAPs monitoring, semi-automated image classifiers which are capable of extracting valuable information in remotely sensed data are vital. This study evaluated the mapping accuracies of supervised and unsupervised image classifiers for mapping Harrisia pomanensis (a cactus plant commonly known as the Midnight Lady) using two interlinked evaluation strategies i.e. point and area based accuracy assessment. Results of the point-based accuracy assessment show that with reference to 219 ground control points, the supervised image classifiers (i.e. Maxver and Bhattacharya) mapped H. pomanensis better than the unsupervised image classifiers (i.e. K-mediuns, Euclidian Length and Isoseg). In this regard, user and producer accuracies were 82.4% and 84% respectively for the Maxver classifier. The user and producer accuracies for the Bhattacharya classifier were 90% and 95.7%, respectively. Though the Maxver produced a higher overall accuracy and Kappa estimate than the Bhattacharya classifier, the Maxver Kappa estimate of 0.8305 is not significantly (statistically) greater than the Bhattacharya Kappa estimate of 0.8088 at a 95% confidence interval. The area based accuracy assessment results show that the Bhattacharya classifier estimated the spatial extent of H. pomanensis with an average mapping accuracy of 86.1% whereas the Maxver classifier only gave an average mapping accuracy of 65.2%. Based on these results, the Bhattacharya classifier is therefore recommended for mapping H. pomanensis. These findings will aid in the algorithm choice making for the development of a semi-automated image classification system for mapping IAPs.
A new classification scheme of plastic wastes based upon recycling labels
DOE Office of Scientific and Technical Information (OSTI.GOV)
Özkan, Kemal, E-mail: kozkan@ogu.edu.tr; Ergin, Semih, E-mail: sergin@ogu.edu.tr; Işık, Şahin, E-mail: sahini@ogu.edu.tr
Highlights: • PET, HPDE or PP types of plastics are considered. • An automated classification of plastic bottles based on the feature extraction and classification methods is performed. • The decision mechanism consists of PCA, Kernel PCA, FLDA, SVD and Laplacian Eigenmaps methods. • SVM is selected to achieve the classification task and majority voting technique is used. - Abstract: Since recycling of materials is widely assumed to be environmentally and economically beneficial, reliable sorting and processing of waste packaging materials such as plastics is very important for recycling with high efficiency. An automated system that can quickly categorize thesemore » materials is certainly needed for obtaining maximum classification while maintaining high throughput. In this paper, first of all, the photographs of the plastic bottles have been taken and several preprocessing steps were carried out. The first preprocessing step is to extract the plastic area of a bottle from the background. Then, the morphological image operations are implemented. These operations are edge detection, noise removal, hole removing, image enhancement, and image segmentation. These morphological operations can be generally defined in terms of the combinations of erosion and dilation. The effect of bottle color as well as label are eliminated using these operations. Secondly, the pixel-wise intensity values of the plastic bottle images have been used together with the most popular subspace and statistical feature extraction methods to construct the feature vectors in this study. Only three types of plastics are considered due to higher existence ratio of them than the other plastic types in the world. The decision mechanism consists of five different feature extraction methods including as Principal Component Analysis (PCA), Kernel PCA (KPCA), Fisher’s Linear Discriminant Analysis (FLDA), Singular Value Decomposition (SVD) and Laplacian Eigenmaps (LEMAP) and uses a simple experimental setup with a camera and homogenous backlighting. Due to the giving global solution for a classification problem, Support Vector Machine (SVM) is selected to achieve the classification task and majority voting technique is used as the decision mechanism. This technique equally weights each classification result and assigns the given plastic object to the class that the most classification results agree on. The proposed classification scheme provides high accuracy rate, and also it is able to run in real-time applications. It can automatically classify the plastic bottle types with approximately 90% recognition accuracy. Besides this, the proposed methodology yields approximately 96% classification rate for the separation of PET or non-PET plastic types. It also gives 92% accuracy for the categorization of non-PET plastic types into HPDE or PP.« less
Zhan, L.; Liu, Y.; Zhou, J.; Ye, J.; Thompson, P.M.
2015-01-01
Mild cognitive impairment (MCI) is an intermediate stage between normal aging and Alzheimer's disease (AD), and around 10-15% of people with MCI develop AD each year. More recently, MCI has been further subdivided into early and late stages, and there is interest in identifying sensitive brain imaging biomarkers that help to differentiate stages of MCI. Here, we focused on anatomical brain networks computed from diffusion MRI and proposed a new feature extraction and classification framework based on higher order singular value decomposition and sparse logistic regression. In tests on publicly available data from the Alzheimer's Disease Neuroimaging Initiative, our proposed framework showed promise in detecting brain network differences that help in classifying early versus late MCI. PMID:26413202
Developing collaborative classifiers using an expert-based model
Mountrakis, G.; Watts, R.; Luo, L.; Wang, Jingyuan
2009-01-01
This paper presents a hierarchical, multi-stage adaptive strategy for image classification. We iteratively apply various classification methods (e.g., decision trees, neural networks), identify regions of parametric and geographic space where accuracy is low, and in these regions, test and apply alternate methods repeating the process until the entire image is classified. Currently, classifiers are evaluated through human input using an expert-based system; therefore, this paper acts as the proof of concept for collaborative classifiers. Because we decompose the problem into smaller, more manageable sub-tasks, our classification exhibits increased flexibility compared to existing methods since classification methods are tailored to the idiosyncrasies of specific regions. A major benefit of our approach is its scalability and collaborative support since selected low-accuracy classifiers can be easily replaced with others without affecting classification accuracy in high accuracy areas. At each stage, we develop spatially explicit accuracy metrics that provide straightforward assessment of results by non-experts and point to areas that need algorithmic improvement or ancillary data. Our approach is demonstrated in the task of detecting impervious surface areas, an important indicator for human-induced alterations to the environment, using a 2001 Landsat scene from Las Vegas, Nevada. ?? 2009 American Society for Photogrammetry and Remote Sensing.
Schmidt, Robert L; Walker, Brandon S; Cohen, Michael B
2015-03-01
Reliable estimates of accuracy are important for any diagnostic test. Diagnostic accuracy studies are subject to unique sources of bias. Verification bias and classification bias are 2 sources of bias that commonly occur in diagnostic accuracy studies. Statistical methods are available to estimate the impact of these sources of bias when they occur alone. The impact of interactions when these types of bias occur together has not been investigated. We developed mathematical relationships to show the combined effect of verification bias and classification bias. A wide range of case scenarios were generated to assess the impact of bias components and interactions on total bias. Interactions between verification bias and classification bias caused overestimation of sensitivity and underestimation of specificity. Interactions had more effect on sensitivity than specificity. Sensitivity was overestimated by at least 7% in approximately 6% of the tested scenarios. Specificity was underestimated by at least 7% in less than 0.1% of the scenarios. Interactions between verification bias and classification bias create distortions in accuracy estimates that are greater than would be predicted from each source of bias acting independently. © 2014 American Cancer Society.
Compensatory neurofuzzy model for discrete data classification in biomedical
NASA Astrophysics Data System (ADS)
Ceylan, Rahime
2015-03-01
Biomedical data is separated to two main sections: signals and discrete data. So, studies in this area are about biomedical signal classification or biomedical discrete data classification. There are artificial intelligence models which are relevant to classification of ECG, EMG or EEG signals. In same way, in literature, many models exist for classification of discrete data taken as value of samples which can be results of blood analysis or biopsy in medical process. Each algorithm could not achieve high accuracy rate on classification of signal and discrete data. In this study, compensatory neurofuzzy network model is presented for classification of discrete data in biomedical pattern recognition area. The compensatory neurofuzzy network has a hybrid and binary classifier. In this system, the parameters of fuzzy systems are updated by backpropagation algorithm. The realized classifier model is conducted to two benchmark datasets (Wisconsin Breast Cancer dataset and Pima Indian Diabetes dataset). Experimental studies show that compensatory neurofuzzy network model achieved 96.11% accuracy rate in classification of breast cancer dataset and 69.08% accuracy rate was obtained in experiments made on diabetes dataset with only 10 iterations.
Major Depression Detection from EEG Signals Using Kernel Eigen-Filter-Bank Common Spatial Patterns.
Liao, Shih-Cheng; Wu, Chien-Te; Huang, Hao-Chuan; Cheng, Wei-Teng; Liu, Yi-Hung
2017-06-14
Major depressive disorder (MDD) has become a leading contributor to the global burden of disease; however, there are currently no reliable biological markers or physiological measurements for efficiently and effectively dissecting the heterogeneity of MDD. Here we propose a novel method based on scalp electroencephalography (EEG) signals and a robust spectral-spatial EEG feature extractor called kernel eigen-filter-bank common spatial pattern (KEFB-CSP). The KEFB-CSP first filters the multi-channel raw EEG signals into a set of frequency sub-bands covering the range from theta to gamma bands, then spatially transforms the EEG signals of each sub-band from the original sensor space to a new space where the new signals (i.e., CSPs) are optimal for the classification between MDD and healthy controls, and finally applies the kernel principal component analysis (kernel PCA) to transform the vector containing the CSPs from all frequency sub-bands to a lower-dimensional feature vector called KEFB-CSP. Twelve patients with MDD and twelve healthy controls participated in this study, and from each participant we collected 54 resting-state EEGs of 6 s length (5 min and 24 s in total). Our results show that the proposed KEFB-CSP outperforms other EEG features including the powers of EEG frequency bands, and fractal dimension, which had been widely applied in previous EEG-based depression detection studies. The results also reveal that the 8 electrodes from the temporal areas gave higher accuracies than other scalp areas. The KEFB-CSP was able to achieve an average EEG classification accuracy of 81.23% in single-trial analysis when only the 8-electrode EEGs of the temporal area and a support vector machine (SVM) classifier were used. We also designed a voting-based leave-one-participant-out procedure to test the participant-independent individual classification accuracy. The voting-based results show that the mean classification accuracy of about 80% can be achieved by the KEFP-CSP feature and the SVM classifier with only several trials, and this level of accuracy seems to become stable as more trials (i.e., <7 trials) are used. These findings therefore suggest that the proposed method has a great potential for developing an efficient (required only a few 6-s EEG signals from the 8 electrodes over the temporal) and effective (~80% classification accuracy) EEG-based brain-computer interface (BCI) system which may, in the future, help psychiatrists provide individualized and effective treatments for MDD patients.
NASA Technical Reports Server (NTRS)
Nalepka, R. F. (Principal Investigator); Sadowski, F. E.; Sarno, J. E.
1976-01-01
The author has identified the following significant results. A supervised classification within two separate ground areas of the Sam Houston National Forest was carried out for two sq meters spatial resolution MSS data. Data were progressively coarsened to simulate five additional cases of spatial resolution ranging up to 64 sq meters. Similar processing and analysis of all spatial resolutions enabled evaluations of the effect of spatial resolution on classification accuracy for various levels of detail and the effects on area proportion estimation for very general forest features. For very coarse resolutions, a subset of spectral channels which simulated the proposed thematic mapper channels was used to study classification accuracy.
The use of Landsat data to inventory cotton and soybean acreage in North Alabama
NASA Technical Reports Server (NTRS)
Downs, S. W., Jr.; Faust, N. L.
1980-01-01
This study was performed to determine if Landsat data could be used to improve the accuracy of the estimation of cotton acreage. A linear classification algorithm and a maximum likelihood algorithm were used for computer classification of the area, and the classification was compared with ground truth. The classification accuracy for some fields was greater than 90 percent; however, the overall accuracy was 71 percent for cotton and 56 percent for soybeans. The results of this research indicate that computer analysis of Landsat data has potential for improving upon the methods presently being used to determine cotton acreage; however, additional experiments and refinements are needed before the method can be used operationally.
NASA Technical Reports Server (NTRS)
Card, Don H.; Strong, Laurence L.
1989-01-01
An application of a classification accuracy assessment procedure is described for a vegetation and land cover map prepared by digital image processing of LANDSAT multispectral scanner data. A statistical sampling procedure called Stratified Plurality Sampling was used to assess the accuracy of portions of a map of the Arctic National Wildlife Refuge coastal plain. Results are tabulated as percent correct classification overall as well as per category with associated confidence intervals. Although values of percent correct were disappointingly low for most categories, the study was useful in highlighting sources of classification error and demonstrating shortcomings of the plurality sampling method.
Automatic classification of protein structures using physicochemical parameters.
Mohan, Abhilash; Rao, M Divya; Sunderrajan, Shruthi; Pennathur, Gautam
2014-09-01
Protein classification is the first step to functional annotation; SCOP and Pfam databases are currently the most relevant protein classification schemes. However, the disproportion in the number of three dimensional (3D) protein structures generated versus their classification into relevant superfamilies/families emphasizes the need for automated classification schemes. Predicting function of novel proteins based on sequence information alone has proven to be a major challenge. The present study focuses on the use of physicochemical parameters in conjunction with machine learning algorithms (Naive Bayes, Decision Trees, Random Forest and Support Vector Machines) to classify proteins into their respective SCOP superfamily/Pfam family, using sequence derived information. Spectrophores™, a 1D descriptor of the 3D molecular field surrounding a structure was used as a benchmark to compare the performance of the physicochemical parameters. The machine learning algorithms were modified to select features based on information gain for each SCOP superfamily/Pfam family. The effect of combining physicochemical parameters and spectrophores on classification accuracy (CA) was studied. Machine learning algorithms trained with the physicochemical parameters consistently classified SCOP superfamilies and Pfam families with a classification accuracy above 90%, while spectrophores performed with a CA of around 85%. Feature selection improved classification accuracy for both physicochemical parameters and spectrophores based machine learning algorithms. Combining both attributes resulted in a marginal loss of performance. Physicochemical parameters were able to classify proteins from both schemes with classification accuracy ranging from 90-96%. These results suggest the usefulness of this method in classifying proteins from amino acid sequences.
Computational approaches for the classification of seed storage proteins.
Radhika, V; Rao, V Sree Hari
2015-07-01
Seed storage proteins comprise a major part of the protein content of the seed and have an important role on the quality of the seed. These storage proteins are important because they determine the total protein content and have an effect on the nutritional quality and functional properties for food processing. Transgenic plants are being used to develop improved lines for incorporation into plant breeding programs and the nutrient composition of seeds is a major target of molecular breeding programs. Hence, classification of these proteins is crucial for the development of superior varieties with improved nutritional quality. In this study we have applied machine learning algorithms for classification of seed storage proteins. We have presented an algorithm based on nearest neighbor approach for classification of seed storage proteins and compared its performance with decision tree J48, multilayer perceptron neural (MLP) network and support vector machine (SVM) libSVM. The model based on our algorithm has been able to give higher classification accuracy in comparison to the other methods.
NASA Astrophysics Data System (ADS)
Schudlo, Larissa C.; Chau, Tom
2015-12-01
Objective. The majority of near-infrared spectroscopy (NIRS) brain-computer interface (BCI) studies have investigated binary classification problems. Limited work has considered differentiation of more than two mental states, or multi-class differentiation of higher-level cognitive tasks using measurements outside of the anterior prefrontal cortex. Improvements in accuracies are needed to deliver effective communication with a multi-class NIRS system. We investigated the feasibility of a ternary NIRS-BCI that supports mental states corresponding to verbal fluency task (VFT) performance, Stroop task performance, and unconstrained rest using prefrontal and parietal measurements. Approach. Prefrontal and parietal NIRS signals were acquired from 11 able-bodied adults during rest and performance of the VFT or Stroop task. Classification was performed offline using bagging with a linear discriminant base classifier trained on a 10 dimensional feature set. Main results. VFT, Stroop task and rest were classified at an average accuracy of 71.7% ± 7.9%. The ternary classification system provided a statistically significant improvement in information transfer rate relative to a binary system controlled by either mental task (0.87 ± 0.35 bits/min versus 0.73 ± 0.24 bits/min). Significance. These results suggest that effective communication can be achieved with a ternary NIRS-BCI that supports VFT, Stroop task and rest via measurements from the frontal and parietal cortices. Further development of such a system is warranted. Accurate ternary classification can enhance communication rates offered by NIRS-BCIs, improving the practicality of this technology.
Semi-supervised SVM for individual tree crown species classification
NASA Astrophysics Data System (ADS)
Dalponte, Michele; Ene, Liviu Theodor; Marconcini, Mattia; Gobakken, Terje; Næsset, Erik
2015-12-01
In this paper a novel semi-supervised SVM classifier is presented, specifically developed for tree species classification at individual tree crown (ITC) level. In ITC tree species classification, all the pixels belonging to an ITC should have the same label. This assumption is used in the learning of the proposed semi-supervised SVM classifier (ITC-S3VM). This method exploits the information contained in the unlabeled ITC samples in order to improve the classification accuracy of a standard SVM. The ITC-S3VM method can be easily implemented using freely available software libraries. The datasets used in this study include hyperspectral imagery and laser scanning data acquired over two boreal forest areas characterized by the presence of three information classes (Pine, Spruce, and Broadleaves). The experimental results quantify the effectiveness of the proposed approach, which provides classification accuracies significantly higher (from 2% to above 27%) than those obtained by the standard supervised SVM and by a state-of-the-art semi-supervised SVM (S3VM). Particularly, by reducing the number of training samples (i.e. from 100% to 25%, and from 100% to 5% for the two datasets, respectively) the proposed method still exhibits results comparable to the ones of a supervised SVM trained with the full available training set. This property of the method makes it particularly suitable for practical forest inventory applications in which collection of in situ information can be very expensive both in terms of cost and time.
Men, Hong; Fu, Songlin; Yang, Jialin; Cheng, Meiqi; Shi, Yan; Liu, Jingjing
2018-01-18
Paraffin odor intensity is an important quality indicator when a paraffin inspection is performed. Currently, paraffin odor level assessment is mainly dependent on an artificial sensory evaluation. In this paper, we developed a paraffin odor analysis system to classify and grade four kinds of paraffin samples. The original feature set was optimized using Principal Component Analysis (PCA) and Partial Least Squares (PLS). Support Vector Machine (SVM), Random Forest (RF), and Extreme Learning Machine (ELM) were applied to three different feature data sets for classification and level assessment of paraffin. For classification, the model based on SVM, with an accuracy rate of 100%, was superior to that based on RF, with an accuracy rate of 98.33-100%, and ELM, with an accuracy rate of 98.01-100%. For level assessment, the R² related to the training set was above 0.97 and the R² related to the test set was above 0.87. Through comprehensive comparison, the generalization of the model based on ELM was superior to those based on SVM and RF. The scoring errors for the three models were 0.0016-0.3494, lower than the error of 0.5-1.0 measured by industry standard experts, meaning these methods have a higher prediction accuracy for scoring paraffin level.
Liu, Rong
2017-01-01
Obtaining a fast and reliable decision is an important issue in brain-computer interfaces (BCI), particularly in practical real-time applications such as wheelchair or neuroprosthetic control. In this study, the EEG signals were firstly analyzed with a power projective base method. Then we were applied a decision-making model, the sequential probability ratio testing (SPRT), for single-trial classification of motor imagery movement events. The unique strength of this proposed classification method lies in its accumulative process, which increases the discriminative power as more and more evidence is observed over time. The properties of the method were illustrated on thirteen subjects' recordings from three datasets. Results showed that our proposed power projective method outperformed two benchmark methods for every subject. Moreover, with sequential classifier, the accuracies across subjects were significantly higher than that with nonsequential ones. The average maximum accuracy of the SPRT method was 84.1%, as compared with 82.3% accuracy for the sequential Bayesian (SB) method. The proposed SPRT method provides an explicit relationship between stopping time, thresholds, and error, which is important for balancing the time-accuracy trade-off. These results suggest SPRT would be useful in speeding up decision-making while trading off errors in BCI. PMID:29348781
Classification accuracy for stratification with remotely sensed data
Raymond L. Czaplewski; Paul L. Patterson
2003-01-01
Tools are developed that help specify the classification accuracy required from remotely sensed data. These tools are applied during the planning stage of a sample survey that will use poststratification, prestratification with proportional allocation, or double sampling for stratification. Accuracy standards are developed in terms of an âerror matrix,â which is...
NASA Astrophysics Data System (ADS)
Starkey, Andrew; Usman Ahmad, Aliyu; Hamdoun, Hassan
2017-10-01
This paper investigates the application of a novel method for classification called Feature Weighted Self Organizing Map (FWSOM) that analyses the topology information of a converged standard Self Organizing Map (SOM) to automatically guide the selection of important inputs during training for improved classification of data with redundant inputs, examined against two traditional approaches namely neural networks and Support Vector Machines (SVM) for the classification of EEG data as presented in previous work. In particular, the novel method looks to identify the features that are important for classification automatically, and in this way the important features can be used to improve the diagnostic ability of any of the above methods. The paper presents the results and shows how the automated identification of the important features successfully identified the important features in the dataset and how this results in an improvement of the classification results for all methods apart from linear discriminatory methods which cannot separate the underlying nonlinear relationship in the data. The FWSOM in addition to achieving higher classification accuracy has given insights into what features are important in the classification of each class (left and right-hand movements), and these are corroborated by already published work in this area.
Metric learning for automatic sleep stage classification.
Phan, Huy; Do, Quan; Do, The-Luan; Vu, Duc-Lung
2013-01-01
We introduce in this paper a metric learning approach for automatic sleep stage classification based on single-channel EEG data. We show that learning a global metric from training data instead of using the default Euclidean metric, the k-nearest neighbor classification rule outperforms state-of-the-art methods on Sleep-EDF dataset with various classification settings. The overall accuracy for Awake/Sleep and 4-class classification setting are 98.32% and 94.49% respectively. Furthermore, the superior accuracy is achieved by performing classification on a low-dimensional feature space derived from time and frequency domains and without the need for artifact removal as a preprocessing step.
ERIC Educational Resources Information Center
Bramley, Tom
2010-01-01
Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…
Bahadure, Nilesh Bhaskarrao; Ray, Arun Kumar; Thethi, Har Pal
2018-01-17
The detection of a brain tumor and its classification from modern imaging modalities is a primary concern, but a time-consuming and tedious work was performed by radiologists or clinical supervisors. The accuracy of detection and classification of tumor stages performed by radiologists is depended on their experience only, so the computer-aided technology is very important to aid with the diagnosis accuracy. In this study, to improve the performance of tumor detection, we investigated comparative approach of different segmentation techniques and selected the best one by comparing their segmentation score. Further, to improve the classification accuracy, the genetic algorithm is employed for the automatic classification of tumor stage. The decision of classification stage is supported by extracting relevant features and area calculation. The experimental results of proposed technique are evaluated and validated for performance and quality analysis on magnetic resonance brain images, based on segmentation score, accuracy, sensitivity, specificity, and dice similarity index coefficient. The experimental results achieved 92.03% accuracy, 91.42% specificity, 92.36% sensitivity, and an average segmentation score between 0.82 and 0.93 demonstrating the effectiveness of the proposed technique for identifying normal and abnormal tissues from brain MR images. The experimental results also obtained an average of 93.79% dice similarity index coefficient, which indicates better overlap between the automated extracted tumor regions with manually extracted tumor region by radiologists.
NASA Astrophysics Data System (ADS)
Kurniawan, Dian; Suparti; Sugito
2018-05-01
Population growth in Indonesia has increased every year. According to the population census conducted by the Central Bureau of Statistics (BPS) in 2010, the population of Indonesia has reached 237.6 million people. Therefore, to control the population growth rate, the government hold Family Planning or Keluarga Berencana (KB) program for couples of childbearing age. The purpose of this program is to improve the health of mothers and children in order to manifest prosperous society by controlling births while ensuring control of population growth. The data used in this study is the updated family data of Semarang city in 2016 that conducted by National Family Planning Coordinating Board (BKKBN). From these data, classifiers with kernel discriminant analysis will be obtained, and also classification accuracy will be obtained from that method. The result of the analysis showed that normal kernel discriminant analysis gives 71.05 % classification accuracy with 28.95 % classification error. Whereas triweight kernel discriminant analysis gives 73.68 % classification accuracy with 26.32 % classification error. Using triweight kernel discriminant for data preprocessing of family planning participation of childbearing age couples in Semarang City of 2016 can be stated better than with normal kernel discriminant.
Madison, Matthew J; Bradshaw, Laine P
2015-06-01
Diagnostic classification models are psychometric models that aim to classify examinees according to their mastery or non-mastery of specified latent characteristics. These models are well-suited for providing diagnostic feedback on educational assessments because of their practical efficiency and increased reliability when compared with other multidimensional measurement models. A priori specifications of which latent characteristics or attributes are measured by each item are a core element of the diagnostic assessment design. This item-attribute alignment, expressed in a Q-matrix, precedes and supports any inference resulting from the application of the diagnostic classification model. This study investigates the effects of Q-matrix design on classification accuracy for the log-linear cognitive diagnosis model. Results indicate that classification accuracy, reliability, and convergence rates improve when the Q-matrix contains isolated information from each measured attribute.
The study of vehicle classification equipment with solutions to improve accuracy in Oklahoma.
DOT National Transportation Integrated Search
2014-12-01
The accuracy of vehicle counting and classification data is vital for appropriate future highway and road : design, including determining pavement characteristics, eliminating traffic jams, and improving safety. : Organizations relying on vehicle cla...
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds.
Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M; Bloom, Peter H; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds
Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data. PMID:28403159
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds
Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael J.; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.
Ahmad, Iftikhar; Ahmad, Manzoor; Khan, Karim; Ikram, Masroor
2016-06-01
Optical polarimetry was employed for assessment of ex vivo healthy and basal cell carcinoma (BCC) tissue samples from human skin. Polarimetric analyses revealed that depolarization and retardance for healthy tissue group were significantly higher (p<0.001) compared to BCC tissue group. Histopathology indicated that these differences partially arise from BCC-related characteristic changes in tissue morphology. Wilks lambda statistics demonstrated the potential of all investigated polarimetric properties for computer assisted classification of the two tissue groups. Based on differences in polarimetric properties, partial least square (PLS) regression classified the samples with 100% accuracy, sensitivity and specificity. These findings indicate that optical polarimetry together with PLS statistics hold promise for automated pathology classification. Copyright © 2016 Elsevier B.V. All rights reserved.
Sea ice classification using fast learning neural networks
NASA Technical Reports Server (NTRS)
Dawson, M. S.; Fung, A. K.; Manry, M. T.
1992-01-01
A first learning neural network approach to the classification of sea ice is presented. The fast learning (FL) neural network and a multilayer perceptron (MLP) trained with backpropagation learning (BP network) were tested on simulated data sets based on the known dominant scattering characteristics of the target class. Four classes were used in the data simulation: open water, thick lossy saline ice, thin saline ice, and multiyear ice. The BP network was unable to consistently converge to less than 25 percent error while the FL method yielded an average error of approximately 1 percent on the first iteration of training. The fast learning method presented can significantly reduce the CPU time necessary to train a neural network as well as consistently yield higher classification accuracy than BP networks.
Byun, Wonwoo; Lee, Jung-Min; Kim, Youngwon; Brusseau, Timothy A
2018-03-26
This study examined the accuracy of the Fitbit activity tracker (FF) for quantifying sedentary behavior (SB) and varying intensities of physical activity (PA) in 3-5-year-old children. Twenty-eight healthy preschool-aged children (Girls: 46%, Mean age: 4.8 ± 1.0 years) wore the FF and were directly observed while performing a set of various unstructured and structured free-living activities from sedentary to vigorous intensity. The classification accuracy of the FF for measuring SB, light PA (LPA), moderate-to-vigorous PA (MVPA), and total PA (TPA) was examined calculating Pearson correlation coefficients (r), mean absolute percent error (MAPE), Cohen's kappa ( k ), sensitivity (Se), specificity (Sp), and area under the receiver operating curve (ROC-AUC). The classification accuracies of the FF (ROC-AUC) were 0.92, 0.63, 0.77 and 0.92 for SB, LPA, MVPA and TPA, respectively. Similarly, values of kappa, Se, Sp and percentage of correct classification were consistently high for SB and TPA, but low for LPA and MVPA. The FF demonstrated excellent classification accuracy for assessing SB and TPA, but lower accuracy for classifying LPA and MVPA. Our findings suggest that the FF should be considered as a valid instrument for assessing time spent sedentary and overall physical activity in preschool-aged children.
NASA Astrophysics Data System (ADS)
Zhang, Min; Zhou, Xiangrong; Goshima, Satoshi; Chen, Huayue; Muramatsu, Chisako; Hara, Takeshi; Yokoyama, Ryojiro; Kanematsu, Masayuki; Fujita, Hiroshi
2012-03-01
We aim at using a new texton based texture classification method in the classification of pulmonary emphysema in computed tomography (CT) images of the lungs. Different from conventional computer-aided diagnosis (CAD) pulmonary emphysema classification methods, in this paper, firstly, the dictionary of texton is learned via applying sparse representation(SR) to image patches in the training dataset. Then the SR coefficients of the test images over the dictionary are used to construct the histograms for texture presentations. Finally, classification is performed by using a nearest neighbor classifier with a histogram dissimilarity measure as distance. The proposed approach is tested on 3840 annotated regions of interest consisting of normal tissue and mild, moderate and severe pulmonary emphysema of three subtypes. The performance of the proposed system, with an accuracy of about 88%, is comparably higher than state of the art method based on the basic rotation invariant local binary pattern histograms and the texture classification method based on texton learning by k-means, which performs almost the best among other approaches in the literature.
Hripcsak, George; Knirsch, Charles; Zhou, Li; Wilcox, Adam; Melton, Genevieve B
2007-03-01
Data mining in electronic medical records may facilitate clinical research, but much of the structured data may be miscoded, incomplete, or non-specific. The exploitation of narrative data using natural language processing may help, although nesting, varying granularity, and repetition remain challenges. In a study of community-acquired pneumonia using electronic records, these issues led to poor classification. Limiting queries to accurate, complete records led to vastly reduced, possibly biased samples. We exploited knowledge latent in the electronic records to improve classification. A similarity metric was used to cluster cases. We defined discordance as the degree to which cases within a cluster give different answers for some query that addresses a classification task of interest. Cases with higher discordance are more likely to be incorrectly classified, and can be reviewed manually to adjust the classification, improve the query, or estimate the likely accuracy of the query. In a study of pneumonia--in which the ICD9-CM coding was found to be very poor--the discordance measure was statistically significantly correlated with classification correctness (.45; 95% CI .15-.62).
Li, Jinyan; Fong, Simon; Sung, Yunsick; Cho, Kyungeun; Wong, Raymond; Wong, Kelvin K L
2016-01-01
An imbalanced dataset is defined as a training dataset that has imbalanced proportions of data in both interesting and uninteresting classes. Often in biomedical applications, samples from the stimulating class are rare in a population, such as medical anomalies, positive clinical tests, and particular diseases. Although the target samples in the primitive dataset are small in number, the induction of a classification model over such training data leads to poor prediction performance due to insufficient training from the minority class. In this paper, we use a novel class-balancing method named adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique (ASCB_DmSMOTE) to solve this imbalanced dataset problem, which is common in biomedical applications. The proposed method combines under-sampling and over-sampling into a swarm optimisation algorithm. It adaptively selects suitable parameters for the rebalancing algorithm to find the best solution. Compared with the other versions of the SMOTE algorithm, significant improvements, which include higher accuracy and credibility, are observed with ASCB_DmSMOTE. Our proposed method tactfully combines two rebalancing techniques together. It reasonably re-allocates the majority class in the details and dynamically optimises the two parameters of SMOTE to synthesise a reasonable scale of minority class for each clustered sub-imbalanced dataset. The proposed methods ultimately overcome other conventional methods and attains higher credibility with even greater accuracy of the classification model.
Ground Truth Sampling and LANDSAT Accuracy Assessment
NASA Technical Reports Server (NTRS)
Robinson, J. W.; Gunther, F. J.; Campbell, W. J.
1982-01-01
It is noted that the key factor in any accuracy assessment of remote sensing data is the method used for determining the ground truth, independent of the remote sensing data itself. The sampling and accuracy procedures developed for nuclear power plant siting study are described. The purpose of the sampling procedure was to provide data for developing supervised classifications for two study sites and for assessing the accuracy of that and the other procedures used. The purpose of the accuracy assessment was to allow the comparison of the cost and accuracy of various classification procedures as applied to various data types.
Abuassba, Adnan O M; Zhang, Dezheng; Luo, Xiong; Shaheryar, Ahmad; Ali, Hazrat
2017-01-01
Extreme Learning Machine (ELM) is a fast-learning algorithm for a single-hidden layer feedforward neural network (SLFN). It often has good generalization performance. However, there are chances that it might overfit the training data due to having more hidden nodes than needed. To address the generalization performance, we use a heterogeneous ensemble approach. We propose an Advanced ELM Ensemble (AELME) for classification, which includes Regularized-ELM, L 2 -norm-optimized ELM (ELML2), and Kernel-ELM. The ensemble is constructed by training a randomly chosen ELM classifier on a subset of training data selected through random resampling. The proposed AELM-Ensemble is evolved by employing an objective function of increasing diversity and accuracy among the final ensemble. Finally, the class label of unseen data is predicted using majority vote approach. Splitting the training data into subsets and incorporation of heterogeneous ELM classifiers result in higher prediction accuracy, better generalization, and a lower number of base classifiers, as compared to other models (Adaboost, Bagging, Dynamic ELM ensemble, data splitting ELM ensemble, and ELM ensemble). The validity of AELME is confirmed through classification on several real-world benchmark datasets.
Abuassba, Adnan O. M.; Ali, Hazrat
2017-01-01
Extreme Learning Machine (ELM) is a fast-learning algorithm for a single-hidden layer feedforward neural network (SLFN). It often has good generalization performance. However, there are chances that it might overfit the training data due to having more hidden nodes than needed. To address the generalization performance, we use a heterogeneous ensemble approach. We propose an Advanced ELM Ensemble (AELME) for classification, which includes Regularized-ELM, L2-norm-optimized ELM (ELML2), and Kernel-ELM. The ensemble is constructed by training a randomly chosen ELM classifier on a subset of training data selected through random resampling. The proposed AELM-Ensemble is evolved by employing an objective function of increasing diversity and accuracy among the final ensemble. Finally, the class label of unseen data is predicted using majority vote approach. Splitting the training data into subsets and incorporation of heterogeneous ELM classifiers result in higher prediction accuracy, better generalization, and a lower number of base classifiers, as compared to other models (Adaboost, Bagging, Dynamic ELM ensemble, data splitting ELM ensemble, and ELM ensemble). The validity of AELME is confirmed through classification on several real-world benchmark datasets. PMID:28546808
Can single classifiers be as useful as model ensembles to produce benthic seabed substratum maps?
NASA Astrophysics Data System (ADS)
Turner, Joseph A.; Babcock, Russell C.; Hovey, Renae; Kendrick, Gary A.
2018-05-01
Numerous machine-learning classifiers are available for benthic habitat map production, which can lead to different results. This study highlights the performance of the Random Forest (RF) classifier, which was significantly better than Classification Trees (CT), Naïve Bayes (NB), and a multi-model ensemble in terms of overall accuracy, Balanced Error Rate (BER), Kappa, and area under the curve (AUC) values. RF accuracy was often higher than 90% for each substratum class, even at the most detailed level of the substratum classification and AUC values also indicated excellent performance (0.8-1). Total agreement between classifiers was high at the broadest level of classification (75-80%) when differentiating between hard and soft substratum. However, this sharply declined as the number of substratum categories increased (19-45%) including a mix of rock, gravel, pebbles, and sand. The model ensemble, produced from the results of all three classifiers by majority voting, did not show any increase in predictive performance when compared to the single RF classifier. This study shows how a single classifier may be sufficient to produce benthic seabed maps and model ensembles of multiple classifiers.
Multi-Temporal Classification and Change Detection Using Uav Images
NASA Astrophysics Data System (ADS)
Makuti, S.; Nex, F.; Yang, M. Y.
2018-05-01
In this paper different methodologies for the classification and change detection of UAV image blocks are explored. UAV is not only the cheapest platform for image acquisition but it is also the easiest platform to operate in repeated data collections over a changing area like a building construction site. Two change detection techniques have been evaluated in this study: the pre-classification and the post-classification algorithms. These methods are based on three main steps: feature extraction, classification and change detection. A set of state of the art features have been used in the tests: colour features (HSV), textural features (GLCM) and 3D geometric features. For classification purposes Conditional Random Field (CRF) has been used: the unary potential was determined using the Random Forest algorithm while the pairwise potential was defined by the fully connected CRF. In the performed tests, different feature configurations and settings have been considered to assess the performance of these methods in such challenging task. Experimental results showed that the post-classification approach outperforms the pre-classification change detection method. This was analysed using the overall accuracy, where by post classification have an accuracy of up to 62.6 % and the pre classification change detection have an accuracy of 46.5 %. These results represent a first useful indication for future works and developments.
Automated structural classification of lipids by machine learning.
Taylor, Ryan; Miller, Ryan H; Miller, Ryan D; Porter, Michael; Dalgleish, James; Prince, John T
2015-03-01
Modern lipidomics is largely dependent upon structural ontologies because of the great diversity exhibited in the lipidome, but no automated lipid classification exists to facilitate this partitioning. The size of the putative lipidome far exceeds the number currently classified, despite a decade of work. Automated classification would benefit ongoing classification efforts by decreasing the time needed and increasing the accuracy of classification while providing classifications for mass spectral identification algorithms. We introduce a tool that automates classification into the LIPID MAPS ontology of known lipids with >95% accuracy and novel lipids with 63% accuracy. The classification is based upon simple chemical characteristics and modern machine learning algorithms. The decision trees produced are intelligible and can be used to clarify implicit assumptions about the current LIPID MAPS classification scheme. These characteristics and decision trees are made available to facilitate alternative implementations. We also discovered many hundreds of lipids that are currently misclassified in the LIPID MAPS database, strongly underscoring the need for automated classification. Source code and chemical characteristic lists as SMARTS search strings are available under an open-source license at https://www.github.com/princelab/lipid_classifier. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
NASA Technical Reports Server (NTRS)
Quattrochi, D. A.
1984-01-01
An initial analysis of LANDSAT 4 Thematic Mapper (TM) data for the discrimination of agricultural, forested wetland, and urban land covers is conducted using a scene of data collected over Arkansas and Tennessee. A classification of agricultural lands derived from multitemporal LANDSAT Multispectral Scanner (MSS) data is compared with a classification of TM data for the same area. Results from this comparative analysis show that the multitemporal MSS classification produced an overall accuracy of 80.91% while the TM classification yields an overall classification accuracy of 97.06% correct.
A new classification scheme of plastic wastes based upon recycling labels.
Özkan, Kemal; Ergin, Semih; Işık, Şahin; Işıklı, Idil
2015-01-01
Since recycling of materials is widely assumed to be environmentally and economically beneficial, reliable sorting and processing of waste packaging materials such as plastics is very important for recycling with high efficiency. An automated system that can quickly categorize these materials is certainly needed for obtaining maximum classification while maintaining high throughput. In this paper, first of all, the photographs of the plastic bottles have been taken and several preprocessing steps were carried out. The first preprocessing step is to extract the plastic area of a bottle from the background. Then, the morphological image operations are implemented. These operations are edge detection, noise removal, hole removing, image enhancement, and image segmentation. These morphological operations can be generally defined in terms of the combinations of erosion and dilation. The effect of bottle color as well as label are eliminated using these operations. Secondly, the pixel-wise intensity values of the plastic bottle images have been used together with the most popular subspace and statistical feature extraction methods to construct the feature vectors in this study. Only three types of plastics are considered due to higher existence ratio of them than the other plastic types in the world. The decision mechanism consists of five different feature extraction methods including as Principal Component Analysis (PCA), Kernel PCA (KPCA), Fisher's Linear Discriminant Analysis (FLDA), Singular Value Decomposition (SVD) and Laplacian Eigenmaps (LEMAP) and uses a simple experimental setup with a camera and homogenous backlighting. Due to the giving global solution for a classification problem, Support Vector Machine (SVM) is selected to achieve the classification task and majority voting technique is used as the decision mechanism. This technique equally weights each classification result and assigns the given plastic object to the class that the most classification results agree on. The proposed classification scheme provides high accuracy rate, and also it is able to run in real-time applications. It can automatically classify the plastic bottle types with approximately 90% recognition accuracy. Besides this, the proposed methodology yields approximately 96% classification rate for the separation of PET or non-PET plastic types. It also gives 92% accuracy for the categorization of non-PET plastic types into HPDE or PP. Copyright © 2014 Elsevier Ltd. All rights reserved.
Synergistic Use of WorldView-2 Imagery and Airborne LiDAR Data for Urban Land Cover Classification
NASA Astrophysics Data System (ADS)
Wu, M. F.; Sun, Z. C.; Yang, B.; Yu, S. S.
2017-02-01
There are lots of challenges for deriving urban land cover types for high resolution optical imagery because of spectral similarity of different objects, mixed pixels, shadows of buildings and large tree crowns. In order to reduce these uncertainties, recently, it’s a trend of the classification of urban land cover from multi-source sensors in the field of urban remote sensing. In this study, a hierarchical support vector machine (SVM) classification method was applied to the urban land cover mapping, using the WorldView-2 imagery and airborne Light Detection and Ranging (LiDAR) data. The results showed that: (1) The overall accuracy (OA) and overall kappa (OK) were 72.92% and 0.66 for WorldView-2 imagery alone; while the OA and OK were improved up to 89.44% and 0.87 for the synergistic use of the two types of data source. (2) Buildings and road/parking lots extracted from fused data were more precision and well-shaped. The two classes from fused data were optimally classified with higher producer’s accuracy and user’s accuracy than WorldView-2 imagery alone. The trees were also easily separated from the grasslands when the airborne LiDAR data was added. (3) The fused data could reduce the phenomenon of different spectral character of the complex and detailed objects. It was also helpful to address the problem of shadows from the high-rise buildings. The results from this study indicate that the synergistic use of high resolution optical imagery and airborne LiDAR data can be an efficient approach to improving the classification of urban land cover.
NASA Astrophysics Data System (ADS)
Meerdink, S.; Roberts, D. A.; Roth, K. L.
2015-12-01
Accurate knowledge of the spatial distribution of plant species is required for many research and management agendas that track ecosystem health. Because of this, there is continuous development of research focused on remotely-sensed species classifications for many diverse ecosystems. While plant species have been mapped using airborne imaging spectroscopy, the geographic extent has been limited due to data availability and spectrally similar species continue to be difficult to separate. The proposed Hyperspectral Infrared Imager (HyspIRI) space-borne mission, which includes a visible near infrared/shortwave infrared (VSWIR) imaging spectrometer and thermal infrared (TIR) multi-spectral imager, would present an opportunity to improve species discrimination over a much broader scale. Here we evaluate: 1) the capability of VSWIR and/or TIR spectra to discriminate plant species; 2) the accuracy of species classifications within an ecosystem; and 3) the potential for discriminating among species across a range of ecosystems. Simulated HyspIRI imagery was acquired in spring/summer of 2013 spanning from Santa Barbara to Bakersfield, CA with the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) and the MODIS/ASTER Airborne Simulator (MASTER) instruments. Three spectral libraries were created from these images: AVIRIS (224 bands from 0.4 - 2.5 µm), MASTER (8 bands from 7.5 - 12 µm), and AVIRIS + MASTER. We used canonical discriminant analysis (CDA) as a dimension reduction technique and then classified plant species using linear discriminant analysis (LDA). Our results show the inclusion of TIR spectra improved species discrimination, but only for plant species with emissivities departing from that of a gray body. Ecosystems with species that have high spectral contrast had higher classification accuracies. Mapping plant species across all ecosystems resulted in a classification with lower accuracies than a single ecosystem due to the complex nature of incorporating more plant species.
Application of visible and near-infrared spectroscopy to classification of Miscanthus species
Jin, Xiaoli; Chen, Xiaoling; Xiao, Liang; ...
2017-04-03
Here, the feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validationmore » results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species.« less
Application of visible and near-infrared spectroscopy to classification of Miscanthus species
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Xiaoli; Chen, Xiaoling; Xiao, Liang
Here, the feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validationmore » results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species.« less
Application of visible and near-infrared spectroscopy to classification of Miscanthus species.
Jin, Xiaoli; Chen, Xiaoling; Xiao, Liang; Shi, Chunhai; Chen, Liang; Yu, Bin; Yi, Zili; Yoo, Ji Hye; Heo, Kweon; Yu, Chang Yeon; Yamada, Toshihiko; Sacks, Erik J; Peng, Junhua
2017-01-01
The feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validation results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species.
Application of visible and near-infrared spectroscopy to classification of Miscanthus species
Shi, Chunhai; Chen, Liang; Yu, Bin; Yi, Zili; Yoo, Ji Hye; Heo, Kweon; Yu, Chang Yeon; Yamada, Toshihiko; Sacks, Erik J.; Peng, Junhua
2017-01-01
The feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validation results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species. PMID:28369059
NASA Technical Reports Server (NTRS)
Rignot, Eric; Williams, Cynthia; Way, Jobea; Viereck, Leslie
1993-01-01
A maximum a posteriori Bayesian classifier for multifrequency polarimetric SAR data is used to perform a supervised classification of forest types in the floodplains of Alaska. The image classes include white spruce, balsam poplar, black spruce, alder, non-forests, and open water. The authors investigate the effect on classification accuracy of changing environmental conditions, and of frequency and polarization of the signal. The highest classification accuracy (86 percent correctly classified forest pixels, and 91 percent overall) is obtained combining L- and C-band frequencies fully polarimetric on a date where the forest is just recovering from flooding. The forest map compares favorably with a vegetation map assembled from digitized aerial photos which took five years for completion, and address the state of the forest in 1978, ignoring subsequent fires, changes in the course of the river, clear-cutting of trees, and tree growth. HV-polarization is the most useful polarization at L- and C-band for classification. C-band VV (ERS-1 mode) and L-band HH (J-ERS-1 mode) alone or combined yield unsatisfactory classification accuracies. Additional data acquired in the winter season during thawed and frozen days yield classification accuracies respectively 20 percent and 30 percent lower due to a greater confusion between conifers and deciduous trees. Data acquired at the peak of flooding in May 1991 also yield classification accuracies 10 percent lower because of dominant trunk-ground interactions which mask out finer differences in radar backscatter between tree species. Combination of several of these dates does not improve classification accuracy. For comparison, panchromatic optical data acquired by SPOT in the summer season of 1991 are used to classify the same area. The classification accuracy (78 percent for the forest types and 90 percent if open water is included) is lower than that obtained with AIRSAR although conifers and deciduous trees are better separated due to the presence of leaves on the deciduous trees. Optical data do not separate black spruce and white spruce as well as SAR data, cannot separate alder from balsam poplar, and are of course limited by the frequent cloud cover in the polar regions. Yet, combining SPOT and AIRSAR offers better chances to identify vegetation types independent of ground truth information using a combination of NDVI indexes from SPOT, biomass numbers from AIRSAR, and a segmentation map from either one.
NASA Astrophysics Data System (ADS)
Karakacan Kuzucu, A.; Bektas Balcik, F.
2017-11-01
Accurate and reliable land use/land cover (LULC) information obtained by remote sensing technology is necessary in many applications such as environmental monitoring, agricultural management, urban planning, hydrological applications, soil management, vegetation condition study and suitability analysis. But this information still remains a challenge especially in heterogeneous landscapes covering urban and rural areas due to spectrally similar LULC features. In parallel with technological developments, supplementary data such as satellite-derived spectral indices have begun to be used as additional bands in classification to produce data with high accuracy. The aim of this research is to test the potential of spectral vegetation indices combination with supervised classification methods and to extract reliable LULC information from SPOT 7 multispectral imagery. The Normalized Difference Vegetation Index (NDVI), the Ratio Vegetation Index (RATIO), the Soil Adjusted Vegetation Index (SAVI) were the three vegetation indices used in this study. The classical maximum likelihood classifier (MLC) and support vector machine (SVM) algorithm were applied to classify SPOT 7 image. Catalca is selected region located in the north west of the Istanbul in Turkey, which has complex landscape covering artificial surface, forest and natural area, agricultural field, quarry/mining area, pasture/scrubland and water body. Accuracy assessment of all classified images was performed through overall accuracy and kappa coefficient. The results indicated that the incorporation of these three different vegetation indices decrease the classification accuracy for the MLC and SVM classification. In addition, the maximum likelihood classification slightly outperformed the support vector machine classification approach in both overall accuracy and kappa statistics.
The Effect of Normalization in Violence Video Classification Performance
NASA Astrophysics Data System (ADS)
Ali, Ashikin; Senan, Norhalina
2017-08-01
Basically, data pre-processing is an important part of data mining. Normalization is a pre-processing stage for any type of problem statement, especially in video classification. Challenging problems that arises in video classification is because of the heterogeneous content, large variations in video quality and complex semantic meanings of the concepts involved. Therefore, to regularize this problem, it is thoughtful to ensure normalization or basically involvement of thorough pre-processing stage aids the robustness of classification performance. This process is to scale all the numeric variables into certain range to make it more meaningful for further phases in available data mining techniques. Thus, this paper attempts to examine the effect of 2 normalization techniques namely Min-max normalization and Z-score in violence video classifications towards the performance of classification rate using Multi-layer perceptron (MLP) classifier. Using Min-Max Normalization range of [0,1] the result shows almost 98% of accuracy, meanwhile Min-Max Normalization range of [-1,1] accuracy is 59% and for Z-score the accuracy is 50%.
NASA Astrophysics Data System (ADS)
Anitha, J.; Vijila, C. Kezi Selva; Hemanth, D. Jude
2010-02-01
Diabetic retinopathy (DR) is a chronic eye disease for which early detection is highly essential to avoid any fatal results. Image processing of retinal images emerge as a feasible tool for this early diagnosis. Digital image processing techniques involve image classification which is a significant technique to detect the abnormality in the eye. Various automated classification systems have been developed in the recent years but most of them lack high classification accuracy. Artificial neural networks are the widely preferred artificial intelligence technique since it yields superior results in terms of classification accuracy. In this work, Radial Basis function (RBF) neural network based bi-level classification system is proposed to differentiate abnormal DR Images and normal retinal images. The results are analyzed in terms of classification accuracy, sensitivity and specificity. A comparative analysis is performed with the results of the probabilistic classifier namely Bayesian classifier to show the superior nature of neural classifier. Experimental results show promising results for the neural classifier in terms of the performance measures.
Multi-phenology WorldView-2 imagery improves remote sensing of savannah tree species
NASA Astrophysics Data System (ADS)
Madonsela, Sabelo; Cho, Moses Azong; Mathieu, Renaud; Mutanga, Onisimo; Ramoelo, Abel; Kaszta, Żaneta; Kerchove, Ruben Van De; Wolff, Eléonore
2017-06-01
Biodiversity mapping in African savannah is important for monitoring changes and ensuring sustainable use of ecosystem resources. Biodiversity mapping can benefit from multi-spectral instruments such as WorldView-2 with very high spatial resolution and a spectral configuration encompassing important spectral regions not previously available for vegetation mapping. This study investigated i) the benefits of the eight-band WorldView-2 (WV-2) spectral configuration for discriminating tree species in Southern African savannah and ii) if multiple-images acquired at key points of the typical phenological development of savannahs (peak productivity, transition to senescence) improve on tree species classifications. We first assessed the discriminatory power of WV-2 bands using interspecies-Spectral Angle Mapper (SAM) via Band Add-On procedure and tested the spectral capability of WorldView-2 against simulated IKONOS for tree species classification. The results from interspecies-SAM procedure identified the yellow and red bands as the most statistically significant bands (p = 0.000251 and p = 0.000039 respectively) in the discriminatory power of WV-2 during the transition from wet to dry season (April). Using Random Forest classifier, the classification scenarios investigated showed that i) the 8-bands of the WV-2 sensor achieved higher classification accuracy for the April date (transition from wet to dry season, senescence) compared to the March date (peak productivity season) ii) the WV-2 spectral configuration systematically outperformed the IKONOS sensor spectral configuration and iii) the multi-temporal approach (March and April combined) improved the discrimination of tress species and produced the highest overall accuracy results at 80.4%. Consistent with the interspecies-SAM procedure, the yellow (605 nm) band also showed a statistically significant contribution in the improved classification accuracy from WV-2. These results highlight the mapping opportunities presented by WV-2 data for monitoring the distribution status of e.g. species often harvested by local communities (e.g. Sclerocharya birrea), encroaching species, or species-specific tree losses induced by elephants.
Explaining Match Outcome During The Men’s Basketball Tournament at The Olympic Games
Leicht, Anthony S.; Gómez, Miguel A.; Woods, Carl T.
2017-01-01
In preparation for the Olympics, there is a limited opportunity for coaches and athletes to interact regularly with team performance indicators providing important guidance to coaches for enhanced match success at the elite level. This study examined the relationship between match outcome and team performance indicators during men’s basketball tournaments at the Olympic Games. Twelve team performance indicators were collated from all men’s teams and matches during the basketball tournament of the 2004-2016 Olympic Games (n = 156). Linear and non-linear analyses examined the relationship between match outcome and team performance indicator characteristics; namely, binary logistic regression and a conditional interference (CI) classification tree. The most parsimonious logistic regression model retained ‘assists’, ‘defensive rebounds’, ‘field-goal percentage’, ‘fouls’, ‘fouls against’, ‘steals’ and ‘turnovers’ (delta AIC <0.01; Akaike weight = 0.28) with a classification accuracy of 85.5%. Conversely, four performance indicators were retained with the CI classification tree with an average classification accuracy of 81.4%. However, it was the combination of ‘field-goal percentage’ and ‘defensive rebounds’ that provided the greatest probability of winning (93.2%). Match outcome during the men’s basketball tournaments at the Olympic Games was identified by a unique combination of performance indicators. Despite the average model accuracy being marginally higher for the logistic regression analysis, the CI classification tree offered a greater practical utility for coaches through its resolution of non-linear phenomena to guide team success. Key points A unique combination of team performance indicators explained 93.2% of winning observations in men’s basketball at the Olympics. Monitoring of these team performance indicators may provide coaches with the capability to devise multiple game plans or strategies to enhance their likelihood of winning. Incorporation of machine learning techniques with team performance indicators may provide a valuable and strategic approach to explain patterns within multivariate datasets in sport science. PMID:29238245
Building confidence and credibility into CAD with belief decision trees
NASA Astrophysics Data System (ADS)
Affenit, Rachael N.; Barns, Erik R.; Furst, Jacob D.; Rasin, Alexander; Raicu, Daniela S.
2017-03-01
Creating classifiers for computer-aided diagnosis in the absence of ground truth is a challenging problem. Using experts' opinions as reference truth is difficult because the variability in the experts' interpretations introduces uncertainty in the labeled diagnostic data. This uncertainty translates into noise, which can significantly affect the performance of any classifier on test data. To address this problem, we propose a new label set weighting approach to combine the experts' interpretations and their variability, as well as a selective iterative classification (SIC) approach that is based on conformal prediction. Using the NIH/NCI Lung Image Database Consortium (LIDC) dataset in which four radiologists interpreted the lung nodule characteristics, including the degree of malignancy, we illustrate the benefits of the proposed approach. Our results show that the proposed 2-label-weighted approach significantly outperforms the accuracy of the original 5- label and 2-label-unweighted classification approaches by 39.9% and 7.6%, respectively. We also found that the weighted 2-label models produce higher skewness values by 1.05 and 0.61 for non-SIC and SIC respectively on root mean square error (RMSE) distributions. When each approach was combined with selective iterative classification, this further improved the accuracy of classification for the 2-weighted-label by 7.5% over the original, and improved the skewness of the 5-label and 2-unweighted-label by 0.22 and 0.44 respectively.
Protein classification based on text document classification techniques.
Cheng, Betty Yee Man; Carbonell, Jaime G; Klein-Seetharaman, Judith
2005-03-01
The need for accurate, automated protein classification methods continues to increase as advances in biotechnology uncover new proteins. G-protein coupled receptors (GPCRs) are a particularly difficult superfamily of proteins to classify due to extreme diversity among its members. Previous comparisons of BLAST, k-nearest neighbor (k-NN), hidden markov model (HMM) and support vector machine (SVM) using alignment-based features have suggested that classifiers at the complexity of SVM are needed to attain high accuracy. Here, analogous to document classification, we applied Decision Tree and Naive Bayes classifiers with chi-square feature selection on counts of n-grams (i.e. short peptide sequences of length n) to this classification task. Using the GPCR dataset and evaluation protocol from the previous study, the Naive Bayes classifier attained an accuracy of 93.0 and 92.4% in level I and level II subfamily classification respectively, while SVM has a reported accuracy of 88.4 and 86.3%. This is a 39.7 and 44.5% reduction in residual error for level I and level II subfamily classification, respectively. The Decision Tree, while inferior to SVM, outperforms HMM in both level I and level II subfamily classification. For those GPCR families whose profiles are stored in the Protein FAMilies database of alignments and HMMs (PFAM), our method performs comparably to a search against those profiles. Finally, our method can be generalized to other protein families by applying it to the superfamily of nuclear receptors with 94.5, 97.8 and 93.6% accuracy in family, level I and level II subfamily classification respectively. Copyright 2005 Wiley-Liss, Inc.
Zhe Fan; Zhong Wang; Guanglin Li; Ruomei Wang
2016-08-01
Motion classification system based on surface Electromyography (sEMG) pattern recognition has achieved good results in experimental condition. But it is still a challenge for clinical implement and practical application. Many factors contribute to the difficulty of clinical use of the EMG based dexterous control. The most obvious and important is the noise in the EMG signal caused by electrode shift, muscle fatigue, motion artifact, inherent instability of signal and biological signals such as Electrocardiogram. In this paper, a novel method based on Canonical Correlation Analysis (CCA) was developed to eliminate the reduction of classification accuracy caused by electrode shift. The average classification accuracy of our method were above 95% for the healthy subjects. In the process, we validated the influence of electrode shift on motion classification accuracy and discovered the strong correlation with correlation coefficient of >0.9 between shift position data and normal position data.
Rifai Chai; Naik, Ganesh R; Tran, Yvonne; Sai Ho Ling; Craig, Ashley; Nguyen, Hung T
2015-08-01
An electroencephalography (EEG)-based counter measure device could be used for fatigue detection during driving. This paper explores the classification of fatigue and alert states using power spectral density (PSD) as a feature extractor and fuzzy swarm based-artificial neural network (ANN) as a classifier. An independent component analysis of entropy rate bound minimization (ICA-ERBM) is investigated as a novel source separation technique for fatigue classification using EEG analysis. A comparison of the classification accuracy of source separator versus no source separator is presented. Classification performance based on 43 participants without the inclusion of the source separator resulted in an overall sensitivity of 71.67%, a specificity of 75.63% and an accuracy of 73.65%. However, these results were improved after the inclusion of a source separator module, resulting in an overall sensitivity of 78.16%, a specificity of 79.60% and an accuracy of 78.88% (p <; 0.05).
NASA Astrophysics Data System (ADS)
Hall-Brown, Mary
The heterogeneity of Arctic vegetation can make land cover classification vey difficult when using medium to small resolution imagery (Schneider et al., 2009; Muller et al., 1999). Using high radiometric and spatial resolution imagery, such as the SPOT 5 and IKONOS satellites, have helped arctic land cover classification accuracies rise into the 80 and 90 percentiles (Allard, 2003; Stine et al., 2010; Muller et al., 1999). However, those increases usually come at a high price. High resolution imagery is very expensive and can often add tens of thousands of dollars onto the cost of the research. The EO-1 satellite launched in 2002 carries two sensors that have high specral and/or high spatial resolutions and can be an acceptable compromise between the resolution versus cost issues. The Hyperion is a hyperspectral sensor with the capability of collecting 242 spectral bands of information. The Advanced Land Imager (ALI) is an advanced multispectral sensor whose spatial resolution can be sharpened to 10 meters. This dissertation compares the accuracies of arctic land cover classifications produced by the Hyperion and ALI sensors to the classification accuracies produced by the Systeme Pour l' Observation de le Terre (SPOT), the Landsat Thematic Mapper (TM) and the Landsat Enhanced Thematic Mapper Plus (ETM+) sensors. Hyperion and ALI images from August 2004 were collected over the Upper Kuparuk River Basin, Alaska. Image processing included the stepwise discriminant analysis of pixels that were positively classified from coinciding ground control points, geometric and radiometric correction, and principle component analysis. Finally, stratified random sampling was used to perform accuracy assessments on satellite derived land cover classifications. Accuracy was estimated from an error matrix (confusion matrix) that provided the overall, producer's and user's accuracies. This research found that while the Hyperion sensor produced classfication accuracies that were equivalent to the TM and ETM+ sensor (approximately 78%), the Hyperion could not obtain the accuracy of the SPOT 5 HRV sensor. However, the land cover classifications derived from the ALI sensor exceeded most classification accuracies derived from the TM and ETM+ senors and were even comparable to most SPOT 5 HRV classifications (87%). With the deactivation of the Landsat series satellites, the monitoring of remote locations such as in the Arctic on an uninterupted basis thoughout the world is in jeopardy. The utilization of the Hyperion and ALI sensors are a way to keep that endeavor operational. By keeping the ALI sensor active at all times, uninterupted observation of the entire Earth can be accomplished. Keeping the Hyperion sensor as a "tasked" sensor can provide scientists with additional imagery and options for their studies without overburdening storage issues.
Evaluation of space SAR as a land-cover classification
NASA Technical Reports Server (NTRS)
Brisco, B.; Ulaby, F. T.; Williams, T. H. L.
1985-01-01
The multidimensional approach to the mapping of land cover, crops, and forests is reported. Dimensionality is achieved by using data from sensors such as LANDSAT to augment Seasat and Shuttle Image Radar (SIR) data, using different image features such as tone and texture, and acquiring multidate data. Seasat, Shuttle Imaging Radar (SIR-A), and LANDSAT data are used both individually and in combination to map land cover in Oklahoma. The results indicates that radar is the best single sensor (72% accuracy) and produces the best sensor combination (97.5% accuracy) for discriminating among five land cover categories. Multidate Seasat data and a single data of LANDSAT coverage are then used in a crop classification study of western Kansas. The highest accuracy for a single channel is achieved using a Seasat scene, which produces a classification accuracy of 67%. Classification accuracy increases to approximately 75% when either a multidate Seasat combination or LANDSAT data in a multisensor combination is used. The tonal and textural elements of SIR-A data are then used both alone and in combination to classify forests into five categories.
Comparative Analysis of Haar and Daubechies Wavelet for Hyper Spectral Image Classification
NASA Astrophysics Data System (ADS)
Sharif, I.; Khare, S.
2014-11-01
With the number of channels in the hundreds instead of in the tens Hyper spectral imagery possesses much richer spectral information than multispectral imagery. The increased dimensionality of such Hyper spectral data provides a challenge to the current technique for analyzing data. Conventional classification methods may not be useful without dimension reduction pre-processing. So dimension reduction has become a significant part of Hyper spectral image processing. This paper presents a comparative analysis of the efficacy of Haar and Daubechies wavelets for dimensionality reduction in achieving image classification. Spectral data reduction using Wavelet Decomposition could be useful because it preserves the distinction among spectral signatures. Daubechies wavelets optimally capture the polynomial trends while Haar wavelet is discontinuous and resembles a step function. The performance of these wavelets are compared in terms of classification accuracy and time complexity. This paper shows that wavelet reduction has more separate classes and yields better or comparable classification accuracy. In the context of the dimensionality reduction algorithm, it is found that the performance of classification of Daubechies wavelets is better as compared to Haar wavelet while Daubechies takes more time compare to Haar wavelet. The experimental results demonstrate the classification system consistently provides over 84% classification accuracy.
ANALYSIS OF A CLASSIFICATION ERROR MATRIX USING CATEGORICAL DATA TECHNIQUES.
Rosenfield, George H.; Fitzpatrick-Lins, Katherine
1984-01-01
Summary form only given. A classification error matrix typically contains tabulation results of an accuracy evaluation of a thematic classification, such as that of a land use and land cover map. The diagonal elements of the matrix represent the counts corrected, and the usual designation of classification accuracy has been the total percent correct. The nondiagonal elements of the matrix have usually been neglected. The classification error matrix is known in statistical terms as a contingency table of categorical data. As an example, an application of these methodologies to a problem of remotely sensed data concerning two photointerpreters and four categories of classification indicated that there is no significant difference in the interpretation between the two photointerpreters, and that there are significant differences among the interpreted category classifications. However, two categories, oak and cottonwood, are not separable in classification in this experiment at the 0. 51 percent probability. A coefficient of agreement is determined for the interpreted map as a whole, and individually for each of the interpreted categories. A conditional coefficient of agreement for the individual categories is compared to other methods for expressing category accuracy which have already been presented in the remote sensing literature.
Data Field Modeling and Spectral-Spatial Feature Fusion for Hyperspectral Data Classification.
Liu, Da; Li, Jianxun
2016-12-16
Classification is a significant subject in hyperspectral remote sensing image processing. This study proposes a spectral-spatial feature fusion algorithm for the classification of hyperspectral images (HSI). Unlike existing spectral-spatial classification methods, the influences and interactions of the surroundings on each measured pixel were taken into consideration in this paper. Data field theory was employed as the mathematical realization of the field theory concept in physics, and both the spectral and spatial domains of HSI were considered as data fields. Therefore, the inherent dependency of interacting pixels was modeled. Using data field modeling, spatial and spectral features were transformed into a unified radiation form and further fused into a new feature by using a linear model. In contrast to the current spectral-spatial classification methods, which usually simply stack spectral and spatial features together, the proposed method builds the inner connection between the spectral and spatial features, and explores the hidden information that contributed to classification. Therefore, new information is included for classification. The final classification result was obtained using a random forest (RF) classifier. The proposed method was tested with the University of Pavia and Indian Pines, two well-known standard hyperspectral datasets. The experimental results demonstrate that the proposed method has higher classification accuracies than those obtained by the traditional approaches.
Research on Remote Sensing Image Classification Based on Feature Level Fusion
NASA Astrophysics Data System (ADS)
Yuan, L.; Zhu, G.
2018-04-01
Remote sensing image classification, as an important direction of remote sensing image processing and application, has been widely studied. However, in the process of existing classification algorithms, there still exists the phenomenon of misclassification and missing points, which leads to the final classification accuracy is not high. In this paper, we selected Sentinel-1A and Landsat8 OLI images as data sources, and propose a classification method based on feature level fusion. Compare three kind of feature level fusion algorithms (i.e., Gram-Schmidt spectral sharpening, Principal Component Analysis transform and Brovey transform), and then select the best fused image for the classification experimental. In the classification process, we choose four kinds of image classification algorithms (i.e. Minimum distance, Mahalanobis distance, Support Vector Machine and ISODATA) to do contrast experiment. We use overall classification precision and Kappa coefficient as the classification accuracy evaluation criteria, and the four classification results of fused image are analysed. The experimental results show that the fusion effect of Gram-Schmidt spectral sharpening is better than other methods. In four kinds of classification algorithms, the fused image has the best applicability to Support Vector Machine classification, the overall classification precision is 94.01 % and the Kappa coefficients is 0.91. The fused image with Sentinel-1A and Landsat8 OLI is not only have more spatial information and spectral texture characteristics, but also enhances the distinguishing features of the images. The proposed method is beneficial to improve the accuracy and stability of remote sensing image classification.
Salimi, Nima; Loh, Kar Hoe; Kaur Dhillon, Sarinder; Chong, Ving Ching
2016-01-01
Background. Fish species may be identified based on their unique otolith shape or contour. Several pattern recognition methods have been proposed to classify fish species through morphological features of the otolith contours. However, there has been no fully-automated species identification model with the accuracy higher than 80%. The purpose of the current study is to develop a fully-automated model, based on the otolith contours, to identify the fish species with the high classification accuracy. Methods. Images of the right sagittal otoliths of 14 fish species from three families namely Sciaenidae, Ariidae, and Engraulidae were used to develop the proposed identification model. Short-time Fourier transform (STFT) was used, for the first time in the area of otolith shape analysis, to extract important features of the otolith contours. Discriminant Analysis (DA), as a classification technique, was used to train and test the model based on the extracted features. Results. Performance of the model was demonstrated using species from three families separately, as well as all species combined. Overall classification accuracy of the model was greater than 90% for all cases. In addition, effects of STFT variables on the performance of the identification model were explored in this study. Conclusions. Short-time Fourier transform could determine important features of the otolith outlines. The fully-automated model proposed in this study (STFT-DA) could predict species of an unknown specimen with acceptable identification accuracy. The model codes can be accessed at http://mybiodiversityontologies.um.edu.my/Otolith/ and https://peerj.com/preprints/1517/. The current model has flexibility to be used for more species and families in future studies.
A Dirichlet-Multinomial Bayes Classifier for Disease Diagnosis with Microbial Compositions.
Gao, Xiang; Lin, Huaiying; Dong, Qunfeng
2017-01-01
Dysbiosis of microbial communities is associated with various human diseases, raising the possibility of using microbial compositions as biomarkers for disease diagnosis. We have developed a Bayes classifier by modeling microbial compositions with Dirichlet-multinomial distributions, which are widely used to model multicategorical count data with extra variation. The parameters of the Dirichlet-multinomial distributions are estimated from training microbiome data sets based on maximum likelihood. The posterior probability of a microbiome sample belonging to a disease or healthy category is calculated based on Bayes' theorem, using the likelihood values computed from the estimated Dirichlet-multinomial distribution, as well as a prior probability estimated from the training microbiome data set or previously published information on disease prevalence. When tested on real-world microbiome data sets, our method, called DMBC (for Dirichlet-multinomial Bayes classifier), shows better classification accuracy than the only existing Bayesian microbiome classifier based on a Dirichlet-multinomial mixture model and the popular random forest method. The advantage of DMBC is its built-in automatic feature selection, capable of identifying a subset of microbial taxa with the best classification accuracy between different classes of samples based on cross-validation. This unique ability enables DMBC to maintain and even improve its accuracy at modeling species-level taxa. The R package for DMBC is freely available at https://github.com/qunfengdong/DMBC. IMPORTANCE By incorporating prior information on disease prevalence, Bayes classifiers have the potential to estimate disease probability better than other common machine-learning methods. Thus, it is important to develop Bayes classifiers specifically tailored for microbiome data. Our method shows higher classification accuracy than the only existing Bayesian classifier and the popular random forest method, and thus provides an alternative option for using microbial compositions for disease diagnosis.
NASA Astrophysics Data System (ADS)
Seo, Young Wook; Yoon, Seung Chul; Park, Bosoon; Hinton, Arthur; Windham, William R.; Lawrence, Kurt C.
2013-05-01
Salmonella is a major cause of foodborne disease outbreaks resulting from the consumption of contaminated food products in the United States. This paper reports the development of a hyperspectral imaging technique for detecting and differentiating two of the most common Salmonella serotypes, Salmonella Enteritidis (SE) and Salmonella Typhimurium (ST), from background microflora that are often found in poultry carcass rinse. Presumptive positive screening of colonies with a traditional direct plating method is a labor intensive and time consuming task. Thus, this paper is concerned with the detection of differences in spectral characteristics among the pure SE, ST, and background microflora grown on brilliant green sulfa (BGS) and xylose lysine tergitol 4 (XLT4) agar media with a spread plating technique. Visible near-infrared hyperspectral imaging, providing the spectral and spatial information unique to each microorganism, was utilized to differentiate SE and ST from the background microflora. A total of 10 classification models, including five machine learning algorithms, each without and with principal component analysis (PCA), were validated and compared to find the best model in classification accuracy. The five machine learning (classification) algorithms used in this study were Mahalanobis distance (MD), k-nearest neighbor (kNN), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM). The average classification accuracy of all 10 models on a calibration (or training) set of the pure cultures on BGS agar plates was 98% (Kappa coefficient = 0.95) in determining the presence of SE and/or ST although it was difficult to differentiate between SE and ST. The average classification accuracy of all 10 models on a training set for ST detection on XLT4 agar was over 99% (Kappa coefficient = 0.99) although SE colonies on XLT4 agar were difficult to differentiate from background microflora. The average classification accuracy of all 10 models on a validation set of chicken carcass rinses spiked with SE or ST and incubated on BGS agar plates was 94.45% and 83.73%, without and with PCA for classification, respectively. The best performing classification model on the validation set was QDA without PCA by achieving the classification accuracy of 98.65% (Kappa coefficient=0.98). The overall best performing classification model regardless of using PCA was MD with the classification accuracy of 94.84% (Kappa coefficient=0.88) on the validation set.
Bennet, Jaison; Ganaprakasam, Chilambuchelvan Arul; Arputharaj, Kannan
2014-01-01
Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN), naive Bayes, and support vector machine (SVM). Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT) and moving window technique (MWT) is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.
A Dirichlet process model for classifying and forecasting epidemic curves.
Nsoesie, Elaine O; Leman, Scotland C; Marathe, Madhav V
2014-01-09
A forecast can be defined as an endeavor to quantitatively estimate a future event or probabilities assigned to a future occurrence. Forecasting stochastic processes such as epidemics is challenging since there are several biological, behavioral, and environmental factors that influence the number of cases observed at each point during an epidemic. However, accurate forecasts of epidemics would impact timely and effective implementation of public health interventions. In this study, we introduce a Dirichlet process (DP) model for classifying and forecasting influenza epidemic curves. The DP model is a nonparametric Bayesian approach that enables the matching of current influenza activity to simulated and historical patterns, identifies epidemic curves different from those observed in the past and enables prediction of the expected epidemic peak time. The method was validated using simulated influenza epidemics from an individual-based model and the accuracy was compared to that of the tree-based classification technique, Random Forest (RF), which has been shown to achieve high accuracy in the early prediction of epidemic curves using a classification approach. We also applied the method to forecasting influenza outbreaks in the United States from 1997-2013 using influenza-like illness (ILI) data from the Centers for Disease Control and Prevention (CDC). We made the following observations. First, the DP model performed as well as RF in identifying several of the simulated epidemics. Second, the DP model correctly forecasted the peak time several days in advance for most of the simulated epidemics. Third, the accuracy of identifying epidemics different from those already observed improved with additional data, as expected. Fourth, both methods correctly classified epidemics with higher reproduction numbers (R) with a higher accuracy compared to epidemics with lower R values. Lastly, in the classification of seasonal influenza epidemics based on ILI data from the CDC, the methods' performance was comparable. Although RF requires less computational time compared to the DP model, the algorithm is fully supervised implying that epidemic curves different from those previously observed will always be misclassified. In contrast, the DP model can be unsupervised, semi-supervised or fully supervised. Since both methods have their relative merits, an approach that uses both RF and the DP model could be beneficial.
Koch, Stefan P.; Hägele, Claudia; Haynes, John-Dylan; Heinz, Andreas; Schlagenhauf, Florian; Sterzer, Philipp
2015-01-01
Functional neuroimaging has provided evidence for altered function of mesolimbic circuits implicated in reward processing, first and foremost the ventral striatum, in patients with schizophrenia. While such findings based on significant group differences in brain activations can provide important insights into the pathomechanisms of mental disorders, the use of neuroimaging results from standard univariate statistical analysis for individual diagnosis has proven difficult. In this proof of concept study, we tested whether the predictive accuracy for the diagnostic classification of schizophrenia patients vs. healthy controls could be improved using multivariate pattern analysis (MVPA) of regional functional magnetic resonance imaging (fMRI) activation patterns for the anticipation of monetary reward. With a searchlight MVPA approach using support vector machine classification, we found that the diagnostic category could be predicted from local activation patterns in frontal, temporal, occipital and midbrain regions, with a maximal cluster peak classification accuracy of 93% for the right pallidum. Region-of-interest based MVPA for the ventral striatum achieved a maximal cluster peak accuracy of 88%, whereas the classification accuracy on the basis of standard univariate analysis reached only 75%. Moreover, using support vector regression we could additionally predict the severity of negative symptoms from ventral striatal activation patterns. These results show that MVPA can be used to substantially increase the accuracy of diagnostic classification on the basis of task-related fMRI signal patterns in a regionally specific way. PMID:25799236
2013-01-01
Background and purpose Guidelines for fracture treatment and evaluation require a valid classification. Classifications especially designed for children are available, but they might lead to reduced accuracy, considering the relative infrequency of childhood fractures in a general orthopedic department. We tested the reliability and accuracy of the Müller classification when used for long bone fractures in children. Methods We included all long bone fractures in children aged < 16 years who were treated in 2008 at the surgical ward of Stavanger University Hospital. 20 surgeons recorded 232 fractures. Datasets were generated for intra- and inter-rater analysis, as well as a reference dataset for accuracy calculations. We present proportion of agreement (PA) and kappa (K) statistics. Results For intra-rater analysis, overall agreement (κ) was 0.75 (95% CI: 0.68–0.81) and PA was 79%. For inter-rater assessment, K was 0.71 (95% CI: 0.61–0.80) and PA was 77%. Accuracy was estimated: κ = 0.72 (95% CI: 0.64–0.79) and PA = 76%. Interpretation The Müller classification (slightly adjusted for pediatric fractures) showed substantial to excellent accuracy among general orthopedic surgeons when applied to long bone fractures in children. However, separate knowledge about the child-specific fracture pattern, the maturity of the bone, and the degree of displacement must be considered when the treatment and the prognosis of the fractures are evaluated. PMID:23245225
An efficient scheme for automatic web pages categorization using the support vector machine
NASA Astrophysics Data System (ADS)
Bhalla, Vinod Kumar; Kumar, Neeraj
2016-07-01
In the past few years, with an evolution of the Internet and related technologies, the number of the Internet users grows exponentially. These users demand access to relevant web pages from the Internet within fraction of seconds. To achieve this goal, there is a requirement of an efficient categorization of web page contents. Manual categorization of these billions of web pages to achieve high accuracy is a challenging task. Most of the existing techniques reported in the literature are semi-automatic. Using these techniques, higher level of accuracy cannot be achieved. To achieve these goals, this paper proposes an automatic web pages categorization into the domain category. The proposed scheme is based on the identification of specific and relevant features of the web pages. In the proposed scheme, first extraction and evaluation of features are done followed by filtering the feature set for categorization of domain web pages. A feature extraction tool based on the HTML document object model of the web page is developed in the proposed scheme. Feature extraction and weight assignment are based on the collection of domain-specific keyword list developed by considering various domain pages. Moreover, the keyword list is reduced on the basis of ids of keywords in keyword list. Also, stemming of keywords and tag text is done to achieve a higher accuracy. An extensive feature set is generated to develop a robust classification technique. The proposed scheme was evaluated using a machine learning method in combination with feature extraction and statistical analysis using support vector machine kernel as the classification tool. The results obtained confirm the effectiveness of the proposed scheme in terms of its accuracy in different categories of web pages.
Articular cartilage degeneration classification by means of high-frequency ultrasound.
Männicke, N; Schöne, M; Oelze, M; Raum, K
2014-10-01
To date only single ultrasound parameters were regarded in statistical analyses to characterize osteoarthritic changes in articular cartilage and the potential benefit of using parameter combinations for characterization remains unclear. Therefore, the aim of this work was to utilize feature selection and classification of a Mankin subset score (i.e., cartilage surface and cell sub-scores) using ultrasound-based parameter pairs and investigate both classification accuracy and the sensitivity towards different degeneration stages. 40 punch biopsies of human cartilage were previously scanned ex vivo with a 40-MHz transducer. Ultrasound-based surface parameters, as well as backscatter and envelope statistics parameters were available. Logistic regression was performed with each unique US parameter pair as predictor and different degeneration stages as response variables. The best ultrasound-based parameter pair for each Mankin subset score value was assessed by highest classification accuracy and utilized in receiver operating characteristics (ROC) analysis. The classifications discriminating between early degenerations yielded area under the ROC curve (AUC) values of 0.94-0.99 (mean ± SD: 0.97 ± 0.03). In contrast, classifications among higher Mankin subset scores resulted in lower AUC values: 0.75-0.91 (mean ± SD: 0.84 ± 0.08). Variable sensitivities of the different ultrasound features were observed with respect to different degeneration stages. Our results strongly suggest that combinations of high-frequency ultrasound-based parameters exhibit potential to characterize different, particularly very early, degeneration stages of hyaline cartilage. Variable sensitivities towards different degeneration stages suggest that a concurrent estimation of multiple ultrasound-based parameters is diagnostically valuable. In-vivo application of the present findings is conceivable in both minimally invasive arthroscopic ultrasound and high-frequency transcutaneous ultrasound. Copyright © 2014 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
Yao, Dongren; Calhoun, Vince D; Fu, Zening; Du, Yuhui; Sui, Jing
2018-05-15
Discriminating Alzheimer's disease (AD) from its prodromal form, mild cognitive impairment (MCI), is a significant clinical problem that may facilitate early diagnosis and intervention, in which a more challenging issue is to classify MCI subtypes, i.e., those who eventually convert to AD (cMCI) versus those who do not (MCI). To solve this difficult 4-way classification problem (AD, MCI, cMCI and healthy controls), a competition was hosted by Kaggle to invite the scientific community to apply their machine learning approaches on pre-processed sets of T1-weighted magnetic resonance images (MRI) data and the demographic information from the international Alzheimer's disease neuroimaging initiative (ADNI) database. This paper summarizes our competition results. We first proposed a hierarchical process by turning the 4-way classification into five binary classification problems. A new feature selection technology based on relative importance was also proposed, aiming to identify a more informative and concise subset from 426 sMRI morphometric and 3 demographic features, to ensure each binary classifier to achieve its highest accuracy. As a result, about 2% of the original features were selected to build a new feature space, which can achieve the final four-way classification with a 54.38% accuracy on testing data through hierarchical grouping, higher than several alternative methods in comparison. More importantly, the selected discriminative features such as hippocampal volume, parahippocampal surface area, and medial orbitofrontal thickness, etc. as well as the MMSE score, are reasonable and consistent with those reported in AD/MCI deficits. In summary, the proposed method provides a new framework for multi-way classification using hierarchical grouping and precise feature selection. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Modiri, M.; Salehabadi, A.; Mohebbi, M.; Hashemi, A. M.; Masumi, M.
2015-12-01
The use of UAV in the application of photogrammetry to obtain cover images and achieve the main objectives of the photogrammetric mapping has been a boom in the region. The images taken from REGGIOLO region in the province of, Italy Reggio -Emilia by UAV with non-metric camera Canon Ixus and with an average height of 139.42 meters were used to classify urban feature. Using the software provided SURE and cover images of the study area, to produce dense point cloud, DSM and Artvqvtv spatial resolution of 10 cm was prepared. DTM area using Adaptive TIN filtering algorithm was developed. NDSM area was prepared with using the difference between DSM and DTM and a separate features in the image stack. In order to extract features, using simultaneous occurrence matrix features mean, variance, homogeneity, contrast, dissimilarity, entropy, second moment, and correlation for each of the RGB band image was used Orthophoto area. Classes used to classify urban problems, including buildings, trees and tall vegetation, grass and vegetation short, paved road and is impervious surfaces. Class consists of impervious surfaces such as pavement conditions, the cement, the car, the roof is stored. In order to pixel-based classification and selection of optimal features of classification was GASVM pixel basis. In order to achieve the classification results with higher accuracy and spectral composition informations, texture, and shape conceptual image featureOrthophoto area was fencing. The segmentation of multi-scale segmentation method was used.it belonged class. Search results using the proposed classification of urban feature, suggests the suitability of this method of classification complications UAV is a city using images. The overall accuracy and kappa coefficient method proposed in this study, respectively, 47/93% and 84/91% was.
Comprehensive decision tree models in bioinformatics.
Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter
2012-01-01
Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly redundant attributes that are very common in bioinformatics.
Comprehensive Decision Tree Models in Bioinformatics
Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter
2012-01-01
Purpose Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. Methods This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. Results The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. Conclusions The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly redundant attributes that are very common in bioinformatics. PMID:22479449
Gao, Jianyong; Tian, Gang; Han, Xu; Zhu, Qiang
2018-01-01
Oral squamous cell carcinoma (OSCC) is the sixth most common type cancer worldwide, with poor prognosis. The present study aimed to identify gene signatures that could classify OSCC and predict prognosis in different stages. A training data set (GSE41613) and two validation data sets (GSE42743 and GSE26549) were acquired from the online Gene Expression Omnibus database. In the training data set, patients were classified based on the tumor-node-metastasis staging system, and subsequently grouped into low stage (L) or high stage (H). Signature genes between L and H stages were selected by disparity index analysis, and classification was performed by the expression of these signature genes. The established classification was compared with the L and H classification, and fivefold cross validation was used to evaluate the stability. Enrichment analysis for the signature genes was implemented by the Database for Annotation, Visualization and Integration Discovery. Two validation data sets were used to determine the precise of classification. Survival analysis was conducted followed each classification using the package ‘survival’ in R software. A set of 24 signature genes was identified based on the classification model with the Fi value of 0.47, which was used to distinguish OSCC samples in two different stages. Overall survival of patients in the H stage was higher than those in the L stage. Signature genes were primarily enriched in ‘ether lipid metabolism’ pathway and biological processes such as ‘positive regulation of adaptive immune response’ and ‘apoptotic cell clearance’. The results provided a novel 24-gene set that may be used as biomarkers to predict OSCC prognosis with high accuracy, which may be used to determine an appropriate treatment program for patients with OSCC in addition to the traditional evaluation index. PMID:29257303
Liu, Yu; Xia, Jun; Shi, Chun-Xiang; Hong, Yang
2009-01-01
The crowning objective of this research was to identify a better cloud classification method to upgrade the current window-based clustering algorithm used operationally for China’s first operational geostationary meteorological satellite FengYun-2C (FY-2C) data. First, the capabilities of six widely-used Artificial Neural Network (ANN) methods are analyzed, together with the comparison of two other methods: Principal Component Analysis (PCA) and a Support Vector Machine (SVM), using 2864 cloud samples manually collected by meteorologists in June, July, and August in 2007 from three FY-2C channel (IR1, 10.3–11.3 μm; IR2, 11.5–12.5 μm and WV 6.3–7.6 μm) imagery. The result shows that: (1) ANN approaches, in general, outperformed the PCA and the SVM given sufficient training samples and (2) among the six ANN networks, higher cloud classification accuracy was obtained with the Self-Organizing Map (SOM) and Probabilistic Neural Network (PNN). Second, to compare the ANN methods to the present FY-2C operational algorithm, this study implemented SOM, one of the best ANN network identified from this study, as an automated cloud classification system for the FY-2C multi-channel data. It shows that SOM method has improved the results greatly not only in pixel-level accuracy but also in cloud patch-level classification by more accurately identifying cloud types such as cumulonimbus, cirrus and clouds in high latitude. Findings of this study suggest that the ANN-based classifiers, in particular the SOM, can be potentially used as an improved Automated Cloud Classification Algorithm to upgrade the current window-based clustering method for the FY-2C operational products. PMID:22346714
Liu, Yu; Xia, Jun; Shi, Chun-Xiang; Hong, Yang
2009-01-01
The crowning objective of this research was to identify a better cloud classification method to upgrade the current window-based clustering algorithm used operationally for China's first operational geostationary meteorological satellite FengYun-2C (FY-2C) data. First, the capabilities of six widely-used Artificial Neural Network (ANN) methods are analyzed, together with the comparison of two other methods: Principal Component Analysis (PCA) and a Support Vector Machine (SVM), using 2864 cloud samples manually collected by meteorologists in June, July, and August in 2007 from three FY-2C channel (IR1, 10.3-11.3 μm; IR2, 11.5-12.5 μm and WV 6.3-7.6 μm) imagery. The result shows that: (1) ANN approaches, in general, outperformed the PCA and the SVM given sufficient training samples and (2) among the six ANN networks, higher cloud classification accuracy was obtained with the Self-Organizing Map (SOM) and Probabilistic Neural Network (PNN). Second, to compare the ANN methods to the present FY-2C operational algorithm, this study implemented SOM, one of the best ANN network identified from this study, as an automated cloud classification system for the FY-2C multi-channel data. It shows that SOM method has improved the results greatly not only in pixel-level accuracy but also in cloud patch-level classification by more accurately identifying cloud types such as cumulonimbus, cirrus and clouds in high latitude. Findings of this study suggest that the ANN-based classifiers, in particular the SOM, can be potentially used as an improved Automated Cloud Classification Algorithm to upgrade the current window-based clustering method for the FY-2C operational products.
Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants.
Yousef, Malik; Saçar Demirci, Müşerref Duygu; Khalifa, Waleed; Allmer, Jens
2016-01-01
MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of ~95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection.
Comparing ecoregional classifications for natural areas management in the Klamath Region, USA
Sarr, Daniel A.; Duff, Andrew; Dinger, Eric C.; Shafer, Sarah L.; Wing, Michael; Seavy, Nathaniel E.; Alexander, John D.
2015-01-01
We compared three existing ecoregional classification schemes (Bailey, Omernik, and World Wildlife Fund) with two derived schemes (Omernik Revised and Climate Zones) to explore their effectiveness in explaining species distributions and to better understand natural resource geography in the Klamath Region, USA. We analyzed presence/absence data derived from digital distribution maps for trees, amphibians, large mammals, small mammals, migrant birds, and resident birds using three statistical analyses of classification accuracy (Analysis of Similarity, Canonical Analysis of Principal Coordinates, and Classification Strength). The classifications were roughly comparable in classification accuracy, with Omernik Revised showing the best overall performance. Trees showed the strongest fidelity to the classifications, and large mammals showed the weakest fidelity. We discuss the implications for regional biogeography and describe how intermediate resolution ecoregional classifications may be appropriate for use as natural areas management domains.
3D micro-mapping: Towards assessing the quality of crowdsourcing to support 3D point cloud analysis
NASA Astrophysics Data System (ADS)
Herfort, Benjamin; Höfle, Bernhard; Klonner, Carolin
2018-03-01
In this paper, we propose a method to crowdsource the task of complex three-dimensional information extraction from 3D point clouds. We design web-based 3D micro tasks tailored to assess segmented LiDAR point clouds of urban trees and investigate the quality of the approach in an empirical user study. Our results for three different experiments with increasing complexity indicate that a single crowdsourcing task can be solved in a very short time of less than five seconds on average. Furthermore, the results of our empirical case study reveal that the accuracy, sensitivity and precision of 3D crowdsourcing are high for most information extraction problems. For our first experiment (binary classification with single answer) we obtain an accuracy of 91%, a sensitivity of 95% and a precision of 92%. For the more complex tasks of the second Experiment 2 (multiple answer classification) the accuracy ranges from 65% to 99% depending on the label class. Regarding the third experiment - the determination of the crown base height of individual trees - our study highlights that crowdsourcing can be a tool to obtain values with even higher accuracy in comparison to an automated computer-based approach. Finally, we found out that the accuracy of the crowdsourced results for all experiments is hardly influenced by characteristics of the input point cloud data and of the users. Importantly, the results' accuracy can be estimated using agreement among volunteers as an intrinsic indicator, which makes a broad application of 3D micro-mapping very promising.
a Gsa-Svm Hybrid System for Classification of Binary Problems
NASA Astrophysics Data System (ADS)
Sarafrazi, Soroor; Nezamabadi-pour, Hossein; Barahman, Mojgan
2011-06-01
This paperhybridizesgravitational search algorithm (GSA) with support vector machine (SVM) and made a novel GSA-SVM hybrid system to improve the classification accuracy in binary problems. GSA is an optimization heuristic toolused to optimize the value of SVM kernel parameter (in this paper, radial basis function (RBF) is chosen as the kernel function). The experimental results show that this newapproach can achieve high classification accuracy and is comparable to or better than the particle swarm optimization (PSO)-SVM and genetic algorithm (GA)-SVM, which are two hybrid systems for classification.
Typicality effects in artificial categories: is there a hemisphere difference?
Richards, L G; Chiarello, C
1990-07-01
In category classification tasks, typicality effects are usually found: accuracy and reaction time depend upon distance from a prototype. In this study, subjects learned either verbal or nonverbal dot pattern categories, followed by a lateralized classification task. Comparable typicality effects were found in both reaction time and accuracy across visual fields for both verbal and nonverbal categories. Both hemispheres appeared to use a similarity-to-prototype matching strategy in classification. This indicates that merely having a verbal label does not differentiate classification in the two hemispheres.
Multi-site evaluation of IKONOS data for classification of tropical coral reef environments
Andrefouet, S.; Kramer, Philip; Torres-Pulliza, D.; Joyce, K.E.; Hochberg, E.J.; Garza-Perez, R.; Mumby, P.J.; Riegl, Bernhard; Yamano, H.; White, W.H.; Zubia, M.; Brock, J.C.; Phinn, S.R.; Naseer, A.; Hatcher, B.G.; Muller-Karger, F. E.
2003-01-01
Ten IKONOS images of different coral reef sites distributed around the world were processed to assess the potential of 4-m resolution multispectral data for coral reef habitat mapping. Complexity of reef environments, established by field observation, ranged from 3 to 15 classes of benthic habitats containing various combinations of sediments, carbonate pavement, seagrass, algae, and corals in different geomorphologic zones (forereef, lagoon, patch reef, reef flats). Processing included corrections for sea surface roughness and bathymetry, unsupervised or supervised classification, and accuracy assessment based on ground-truth data. IKONOS classification results were compared with classified Landsat 7 imagery for simple to moderate complexity of reef habitats (5-11 classes). For both sensors, overall accuracies of the classifications show a general linear trend of decreasing accuracy with increasing habitat complexity. The IKONOS sensor performed better, with a 15-20% improvement in accuracy compared to Landsat. For IKONOS, overall accuracy was 77% for 4-5 classes, 71% for 7-8 classes, 65% in 9-11 classes, and 53% for more than 13 classes. The Landsat classification accuracy was systematically lower, with an average of 56% for 5-10 classes. Within this general trend, inter-site comparisons and specificities demonstrate the benefits of different approaches. Pre-segmentation of the different geomorphologic zones and depth correction provided different advantages in different environments. Our results help guide scientists and managers in applying IKONOS-class data for coral reef mapping applications. ?? 2003 Elsevier Inc. All rights reserved.
Transportation Modes Classification Using Sensors on Smartphones.
Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu
2016-08-19
This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user's transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes.
Transportation Modes Classification Using Sensors on Smartphones
Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu
2016-01-01
This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user’s transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes. PMID:27548182
A MUSIC-based method for SSVEP signal processing.
Chen, Kun; Liu, Quan; Ai, Qingsong; Zhou, Zude; Xie, Sheng Quan; Meng, Wei
2016-03-01
The research on brain computer interfaces (BCIs) has become a hotspot in recent years because it offers benefit to disabled people to communicate with the outside world. Steady state visual evoked potential (SSVEP)-based BCIs are more widely used because of higher signal to noise ratio and greater information transfer rate compared with other BCI techniques. In this paper, a multiple signal classification based method was proposed for multi-dimensional SSVEP feature extraction. 2-second data epochs from four electrodes achieved excellent accuracy rates including idle state detection. In some asynchronous mode experiments, the recognition accuracy reached up to 100%. The experimental results showed that the proposed method attained good frequency resolution. In most situations, the recognition accuracy was higher than canonical correlation analysis, which is a typical method for multi-channel SSVEP signal processing. Also, a virtual keyboard was successfully controlled by different subjects in an unshielded environment, which proved the feasibility of the proposed method for multi-dimensional SSVEP signal processing in practical applications.
Cluster analysis as a prediction tool for pregnancy outcomes.
Banjari, Ines; Kenjerić, Daniela; Šolić, Krešimir; Mandić, Milena L
2015-03-01
Considering specific physiology changes during gestation and thinking of pregnancy as a "critical window", classification of pregnant women at early pregnancy can be considered as crucial. The paper demonstrates the use of a method based on an approach from intelligent data mining, cluster analysis. Cluster analysis method is a statistical method which makes possible to group individuals based on sets of identifying variables. The method was chosen in order to determine possibility for classification of pregnant women at early pregnancy to analyze unknown correlations between different variables so that the certain outcomes could be predicted. 222 pregnant women from two general obstetric offices' were recruited. The main orient was set on characteristics of these pregnant women: their age, pre-pregnancy body mass index (BMI) and haemoglobin value. Cluster analysis gained a 94.1% classification accuracy rate with three branch- es or groups of pregnant women showing statistically significant correlations with pregnancy outcomes. The results are showing that pregnant women both of older age and higher pre-pregnancy BMI have a significantly higher incidence of delivering baby of higher birth weight but they gain significantly less weight during pregnancy. Their babies are also longer, and these women have significantly higher probability for complications during pregnancy (gestosis) and higher probability of induced or caesarean delivery. We can conclude that the cluster analysis method can appropriately classify pregnant women at early pregnancy to predict certain outcomes.
Laukka, Petri; Neiberg, Daniel; Elfenbein, Hillary Anger
2014-06-01
The possibility of cultural differences in the fundamental acoustic patterns used to express emotion through the voice is an unanswered question central to the larger debate about the universality versus cultural specificity of emotion. This study used emotionally inflected standard-content speech segments expressing 11 emotions produced by 100 professional actors from 5 English-speaking cultures. Machine learning simulations were employed to classify expressions based on their acoustic features, using conditions where training and testing were conducted on stimuli coming from either the same or different cultures. A wide range of emotions were classified with above-chance accuracy in cross-cultural conditions, suggesting vocal expressions share important characteristics across cultures. However, classification showed an in-group advantage with higher accuracy in within- versus cross-cultural conditions. This finding demonstrates cultural differences in expressive vocal style, and supports the dialect theory of emotions according to which greater recognition of expressions from in-group members results from greater familiarity with culturally specific expressive styles.
Moustafa, Ahmed A; Kéri, Szabolcs; Somlai, Zsuzsanna; Balsdon, Tarryn; Frydecka, Dorota; Misiak, Blazej; White, Corey
2015-09-15
In this study, we tested reward- and punishment learning performance using a probabilistic classification learning task in patients with schizophrenia (n=37) and healthy controls (n=48). We also fit subjects' data using a Drift Diffusion Model (DDM) of simple decisions to investigate which components of the decision process differ between patients and controls. Modeling results show between-group differences in multiple components of the decision process. Specifically, patients had slower motor/encoding time, higher response caution (favoring accuracy over speed), and a deficit in classification learning for punishment, but not reward, trials. The results suggest that patients with schizophrenia adopt a compensatory strategy of favoring accuracy over speed to improve performance, yet still show signs of a deficit in learning based on negative feedback. Our data highlights the importance of applying fitting models (particularly drift diffusion models) to behavioral data. The implications of these findings are discussed relative to theories of schizophrenia and cognitive processing. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Cavigelli, Lukas; Bernath, Dominic; Magno, Michele; Benini, Luca
2016-10-01
Detecting and classifying targets in video streams from surveillance cameras is a cumbersome, error-prone and expensive task. Often, the incurred costs are prohibitive for real-time monitoring. This leads to data being stored locally or transmitted to a central storage site for post-incident examination. The required communication links and archiving of the video data are still expensive and this setup excludes preemptive actions to respond to imminent threats. An effective way to overcome these limitations is to build a smart camera that analyzes the data on-site, close to the sensor, and transmits alerts when relevant video sequences are detected. Deep neural networks (DNNs) have come to outperform humans in visual classifications tasks and are also performing exceptionally well on other computer vision tasks. The concept of DNNs and Convolutional Networks (ConvNets) can easily be extended to make use of higher-dimensional input data such as multispectral data. We explore this opportunity in terms of achievable accuracy and required computational effort. To analyze the precision of DNNs for scene labeling in an urban surveillance scenario we have created a dataset with 8 classes obtained in a field experiment. We combine an RGB camera with a 25-channel VIS-NIR snapshot sensor to assess the potential of multispectral image data for target classification. We evaluate several new DNNs, showing that the spectral information fused together with the RGB frames can be used to improve the accuracy of the system or to achieve similar accuracy with a 3x smaller computation effort. We achieve a very high per-pixel accuracy of 99.1%. Even for scarcely occurring, but particularly interesting classes, such as cars, 75% of the pixels are labeled correctly with errors occurring only around the border of the objects. This high accuracy was obtained with a training set of only 30 labeled images, paving the way for fast adaptation to various application scenarios.
Brain-Computer Interface Based on Generation of Visual Images
Bobrov, Pavel; Frolov, Alexander; Cantor, Charles; Fedulova, Irina; Bakhnyan, Mikhail; Zhavoronkov, Alexander
2011-01-01
This paper examines the task of recognizing EEG patterns that correspond to performing three mental tasks: relaxation and imagining of two types of pictures: faces and houses. The experiments were performed using two EEG headsets: BrainProducts ActiCap and Emotiv EPOC. The Emotiv headset becomes widely used in consumer BCI application allowing for conducting large-scale EEG experiments in the future. Since classification accuracy significantly exceeded the level of random classification during the first three days of the experiment with EPOC headset, a control experiment was performed on the fourth day using ActiCap. The control experiment has shown that utilization of high-quality research equipment can enhance classification accuracy (up to 68% in some subjects) and that the accuracy is independent of the presence of EEG artifacts related to blinking and eye movement. This study also shows that computationally-inexpensive Bayesian classifier based on covariance matrix analysis yields similar classification accuracy in this problem as a more sophisticated Multi-class Common Spatial Patterns (MCSP) classifier. PMID:21695206
NASA Technical Reports Server (NTRS)
Mulligan, P. J.; Gervin, J. C.; Lu, Y. C.
1985-01-01
An area bordering the Eastern Shore of the Chesapeake Bay was selected for study and classified using unsupervised techniques applied to LANDSAT-2 MSS data and several band combinations of LANDSAT-4 TM data. The accuracies of these Level I land cover classifications were verified using the Taylor's Island USGS 7.5 minute topographic map which was photointerpreted, digitized and rasterized. The the Taylor's Island map, comparing the MSS and TM three band (2 3 4) classifications, the increased resolution of TM produced a small improvement in overall accuracy of 1% correct due primarily to a small improvement, and 1% and 3%, in areas such as water and woodland. This was expected as the MSS data typically produce high accuracies for categories which cover large contiguous areas. However, in the categories covering smaller areas within the map there was generally an improvement of at least 10%. Classification of the important residential category improved 12%, and wetlands were mapped with 11% greater accuracy.
ICA-Based Imagined Conceptual Words Classification on EEG Signals.
Imani, Ehsan; Pourmohammad, Ali; Bagheri, Mahsa; Mobasheri, Vida
2017-01-01
Independent component analysis (ICA) has been used for detecting and removing the eye artifacts conventionally. However, in this research, it was used not only for detecting the eye artifacts, but also for detecting the brain-produced signals of two conceptual danger and information category words. In this cross-sectional research, electroencephalography (EEG) signals were recorded using Micromed and 19-channel helmet devices in unipolar mode, wherein Cz electrode was selected as the reference electrode. In the first part of this research, the statistical community test case included four men and four women, who were 25-30 years old. In the designed task, three groups of traffic signs were considered, in which two groups referred to the concept of danger, and the third one referred to the concept of information. In the second part, the three volunteers, two men and one woman, who had the best results, were chosen from among eight participants. In the second designed task, direction arrows (up, down, left, and right) were used. For the 2/8 volunteers in the rest times, very high-power alpha waves were observed from the back of the head; however, in the thinking times, they were different. According to this result, alpha waves for changing the task from thinking to rest condition took at least 3 s for the two volunteers, and it was at most 5 s until they went to the absolute rest condition. For the 7/8 volunteers, the danger and information signals were well classified; these differences for the 5/8 volunteers were observed in the right hemisphere, and, for the other three volunteers, the differences were observed in the left hemisphere. For the second task, simulations showed that the best classification accuracies resulted when the time window was 2.5 s. In addition, it also showed that the features of the autoregressive (AR)-15 model coefficients were the best choices for extracting the features. For all the states of neural network except hardlim discriminator function, the classification accuracies were almost the same and not very different. Linear discriminant analysis (LDA) in comparison with the neural network yielded higher classification accuracies. ICA is a suitable algorithm for recognizing of the word's concept and its place in the brain. Achieved results from this experiment were the same compared with the results from other methods such as functional magnetic resonance imaging and methods based on the brain signals (EEG) in the vowel imagination and covert speech. Herein, the highest classification accuracy was obtained by extracting the target signal from the output of the ICA and extracting the features of coefficients AR model with time interval of 2.5 s. Finally, LDA resulted in the highest classification accuracy more than 60%.
Besio, Walter G; Cao, Hongbao; Zhou, Peng
2008-04-01
For persons with severe disabilities, a brain-computer interface (BCI) may be a viable means of communication. Lapalacian electroencephalogram (EEG) has been shown to improve classification in EEG recognition. In this work, the effectiveness of signals from tripolar concentric electrodes and disc electrodes were compared for use as a BCI. Two sets of left/right hand motor imagery EEG signals were acquired. An autoregressive (AR) model was developed for feature extraction with a Mahalanobis distance based linear classifier for classification. An exhaust selection algorithm was employed to analyze three factors before feature extraction. The factors analyzed were 1) length of data in each trial to be used, 2) start position of data, and 3) the order of the AR model. The results showed that tripolar concentric electrodes generated significantly higher classification accuracy than disc electrodes.
Pettersson-Yeo, William; Benetti, Stefania; Marquand, Andre F.; Joules, Richard; Catani, Marco; Williams, Steve C. R.; Allen, Paul; McGuire, Philip; Mechelli, Andrea
2014-01-01
In the pursuit of clinical utility, neuroimaging researchers of psychiatric and neurological illness are increasingly using analyses, such as support vector machine, that allow inference at the single-subject level. Recent studies employing single-modality data, however, suggest that classification accuracies must be improved for such utility to be realized. One possible solution is to integrate different data types to provide a single combined output classification; either by generating a single decision function based on an integrated kernel matrix, or, by creating an ensemble of multiple single modality classifiers and integrating their predictions. Here, we describe four integrative approaches: (1) an un-weighted sum of kernels, (2) multi-kernel learning, (3) prediction averaging, and (4) majority voting, and compare their ability to enhance classification accuracy relative to the best single-modality classification accuracy. We achieve this by integrating structural, functional, and diffusion tensor magnetic resonance imaging data, in order to compare ultra-high risk (n = 19), first episode psychosis (n = 19) and healthy control subjects (n = 23). Our results show that (i) whilst integration can enhance classification accuracy by up to 13%, the frequency of such instances may be limited, (ii) where classification can be enhanced, simple methods may yield greater increases relative to more computationally complex alternatives, and, (iii) the potential for classification enhancement is highly influenced by the specific diagnostic comparison under consideration. In conclusion, our findings suggest that for moderately sized clinical neuroimaging datasets, combining different imaging modalities in a data-driven manner is no “magic bullet” for increasing classification accuracy. However, it remains possible that this conclusion is dependent on the use of neuroimaging modalities that had little, or no, complementary information to offer one another, and that the integration of more diverse types of data would have produced greater classification enhancement. We suggest that future studies ideally examine a greater variety of data types (e.g., genetic, cognitive, and neuroimaging) in order to identify the data types and combinations optimally suited to the classification of early stage psychosis. PMID:25076868
Pettersson-Yeo, William; Benetti, Stefania; Marquand, Andre F; Joules, Richard; Catani, Marco; Williams, Steve C R; Allen, Paul; McGuire, Philip; Mechelli, Andrea
2014-01-01
In the pursuit of clinical utility, neuroimaging researchers of psychiatric and neurological illness are increasingly using analyses, such as support vector machine, that allow inference at the single-subject level. Recent studies employing single-modality data, however, suggest that classification accuracies must be improved for such utility to be realized. One possible solution is to integrate different data types to provide a single combined output classification; either by generating a single decision function based on an integrated kernel matrix, or, by creating an ensemble of multiple single modality classifiers and integrating their predictions. Here, we describe four integrative approaches: (1) an un-weighted sum of kernels, (2) multi-kernel learning, (3) prediction averaging, and (4) majority voting, and compare their ability to enhance classification accuracy relative to the best single-modality classification accuracy. We achieve this by integrating structural, functional, and diffusion tensor magnetic resonance imaging data, in order to compare ultra-high risk (n = 19), first episode psychosis (n = 19) and healthy control subjects (n = 23). Our results show that (i) whilst integration can enhance classification accuracy by up to 13%, the frequency of such instances may be limited, (ii) where classification can be enhanced, simple methods may yield greater increases relative to more computationally complex alternatives, and, (iii) the potential for classification enhancement is highly influenced by the specific diagnostic comparison under consideration. In conclusion, our findings suggest that for moderately sized clinical neuroimaging datasets, combining different imaging modalities in a data-driven manner is no "magic bullet" for increasing classification accuracy. However, it remains possible that this conclusion is dependent on the use of neuroimaging modalities that had little, or no, complementary information to offer one another, and that the integration of more diverse types of data would have produced greater classification enhancement. We suggest that future studies ideally examine a greater variety of data types (e.g., genetic, cognitive, and neuroimaging) in order to identify the data types and combinations optimally suited to the classification of early stage psychosis.
NASA Astrophysics Data System (ADS)
Park, M.; Stenstrom, M. K.
2004-12-01
Recognizing urban information from the satellite imagery is problematic due to the diverse features and dynamic changes of urban landuse. The use of Landsat imagery for urban land use classification involves inherent uncertainty due to its spatial resolution and the low separability among land uses. To resolve the uncertainty problem, we investigated the performance of Bayesian networks to classify urban land use since Bayesian networks provide a quantitative way of handling uncertainty and have been successfully used in many areas. In this study, we developed the optimized networks for urban land use classification from Landsat ETM+ images of Marina del Rey area based on USGS land cover/use classification level III. The networks started from a tree structure based on mutual information between variables and added the links to improve accuracy. This methodology offers several advantages: (1) The network structure shows the dependency relationships between variables. The class node value can be predicted even with particular band information missing due to sensor system error. The missing information can be inferred from other dependent bands. (2) The network structure provides information of variables that are important for the classification, which is not available from conventional classification methods such as neural networks and maximum likelihood classification. In our case, for example, bands 1, 5 and 6 are the most important inputs in determining the land use of each pixel. (3) The networks can be reduced with those input variables important for classification. This minimizes the problem without considering all possible variables. We also examined the effect of incorporating ancillary data: geospatial information such as X and Y coordinate values of each pixel and DEM data, and vegetation indices such as NDVI and Tasseled Cap transformation. The results showed that the locational information improved overall accuracy (81%) and kappa coefficient (76%), and lowered the omission and commission errors compared with using only spectral data (accuracy 71%, kappa coefficient 62%). Incorporating DEM data did not significantly improve overall accuracy (74%) and kappa coefficient (66%) but lowered the omission and commission errors. Incorporating NDVI did not much improve the overall accuracy (72%) and k coefficient (65%). Including Tasseled Cap transformation reduced the accuracy (accuracy 70%, kappa 61%). Therefore, additional information from the DEM and vegetation indices was not useful as locational ancillary data.
High Accuracy Human Activity Recognition Based on Sparse Locality Preserving Projections.
Zhu, Xiangbin; Qiu, Huiling
2016-01-01
Human activity recognition(HAR) from the temporal streams of sensory data has been applied to many fields, such as healthcare services, intelligent environments and cyber security. However, the classification accuracy of most existed methods is not enough in some applications, especially for healthcare services. In order to improving accuracy, it is necessary to develop a novel method which will take full account of the intrinsic sequential characteristics for time-series sensory data. Moreover, each human activity may has correlated feature relationship at different levels. Therefore, in this paper, we propose a three-stage continuous hidden Markov model (TSCHMM) approach to recognize human activities. The proposed method contains coarse, fine and accurate classification. The feature reduction is an important step in classification processing. In this paper, sparse locality preserving projections (SpLPP) is exploited to determine the optimal feature subsets for accurate classification of the stationary-activity data. It can extract more discriminative activities features from the sensor data compared with locality preserving projections. Furthermore, all of the gyro-based features are used for accurate classification of the moving data. Compared with other methods, our method uses significantly less number of features, and the over-all accuracy has been obviously improved.
High Accuracy Human Activity Recognition Based on Sparse Locality Preserving Projections
2016-01-01
Human activity recognition(HAR) from the temporal streams of sensory data has been applied to many fields, such as healthcare services, intelligent environments and cyber security. However, the classification accuracy of most existed methods is not enough in some applications, especially for healthcare services. In order to improving accuracy, it is necessary to develop a novel method which will take full account of the intrinsic sequential characteristics for time-series sensory data. Moreover, each human activity may has correlated feature relationship at different levels. Therefore, in this paper, we propose a three-stage continuous hidden Markov model (TSCHMM) approach to recognize human activities. The proposed method contains coarse, fine and accurate classification. The feature reduction is an important step in classification processing. In this paper, sparse locality preserving projections (SpLPP) is exploited to determine the optimal feature subsets for accurate classification of the stationary-activity data. It can extract more discriminative activities features from the sensor data compared with locality preserving projections. Furthermore, all of the gyro-based features are used for accurate classification of the moving data. Compared with other methods, our method uses significantly less number of features, and the over-all accuracy has been obviously improved. PMID:27893761
NASA Technical Reports Server (NTRS)
Sadowski, F. E.; Sarno, J. E.
1976-01-01
First, an analysis of forest feature signatures was used to help explain the large variation in classification accuracy that can occur among individual forest features for any one case of spatial resolution and the inconsistent changes in classification accuracy that were demonstrated among features as spatial resolution was degraded. Second, the classification rejection threshold was varied in an effort to reduce the large proportion of unclassified resolution elements that previously appeared in the processing of coarse resolution data when a constant rejection threshold was used for all cases of spatial resolution. For the signature analysis, two-channel ellipse plots showing the feature signature distributions for several cases of spatial resolution indicated that the capability of signatures to correctly identify their respective features is dependent on the amount of statistical overlap among signatures. Reductions in signature variance that occur in data of degraded spatial resolution may not necessarily decrease the amount of statistical overlap among signatures having large variance and small mean separations. Features classified by such signatures may thus continue to have similar amounts of misclassified elements in coarser resolution data, and thus, not necessarily improve in classification accuracy.
Men, Hong; Fu, Songlin; Yang, Jialin; Cheng, Meiqi; Shi, Yan
2018-01-01
Paraffin odor intensity is an important quality indicator when a paraffin inspection is performed. Currently, paraffin odor level assessment is mainly dependent on an artificial sensory evaluation. In this paper, we developed a paraffin odor analysis system to classify and grade four kinds of paraffin samples. The original feature set was optimized using Principal Component Analysis (PCA) and Partial Least Squares (PLS). Support Vector Machine (SVM), Random Forest (RF), and Extreme Learning Machine (ELM) were applied to three different feature data sets for classification and level assessment of paraffin. For classification, the model based on SVM, with an accuracy rate of 100%, was superior to that based on RF, with an accuracy rate of 98.33–100%, and ELM, with an accuracy rate of 98.01–100%. For level assessment, the R2 related to the training set was above 0.97 and the R2 related to the test set was above 0.87. Through comprehensive comparison, the generalization of the model based on ELM was superior to those based on SVM and RF. The scoring errors for the three models were 0.0016–0.3494, lower than the error of 0.5–1.0 measured by industry standard experts, meaning these methods have a higher prediction accuracy for scoring paraffin level. PMID:29346328
Shi, Lei; Wan, Youchuan; Gao, Xianjun
2018-01-01
In object-based image analysis of high-resolution images, the number of features can reach hundreds, so it is necessary to perform feature reduction prior to classification. In this paper, a feature selection method based on the combination of a genetic algorithm (GA) and tabu search (TS) is presented. The proposed GATS method aims to reduce the premature convergence of the GA by the use of TS. A prematurity index is first defined to judge the convergence situation during the search. When premature convergence does take place, an improved mutation operator is executed, in which TS is performed on individuals with higher fitness values. As for the other individuals with lower fitness values, mutation with a higher probability is carried out. Experiments using the proposed GATS feature selection method and three other methods, a standard GA, the multistart TS method, and ReliefF, were conducted on WorldView-2 and QuickBird images. The experimental results showed that the proposed method outperforms the other methods in terms of the final classification accuracy. PMID:29581721
Improving mental task classification by adding high frequency band information.
Zhang, Li; He, Wei; He, Chuanhong; Wang, Ping
2010-02-01
Features extracted from delta, theta, alpha, beta and gamma bands spanning low frequency range are commonly used to classify scalp-recorded electroencephalogram (EEG) for designing brain-computer interface (BCI) and higher frequencies are often neglected as noise. In this paper, we implemented an experimental validation to demonstrate that high frequency components could provide helpful information for improving the performance of the mental task based BCI. Electromyography (EMG) and electrooculography (EOG) artifacts were removed by using blind source separation (BSS) techniques. Frequency band powers and asymmetry ratios from the high frequency band (40-100 Hz) together with those from the lower frequency bands were used to represent EEG features. Finally, Fisher discriminant analysis (FDA) combining with Mahalanobis distance were used as the classifier. In this study, four types of classifications were performed using EEG signals recorded from four subjects during five mental tasks. We obtained significantly higher classification accuracy by adding the high frequency band features compared to using the low frequency bands alone, which demonstrated that the information in high frequency components from scalp-recorded EEG is valuable for the mental task based BCI.
Audio-guided audiovisual data segmentation, indexing, and retrieval
NASA Astrophysics Data System (ADS)
Zhang, Tong; Kuo, C.-C. Jay
1998-12-01
While current approaches for video segmentation and indexing are mostly focused on visual information, audio signals may actually play a primary role in video content parsing. In this paper, we present an approach for automatic segmentation, indexing, and retrieval of audiovisual data, based on audio content analysis. The accompanying audio signal of audiovisual data is first segmented and classified into basic types, i.e., speech, music, environmental sound, and silence. This coarse-level segmentation and indexing step is based upon morphological and statistical analysis of several short-term features of the audio signals. Then, environmental sounds are classified into finer classes, such as applause, explosions, bird sounds, etc. This fine-level classification and indexing step is based upon time- frequency analysis of audio signals and the use of the hidden Markov model as the classifier. On top of this archiving scheme, an audiovisual data retrieval system is proposed. Experimental results show that the proposed approach has an accuracy rate higher than 90 percent for the coarse-level classification, and higher than 85 percent for the fine-level classification. Examples of audiovisual data segmentation and retrieval are also provided.
The impact of OCR accuracy on automated cancer classification of pathology reports.
Zuccon, Guido; Nguyen, Anthony N; Bergheim, Anton; Wickman, Sandra; Grayson, Narelle
2012-01-01
To evaluate the effects of Optical Character Recognition (OCR) on the automatic cancer classification of pathology reports. Scanned images of pathology reports were converted to electronic free-text using a commercial OCR system. A state-of-the-art cancer classification system, the Medical Text Extraction (MEDTEX) system, was used to automatically classify the OCR reports. Classifications produced by MEDTEX on the OCR versions of the reports were compared with the classification from a human amended version of the OCR reports. The employed OCR system was found to recognise scanned pathology reports with up to 99.12% character accuracy and up to 98.95% word accuracy. Errors in the OCR processing were found to minimally impact on the automatic classification of scanned pathology reports into notifiable groups. However, the impact of OCR errors is not negligible when considering the extraction of cancer notification items, such as primary site, histological type, etc. The automatic cancer classification system used in this work, MEDTEX, has proven to be robust to errors produced by the acquisition of freetext pathology reports from scanned images through OCR software. However, issues emerge when considering the extraction of cancer notification items.
NASA Astrophysics Data System (ADS)
Dash, Jatindra K.; Kale, Mandar; Mukhopadhyay, Sudipta; Khandelwal, Niranjan; Prabhakar, Nidhi; Garg, Mandeep; Kalra, Naveen
2017-03-01
In this paper, we investigate the effect of the error criteria used during a training phase of the artificial neural network (ANN) on the accuracy of the classifier for classification of lung tissues affected with Interstitial Lung Diseases (ILD). Mean square error (MSE) and the cross-entropy (CE) criteria are chosen being most popular choice in state-of-the-art implementations. The classification experiment performed on the six interstitial lung disease (ILD) patterns viz. Consolidation, Emphysema, Ground Glass Opacity, Micronodules, Fibrosis and Healthy from MedGIFT database. The texture features from an arbitrary region of interest (AROI) are extracted using Gabor filter. Two different neural networks are trained with the scaled conjugate gradient back propagation algorithm with MSE and CE error criteria function respectively for weight updation. Performance is evaluated in terms of average accuracy of these classifiers using 4 fold cross-validation. Each network is trained for five times for each fold with randomly initialized weight vectors and accuracies are computed. Significant improvement in classification accuracy is observed when ANN is trained by using CE (67.27%) as error function compared to MSE (63.60%). Moreover, standard deviation of the classification accuracy for the network trained with CE (6.69) error criteria is found less as compared to network trained with MSE (10.32) criteria.
Edwards, T.C.; Cutler, D.R.; Zimmermann, N.E.; Geiser, L.; Moisen, Gretchen G.
2006-01-01
We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by resubstitution rates were similar for each lichen species irrespective of the underlying sample survey form. Cross-validation estimates of prediction accuracies were lower than resubstitution accuracies for all species and both design types, and in all cases were closer to the true prediction accuracies based on the EVALUATION data set. We argue that greater emphasis should be placed on calculating and reporting cross-validation accuracy rates rather than simple resubstitution accuracy rates. Evaluation of the DESIGN and PURPOSIVE tree models on the EVALUATION data set shows significantly lower prediction accuracy for the PURPOSIVE tree models relative to the DESIGN models, indicating that non-probabilistic sample surveys may generate models with limited predictive capability. These differences were consistent across all four lichen species, with 11 of the 12 possible species and sample survey type comparisons having significantly lower accuracy rates. Some differences in accuracy were as large as 50%. The classification tree structures also differed considerably both among and within the modelled species, depending on the sample survey form. Overlap in the predictor variables selected by the DESIGN and PURPOSIVE tree models ranged from only 20% to 38%, indicating the classification trees fit the two evaluated survey forms on different sets of predictor variables. The magnitude of these differences in predictor variables throws doubt on ecological interpretation derived from prediction models based on non-probabilistic sample surveys. ?? 2006 Elsevier B.V. All rights reserved.
A new approach to enhance the performance of decision tree for classifying gene expression data.
Hassan, Md; Kotagiri, Ramamohanarao
2013-12-20
Gene expression data classification is a challenging task due to the large dimensionality and very small number of samples. Decision tree is one of the popular machine learning approaches to address such classification problems. However, the existing decision tree algorithms use a single gene feature at each node to split the data into its child nodes and hence might suffer from poor performance specially when classifying gene expression dataset. By using a new decision tree algorithm where, each node of the tree consists of more than one gene, we enhance the classification performance of traditional decision tree classifiers. Our method selects suitable genes that are combined using a linear function to form a derived composite feature. To determine the structure of the tree we use the area under the Receiver Operating Characteristics curve (AUC). Experimental analysis demonstrates higher classification accuracy using the new decision tree compared to the other existing decision trees in literature. We experimentally compare the effect of our scheme against other well known decision tree techniques. Experiments show that our algorithm can substantially boost the classification performance of the decision tree.
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-01-01
Background Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Methodology/Principal Findings Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. Conclusions/Significance The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice. PMID:21359184
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-02-16
Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice.
An Extended Spectral-Spatial Classification Approach for Hyperspectral Data
NASA Astrophysics Data System (ADS)
Akbari, D.
2017-11-01
In this paper an extended classification approach for hyperspectral imagery based on both spectral and spatial information is proposed. The spatial information is obtained by an enhanced marker-based minimum spanning forest (MSF) algorithm. Three different methods of dimension reduction are first used to obtain the subspace of hyperspectral data: (1) unsupervised feature extraction methods including principal component analysis (PCA), independent component analysis (ICA), and minimum noise fraction (MNF); (2) supervised feature extraction including decision boundary feature extraction (DBFE), discriminate analysis feature extraction (DAFE), and nonparametric weighted feature extraction (NWFE); (3) genetic algorithm (GA). The spectral features obtained are then fed into the enhanced marker-based MSF classification algorithm. In the enhanced MSF algorithm, the markers are extracted from the classification maps obtained by both SVM and watershed segmentation algorithm. To evaluate the proposed approach, the Pavia University hyperspectral data is tested. Experimental results show that the proposed approach using GA achieves an approximately 8 % overall accuracy higher than the original MSF-based algorithm.
Drug-related webpages classification based on multi-modal local decision fusion
NASA Astrophysics Data System (ADS)
Hu, Ruiguang; Su, Xiaojing; Liu, Yanxin
2018-03-01
In this paper, multi-modal local decision fusion is used for drug-related webpages classification. First, meaningful text are extracted through HTML parsing, and effective images are chosen by the FOCARSS algorithm. Second, six SVM classifiers are trained for six kinds of drug-taking instruments, which are represented by PHOG. One SVM classifier is trained for the cannabis, which is represented by the mid-feature of BOW model. For each instance in a webpage, seven SVMs give seven labels for its image, and other seven labels are given by searching the names of drug-taking instruments and cannabis in its related text. Concatenating seven labels of image and seven labels of text, the representation of those instances in webpages are generated. Last, Multi-Instance Learning is used to classify those drugrelated webpages. Experimental results demonstrate that the classification accuracy of multi-instance learning with multi-modal local decision fusion is much higher than those of single-modal classification.
A Modified Decision Tree Algorithm Based on Genetic Algorithm for Mobile User Classification Problem
Liu, Dong-sheng; Fan, Shu-jiang
2014-01-01
In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity. PMID:24688389
The Cross-Entropy Based Multi-Filter Ensemble Method for Gene Selection.
Sun, Yingqiang; Lu, Chengbo; Li, Xiaobo
2018-05-17
The gene expression profile has the characteristics of a high dimension, low sample, and continuous type, and it is a great challenge to use gene expression profile data for the classification of tumor samples. This paper proposes a cross-entropy based multi-filter ensemble (CEMFE) method for microarray data classification. Firstly, multiple filters are used to select the microarray data in order to obtain a plurality of the pre-selected feature subsets with a different classification ability. The top N genes with the highest rank of each subset are integrated so as to form a new data set. Secondly, the cross-entropy algorithm is used to remove the redundant data in the data set. Finally, the wrapper method, which is based on forward feature selection, is used to select the best feature subset. The experimental results show that the proposed method is more efficient than other gene selection methods and that it can achieve a higher classification accuracy under fewer characteristic genes.
NASA Astrophysics Data System (ADS)
Shenoy Handiru, Vikram; Vinod, A. P.; Guan, Cuntai
2017-08-01
Objective. In electroencephalography (EEG)-based brain-computer interface (BCI) systems for motor control tasks the conventional practice is to decode motor intentions by using scalp EEG. However, scalp EEG only reveals certain limited information about the complex tasks of movement with a higher degree of freedom. Therefore, our objective is to investigate the effectiveness of source-space EEG in extracting relevant features that discriminate arm movement in multiple directions. Approach. We have proposed a novel feature extraction algorithm based on supervised factor analysis that models the data from source-space EEG. To this end, we computed the features from the source dipoles confined to Brodmann areas of interest (BA4a, BA4p and BA6). Further, we embedded class-wise labels of multi-direction (multi-class) source-space EEG to an unsupervised factor analysis to make it into a supervised learning method. Main Results. Our approach provided an average decoding accuracy of 71% for the classification of hand movement in four orthogonal directions, that is significantly higher (>10%) than the classification accuracy obtained using state-of-the-art spatial pattern features in sensor space. Also, the group analysis on the spectral characteristics of source-space EEG indicates that the slow cortical potentials from a set of cortical source dipoles reveal discriminative information regarding the movement parameter, direction. Significance. This study presents evidence that low-frequency components in the source space play an important role in movement kinematics, and thus it may lead to new strategies for BCI-based neurorehabilitation.
Classification Consistency and Accuracy for Complex Assessments Using Item Response Theory
ERIC Educational Resources Information Center
Lee, Won-Chan
2010-01-01
In this article, procedures are described for estimating single-administration classification consistency and accuracy indices for complex assessments using item response theory (IRT). This IRT approach was applied to real test data comprising dichotomous and polytomous items. Several different IRT model combinations were considered. Comparisons…
Conceptual Scoring and Classification Accuracy of Vocabulary Testing in Bilingual Children
ERIC Educational Resources Information Center
Anaya, Jissel B.; Peña, Elizabeth D.; Bedore, Lisa M.
2018-01-01
Purpose: This study examined the effects of single-language and conceptual scoring on the vocabulary performance of bilingual children with and without specific language impairment. We assessed classification accuracy across 3 scoring methods. Method: Participants included Spanish-English bilingual children (N = 247) aged 5;1 (years;months) to…
Classification with spatio-temporal interpixel class dependency contexts
NASA Technical Reports Server (NTRS)
Jeon, Byeungwoo; Landgrebe, David A.
1992-01-01
A contextual classifier which can utilize both spatial and temporal interpixel dependency contexts is investigated. After spatial and temporal neighbors are defined, a general form of maximum a posterior spatiotemporal contextual classifier is derived. This contextual classifier is simplified under several assumptions. Joint prior probabilities of the classes of each pixel and its spatial neighbors are modeled by the Gibbs random field. The classification is performed in a recursive manner to allow a computationally efficient contextual classification. Experimental results with bitemporal TM data show significant improvement of classification accuracy over noncontextual pixelwise classifiers. This spatiotemporal contextual classifier should find use in many applications of remote sensing, especially when the classification accuracy is important.
Gastric precancerous diseases classification using CNN with a concise model.
Zhang, Xu; Hu, Weiling; Chen, Fei; Liu, Jiquan; Yang, Yuanhang; Wang, Liangjing; Duan, Huilong; Si, Jianmin
2017-01-01
Gastric precancerous diseases (GPD) may deteriorate into early gastric cancer if misdiagnosed, so it is important to help doctors recognize GPD accurately and quickly. In this paper, we realize the classification of 3-class GPD, namely, polyp, erosion, and ulcer using convolutional neural networks (CNN) with a concise model called the Gastric Precancerous Disease Network (GPDNet). GPDNet introduces fire modules from SqueezeNet to reduce the model size and parameters about 10 times while improving speed for quick classification. To maintain classification accuracy with fewer parameters, we propose an innovative method called iterative reinforced learning (IRL). After training GPDNet from scratch, we apply IRL to fine-tune the parameters whose values are close to 0, and then we take the modified model as a pretrained model for the next training. The result shows that IRL can improve the accuracy about 9% after 6 iterations. The final classification accuracy of our GPDNet was 88.90%, which is promising for clinical GPD recognition.
Convolutional neural network with transfer learning for rice type classification
NASA Astrophysics Data System (ADS)
Patel, Vaibhav Amit; Joshi, Manjunath V.
2018-04-01
Presently, rice type is identified manually by humans, which is time consuming and error prone. Therefore, there is a need to do this by machine which makes it faster with greater accuracy. This paper proposes a deep learning based method for classification of rice types. We propose two methods to classify the rice types. In the first method, we train a deep convolutional neural network (CNN) using the given segmented rice images. In the second method, we train a combination of a pretrained VGG16 network and the proposed method, while using transfer learning in which the weights of a pretrained network are used to achieve better accuracy. Our approach can also be used for classification of rice grain as broken or fine. We train a 5-class model for classifying rice types using 4000 training images and another 2- class model for the classification of broken and normal rice using 1600 training images. We observe that despite having distinct rice images, our architecture, pretrained on ImageNet data boosts classification accuracy significantly.
A motion-classification strategy based on sEMG-EEG signal combination for upper-limb amputees.
Li, Xiangxin; Samuel, Oluwarotimi Williams; Zhang, Xu; Wang, Hui; Fang, Peng; Li, Guanglin
2017-01-07
Most of the modern motorized prostheses are controlled with the surface electromyography (sEMG) recorded on the residual muscles of amputated limbs. However, the residual muscles are usually limited, especially after above-elbow amputations, which would not provide enough sEMG for the control of prostheses with multiple degrees of freedom. Signal fusion is a possible approach to solve the problem of insufficient control commands, where some non-EMG signals are combined with sEMG signals to provide sufficient information for motion intension decoding. In this study, a motion-classification method that combines sEMG and electroencephalography (EEG) signals were proposed and investigated, in order to improve the control performance of upper-limb prostheses. Four transhumeral amputees without any form of neurological disease were recruited in the experiments. Five motion classes including hand-open, hand-close, wrist-pronation, wrist-supination, and no-movement were specified. During the motion performances, sEMG and EEG signals were simultaneously acquired from the skin surface and scalp of the amputees, respectively. The two types of signals were independently preprocessed and then combined as a parallel control input. Four time-domain features were extracted and fed into a classifier trained by the Linear Discriminant Analysis (LDA) algorithm for motion recognition. In addition, channel selections were performed by using the Sequential Forward Selection (SFS) algorithm to optimize the performance of the proposed method. The classification performance achieved by the fusion of sEMG and EEG signals was significantly better than that obtained by single signal source of either sEMG or EEG. An increment of more than 14% in classification accuracy was achieved when using a combination of 32-channel sEMG and 64-channel EEG. Furthermore, based on the SFS algorithm, two optimized electrode arrangements (10-channel sEMG + 10-channel EEG, 10-channel sEMG + 20-channel EEG) were obtained with classification accuracies of 84.2 and 87.0%, respectively, which were about 7.2 and 10% higher than the accuracy by using only 32-channel sEMG input. This study demonstrated the feasibility of fusing sEMG and EEG signals towards improving motion classification accuracy for above-elbow amputees, which might enhance the control performances of multifunctional myoelectric prostheses in clinical application. The study was approved by the ethics committee of Institutional Review Board of Shenzhen Institutes of Advanced Technology, and the reference number is SIAT-IRB-150515-H0077.
Automated classification of cell morphology by coherence-controlled holographic microscopy
NASA Astrophysics Data System (ADS)
Strbkova, Lenka; Zicha, Daniel; Vesely, Pavel; Chmelik, Radim
2017-08-01
In the last few years, classification of cells by machine learning has become frequently used in biology. However, most of the approaches are based on morphometric (MO) features, which are not quantitative in terms of cell mass. This may result in poor classification accuracy. Here, we study the potential contribution of coherence-controlled holographic microscopy enabling quantitative phase imaging for the classification of cell morphologies. We compare our approach with the commonly used method based on MO features. We tested both classification approaches in an experiment with nutritionally deprived cancer tissue cells, while employing several supervised machine learning algorithms. Most of the classifiers provided higher performance when quantitative phase features were employed. Based on the results, it can be concluded that the quantitative phase features played an important role in improving the performance of the classification. The methodology could be valuable help in refining the monitoring of live cells in an automated fashion. We believe that coherence-controlled holographic microscopy, as a tool for quantitative phase imaging, offers all preconditions for the accurate automated analysis of live cell behavior while enabling noninvasive label-free imaging with sufficient contrast and high-spatiotemporal phase sensitivity.
Ground-based cloud classification by learning stable local binary patterns
NASA Astrophysics Data System (ADS)
Wang, Yu; Shi, Cunzhao; Wang, Chunheng; Xiao, Baihua
2018-07-01
Feature selection and extraction is the first step in implementing pattern classification. The same is true for ground-based cloud classification. Histogram features based on local binary patterns (LBPs) are widely used to classify texture images. However, the conventional uniform LBP approach cannot capture all the dominant patterns in cloud texture images, thereby resulting in low classification performance. In this study, a robust feature extraction method by learning stable LBPs is proposed based on the averaged ranks of the occurrence frequencies of all rotation invariant patterns defined in the LBPs of cloud images. The proposed method is validated with a ground-based cloud classification database comprising five cloud types. Experimental results demonstrate that the proposed method achieves significantly higher classification accuracy than the uniform LBP, local texture patterns (LTP), dominant LBP (DLBP), completed LBP (CLTP) and salient LBP (SaLBP) methods in this cloud image database and under different noise conditions. And the performance of the proposed method is comparable with that of the popular deep convolutional neural network (DCNN) method, but with less computation complexity. Furthermore, the proposed method also achieves superior performance on an independent test data set.
Automated classification of cell morphology by coherence-controlled holographic microscopy.
Strbkova, Lenka; Zicha, Daniel; Vesely, Pavel; Chmelik, Radim
2017-08-01
In the last few years, classification of cells by machine learning has become frequently used in biology. However, most of the approaches are based on morphometric (MO) features, which are not quantitative in terms of cell mass. This may result in poor classification accuracy. Here, we study the potential contribution of coherence-controlled holographic microscopy enabling quantitative phase imaging for the classification of cell morphologies. We compare our approach with the commonly used method based on MO features. We tested both classification approaches in an experiment with nutritionally deprived cancer tissue cells, while employing several supervised machine learning algorithms. Most of the classifiers provided higher performance when quantitative phase features were employed. Based on the results, it can be concluded that the quantitative phase features played an important role in improving the performance of the classification. The methodology could be valuable help in refining the monitoring of live cells in an automated fashion. We believe that coherence-controlled holographic microscopy, as a tool for quantitative phase imaging, offers all preconditions for the accurate automated analysis of live cell behavior while enabling noninvasive label-free imaging with sufficient contrast and high-spatiotemporal phase sensitivity. (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).
Monteiro-Soares, M; Martins-Mendes, D; Vaz-Carneiro, A; Sampaio, S; Dinis-Ribeiro, M
2014-10-01
We systematically review the available systems used to classify diabetic foot ulcers in order to synthesize their methodological qualitative issues and accuracy to predict lower extremity amputation, as this may represent a critical point in these patients' care. Two investigators searched, in EBSCO, ISI, PubMed and SCOPUS databases, and independently selected studies published until May 2013 and reporting prognostic accuracy and/or reliability of specific systems for patients with diabetic foot ulcer in order to predict lower extremity amputation. We included 25 studies reporting a prevalence of lower extremity amputation between 6% and 78%. Eight different diabetic foot ulcer descriptions and seven prognostic stratification classification systems were addressed with a variable (1-9) number of factors included, specially peripheral arterial disease (n = 12) or infection at the ulcer site (n = 10) or ulcer depth (n = 10). The Meggitt-Wagner, S(AD)SAD and Texas University Classification systems were the most extensively validated, whereas ten classifications were derived or validated only once. Reliability was reported in a single study, and accuracy measures were reported in five studies with another eight allowing their calculation. Pooled accuracy ranged from 0.65 (for gangrene) to 0.74 (for infection). There are numerous classification systems for diabetic foot ulcer outcome prediction, but only few studies evaluated their reliability or external validity. Studies rarely validated several systems simultaneously and only a few reported accuracy measures. Further studies assessing reliability and accuracy of the available systems and their composing variables are needed. Copyright © 2014 John Wiley & Sons, Ltd.
Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders.
Subasi, Abdulhamit
2013-06-01
Support vector machine (SVM) is an extensively used machine learning method with many biomedical signal classification applications. In this study, a novel PSO-SVM model has been proposed that hybridized the particle swarm optimization (PSO) and SVM to improve the EMG signal classification accuracy. This optimization mechanism involves kernel parameter setting in the SVM training procedure, which significantly influences the classification accuracy. The experiments were conducted on the basis of EMG signal to classify into normal, neurogenic or myopathic. In the proposed method the EMG signals were decomposed into the frequency sub-bands using discrete wavelet transform (DWT) and a set of statistical features were extracted from these sub-bands to represent the distribution of wavelet coefficients. The obtained results obviously validate the superiority of the SVM method compared to conventional machine learning methods, and suggest that further significant enhancements in terms of classification accuracy can be achieved by the proposed PSO-SVM classification system. The PSO-SVM yielded an overall accuracy of 97.41% on 1200 EMG signals selected from 27 subject records against 96.75%, 95.17% and 94.08% for the SVM, the k-NN and the RBF classifiers, respectively. PSO-SVM is developed as an efficient tool so that various SVMs can be used conveniently as the core of PSO-SVM for diagnosis of neuromuscular disorders. Copyright © 2013 Elsevier Ltd. All rights reserved.
Decoding English Alphabet Letters Using EEG Phase Information
Wang, YiYan; Wang, Pingxiao; Yu, Yuguo
2018-01-01
Increasing evidence indicates that the phase pattern and power of the low frequency oscillations of brain electroencephalograms (EEG) contain significant information during the human cognition of sensory signals such as auditory and visual stimuli. Here, we investigate whether and how the letters of the alphabet can be directly decoded from EEG phase and power data. In addition, we investigate how different band oscillations contribute to the classification and determine the critical time periods. An English letter recognition task was assigned, and statistical analyses were conducted to decode the EEG signal corresponding to each letter visualized on a computer screen. We applied support vector machine (SVM) with gradient descent method to learn the potential features for classification. It was observed that the EEG phase signals have a higher decoding accuracy than the oscillation power information. Low-frequency theta and alpha oscillations have phase information with higher accuracy than do other bands. The decoding performance was best when the analysis period began from 180 to 380 ms after stimulus presentation, especially in the lateral occipital and posterior temporal scalp regions (PO7 and PO8). These results may provide a new approach for brain-computer interface techniques (BCI) and may deepen our understanding of EEG oscillations in cognition. PMID:29467615
Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio
NASA Astrophysics Data System (ADS)
Nababan, A. A.; Sitompul, O. S.; Tulus
2018-04-01
K- Nearest Neighbor (KNN) is a good classifier, but from several studies, the result performance accuracy of KNN still lower than other methods. One of the causes of the low accuracy produced, because each attribute has the same effect on the classification process, while some less relevant characteristics lead to miss-classification of the class assignment for new data. In this research, we proposed Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio as a parameter to see the correlation between each attribute in the data and the Gain Ratio also will be used as the basis for weighting each attribute of the dataset. The accuracy of results is compared to the accuracy acquired from the original KNN method using 10-fold Cross-Validation with several datasets from the UCI Machine Learning repository and KEEL-Dataset Repository, such as abalone, glass identification, haberman, hayes-roth and water quality status. Based on the result of the test, the proposed method was able to increase the classification accuracy of KNN, where the highest difference of accuracy obtained hayes-roth dataset is worth 12.73%, and the lowest difference of accuracy obtained in the abalone dataset of 0.07%. The average result of the accuracy of all dataset increases the accuracy by 5.33%.
NASA Technical Reports Server (NTRS)
Myint, Soe W.; Mesev, Victor; Quattrochi, Dale; Wentz, Elizabeth A.
2013-01-01
Remote sensing methods used to generate base maps to analyze the urban environment rely predominantly on digital sensor data from space-borne platforms. This is due in part from new sources of high spatial resolution data covering the globe, a variety of multispectral and multitemporal sources, sophisticated statistical and geospatial methods, and compatibility with GIS data sources and methods. The goal of this chapter is to review the four groups of classification methods for digital sensor data from space-borne platforms; per-pixel, sub-pixel, object-based (spatial-based), and geospatial methods. Per-pixel methods are widely used methods that classify pixels into distinct categories based solely on the spectral and ancillary information within that pixel. They are used for simple calculations of environmental indices (e.g., NDVI) to sophisticated expert systems to assign urban land covers. Researchers recognize however, that even with the smallest pixel size the spectral information within a pixel is really a combination of multiple urban surfaces. Sub-pixel classification methods therefore aim to statistically quantify the mixture of surfaces to improve overall classification accuracy. While within pixel variations exist, there is also significant evidence that groups of nearby pixels have similar spectral information and therefore belong to the same classification category. Object-oriented methods have emerged that group pixels prior to classification based on spectral similarity and spatial proximity. Classification accuracy using object-based methods show significant success and promise for numerous urban 3 applications. Like the object-oriented methods that recognize the importance of spatial proximity, geospatial methods for urban mapping also utilize neighboring pixels in the classification process. The primary difference though is that geostatistical methods (e.g., spatial autocorrelation methods) are utilized during both the pre- and post-classification steps. Within this chapter, each of the four approaches is described in terms of scale and accuracy classifying urban land use and urban land cover; and for its range of urban applications. We demonstrate the overview of four main classification groups in Figure 1 while Table 1 details the approaches with respect to classification requirements and procedures (e.g., reflectance conversion, steps before training sample selection, training samples, spatial approaches commonly used, classifiers, primary inputs for classification, output structures, number of output layers, and accuracy assessment). The chapter concludes with a brief summary of the methods reviewed and the challenges that remain in developing new classification methods for improving the efficiency and accuracy of mapping urban areas.
NASA Astrophysics Data System (ADS)
Shupe, Scott Marshall
2000-10-01
Vegetation mapping in and regions facilitates ecological studies, land management, and provides a record to which future land changes can be compared. Accurate and representative mapping of desert vegetation requires a sound field sampling program and a methodology to transform the data collected into a representative classification system. Time and cost constraints require that a remote sensing approach be used if such a classification system is to be applied on a regional scale. However, desert vegetation may be sparse and thus difficult to sense at typical satellite resolutions, especially given the problem of soil reflectance. This study was designed to address these concerns by conducting vegetation mapping research using field and satellite data from the US Army Yuma Proving Ground (USYPG) in Southwest Arizona. Line and belt transect data from the Army's Land Condition Trend Analysis (LCTA) Program were transformed into relative cover and relative density classification schemes using cluster analysis. Ordination analysis of the same data produced two and three-dimensional graphs on which the homogeneity of each vegetation class could be examined. It was found that the use of correspondence analysis (CA), detrended correspondence analysis (DCA), and non-metric multidimensional scaling (NMS) ordination methods was superior to the use of any single ordination method for helping to clarify between-class and within-class relationships in vegetation composition. Analysis of these between-class and within-class relationships were of key importance in examining how well relative cover and relative density schemes characterize the USYPG vegetation. Using these two classification schemes as reference data, maximum likelihood and artificial neural net classifications were then performed on a coregistered dataset consisting of a summer Landsat Thematic Mapper (TM) image, one spring and one summer ERS-1 microwave image, and elevation, slope, and aspect layers. Classifications using a combination of ERS-1 imagery and elevation, slope, and aspect data were superior to classifications carried out using Landsat TM data alone. In all classification iterations it was consistently found that the highest classification accuracy was obtained by using a combination of Landsat TM, ERS-1, and elevation, slope, and aspect data. Maximum likelihood classification accuracy was found to be higher than artificial neural net classification in all cases.
Wang, Xueyi; Davidson, Nicholas J.
2011-01-01
Ensemble methods have been widely used to improve prediction accuracy over individual classifiers. In this paper, we achieve a few results about the prediction accuracies of ensemble methods for binary classification that are missed or misinterpreted in previous literature. First we show the upper and lower bounds of the prediction accuracies (i.e. the best and worst possible prediction accuracies) of ensemble methods. Next we show that an ensemble method can achieve > 0.5 prediction accuracy, while individual classifiers have < 0.5 prediction accuracies. Furthermore, for individual classifiers with different prediction accuracies, the average of the individual accuracies determines the upper and lower bounds. We perform two experiments to verify the results and show that it is hard to achieve the upper and lower bounds accuracies by random individual classifiers and better algorithms need to be developed. PMID:21853162
Bauer, Robert; Fels, Meike; Royter, Vladislav; Raco, Valerio; Gharabaghi, Alireza
2016-09-01
Considering self-rated mental effort during neurofeedback may improve training of brain self-regulation. Twenty-one healthy, right-handed subjects performed kinesthetic motor imagery of opening their left hand, while threshold-based classification of beta-band desynchronization resulted in proprioceptive robotic feedback. The experiment consisted of two blocks in a cross-over design. The participants rated their perceived mental effort nine times per block. In the adaptive block, the threshold was adjusted on the basis of these ratings whereas adjustments were carried out at random in the other block. Electroencephalography was used to examine the cortical activation patterns during the training sessions. The perceived mental effort was correlated with the difficulty threshold of neurofeedback training. Adaptive threshold-setting reduced mental effort and increased the classification accuracy and positive predictive value. This was paralleled by an inter-hemispheric cortical activation pattern in low frequency bands connecting the right frontal and left parietal areas. Optimal balance of mental effort was achieved at thresholds significantly higher than maximum classification accuracy. Rating of mental effort is a feasible approach for effective threshold-adaptation during neurofeedback training. Closed-loop adaptation of the neurofeedback difficulty level facilitates reinforcement learning of brain self-regulation. Copyright © 2016 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
The Analysis of Surface EMG Signals with the Wavelet-Based Correlation Dimension Method
Zhang, Yanyan; Wang, Jue
2014-01-01
Many attempts have been made to effectively improve a prosthetic system controlled by the classification of surface electromyographic (SEMG) signals. Recently, the development of methodologies to extract the effective features still remains a primary challenge. Previous studies have demonstrated that the SEMG signals have nonlinear characteristics. In this study, by combining the nonlinear time series analysis and the time-frequency domain methods, we proposed the wavelet-based correlation dimension method to extract the effective features of SEMG signals. The SEMG signals were firstly analyzed by the wavelet transform and the correlation dimension was calculated to obtain the features of the SEMG signals. Then, these features were used as the input vectors of a Gustafson-Kessel clustering classifier to discriminate four types of forearm movements. Our results showed that there are four separate clusters corresponding to different forearm movements at the third resolution level and the resulting classification accuracy was 100%, when two channels of SEMG signals were used. This indicates that the proposed approach can provide important insight into the nonlinear characteristics and the time-frequency domain features of SEMG signals and is suitable for classifying different types of forearm movements. By comparing with other existing methods, the proposed method exhibited more robustness and higher classification accuracy. PMID:24868240
The Effect of Sub-Aperture in DRIA Framework Applied on Multi-Aspect PolSAR Data
NASA Astrophysics Data System (ADS)
Xue, Feiteng; Yin, Qiang; Lin, Yun; Hong, Wen
2016-08-01
Multi-aspect SAR is a new remote sensing technology, achieves consecutive data in large look angle as platform moves. Multi- aspect observation brings higher resolution and SNR to SAR picture. Multi-aspect PolSAR data can increase the accuracy of target identify and classification because it contains the 3-D polarimetric scattering properties.DRIA(detecting-removing-incoherent-adding)framework is a multi-aspect PolSAR data processing method. In this method, the anisotropic and isotropic scattering is separated by maximum- likelihood ratio test. The anisotropic scattering is removed to gain a removal series. The isotropic scattering is incoherent added to gain a high resolution picture. The removal series describes the anisotropic scattering property and is used in features extraction and classification.This article focuses on the effect brought by difference of sub-aperture numbers in anisotropic scattering detection and removal. The more sub-apertures are, the less look angle is. Artificial target has anisotropic scattering because of Bragg resonances. The increase of sub-aperture number brings more accurate observation in azimuth though the quality of each single image may loss. The accuracy of classification in agricultural fields is affected by the anisotropic scattering brought by Bragg resonances. The size of the sub-aperture has a significant effect in the removal result of Bragg resonances.
A Brain-Machine Interface Based on ERD/ERS for an Upper-Limb Exoskeleton Control.
Tang, Zhichuan; Sun, Shouqian; Zhang, Sanyuan; Chen, Yumiao; Li, Chao; Chen, Shi
2016-12-02
To recognize the user's motion intention, brain-machine interfaces (BMI) usually decode movements from cortical activity to control exoskeletons and neuroprostheses for daily activities. The aim of this paper is to investigate whether self-induced variations of the electroencephalogram (EEG) can be useful as control signals for an upper-limb exoskeleton developed by us. A BMI based on event-related desynchronization/synchronization (ERD/ERS) is proposed. In the decoder-training phase, we investigate the offline classification performance of left versus right hand and left hand versus both feet by using motor execution (ME) or motor imagery (MI). The results indicate that the accuracies of ME sessions are higher than those of MI sessions, and left hand versus both feet paradigm achieves a better classification performance, which would be used in the online-control phase. In the online-control phase, the trained decoder is tested in two scenarios (wearing or without wearing the exoskeleton). The MI and ME sessions wearing the exoskeleton achieve mean classification accuracy of 84.29% ± 2.11% and 87.37% ± 3.06%, respectively. The present study demonstrates that the proposed BMI is effective to control the upper-limb exoskeleton, and provides a practical method by non-invasive EEG signal associated with human natural behavior for clinical applications.
Landenburger, L.; Lawrence, R.L.; Podruzny, S.; Schwartz, C.C.
2008-01-01
Moderate resolution satellite imagery traditionally has been thought to be inadequate for mapping vegetation at the species level. This has made comprehensive mapping of regional distributions of sensitive species, such as whitebark pine, either impractical or extremely time consuming. We sought to determine whether using a combination of moderate resolution satellite imagery (Landsat Enhanced Thematic Mapper Plus), extensive stand data collected by land management agencies for other purposes, and modern statistical classification techniques (boosted classification trees) could result in successful mapping of whitebark pine. Overall classification accuracies exceeded 90%, with similar individual class accuracies. Accuracies on a localized basis varied based on elevation. Accuracies also varied among administrative units, although we were not able to determine whether these differences related to inherent spatial variations or differences in the quality of available reference data.
NASA Technical Reports Server (NTRS)
Wrigley, R. C.; Acevedo, W.; Alexander, D.; Buis, J.; Card, D.
1984-01-01
An experiment of a factorial design was conducted to test the effects on classification accuracy of land cover types due to the improved spatial, spectral and radiometric characteristics of the Thematic Mapper (TM) in comparison to the Multispectral Scanner (MSS). High altitude aircraft scanner data from the Airborne Thematic Mapper instrument was acquired over central California in August, 1983 and used to simulate Thematic Mapper data as well as all combinations of the three characteristics for eight data sets in all. Results for the training sites (field center pixels) showed better classification accuracies for MSS spatial resolution, TM spectral bands and TM radiometry in order of importance.
Gemignani, Jessica; Middell, Eike; Barbour, Randall L; Graber, Harry L; Blankertz, Benjamin
2018-04-04
The statistical analysis of functional near infrared spectroscopy (fNIRS) data based on the general linear model (GLM) is often made difficult by serial correlations, high inter-subject variability of the hemodynamic response, and the presence of motion artifacts. In this work we propose to extract information on the pattern of hemodynamic activations without using any a priori model for the data, by classifying the channels as 'active' or 'not active' with a multivariate classifier based on linear discriminant analysis (LDA). This work is developed in two steps. First we compared the performance of the two analyses, using a synthetic approach in which simulated hemodynamic activations were combined with either simulated or real resting-state fNIRS data. This procedure allowed for exact quantification of the classification accuracies of GLM and LDA. In the case of real resting-state data, the correlations between classification accuracy and demographic characteristics were investigated by means of a Linear Mixed Model. In the second step, to further characterize the reliability of the newly proposed analysis method, we conducted an experiment in which participants had to perform a simple motor task and data were analyzed with the LDA-based classifier as well as with the standard GLM analysis. The results of the simulation study show that the LDA-based method achieves higher classification accuracies than the GLM analysis, and that the LDA results are more uniform across different subjects and, in contrast to the accuracies achieved by the GLM analysis, have no significant correlations with any of the demographic characteristics. Findings from the real-data experiment are consistent with the results of the real-plus-simulation study, in that the GLM-analysis results show greater inter-subject variability than do the corresponding LDA results. The results obtained suggest that the outcome of GLM analysis is highly vulnerable to violations of theoretical assumptions, and that therefore a data-driven approach such as that provided by the proposed LDA-based method is to be favored.
Sex estimation of the tibia in modern Turkish: A computed tomography study.
Ekizoglu, Oguzhan; Er, Ali; Bozdag, Mustafa; Akcaoglu, Mustafa; Can, Ismail Ozgur; García-Donas, Julieta G; Kranioti, Elena F
2016-11-01
The utilization of computed tomography is beneficial for the analysis of skeletal remains and it has important advantages for anthropometric studies. The present study investigated morphometry of left tibia using CT images of a contemporary Turkish population. Seven parameters were measured on 203 individuals (124 males and 79 females) within the 19-92-years age group. The first objective of this study was to provide population-specific sex estimation equations for the contemporary Turkish population based on CT images. A second objective was to test the sex estimation formulae on Southern Europeans by Kranioti and Apostol (2015). Univariate discriminant functions resulted in classification accuracy that ranged from 66 to 86%. The best single variable was found to be upper epiphyseal breadth (86%) followed by lower epiphyseal breadth (85%). Multivariate discriminant functions resulted in classification accuracy for cross-validated data ranged from 79 to 86%. Applying the multivariate sex estimation formulae on Southern Europeans (SE) by Kranioti and Apostol in our sample resulted in very high classification accuracy ranging from 81 to 88%. In addition, 35.5-47% of the total Turkish sample is correctly classified with over 95% posterior probability, which is actually higher than the one reported for the original sample (25-43%). We conclude that the tibia is a very useful bone for sex estimation in the contemporary Turkish population. Moreover, our test results support the hypothesis that the SE formulae are sufficient for the contemporary Turkish population and they can be used safely for criminal investigations when posterior probabilities are over 95%. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Sarkar, Atasi; Sengupta, Sanghamitra; Mukherjee, Anirban; Chatterjee, Jyotirmoy
2017-02-01
Infra red (IR) spectral characterization can provide label-free cellular metabolic signatures of normal and diseased circumstances in a rapid and non-invasive manner. Present study endeavoured to enlist Fourier transform infra red (FTIR) spectroscopic signatures for lung normal and cancer cells during chemically induced epithelial mesenchymal transition (EMT) for which global metabolic dimension is not well reported yet. Occurrence of EMT was validated with morphological and immunocytochemical confirmation. Pre-processed spectral data was analyzed using ANOVA and principal component analysis-linear discriminant analysis (PCA-LDA). Significant differences observed in peak area corresponding to biochemical fingerprint (900-1800 cm- 1) and high wave-number (2800-3800 cm- 1) regions contributed to adequate PCA-LDA segregation of cells undergoing EMT. The findings were validated by re-analysis of data using another in-house built binary classifier namely vector valued regularized kernel approximation (VVRKFA), in order to understand EMT progression. To improve the classification accuracy, forward feature selection (FFS) tool was employed in extracting potent spectral signatures by eliminating undesirable noise. Gradual increase in classification accuracy with EMT progression of both cell types indicated prominence of the biochemical alterations. Rapid changes in cellular metabolome noted in cancer cells within first 24 h of EMT induction along with higher classification accuracy for cancer cell groups in comparison to normal cells might be attributed to inherent differences between them. Spectral features were suggestive of EMT triggered changes in nucleic acid, protein, lipid and bound water contents which can emerge as the useful markers to capture EMT related cellular characteristics.
Sex estimation standards for medieval and contemporary Croats
Bašić, Željana; Kružić, Ivana; Jerković, Ivan; Anđelinović, Deny; Anđelinović, Šimun
2017-01-01
Aim To develop discriminant functions for sex estimation on medieval Croatian population and test their application on contemporary Croatian population. Methods From a total of 519 skeletons, we chose 84 adult excellently preserved skeletons free of antemortem and postmortem changes and took all standard measurements. Sex was estimated/determined using standard anthropological procedures and ancient DNA (amelogenin analysis) where pelvis was insufficiently preserved or where sex morphological indicators were not consistent. We explored which measurements showed sexual dimorphism and used them for developing univariate and multivariate discriminant functions for sex estimation. We included only those functions that reached accuracy rate ≥80%. We tested the applicability of developed functions on modern Croatian sample (n = 37). Results From 69 standard skeletal measurements used in this study, 56 of them showed statistically significant sexual dimorphism (74.7%). We developed five univariate discriminant functions with classification rate 80.6%-85.2% and seven multivariate discriminant functions with an accuracy rate of 81.8%-93.0%. When tested on the modern population functions showed classification rates 74.1%-100%, and ten of them reached aimed accuracy rate. Females showed higher classification rates in the medieval populations, whereas males were better classified in the modern populations. Conclusion Developed discriminant functions are sufficiently accurate for reliable sex estimation in both medieval Croatian population and modern Croatian samples and may be used in forensic settings. The methodological issues that emerged regarding the importance of considering external factors in development and application of discriminant functions for sex estimation should be further explored. PMID:28613039
Multiple confidence estimates as indices of eyewitness memory.
Sauer, James D; Brewer, Neil; Weber, Nathan
2008-08-01
Eyewitness identification decisions are vulnerable to various influences on witnesses' decision criteria that contribute to false identifications of innocent suspects and failures to choose perpetrators. An alternative procedure using confidence estimates to assess the degree of match between novel and previously viewed faces was investigated. Classification algorithms were applied to participants' confidence data to determine when a confidence value or pattern of confidence values indicated a positive response. Experiment 1 compared confidence group classification accuracy with a binary decision control group's accuracy on a standard old-new face recognition task and found superior accuracy for the confidence group for target-absent trials but not for target-present trials. Experiment 2 used a face mini-lineup task and found reduced target-present accuracy offset by large gains in target-absent accuracy. Using a standard lineup paradigm, Experiments 3 and 4 also found improved classification accuracy for target-absent lineups and, with a more sophisticated algorithm, for target-present lineups. This demonstrates the accessibility of evidence for recognition memory decisions and points to a more sensitive index of memory quality than is afforded by binary decisions.
Good Practices for Learning to Recognize Actions Using FV and VLAD.
Wu, Jianxin; Zhang, Yu; Lin, Weiyao
2016-12-01
High dimensional representations such as Fisher vectors (FV) and vectors of locally aggregated descriptors (VLAD) have shown state-of-the-art accuracy for action recognition in videos. The high dimensionality, on the other hand, also causes computational difficulties when scaling up to large-scale video data. This paper makes three lines of contributions to learning to recognize actions using high dimensional representations. First, we reviewed several existing techniques that improve upon FV or VLAD in image classification, and performed extensive empirical evaluations to assess their applicability for action recognition. Our analyses of these empirical results show that normality and bimodality are essential to achieve high accuracy. Second, we proposed a new pooling strategy for VLAD and three simple, efficient, and effective transformations for both FV and VLAD. Both proposed methods have shown higher accuracy than the original FV/VLAD method in extensive evaluations. Third, we proposed and evaluated new feature selection and compression methods for the FV and VLAD representations. This strategy uses only 4% of the storage of the original representation, but achieves comparable or even higher accuracy. Based on these contributions, we recommend a set of good practices for action recognition in videos for practitioners in this field.
Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor.
Xu, Chang; Wang, Yingguan; Bao, Xinghe; Li, Fengrong
2018-05-24
This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs). Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN) classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.
NASA Technical Reports Server (NTRS)
Stoner, E. R.; May, G. A.; Kalcic, M. T. (Principal Investigator)
1981-01-01
Sample segments of ground-verified land cover data collected in conjunction with the USDA/ESS June Enumerative Survey were merged with LANDSAT data and served as a focus for unsupervised spectral class development and accuracy assessment. Multitemporal data sets were created from single-date LANDSAT MSS acquisitions from a nominal scene covering an eleven-county area in north central Missouri. Classification accuracies for the four land cover types predominant in the test site showed significant improvement in going from unitemporal to multitemporal data sets. Transformed LANDSAT data sets did not significantly improve classification accuracies. Regression estimators yielded mixed results for different land covers. Misregistration of two LANDSAT data sets by as much and one half pixels did not significantly alter overall classification accuracies. Existing algorithms for scene-to scene overlay proved adequate for multitemporal data analysis as long as statistical class development and accuracy assessment were restricted to field interior pixels.
Examining the Classification Accuracy of a Vocabulary Screening Measure with Preschool Children
ERIC Educational Resources Information Center
Marcotte, Amanda M.; Clemens, Nathan H.; Parker, Christopher; Whitcomb, Sara A.
2016-01-01
This study investigated the classification accuracy of the "Dynamic Indicators of Vocabulary Skills" (DIVS) as a preschool vocabulary screening measure. With a sample of 240 preschoolers, fall and winter DIVS scores were used to predict year-end vocabulary risk using the 25th percentile on the "Peabody Picture Vocabulary Test--Third…
ERIC Educational Resources Information Center
Daniels, Brian; Volpe, Robert J.; Fabiano, Gregory A.; Briesch, Amy M.
2017-01-01
This study examines the classification accuracy and teacher acceptability of a problem-focused screener for academic and disruptive behavior problems, which is directly linked to evidence-based intervention. Participants included 39 classroom teachers from 2 public school districts in the Northeastern United States. Teacher ratings were obtained…
ERIC Educational Resources Information Center
Zhang, Bo
2010-01-01
This article investigates how measurement models and statistical procedures can be applied to estimate the accuracy of proficiency classification in language testing. The paper starts with a concise introduction of four measurement models: the classical test theory (CTT) model, the dichotomous item response theory (IRT) model, the testlet response…
ERIC Educational Resources Information Center
Furey, William M.; Marcotte, Amanda M.; Hintze, John M.; Shackett, Caroline M.
2016-01-01
The study presents a critical analysis of written expression curriculum-based measurement (WE-CBM) metrics derived from 3- and 10-min test lengths. Criterion validity and classification accuracy were examined for Total Words Written (TWW), Correct Writing Sequences (CWS), Percent Correct Writing Sequences (%CWS), and Correct Minus Incorrect…
ERIC Educational Resources Information Center
Pena, Elizabeth D.; Gillam, Ronald B.; Malek, Melynn; Ruiz-Felter, Roxanna; Resendiz, Maria; Fiestas, Christine; Sabel, Tracy
2006-01-01
Two experiments examined reliability and classification accuracy of a narration-based dynamic assessment task. Purpose: The first experiment evaluated whether parallel results were obtained from stories created in response to 2 different wordless picture books. If so, the tasks and measures would be appropriate for assessing pretest and posttest…
The Potential Impact of Not Being Able to Create Parallel Tests on Expected Classification Accuracy
ERIC Educational Resources Information Center
Wyse, Adam E.
2011-01-01
In many practical testing situations, alternate test forms from the same testing program are not strictly parallel to each other and instead the test forms exhibit small psychometric differences. This article investigates the potential practical impact that these small psychometric differences can have on expected classification accuracy. Ten…
Emotion recognition from multichannel EEG signals using K-nearest neighbor classification.
Li, Mi; Xu, Hongpei; Liu, Xingwang; Lu, Shengfu
2018-04-27
Many studies have been done on the emotion recognition based on multi-channel electroencephalogram (EEG) signals. This paper explores the influence of the emotion recognition accuracy of EEG signals in different frequency bands and different number of channels. We classified the emotional states in the valence and arousal dimensions using different combinations of EEG channels. Firstly, DEAP default preprocessed data were normalized. Next, EEG signals were divided into four frequency bands using discrete wavelet transform, and entropy and energy were calculated as features of K-nearest neighbor Classifier. The classification accuracies of the 10, 14, 18 and 32 EEG channels based on the Gamma frequency band were 89.54%, 92.28%, 93.72% and 95.70% in the valence dimension and 89.81%, 92.24%, 93.69% and 95.69% in the arousal dimension. As the number of channels increases, the classification accuracy of emotional states also increases, the classification accuracy of the gamma frequency band is greater than that of the beta frequency band followed by the alpha and theta frequency bands. This paper provided better frequency bands and channels reference for emotion recognition based on EEG.
NASA Astrophysics Data System (ADS)
Du, Peijun; Tan, Kun; Xing, Xiaoshi
2010-12-01
Combining Support Vector Machine (SVM) with wavelet analysis, we constructed wavelet SVM (WSVM) classifier based on wavelet kernel functions in Reproducing Kernel Hilbert Space (RKHS). In conventional kernel theory, SVM is faced with the bottleneck of kernel parameter selection which further results in time-consuming and low classification accuracy. The wavelet kernel in RKHS is a kind of multidimensional wavelet function that can approximate arbitrary nonlinear functions. Implications on semiparametric estimation are proposed in this paper. Airborne Operational Modular Imaging Spectrometer II (OMIS II) hyperspectral remote sensing image with 64 bands and Reflective Optics System Imaging Spectrometer (ROSIS) data with 115 bands were used to experiment the performance and accuracy of the proposed WSVM classifier. The experimental results indicate that the WSVM classifier can obtain the highest accuracy when using the Coiflet Kernel function in wavelet transform. In contrast with some traditional classifiers, including Spectral Angle Mapping (SAM) and Minimum Distance Classification (MDC), and SVM classifier using Radial Basis Function kernel, the proposed wavelet SVM classifier using the wavelet kernel function in Reproducing Kernel Hilbert Space is capable of improving classification accuracy obviously.
NASA Astrophysics Data System (ADS)
Hwang, Han-Jeong; Lim, Jeong-Hwan; Kim, Do-Won; Im, Chang-Hwan
2014-07-01
A number of recent studies have demonstrated that near-infrared spectroscopy (NIRS) is a promising neuroimaging modality for brain-computer interfaces (BCIs). So far, most NIRS-based BCI studies have focused on enhancing the accuracy of the classification of different mental tasks. In the present study, we evaluated the performances of a variety of mental task combinations in order to determine the mental task pairs that are best suited for customized NIRS-based BCIs. To this end, we recorded event-related hemodynamic responses while seven participants performed eight different mental tasks. Classification accuracies were then estimated for all possible pairs of the eight mental tasks (C=28). Based on this analysis, mental task combinations with relatively high classification accuracies frequently included the following three mental tasks: "mental multiplication," "mental rotation," and "right-hand motor imagery." Specifically, mental task combinations consisting of two of these three mental tasks showed the highest mean classification accuracies. It is expected that our results will be a useful reference to reduce the time needed for preliminary tests when discovering individual-specific mental task combinations.
Impacts of land use/cover classification accuracy on regional climate simulations
NASA Astrophysics Data System (ADS)
Ge, Jianjun; Qi, Jiaguo; Lofgren, Brent M.; Moore, Nathan; Torbick, Nathan; Olson, Jennifer M.
2007-03-01
Land use/cover change has been recognized as a key component in global change. Various land cover data sets, including historically reconstructed, recently observed, and future projected, have been used in numerous climate modeling studies at regional to global scales. However, little attention has been paid to the effect of land cover classification accuracy on climate simulations, though accuracy assessment has become a routine procedure in land cover production community. In this study, we analyzed the behavior of simulated precipitation in the Regional Atmospheric Modeling System (RAMS) over a range of simulated classification accuracies over a 3 month period. This study found that land cover accuracy under 80% had a strong effect on precipitation especially when the land surface had a greater control of the atmosphere. This effect became stronger as the accuracy decreased. As shown in three follow-on experiments, the effect was further influenced by model parameterizations such as convection schemes and interior nudging, which can mitigate the strength of surface boundary forcings. In reality, land cover accuracy rarely obtains the commonly recommended 85% target. Its effect on climate simulations should therefore be considered, especially when historically reconstructed and future projected land covers are employed.
NASA Astrophysics Data System (ADS)
YangDai, Tianyi; Zhang, Li
2016-02-01
Energy dispersive X-ray diffraction (EDXRD) combined with hybrid discriminant analysis (HDA) has been utilized for classifying the liquid materials for the first time. The XRD spectra of 37 kinds of liquid contrabands and daily supplies were obtained using an EDXRD test bed facility. The unique spectra of different samples reveal XRD's capability to distinguish liquid contrabands from daily supplies. In order to create a system to detect liquid contrabands, the diffraction spectra were subjected to HDA which is the combination of principal components analysis (PCA) and linear discriminant analysis (LDA). Experiments based on the leave-one-out method demonstrate that HDA is a practical method with higher classification accuracy and lower noise sensitivity than the other methods in this application. The study shows the great capability and potential of the combination of XRD and HDA for liquid contrabands classification.
NASA Astrophysics Data System (ADS)
Liu, Wanjun; Liang, Xuejian; Qu, Haicheng
2017-11-01
Hyperspectral image (HSI) classification is one of the most popular topics in remote sensing community. Traditional and deep learning-based classification methods were proposed constantly in recent years. In order to improve the classification accuracy and robustness, a dimensionality-varied convolutional neural network (DVCNN) was proposed in this paper. DVCNN was a novel deep architecture based on convolutional neural network (CNN). The input of DVCNN was a set of 3D patches selected from HSI which contained spectral-spatial joint information. In the following feature extraction process, each patch was transformed into some different 1D vectors by 3D convolution kernels, which were able to extract features from spectral-spatial data. The rest of DVCNN was about the same as general CNN and processed 2D matrix which was constituted by by all 1D data. So that the DVCNN could not only extract more accurate and rich features than CNN, but also fused spectral-spatial information to improve classification accuracy. Moreover, the robustness of network on water-absorption bands was enhanced in the process of spectral-spatial fusion by 3D convolution, and the calculation was simplified by dimensionality varied convolution. Experiments were performed on both Indian Pines and Pavia University scene datasets, and the results showed that the classification accuracy of DVCNN improved by 32.87% on Indian Pines and 19.63% on Pavia University scene than spectral-only CNN. The maximum accuracy improvement of DVCNN achievement was 13.72% compared with other state-of-the-art HSI classification methods, and the robustness of DVCNN on water-absorption bands noise was demonstrated.
Integrative Chemical-Biological Read-Across Approach for Chemical Hazard Classification
Low, Yen; Sedykh, Alexander; Fourches, Denis; Golbraikh, Alexander; Whelan, Maurice; Rusyn, Ivan; Tropsha, Alexander
2013-01-01
Traditional read-across approaches typically rely on the chemical similarity principle to predict chemical toxicity; however, the accuracy of such predictions is often inadequate due to the underlying complex mechanisms of toxicity. Here we report on the development of a hazard classification and visualization method that draws upon both chemical structural similarity and comparisons of biological responses to chemicals measured in multiple short-term assays (”biological” similarity). The Chemical-Biological Read-Across (CBRA) approach infers each compound's toxicity from those of both chemical and biological analogs whose similarities are determined by the Tanimoto coefficient. Classification accuracy of CBRA was compared to that of classical RA and other methods using chemical descriptors alone, or in combination with biological data. Different types of adverse effects (hepatotoxicity, hepatocarcinogenicity, mutagenicity, and acute lethality) were classified using several biological data types (gene expression profiling and cytotoxicity screening). CBRA-based hazard classification exhibited consistently high external classification accuracy and applicability to diverse chemicals. Transparency of the CBRA approach is aided by the use of radial plots that show the relative contribution of analogous chemical and biological neighbors. Identification of both chemical and biological features that give rise to the high accuracy of CBRA-based toxicity prediction facilitates mechanistic interpretation of the models. PMID:23848138
Classification of ECG beats using deep belief network and active learning.
G, Sayantan; T, Kien P; V, Kadambari K
2018-04-12
A new semi-supervised approach based on deep learning and active learning for classification of electrocardiogram signals (ECG) is proposed. The objective of the proposed work is to model a scientific method for classification of cardiac irregularities using electrocardiogram beats. The model follows the Association for the Advancement of medical instrumentation (AAMI) standards and consists of three phases. In phase I, feature representation of ECG is learnt using Gaussian-Bernoulli deep belief network followed by a linear support vector machine (SVM) training in the consecutive phase. It yields three deep models which are based on AAMI-defined classes, namely N, V, S, and F. In the last phase, a query generator is introduced to interact with the expert to label few beats to improve accuracy and sensitivity. The proposed approach depicts significant improvement in accuracy with minimal queries posed to the expert and fast online training as tested on the MIT-BIH Arrhythmia Database and the MIT-BIH Supra-ventricular Arrhythmia Database (SVDB). With 100 queries labeled by the expert in phase III, the method achieves an accuracy of 99.5% in "S" versus all classifications (SVEB) and 99.4% accuracy in "V " versus all classifications (VEB) on MIT-BIH Arrhythmia Database. In a similar manner, it is attributed that an accuracy of 97.5% for SVEB and 98.6% for VEB on SVDB database is achieved respectively. Graphical Abstract Reply- Deep belief network augmented by active learning for efficient prediction of arrhythmia.
NASA Astrophysics Data System (ADS)
Xie, W.-J.; Zhang, L.; Chen, H.-P.; Zhou, J.; Mao, W.-J.
2018-04-01
The purpose of carrying out national geographic conditions monitoring is to obtain information of surface changes caused by human social and economic activities, so that the geographic information can be used to offer better services for the government, enterprise and public. Land cover data contains detailed geographic conditions information, thus has been listed as one of the important achievements in the national geographic conditions monitoring project. At present, the main issue of the production of the land cover data is about how to improve the classification accuracy. For the land cover data quality inspection and acceptance, classification accuracy is also an important check point. So far, the classification accuracy inspection is mainly based on human-computer interaction or manual inspection in the project, which are time consuming and laborious. By harnessing the automatic high-resolution remote sensing image change detection technology based on the ERDAS IMAGINE platform, this paper carried out the classification accuracy inspection test of land cover data in the project, and presented a corresponding technical route, which includes data pre-processing, change detection, result output and information extraction. The result of the quality inspection test shows the effectiveness of the technical route, which can meet the inspection needs for the two typical errors, that is, missing and incorrect update error, and effectively reduces the work intensity of human-computer interaction inspection for quality inspectors, and also provides a technical reference for the data production and quality control of the land cover data.
Hierarchical vs non-hierarchical audio indexation and classification for video genres
NASA Astrophysics Data System (ADS)
Dammak, Nouha; BenAyed, Yassine
2018-04-01
In this paper, Support Vector Machines (SVMs) are used for segmenting and indexing video genres based on only audio features extracted at block level, which has a prominent asset by capturing local temporal information. The main contribution of our study is to show the wide effect on the classification accuracies while using an hierarchical categorization structure based on Mel Frequency Cepstral Coefficients (MFCC) audio descriptor. In fact, the classification consists in three common video genres: sports videos, music clips and news scenes. The sub-classification may divide each genre into several multi-speaker and multi-dialect sub-genres. The validation of this approach was carried out on over 360 minutes of video span yielding a classification accuracy of over 99%.
Acosta-Mesa, Héctor-Gabriel; Rechy-Ramírez, Fernando; Mezura-Montes, Efrén; Cruz-Ramírez, Nicandro; Hernández Jiménez, Rodolfo
2014-06-01
In this work, we present a novel application of time series discretization using evolutionary programming for the classification of precancerous cervical lesions. The approach optimizes the number of intervals in which the length and amplitude of the time series should be compressed, preserving the important information for classification purposes. Using evolutionary programming, the search for a good discretization scheme is guided by a cost function which considers three criteria: the entropy regarding the classification, the complexity measured as the number of different strings needed to represent the complete data set, and the compression rate assessed as the length of the discrete representation. This discretization approach is evaluated using a time series data based on temporal patterns observed during a classical test used in cervical cancer detection; the classification accuracy reached by our method is compared with the well-known times series discretization algorithm SAX and the dimensionality reduction method PCA. Statistical analysis of the classification accuracy shows that the discrete representation is as efficient as the complete raw representation for the present application, reducing the dimensionality of the time series length by 97%. This representation is also very competitive in terms of classification accuracy when compared with similar approaches. Copyright © 2014 Elsevier Inc. All rights reserved.
Cognitive-motivational deficits in ADHD: development of a classification system.
Gupta, Rashmi; Kar, Bhoomika R; Srinivasan, Narayanan
2011-01-01
The classification systems developed so far to detect attention deficit/hyperactivity disorder (ADHD) do not have high sensitivity and specificity. We have developed a classification system based on several neuropsychological tests that measure cognitive-motivational functions that are specifically impaired in ADHD children. A total of 240 (120 ADHD children and 120 healthy controls) children in the age range of 6-9 years and 32 Oppositional Defiant Disorder (ODD) children (aged 9 years) participated in the study. Stop-Signal, Task-Switching, Attentional Network, and Choice Delay tests were administered to all the participants. Receiver operating characteristic (ROC) analysis indicated that percentage choice of long-delay reward best classified the ADHD children from healthy controls. Single parameters were not helpful in making a differential classification of ADHD with ODD. Multinominal logistic regression (MLR) was performed with multiple parameters (data fusion) that produced improved overall classification accuracy. A combination of stop-signal reaction time, posterror-slowing, mean delay, switch cost, and percentage choice of long-delay reward produced an overall classification accuracy of 97.8%; with internal validation, the overall accuracy was 92.2%. Combining parameters from different tests of control functions not only enabled us to accurately classify ADHD children from healthy controls but also in making a differential classification with ODD. These results have implications for the theories of ADHD.
Mediterranean Land Use and Land Cover Classification Assessment Using High Spatial Resolution Data
NASA Astrophysics Data System (ADS)
Elhag, Mohamed; Boteva, Silvena
2016-10-01
Landscape fragmentation is noticeably practiced in Mediterranean regions and imposes substantial complications in several satellite image classification methods. To some extent, high spatial resolution data were able to overcome such complications. For better classification performances in Land Use Land Cover (LULC) mapping, the current research adopts different classification methods comparison for LULC mapping using Sentinel-2 satellite as a source of high spatial resolution. Both of pixel-based and an object-based classification algorithms were assessed; the pixel-based approach employs Maximum Likelihood (ML), Artificial Neural Network (ANN) algorithms, Support Vector Machine (SVM), and, the object-based classification uses the Nearest Neighbour (NN) classifier. Stratified Masking Process (SMP) that integrates a ranking process within the classes based on spectral fluctuation of the sum of the training and testing sites was implemented. An analysis of the overall and individual accuracy of the classification results of all four methods reveals that the SVM classifier was the most efficient overall by distinguishing most of the classes with the highest accuracy. NN succeeded to deal with artificial surface classes in general while agriculture area classes, and forest and semi-natural area classes were segregated successfully with SVM. Furthermore, a comparative analysis indicates that the conventional classification method yielded better accuracy results than the SMP method overall with both classifiers used, ML and SVM.
Weighted statistical parameters for irregularly sampled time series
NASA Astrophysics Data System (ADS)
Rimoldini, Lorenzo
2014-01-01
Unevenly spaced time series are common in astronomy because of the day-night cycle, weather conditions, dependence on the source position in the sky, allocated telescope time and corrupt measurements, for example, or inherent to the scanning law of satellites like Hipparcos and the forthcoming Gaia. Irregular sampling often causes clumps of measurements and gaps with no data which can severely disrupt the values of estimators. This paper aims at improving the accuracy of common statistical parameters when linear interpolation (in time or phase) can be considered an acceptable approximation of a deterministic signal. A pragmatic solution is formulated in terms of a simple weighting scheme, adapting to the sampling density and noise level, applicable to large data volumes at minimal computational cost. Tests on time series from the Hipparcos periodic catalogue led to significant improvements in the overall accuracy and precision of the estimators with respect to the unweighted counterparts and those weighted by inverse-squared uncertainties. Automated classification procedures employing statistical parameters weighted by the suggested scheme confirmed the benefits of the improved input attributes. The classification of eclipsing binaries, Mira, RR Lyrae, Delta Cephei and Alpha2 Canum Venaticorum stars employing exclusively weighted descriptive statistics achieved an overall accuracy of 92 per cent, about 6 per cent higher than with unweighted estimators.
NASA Astrophysics Data System (ADS)
Altıparmak, Hamit; Al Shahadat, Mohamad; Kiani, Ehsan; Dimililer, Kamil
2018-04-01
Robotic agriculture requires smart and doable techniques to substitute the human intelligence with machine intelligence. Strawberry is one of the important Mediterranean product and its productivity enhancement requires modern and machine-based methods. Whereas a human identifies the disease infected leaves by his eye, the machine should also be capable of vision-based disease identification. The objective of this paper is to practically verify the applicability of a new computer-vision method for discrimination between the healthy and disease infected strawberry leaves which does not require neural network or time consuming trainings. The proposed method was tested under outdoor lighting condition using a regular DLSR camera without any particular lens. Since the type and infection degree of disease is approximated a human brain a fuzzy decision maker classifies the leaves over the images captured on-site having the same properties of human vision. Optimizing the fuzzy parameters for a typical strawberry production area at a summer mid-day in Cyprus produced 96% accuracy for segmented iron deficiency and 93% accuracy for segmented using a typical human instant classification approximation as the benchmark holding higher accuracy than a human eye identifier. The fuzzy-base classifier provides approximate result for decision making on the leaf status as if it is healthy or not.
The Analysis of Object-Based Change Detection in Mining Area: a Case Study with Pingshuo Coal Mine
NASA Astrophysics Data System (ADS)
Zhang, M.; Zhou, W.; Li, Y.
2017-09-01
Accurate information on mining land use and land cover change are crucial for monitoring and environmental change studies. In this paper, RapidEye Remote Sensing Image (Map 2012) and SPOT7 Remote Sensing Image (Map 2015) in Pingshuo Mining Area are selected to monitor changes combined with object-based classification and change vector analysis method, we also used R in highresolution remote sensing image for mining land classification, and found the feasibility and the flexibility of open source software. The results show that (1) the classification of reclaimed mining land has higher precision, the overall accuracy and kappa coefficient of the classification of the change region map were 86.67 % and 89.44 %. It's obvious that object-based classification and change vector analysis which has a great significance to improve the monitoring accuracy can be used to monitor mining land, especially reclaiming mining land; (2) the vegetation area changed from 46 % to 40 % accounted for the proportion of the total area from 2012 to 2015, and most of them were transformed into the arable land. The sum of arable land and vegetation area increased from 51 % to 70 %; meanwhile, build-up land has a certain degree of increase, part of the water area was transformed into arable land, but the extent of the two changes is not obvious. The result illustrated the transformation of reclaimed mining area, at the same time, there is still some land convert to mining land, and it shows the mine is still operating, mining land use and land cover are the dynamic procedure.
Use of collateral information to improve LANDSAT classification accuracies
NASA Technical Reports Server (NTRS)
Strahler, A. H. (Principal Investigator)
1981-01-01
Methods to improve LANDSAT classification accuracies were investigated including: (1) the use of prior probabilities in maximum likelihood classification as a methodology to integrate discrete collateral data with continuously measured image density variables; (2) the use of the logit classifier as an alternative to multivariate normal classification that permits mixing both continuous and categorical variables in a single model and fits empirical distributions of observations more closely than the multivariate normal density function; and (3) the use of collateral data in a geographic information system as exercised to model a desired output information layer as a function of input layers of raster format collateral and image data base layers.
Landcover classification in MRF context using Dempster-Shafer fusion for multisensor imagery.
Sarkar, Anjan; Banerjee, Anjan; Banerjee, Nilanjan; Brahma, Siddhartha; Kartikeyan, B; Chakraborty, Manab; Majumder, K L
2005-05-01
This work deals with multisensor data fusion to obtain landcover classification. The role of feature-level fusion using the Dempster-Shafer rule and that of data-level fusion in the MRF context is studied in this paper to obtain an optimally segmented image. Subsequently, segments are validated and classification accuracy for the test data is evaluated. Two examples of data fusion of optical images and a synthetic aperture radar image are presented, each set having been acquired on different dates. Classification accuracies of the technique proposed are compared with those of some recent techniques in literature for the same image data.
Random forests for classification in ecology
Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J.
2007-01-01
Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in ecology. Advantages of RF compared to other statistical classifiers include (1) very high classification accuracy; (2) a novel method of determining variable importance; (3) ability to model complex interactions among predictor variables; (4) flexibility to perform several types of statistical data analysis, including regression, classification, survival analysis, and unsupervised learning; and (5) an algorithm for imputing missing values. We compared the accuracies of RF and four other commonly used statistical classifiers using data on invasive plant species presence in Lava Beds National Monument, California, USA, rare lichen species presence in the Pacific Northwest, USA, and nest sites for cavity nesting birds in the Uinta Mountains, Utah, USA. We observed high classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods. We also observed that the variables that RF identified as most important for classifying invasive plant species coincided with expectations based on the literature. ?? 2007 by the Ecological Society of America.
Retinal vasculature classification using novel multifractal features
NASA Astrophysics Data System (ADS)
Ding, Y.; Ward, W. O. C.; Duan, Jinming; Auer, D. P.; Gowland, Penny; Bai, L.
2015-11-01
Retinal blood vessels have been implicated in a large number of diseases including diabetic retinopathy and cardiovascular diseases, which cause damages to retinal blood vessels. The availability of retinal vessel imaging provides an excellent opportunity for monitoring and diagnosis of retinal diseases, and automatic analysis of retinal vessels will help with the processes. However, state of the art vascular analysis methods such as counting the number of branches or measuring the curvature and diameter of individual vessels are unsuitable for the microvasculature. There has been published research using fractal analysis to calculate fractal dimensions of retinal blood vessels, but so far there has been no systematic research extracting discriminant features from retinal vessels for classifications. This paper introduces new methods for feature extraction from multifractal spectra of retinal vessels for classification. Two publicly available retinal vascular image databases are used for the experiments, and the proposed methods have produced accuracies of 85.5% and 77% for classification of healthy and diabetic retinal vasculatures. Experiments show that classification with multiple fractal features produces better rates compared with methods using a single fractal dimension value. In addition to this, experiments also show that classification accuracy can be affected by the accuracy of vessel segmentation algorithms.
Lu, Dengsheng; Batistella, Mateus; de Miranda, Evaristo E; Moran, Emilio
2008-01-01
Complex forest structure and abundant tree species in the moist tropical regions often cause difficulties in classifying vegetation classes with remotely sensed data. This paper explores improvement in vegetation classification accuracies through a comparative study of different image combinations based on the integration of Landsat Thematic Mapper (TM) and SPOT High Resolution Geometric (HRG) instrument data, as well as the combination of spectral signatures and textures. A maximum likelihood classifier was used to classify the different image combinations into thematic maps. This research indicated that data fusion based on HRG multispectral and panchromatic data slightly improved vegetation classification accuracies: a 3.1 to 4.6 percent increase in the kappa coefficient compared with the classification results based on original HRG or TM multispectral images. A combination of HRG spectral signatures and two textural images improved the kappa coefficient by 6.3 percent compared with pure HRG multispectral images. The textural images based on entropy or second-moment texture measures with a window size of 9 pixels × 9 pixels played an important role in improving vegetation classification accuracy. Overall, optical remote-sensing data are still insufficient for accurate vegetation classifications in the Amazon basin.
Lu, Dengsheng; Batistella, Mateus; de Miranda, Evaristo E.; Moran, Emilio
2009-01-01
Complex forest structure and abundant tree species in the moist tropical regions often cause difficulties in classifying vegetation classes with remotely sensed data. This paper explores improvement in vegetation classification accuracies through a comparative study of different image combinations based on the integration of Landsat Thematic Mapper (TM) and SPOT High Resolution Geometric (HRG) instrument data, as well as the combination of spectral signatures and textures. A maximum likelihood classifier was used to classify the different image combinations into thematic maps. This research indicated that data fusion based on HRG multispectral and panchromatic data slightly improved vegetation classification accuracies: a 3.1 to 4.6 percent increase in the kappa coefficient compared with the classification results based on original HRG or TM multispectral images. A combination of HRG spectral signatures and two textural images improved the kappa coefficient by 6.3 percent compared with pure HRG multispectral images. The textural images based on entropy or second-moment texture measures with a window size of 9 pixels × 9 pixels played an important role in improving vegetation classification accuracy. Overall, optical remote-sensing data are still insufficient for accurate vegetation classifications in the Amazon basin. PMID:19789716
Ralston, Barbara E.; Davis, Philip A.; Weber, Robert M.; Rundall, Jill M.
2008-01-01
A vegetation database of the riparian vegetation located within the Colorado River ecosystem (CRE), a subsection of the Colorado River between Glen Canyon Dam and the western boundary of Grand Canyon National Park, was constructed using four-band image mosaics acquired in May 2002. A digital line scanner was flown over the Colorado River corridor in Arizona by ISTAR Americas, using a Leica ADS-40 digital camera to acquire a digital surface model and four-band image mosaics (blue, green, red, and near-infrared) for vegetation mapping. The primary objective of this mapping project was to develop a digital inventory map of vegetation to enable patch- and landscape-scale change detection, and to establish randomized sampling points for ground surveys of terrestrial fauna (principally, but not exclusively, birds). The vegetation base map was constructed through a combination of ground surveys to identify vegetation classes, image processing, and automated supervised classification procedures. Analysis of the imagery and subsequent supervised classification involved multiple steps to evaluate band quality, band ratios, and vegetation texture and density. Identification of vegetation classes involved collection of cover data throughout the river corridor and subsequent analysis using two-way indicator species analysis (TWINSPAN). Vegetation was classified into six vegetation classes, following the National Vegetation Classification Standard, based on cover dominance. This analysis indicated that total area covered by all vegetation within the CRE was 3,346 ha. Considering the six vegetation classes, the sparse shrub (SS) class accounted for the greatest amount of vegetation (627 ha) followed by Pluchea (PLSE) and Tamarix (TARA) at 494 and 366 ha, respectively. The wetland (WTLD) and Prosopis-Acacia (PRGL) classes both had similar areal cover values (227 and 213 ha, respectively). Baccharis-Salix (BAXX) was the least represented at 94 ha. Accuracy assessment of the supervised classification determined that accuracies varied among vegetation classes from 90% to 49%. Causes for low accuracies were similar spectral signatures among vegetation classes. Fuzzy accuracy assessment improved classification accuracies such that Federal mapping standards of 80% accuracies for all classes were met. The scale used to quantify vegetation adequately meets the needs of the stakeholder group. Increasing the scale to meet the U.S. Geological Survey (USGS)-National Park Service (NPS)National Mapping Program's minimum mapping unit of 0.5 ha is unwarranted because this scale would reduce the resolution of some classes (e.g., seep willow/coyote willow would likely be combined with tamarisk). While this would undoubtedly improve classification accuracies, it would not provide the community-level information about vegetation change that would benefit stakeholders. The identification of vegetation classes should follow NPS mapping approaches to complement the national effort and should incorporate the alternative analysis for community identification that is being incorporated into newer NPS mapping efforts. National Vegetation Classification is followed in this report for association- to formation-level categories. Accuracies could be improved by including more environmental variables such as stage elevation in the classification process and incorporating object-based classification methods. Another approach that may address the heterogeneous species issue and classification is to use spectral mixing analysis to estimate the fractional cover of species within each pixel and better quantify the cover of individual species that compose a cover class. Varying flights to capture vegetation at different times of the year might also help separate some vegetation classes, though the cost may be prohibitive. Lastly, photointerpretation instead of automated mapping could be tried. Photointerpretation would likely not improve accuracies in this case, howev
Spatial modeling and classification of corneal shape.
Marsolo, Keith; Twa, Michael; Bullimore, Mark A; Parthasarathy, Srinivasan
2007-03-01
One of the most promising applications of data mining is in biomedical data used in patient diagnosis. Any method of data analysis intended to support the clinical decision-making process should meet several criteria: it should capture clinically relevant features, be computationally feasible, and provide easily interpretable results. In an initial study, we examined the feasibility of using Zernike polynomials to represent biomedical instrument data in conjunction with a decision tree classifier to distinguish between the diseased and non-diseased eyes. Here, we provide a comprehensive follow-up to that work, examining a second representation, pseudo-Zernike polynomials, to determine whether they provide any increase in classification accuracy. We compare the fidelity of both methods using residual root-mean-square (rms) error and evaluate accuracy using several classifiers: neural networks, C4.5 decision trees, Voting Feature Intervals, and Naïve Bayes. We also examine the effect of several meta-learning strategies: boosting, bagging, and Random Forests (RFs). We present results comparing accuracy as it relates to dataset and transformation resolution over a larger, more challenging, multi-class dataset. They show that classification accuracy is similar for both data transformations, but differs by classifier. We find that the Zernike polynomials provide better feature representation than the pseudo-Zernikes and that the decision trees yield the best balance of classification accuracy and interpretability.
AVHRR channel selection for land cover classification
Maxwell, S.K.; Hoffer, R.M.; Chapman, P.L.
2002-01-01
Mapping land cover of large regions often requires processing of satellite images collected from several time periods at many spectral wavelength channels. However, manipulating and processing large amounts of image data increases the complexity and time, and hence the cost, that it takes to produce a land cover map. Very few studies have evaluated the importance of individual Advanced Very High Resolution Radiometer (AVHRR) channels for discriminating cover types, especially the thermal channels (channels 3, 4 and 5). Studies rarely perform a multi-year analysis to determine the impact of inter-annual variability on the classification results. We evaluated 5 years of AVHRR data using combinations of the original AVHRR spectral channels (1-5) to determine which channels are most important for cover type discrimination, yet stabilize inter-annual variability. Particular attention was placed on the channels in the thermal portion of the spectrum. Fourteen cover types over the entire state of Colorado were evaluated using a supervised classification approach on all two-, three-, four- and five-channel combinations for seven AVHRR biweekly composite datasets covering the entire growing season for each of 5 years. Results show that all three of the major portions of the electromagnetic spectrum represented by the AVHRR sensor are required to discriminate cover types effectively and stabilize inter-annual variability. Of the two-channel combinations, channels 1 (red visible) and 2 (near-infrared) had, by far, the highest average overall accuracy (72.2%), yet the inter-annual classification accuracies were highly variable. Including a thermal channel (channel 4) significantly increased the average overall classification accuracy by 5.5% and stabilized interannual variability. Each of the thermal channels gave similar classification accuracies; however, because of the problems in consistently interpreting channel 3 data, either channel 4 or 5 was found to be a more appropriate choice. Substituting the thermal channel with a single elevation layer resulted in equivalent classification accuracies and inter-annual variability.
Accuracy assessment of NLCD 2006 land cover and impervious surface
Wickham, James D.; Stehman, Stephen V.; Gass, Leila; Dewitz, Jon; Fry, Joyce A.; Wade, Timothy G.
2013-01-01
Release of NLCD 2006 provides the first wall-to-wall land-cover change database for the conterminous United States from Landsat Thematic Mapper (TM) data. Accuracy assessment of NLCD 2006 focused on four primary products: 2001 land cover, 2006 land cover, land-cover change between 2001 and 2006, and impervious surface change between 2001 and 2006. The accuracy assessment was conducted by selecting a stratified random sample of pixels with the reference classification interpreted from multi-temporal high resolution digital imagery. The NLCD Level II (16 classes) overall accuracies for the 2001 and 2006 land cover were 79% and 78%, respectively, with Level II user's accuracies exceeding 80% for water, high density urban, all upland forest classes, shrubland, and cropland for both dates. Level I (8 classes) accuracies were 85% for NLCD 2001 and 84% for NLCD 2006. The high overall and user's accuracies for the individual dates translated into high user's accuracies for the 2001–2006 change reporting themes water gain and loss, forest loss, urban gain, and the no-change reporting themes for water, urban, forest, and agriculture. The main factor limiting higher accuracies for the change reporting themes appeared to be difficulty in distinguishing the context of grass. We discuss the need for more research on land-cover change accuracy assessment.
Application of Skylab EREP data for land use management
NASA Technical Reports Server (NTRS)
Simonett, D. S. (Principal Investigator)
1976-01-01
The author has identified the following significant results. The 1.09-1.19 micron band proved to be very valuable for discriminating a variety of land use categories, including agriculture, forest, and urban classes. The 1.55-1.75 micron band proved very useful in combination with the 1.09-1.19 micron band. Misregistration between spectral bands, even by as little as 1/2 pixel, may degrade classification accuracy. Identification accuracy of boundary or border pixels was as much as 13% lower than the accuracy for identifying internal field pixels. The principal conclusion with respect to the S190B camera system is that the higher resolution of the S190B system in comparison to previous space photography (Gemini, Apollo), to the S190A system (Skylab), and to LANDSAT imagery significantly increases the range of additional discrimination achievable.
NASA Astrophysics Data System (ADS)
Hao Chiang, Shou; Valdez, Miguel; Chen, Chi-Farn
2016-06-01
Forest is a very important ecosystem and natural resource for living things. Based on forest inventories, government is able to make decisions to converse, improve and manage forests in a sustainable way. Field work for forestry investigation is difficult and time consuming, because it needs intensive physical labor and the costs are high, especially surveying in remote mountainous regions. A reliable forest inventory can give us a more accurate and timely information to develop new and efficient approaches of forest management. The remote sensing technology has been recently used for forest investigation at a large scale. To produce an informative forest inventory, forest attributes, including tree species are unavoidably required to be considered. In this study the aim is to classify forest tree species in Erdenebulgan County, Huwsgul province in Mongolia, using Maximum Entropy method. The study area is covered by a dense forest which is almost 70% of total territorial extension of Erdenebulgan County and is located in a high mountain region in northern Mongolia. For this study, Landsat satellite imagery and a Digital Elevation Model (DEM) were acquired to perform tree species mapping. The forest tree species inventory map was collected from the Forest Division of the Mongolian Ministry of Nature and Environment as training data and also used as ground truth to perform the accuracy assessment of the tree species classification. Landsat images and DEM were processed for maximum entropy modeling, and this study applied the model with two experiments. The first one is to use Landsat surface reflectance for tree species classification; and the second experiment incorporates terrain variables in addition to the Landsat surface reflectance to perform the tree species classification. All experimental results were compared with the tree species inventory to assess the classification accuracy. Results show that the second one which uses Landsat surface reflectance coupled with terrain variables produced better result, with the higher overall accuracy and kappa coefficient than first experiment. The results indicate that the Maximum Entropy method is an applicable, and to classify tree species using satellite imagery data coupled with terrain information can improve the classification of tree species in the study area.
ERIC Educational Resources Information Center
Guiberson, Mark; Rodriguez, Barbara L.; Dale, Philip S.
2011-01-01
Purpose: The purpose of the current study was to examine the concurrent validity and classification accuracy of 3 parent report measures of language development in Spanish-speaking toddlers. Method: Forty-five Spanish-speaking parents and their 2-year-old children participated. Twenty-three children had expressive language delays (ELDs) as…
ERIC Educational Resources Information Center
Guiberson, Mark; Rodriguez, Barbara L.
2010-01-01
Purpose: To describe the concurrent validity and classification accuracy of 2 Spanish parent surveys of language development, the Spanish Ages and Stages Questionnaire (ASQ; Squires, Potter, & Bricker, 1999) and the Pilot Inventario-III (Pilot INV-III; Guiberson, 2008a). Method: Forty-eight Spanish-speaking parents of preschool-age children…
ERIC Educational Resources Information Center
Zytowski, Donald G.
1972-01-01
Owing to the uncertainty concerning the concurrent validity of the SVIB and the KOIS, a test of accuracy of classification of men in the occupations common to both inventories was undertaken. The results suggest that neither show any less validity than had been shown in separate studies previously. (Author)
ERIC Educational Resources Information Center
Cohen, Ira L.; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N. S.; Romanczyk, Raymond G.; Karmel, Bernard Z.; Gardner, Judith M.
2016-01-01
In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80%,…
Developing Local Oral Reading Fluency Cut Scores for Predicting High-Stakes Test Performance
ERIC Educational Resources Information Center
Grapin, Sally L.; Kranzler, John H.; Waldron, Nancy; Joyce-Beaulieu, Diana; Algina, James
2017-01-01
This study evaluated the classification accuracy of a second grade oral reading fluency curriculum-based measure (R-CBM) in predicting third grade state test performance. It also compared the long-term classification accuracy of local and publisher-recommended R-CBM cut scores. Participants were 266 students who were divided into a calibration…
Factors Affecting the Item Parameter Estimation and Classification Accuracy of the DINA Model
ERIC Educational Resources Information Center
de la Torre, Jimmy; Hong, Yuan; Deng, Weiling
2010-01-01
To better understand the statistical properties of the deterministic inputs, noisy "and" gate cognitive diagnosis (DINA) model, the impact of several factors on the quality of the item parameter estimates and classification accuracy was investigated. Results of the simulation study indicate that the fully Bayes approach is most accurate when the…
Towards automatic lithological classification from remote sensing data using support vector machines
NASA Astrophysics Data System (ADS)
Yu, Le; Porwal, Alok; Holden, Eun-Jung; Dentith, Michael
2010-05-01
Remote sensing data can be effectively used as a mean to build geological knowledge for poorly mapped terrains. Spectral remote sensing data from space- and air-borne sensors have been widely used to geological mapping, especially in areas of high outcrop density in arid regions. However, spectral remote sensing information by itself cannot be efficiently used for a comprehensive lithological classification of an area due to (1) diagnostic spectral response of a rock within an image pixel is conditioned by several factors including the atmospheric effects, spectral and spatial resolution of the image, sub-pixel level heterogeneity in chemical and mineralogical composition of the rock, presence of soil and vegetation cover; (2) only surface information and is therefore highly sensitive to the noise due to weathering, soil cover, and vegetation. Consequently, for efficient lithological classification, spectral remote sensing data needs to be supplemented with other remote sensing datasets that provide geomorphological and subsurface geological information, such as digital topographic model (DEM) and aeromagnetic data. Each of the datasets contain significant information about geology that, in conjunction, can potentially be used for automated lithological classification using supervised machine learning algorithms. In this study, support vector machine (SVM), which is a kernel-based supervised learning method, was applied to automated lithological classification of a study area in northwestern India using remote sensing data, namely, ASTER, DEM and aeromagnetic data. Several digital image processing techniques were used to produce derivative datasets that contained enhanced information relevant to lithological discrimination. A series of SVMs (trained using k-folder cross-validation with grid search) were tested using various combinations of input datasets selected from among 50 datasets including the original 14 ASTER bands and 36 derivative datasets (including 14 principal component bands, 14 independent component bands, 3 band ratios, 3 DEM derivatives: slope/curvatureroughness and 2 aeromagnetic derivatives: mean and variance of susceptibility) extracted from the ASTER, DEM and aeromagnetic data, in order to determine the optimal inputs that provide the highest classification accuracy. It was found that a combination of ASTER-derived independent components, principal components and band ratios, DEM-derived slope, curvature and roughness, and aeromagnetic-derived mean and variance of magnetic susceptibility provide the highest classification accuracy of 93.4% on independent test samples. A comparison of the classification results of the SVM with those of maximum likelihood (84.9%) and minimum distance (38.4%) classifiers clearly show that the SVM algorithm returns much higher classification accuracy. Therefore, the SVM method can be used to produce quick and reliable geological maps from scarce geological information, which is still the case with many under-developed frontier regions of the world.
NASA Astrophysics Data System (ADS)
Hänsch, Ronny; Hellwich, Olaf
2018-04-01
Random Forests have continuously proven to be one of the most accurate, robust, as well as efficient methods for the supervised classification of images in general and polarimetric synthetic aperture radar data in particular. While the majority of previous work focus on improving classification accuracy, we aim for accelerating the training of the classifier as well as its usage during prediction while maintaining its accuracy. Unlike other approaches we mainly consider algorithmic changes to stay as much as possible independent of platform and programming language. The final model achieves an approximately 60 times faster training and a 500 times faster prediction, while the accuracy is only marginally decreased by roughly 1 %.
How reliable and accurate is the AO/OTA comprehensive classification for adult long-bone fractures?
Meling, Terje; Harboe, Knut; Enoksen, Cathrine H; Aarflot, Morten; Arthursson, Astvaldur J; Søreide, Kjetil
2012-07-01
Reliable classification of fractures is important for treatment allocation and study comparisons. The overall accuracy of scoring applied to a general population of fractures is little known. This study aimed to investigate the accuracy and reliability of the comprehensive Arbeitsgemeinschaft für Osteosynthesefragen/Orthopedic Trauma Association classification for adult long-bone fractures and identify factors associated with poor coding agreement. Adults (>16 years) with long-bone fractures coded in a Fracture and Dislocation Registry at the Stavanger University Hospital during the fiscal year 2008 were included. An unblinded reference code dataset was generated for the overall accuracy assessment by two experienced orthopedic trauma surgeons. Blinded analysis of intrarater reliability was performed by rescoring and of interrater reliability by recoding of a randomly selected fracture sample. Proportion of agreement (PA) and kappa (κ) statistics are presented. Uni- and multivariate logistic regression analyses of factors predicting accuracy were performed. During the study period, 949 fractures were included and coded by 26 surgeons. For the intrarater analysis, overall agreements were κ = 0.67 (95% confidence interval [CI]: 0.64-0.70) and PA 69%. For interrater assessment, κ = 0.67 (95% CI: 0.62-0.72) and PA 69%. The accuracy of surgeons' blinded recoding was κ = 0.68 (95% CI: 0.65- 0.71) and PA 68%. Fracture type, frequency of the fracture, and segment fractured significantly influenced accuracy whereas the coder's experience did not. Both the reliability and accuracy of the comprehensive Arbeitsgemeinschaft für Osteosynthesefragen/Orthopedic Trauma Association classification for long-bone fractures ranged from substantial to excellent. Variations in coding accuracy seem to be related more to the fracture itself than the surgeon. Diagnostic study, level I.
Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.
Chen, Shizhi; Yang, Xiaodong; Tian, Yingli
2015-09-01
A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement.
Mayes, Susan Dickerson; Calhoun, Susan L; Murray, Michael J; Morrow, Jill D; Yurich, Kirsten K L; Cothren, Shiyoko; Purichia, Heather; Bouder, James N
2011-02-01
Little is known about the validity of Gilliam Asperger's Disorder Scale (GADS), although it is widely used. This study of 199 children with high functioning autism or Asperger's disorder, 195 with low functioning autism, and 83 with attention deficit hyperactivity disorder (ADHD) showed high classification accuracy (autism vs. ADHD) for clinicians' GADS Quotients (92%), and somewhat lower accuracy (77%) for parents' Quotients. Both children with high and low functioning autism had clinicians' Quotients (M=99 and 101, respectively) similar to the Asperger's Disorder mean of 100 for the GADS normative sample. Children with high functioning autism scored significantly higher on the cognitive patterns subscale than children with low functioning autism, and the latter had higher scores on the remaining subscales: social interaction, restricted patterns of behavior, and pragmatic skills. Using the clinicians' Quotient and Cognitive Patterns score, 70% of children were correctly identified as having high or low functioning autism or ADHD.
Acquiring Research-grade ALSM Data in the Commercial Marketplace
NASA Astrophysics Data System (ADS)
Haugerud, R. A.; Harding, D. J.; Latypov, D.; Martinez, D.; Routh, S.; Ziegler, J.
2003-12-01
The Puget Sound Lidar Consortium, working with TerraPoint, LLC, has procured a large volume of ALSM (topographic lidar) data for scientific research. Research-grade ALSM data can be characterized by their completeness, density, and accuracy. Complete data include-at a minimum-X, Y, Z, time, and classification (ground, vegetation, structure, blunder) for each laser reflection. Off-nadir angle and return number for multiple returns are also useful. We began with a pulse density of 1/sq m, and after limited experiments still find this density satisfactory in the dense second-growth forests of western Washington. Lower pulse densities would have produced unacceptably limited sampling in forested areas and aliased some topographic features. Higher pulse densities do not produce markedly better topographic models, in part because of limitations of reproducibility between the overlapping survey swaths used to achieve higher density. Our experience in a variety of forest types demonstrates that the fraction of pulses that produce ground returns varies with vegetation cover, laser beam divergence, laser power, and detector sensitivity, but have not quantified this relationship. The most significant operational limits on vertical accuracy of ALSM appear to be instrument calibration and the accuracy with which returns are classified as ground or vegetation. TerraPoint has recently implemented in-situ calibration using overlapping swaths (Latypov and Zosse, 2002, see http://www.terrapoint.com/News_damirACSM_ASPRS2002.html). On the consumer side, we routinely perform a similar overlap analysis to produce maps of relative Z error between swaths; we find that in bare, low-slope regions the in-situ calibration has reduced this internal Z error to 6-10 cm RMSE. Comparison with independent ground control points commonly illuminates inconsistencies in how GPS heights have been reduced to orthometric heights. Once these inconsistencies are resolved, it appears that the internal errors are the bulk of the error of the survey. The error maps suggest that with in-situ calibration, minor time-varying errors with a period of circa 1 sec are the largest remaining source of survey error. For forested terrain, limited ground penetration and errors in return classification can severely limit the accuracy of resulting topographic models. Initial work by Haugerud and Harding demonstrated the feasibility of fully-automatic return classification; however, TerraPoint has found that better results can be obtained more effectively with 3rd-party classification software that allows a mix of automated routines and human intervention. Our relationship has been evolving since early 2000. Important aspects of this relationship include close communication between data producer and consumer, a willingness to learn from each other, significant technical expertise and resources on the consumer side, and continued refinement of achievable, quantitative performance and accuracy specifications. Most recently we have instituted a slope-dependent Z accuracy specification that TerraPoint first developed as a heuristic for surveying mountainous terrain in Switzerland. We are now working on quantifying the internal consistency of topographic models in forested areas, using a variant of overlap analysis, and standards for the spatial distribution of internal errors.
Analysis of spatial distribution of land cover maps accuracy
NASA Astrophysics Data System (ADS)
Khatami, R.; Mountrakis, G.; Stehman, S. V.
2017-12-01
Land cover maps have become one of the most important products of remote sensing science. However, classification errors will exist in any classified map and affect the reliability of subsequent map usage. Moreover, classification accuracy often varies over different regions of a classified map. These variations of accuracy will affect the reliability of subsequent analyses of different regions based on the classified maps. The traditional approach of map accuracy assessment based on an error matrix does not capture the spatial variation in classification accuracy. Here, per-pixel accuracy prediction methods are proposed based on interpolating accuracy values from a test sample to produce wall-to-wall accuracy maps. Different accuracy prediction methods were developed based on four factors: predictive domain (spatial versus spectral), interpolation function (constant, linear, Gaussian, and logistic), incorporation of class information (interpolating each class separately versus grouping them together), and sample size. Incorporation of spectral domain as explanatory feature spaces of classification accuracy interpolation was done for the first time in this research. Performance of the prediction methods was evaluated using 26 test blocks, with 10 km × 10 km dimensions, dispersed throughout the United States. The performance of the predictions was evaluated using the area under the curve (AUC) of the receiver operating characteristic. Relative to existing accuracy prediction methods, our proposed methods resulted in improvements of AUC of 0.15 or greater. Evaluation of the four factors comprising the accuracy prediction methods demonstrated that: i) interpolations should be done separately for each class instead of grouping all classes together; ii) if an all-classes approach is used, the spectral domain will result in substantially greater AUC than the spatial domain; iii) for the smaller sample size and per-class predictions, the spectral and spatial domain yielded similar AUC; iv) for the larger sample size (i.e., very dense spatial sample) and per-class predictions, the spatial domain yielded larger AUC; v) increasing the sample size improved accuracy predictions with a greater benefit accruing to the spatial domain; and vi) the function used for interpolation had the smallest effect on AUC.
NASA Astrophysics Data System (ADS)
Lemma, Hanibal; Frankl, Amaury; Poesen, Jean; Adgo, Enyew; Nyssen, Jan
2017-04-01
Object-oriented image classification has been gaining prominence in the field of remote sensing and provides a valid alternative to the 'traditional' pixel based methods. Recent studies have proven the superiority of the object-based approach. So far, object-oriented land cover classifications have been applied either at limited spatial coverages (ranging 2 to 1091 km2) or by using very high resolution (0.5-16 m) imageries. The main aim of this study is to drive land cover information for large area from Landsat 8 OLI surface reflectance using the Estimation of Scale Parameter (ESP) tool and the object oriented software eCognition. The available land cover map of Lake Tana Basin (Ethiopia) is about 20 years old with a courser spatial scale (1:250,000) and has limited use for environmental modelling and monitoring studies. Up-to-date and basin wide land cover maps are essential to overcome haphazard natural resources management, land degradation and reduced agricultural production. Indeed, object-oriented approach involves image segmentation prior to classification, i.e. adjacent similar pixels are aggregated into segments as long as the heterogeneity in the spectral and spatial domains is minimized. For each segmented object, different attributes (spectral, textural and shape) were calculated and used for in subsequent classification analysis. Moreover, the commonly used error matrix is employed to determine the quality of the land cover map. As a result, the multiresolution segmentation (with parameters of scale=30, shape=0.3 and Compactness=0.7) produces highly homogeneous image objects as it is observed in different sample locations in google earth. Out of the 15,089 km2 area of the basin, cultivated land is dominant (69%) followed by water bodies (21%), grassland (4.8%), forest (3.7%) and shrubs (1.1%). Wetlands, artificial surfaces and bare land cover only about 1% of the basin. The overall classification accuracy is 80% with a Kappa coefficient of 0.75. With regard to individual classes, the classification show higher Producer's and User's accuracy (above 84%) for cultivated land, water bodies and forest, but lower (less than 70%) for shrubs, bare land and grassland. Key words: accuracy assessment, eCognition, Estimation of Scale Parameter, land cover, Landsat 8, remote sensing
A Dirichlet process model for classifying and forecasting epidemic curves
2014-01-01
Background A forecast can be defined as an endeavor to quantitatively estimate a future event or probabilities assigned to a future occurrence. Forecasting stochastic processes such as epidemics is challenging since there are several biological, behavioral, and environmental factors that influence the number of cases observed at each point during an epidemic. However, accurate forecasts of epidemics would impact timely and effective implementation of public health interventions. In this study, we introduce a Dirichlet process (DP) model for classifying and forecasting influenza epidemic curves. Methods The DP model is a nonparametric Bayesian approach that enables the matching of current influenza activity to simulated and historical patterns, identifies epidemic curves different from those observed in the past and enables prediction of the expected epidemic peak time. The method was validated using simulated influenza epidemics from an individual-based model and the accuracy was compared to that of the tree-based classification technique, Random Forest (RF), which has been shown to achieve high accuracy in the early prediction of epidemic curves using a classification approach. We also applied the method to forecasting influenza outbreaks in the United States from 1997–2013 using influenza-like illness (ILI) data from the Centers for Disease Control and Prevention (CDC). Results We made the following observations. First, the DP model performed as well as RF in identifying several of the simulated epidemics. Second, the DP model correctly forecasted the peak time several days in advance for most of the simulated epidemics. Third, the accuracy of identifying epidemics different from those already observed improved with additional data, as expected. Fourth, both methods correctly classified epidemics with higher reproduction numbers (R) with a higher accuracy compared to epidemics with lower R values. Lastly, in the classification of seasonal influenza epidemics based on ILI data from the CDC, the methods’ performance was comparable. Conclusions Although RF requires less computational time compared to the DP model, the algorithm is fully supervised implying that epidemic curves different from those previously observed will always be misclassified. In contrast, the DP model can be unsupervised, semi-supervised or fully supervised. Since both methods have their relative merits, an approach that uses both RF and the DP model could be beneficial. PMID:24405642
Calès, P; Boursier, J; Lebigot, J; de Ledinghen, V; Aubé, C; Hubert, I; Oberti, F
2017-04-01
In chronic hepatitis C, the European Association for the Study of the Liver and the Asociacion Latinoamericana para el Estudio del Higado recommend performing transient elastography plus a blood test to diagnose significant fibrosis; test concordance confirms the diagnosis. To validate this rule and improve it by combining a blood test, FibroMeter (virus second generation, Echosens, Paris, France) and transient elastography (constitutive tests) into a single combined test, as suggested by the American Association for the Study of Liver Diseases and the Infectious Diseases Society of America. A total of 1199 patients were included in an exploratory set (HCV, n = 679) or in two validation sets (HCV ± HIV, HBV, n = 520). Accuracy was mainly evaluated by correct diagnosis rate for severe fibrosis (pathological Metavir F ≥ 3, primary outcome) by classical test scores or a fibrosis classification, reflecting Metavir staging, as a function of test concordance. Score accuracy: there were no significant differences between the blood test (75.7%), elastography (79.1%) and the combined test (79.4%) (P = 0.066); the score accuracy of each test was significantly (P < 0.001) decreased in discordant vs. concordant tests. Classification accuracy: combined test accuracy (91.7%) was significantly (P < 0.001) increased vs. the blood test (84.1%) and elastography (88.2%); accuracy of each constitutive test was significantly (P < 0.001) decreased in discordant vs. concordant tests but not with combined test: 89.0 vs. 92.7% (P = 0.118). Multivariate analysis for accuracy showed an interaction between concordance and fibrosis level: in the 1% of patients with full classification discordance and severe fibrosis, non-invasive tests were unreliable. The advantage of combined test classification was confirmed in the validation sets. The concordance recommendation is validated. A combined test, expressed in classification instead of score, improves this rule and validates the recommendation of a combined test, avoiding 99% of biopsies, and offering precise staging. © 2017 John Wiley & Sons Ltd.
Ex vivo characterization of normal and adenocarcinoma colon samples by Mueller matrix polarimetry.
Ahmad, Iftikhar; Ahmad, Manzoor; Khan, Karim; Ashraf, Sumara; Ahmad, Shakil; Ikram, Masroor
2015-05-01
Mueller matrix polarimetry along with polar decomposition algorithm was employed for the characterization of ex vivo normal and adenocarcinoma human colon tissues by polarized light in the visible spectral range (425-725 nm). Six derived polarization metrics [total diattenuation (DT ), retardance (RT ), depolarization(ΔT ), linear diattenuation (DL), retardance (δ), and depolarization (ΔL)] were compared for normal and adenocarcinoma colon tissue samples. The results show that all six polarimetric properties for adenocarcinoma samples were significantly higher as compared to the normal samples for all wavelengths. The Wilcoxon rank sum test illustrated that total retardance is a good candidate for the discrimination of normal and adenocarcinoma colon samples. Support vector machine classification for normal and adenocarcinoma based on the four polarization properties spectra (ΔT , ΔL, RT ,and δ) yielded 100% accuracy, sensitivity, and specificity, while both DTa nd DL showed 66.6%, 33.3%, and 83.3% accuracy, sensitivity, and specificity, respectively. The combination of polarization analysis and given classification methods provides a framework to distinguish the normal and cancerous tissues.
Discrimination Enhancement with Transient Feature Analysis of a Graphene Chemical Sensor.
Nallon, Eric C; Schnee, Vincent P; Bright, Collin J; Polcha, Michael P; Li, Qiliang
2016-01-19
A graphene chemical sensor is subjected to a set of structurally and chemically similar hydrocarbon compounds consisting of toluene, o-xylene, p-xylene, and mesitylene. The fractional change in resistance of the sensor upon exposure to these compounds exhibits a similar response magnitude among compounds, whereas large variation is observed within repetitions for each compound, causing a response overlap. Therefore, traditional features depending on maximum response change will cause confusion during further discrimination and classification analysis. More robust features that are less sensitive to concentration, sampling, and drift variability would provide higher quality information. In this work, we have explored the advantage of using transient-based exponential fitting coefficients to enhance the discrimination of similar compounds. The advantages of such feature analysis to discriminate each compound is evaluated using principle component analysis (PCA). In addition, machine learning-based classification algorithms were used to compare the prediction accuracies when using fitting coefficients as features. The additional features greatly enhanced the discrimination between compounds while performing PCA and also improved the prediction accuracy by 34% when using linear discrimination analysis.
Ullah, Saleem; Groen, Thomas A; Schlerf, Martin; Skidmore, Andrew K; Nieuwenhuis, Willem; Vaiphasa, Chaichoke
2012-01-01
Genetic variation between various plant species determines differences in their physio-chemical makeup and ultimately in their hyperspectral emissivity signatures. The hyperspectral emissivity signatures, on the one hand, account for the subtle physio-chemical changes in the vegetation, but on the other hand, highlight the problem of high dimensionality. The aim of this paper is to investigate the performance of genetic algorithms coupled with the spectral angle mapper (SAM) to identify a meaningful subset of wavebands sensitive enough to discriminate thirteen broadleaved vegetation species from the laboratory measured hyperspectral emissivities. The performance was evaluated using an overall classification accuracy and Jeffries Matusita distance. For the multiple plant species, the targeted bands based on genetic algorithms resulted in a high overall classification accuracy (90%). Concentrating on the pairwise comparison results, the selected wavebands based on genetic algorithms resulted in higher Jeffries Matusita (J-M) distances than randomly selected wavebands did. This study concludes that targeted wavebands from leaf emissivity spectra are able to discriminate vegetation species.
A spectral-knowledge-based approach for urban land-cover discrimination
NASA Technical Reports Server (NTRS)
Wharton, Stephen W.
1987-01-01
A prototype expert system was developed to demonstrate the feasibility of classifying multispectral remotely sensed data on the basis of spectral knowledge. The spectral expert was developed and tested with Thematic Mapper Simulator (TMS) data having eight spectral bands and a spatial resolution of 5 m. A knowledge base was developed that describes the target categories in terms of characteristic spectral relationships. The knowledge base was developed under the following assumptions: the data are calibrated to ground reflectance, the area is well illuminated, the pixels are dominated by a single category, and the target categories can be recognized without the use of spatial knowledge. Classification decisions are made on the basis of convergent evidence as derived from applying the spectral rules to a multiple spatial resolution representation of the image. The spectral expert achieved an accuracy of 80-percent correct or higher in recognizing 11 spectral categories in TMS data for the washington, DC, area. Classification performance can be expected to decrease for data that do not satisfy the above assumptions as illustrated by the 63-percent accuracy for 30-m resolution Thematic Mapper data.
Cheng, Jiao; Jin, Jing; Wang, Xingyu
2017-01-01
Brain-computer interface (BCI) systems allow users to communicate with the external world by recognizing the brain activity without the assistance of the peripheral motor nervous system. P300-based BCI is one of the most common used BCI systems that can obtain high classification accuracy and information transfer rate (ITR). Face stimuli can result in large event-related potentials and improve the performance of P300-based BCI. However, previous studies on face stimuli focused mainly on the effect of various face types (i.e., face expression, face familiarity, and multifaces) on the BCI performance. Studies on the influence of face transparency differences are scarce. Therefore, we investigated the effect of semitransparent face pattern (STF-P) (the subject could see the target character when the stimuli were flashed) and traditional face pattern (F-P) (the subject could not see the target character when the stimuli were flashed) on the BCI performance from the transparency perspective. Results showed that STF-P obtained significantly higher classification accuracy and ITR than those of F-P ( p < 0.05).
Automatic detection of confusion in elderly users of a web-based health instruction video.
Postma-Nilsenová, Marie; Postma, Eric; Tates, Kiek
2015-06-01
Because of cognitive limitations and lower health literacy, many elderly patients have difficulty understanding verbal medical instructions. Automatic detection of facial movements provides a nonintrusive basis for building technological tools supporting confusion detection in healthcare delivery applications on the Internet. Twenty-four elderly participants (70-90 years old) were recorded while watching Web-based health instruction videos involving easy and complex medical terminology. Relevant fragments of the participants' facial expressions were rated by 40 medical students for perceived level of confusion and analyzed with automatic software for facial movement recognition. A computer classification of the automatically detected facial features performed more accurately and with a higher sensitivity than the human observers (automatic detection and classification, 64% accuracy, 0.64 sensitivity; human observers, 41% accuracy, 0.43 sensitivity). A drill-down analysis of cues to confusion indicated the importance of the eye and eyebrow region. Confusion caused by misunderstanding of medical terminology is signaled by facial cues that can be automatically detected with currently available facial expression detection technology. The findings are relevant for the development of Web-based services for healthcare consumers.
NASA Technical Reports Server (NTRS)
Ackleson, S. G.; Klemas, V.
1985-01-01
LANDSAT Thematic Mapper (TM) and Multispectral Scanner (MSS) imagery generated simultaneously over Guinea Marsh, Virginia, are assessed in the ability to detect submerged aquatic, bottom-adhering plant canopies (SAV). An unsupervised clustering algorithm is applied to both image types and the resulting classifications compared to SAV distributions derived from color aerial photography. Class confidence and accuracy are first computed for all water areas and then only shallow areas where water depth is less than 6 feet. In both the TM and MSS imagery, masking water areas deeper than 6 ft. resulted in greater classification accuracy at confidence levels greater than 50%. Both systems perform poorly in detecting SAV with crown cover densities less than 70%. On the basis of the spectral resolution, radiometric sensitivity, and location of visible bands, TM imagery does not offer a significant advantage over MSS data for detecting SAV in Lower Chesapeake Bay. However, because the TM imagery represents a higher spatial resolution, smaller SAV canopies may be detected than is possible with MSS data.
Zhang, Ming-Huan; Ma, Jun-Shan; Shen, Ying; Chen, Ying
2016-09-01
This study aimed to investigate the optimal support vector machines (SVM)-based classifier of duchenne muscular dystrophy (DMD) magnetic resonance imaging (MRI) images. T1-weighted (T1W) and T2-weighted (T2W) images of the 15 boys with DMD and 15 normal controls were obtained. Textural features of the images were extracted and wavelet decomposed, and then, principal features were selected. Scale transform was then performed for MRI images. Afterward, SVM-based classifiers of MRI images were analyzed based on the radical basis function and decomposition levels. The cost (C) parameter and kernel parameter [Formula: see text] were used for classification. Then, the optimal SVM-based classifier, expressed as [Formula: see text]), was identified by performance evaluation (sensitivity, specificity and accuracy). Eight of 12 textural features were selected as principal features (eigenvalues [Formula: see text]). The 16 SVM-based classifiers were obtained using combination of (C, [Formula: see text]), and those with lower C and [Formula: see text] values showed higher performances, especially classifier of [Formula: see text]). The SVM-based classifiers of T1W images showed higher performance than T1W images at the same decomposition level. The T1W images in classifier of [Formula: see text]) at level 2 decomposition showed the highest performance of all, and its overall correct sensitivity, specificity, and accuracy reached 96.9, 97.3, and 97.1 %, respectively. The T1W images in SVM-based classifier [Formula: see text] at level 2 decomposition showed the highest performance of all, demonstrating that it was the optimal classification for the diagnosis of DMD.
Simulation of seagrass bed mapping by satellite images based on the radiative transfer model
NASA Astrophysics Data System (ADS)
Sagawa, Tatsuyuki; Komatsu, Teruhisa
2015-06-01
Seagrass and seaweed beds play important roles in coastal marine ecosystems. They are food sources and habitats for many marine organisms, and influence the physical, chemical, and biological environment. They are sensitive to human impacts such as reclamation and pollution. Therefore, their management and preservation are necessary for a healthy coastal environment. Satellite remote sensing is a useful tool for mapping and monitoring seagrass beds. The efficiency of seagrass mapping, seagrass bed classification in particular, has been evaluated by mapping accuracy using an error matrix. However, mapping accuracies are influenced by coastal environments such as seawater transparency, bathymetry, and substrate type. Coastal management requires sufficient accuracy and an understanding of mapping limitations for monitoring coastal habitats including seagrass beds. Previous studies are mainly based on case studies in specific regions and seasons. Extensive data are required to generalise assessments of classification accuracy from case studies, which has proven difficult. This study aims to build a simulator based on a radiative transfer model to produce modelled satellite images and assess the visual detectability of seagrass beds under different transparencies and seagrass coverages, as well as to examine mapping limitations and classification accuracy. Our simulations led to the development of a model of water transparency and the mapping of depth limits and indicated the possibility for seagrass density mapping under certain ideal conditions. The results show that modelling satellite images is useful in evaluating the accuracy of classification and that establishing seagrass bed monitoring by remote sensing is a reliable tool.
Corcoran, Jennifer M.; Knight, Joseph F.; Gallant, Alisa L.
2013-01-01
Wetland mapping at the landscape scale using remotely sensed data requires both affordable data and an efficient accurate classification method. Random forest classification offers several advantages over traditional land cover classification techniques, including a bootstrapping technique to generate robust estimations of outliers in the training data, as well as the capability of measuring classification confidence. Though the random forest classifier can generate complex decision trees with a multitude of input data and still not run a high risk of over fitting, there is a great need to reduce computational and operational costs by including only key input data sets without sacrificing a significant level of accuracy. Our main questions for this study site in Northern Minnesota were: (1) how does classification accuracy and confidence of mapping wetlands compare using different remote sensing platforms and sets of input data; (2) what are the key input variables for accurate differentiation of upland, water, and wetlands, including wetland type; and (3) which datasets and seasonal imagery yield the best accuracy for wetland classification. Our results show the key input variables include terrain (elevation and curvature) and soils descriptors (hydric), along with an assortment of remotely sensed data collected in the spring (satellite visible, near infrared, and thermal bands; satellite normalized vegetation index and Tasseled Cap greenness and wetness; and horizontal-horizontal (HH) and horizontal-vertical (HV) polarization using L-band satellite radar). We undertook this exploratory analysis to inform decisions by natural resource managers charged with monitoring wetland ecosystems and to aid in designing a system for consistent operational mapping of wetlands across landscapes similar to those found in Northern Minnesota.
A Visual mining based framework for classification accuracy estimation
NASA Astrophysics Data System (ADS)
Arun, Pattathal Vijayakumar
2013-12-01
Classification techniques have been widely used in different remote sensing applications and correct classification of mixed pixels is a tedious task. Traditional approaches adopt various statistical parameters, however does not facilitate effective visualisation. Data mining tools are proving very helpful in the classification process. We propose a visual mining based frame work for accuracy assessment of classification techniques using open source tools such as WEKA and PREFUSE. These tools in integration can provide an efficient approach for getting information about improvements in the classification accuracy and helps in refining training data set. We have illustrated framework for investigating the effects of various resampling methods on classification accuracy and found that bilinear (BL) is best suited for preserving radiometric characteristics. We have also investigated the optimal number of folds required for effective analysis of LISS-IV images. Techniki klasyfikacji są szeroko wykorzystywane w różnych aplikacjach teledetekcyjnych, w których poprawna klasyfikacja pikseli stanowi poważne wyzwanie. Podejście tradycyjne wykorzystujące różnego rodzaju parametry statystyczne nie zapewnia efektywnej wizualizacji. Wielce obiecujące wydaje się zastosowanie do klasyfikacji narzędzi do eksploracji danych. W artykule zaproponowano podejście bazujące na wizualnej analizie eksploracyjnej, wykorzystujące takie narzędzia typu open source jak WEKA i PREFUSE. Wymienione narzędzia ułatwiają korektę pół treningowych i efektywnie wspomagają poprawę dokładności klasyfikacji. Działanie metody sprawdzono wykorzystując wpływ różnych metod resampling na zachowanie dokładności radiometrycznej i uzyskując najlepsze wyniki dla metody bilinearnej (BL).
NASA Technical Reports Server (NTRS)
Quattrochi, D. A.; Anderson, J. E.; Brannon, D. P.; Hill, C. L.
1982-01-01
An initial analysis of LANDSAT 4 thematic mapper (TM) data for the delineation and classification of agricultural, forested wetland, and urban land covers was conducted. A study area in Poinsett County, Arkansas was used to evaluate a classification of agricultural lands derived from multitemporal LANDSAT multispectral scanner (MSS) data in comparison with a classification of TM data for the same area. Data over Reelfoot Lake in northwestern Tennessee were utilized to evaluate the TM for delineating forested wetland species. A classification of the study area was assessed for accuracy in discriminating five forested wetland categories. Finally, the TM data were used to identify urban features within a small city. A computer generated classification of Union City, Tennessee was analyzed for accuracy in delineating urban land covers. An evaluation of digitally enhanced TM data using principal components analysis to facilitate photointerpretation of urban features was also performed.
NASA Technical Reports Server (NTRS)
Lillesand, T. M.; Werth, L. F. (Principal Investigator)
1980-01-01
A 25% improvement in average classification accuracy was realized by processing double-date vs. single-date data. Under the spectrally and spatially complex site conditions characterizing the geographical area used, further improvement in wetland classification accuracy is apparently precluded by the spectral and spatial resolution restrictions of the LANDSAT MSS. Full scene analysis of scanning densitometer data extracted from scale infrared photography failed to permit discrimination of many wetland and nonwetland cover types. When classification of photographic data was limited to wetland areas only, much more detailed and accurate classification could be made. The integration of conventional image interpretation (to simply delineate wetland boundaries) and machine assisted classification (to discriminate among cover types present within the wetland areas) appears to warrant further research to study the feasibility and cost of extending this methodology over a large area using LANDSAT and/or small scale photography.
Exercise recognition for Kinect-based telerehabilitation.
Antón, D; Goñi, A; Illarramendi, A
2015-01-01
An aging population and people's higher survival to diseases and traumas that leave physical consequences are challenging aspects in the context of an efficient health management. This is why telerehabilitation systems are being developed, to allow monitoring and support of physiotherapy sessions at home, which could reduce healthcare costs while also improving the quality of life of the users. Our goal is the development of a Kinect-based algorithm that provides a very accurate real-time monitoring of physical rehabilitation exercises and that also provides a friendly interface oriented both to users and physiotherapists. The two main constituents of our algorithm are the posture classification method and the exercises recognition method. The exercises consist of series of movements. Each movement is composed of an initial posture, a final posture and the angular trajectories of the limbs involved in the movement. The algorithm was designed and tested with datasets of real movements performed by volunteers. We also explain in the paper how we obtained the optimal values for the trade-off values for posture and trajectory recognition. Two relevant aspects of the algorithm were evaluated in our tests, classification accuracy and real-time data processing. We achieved 91.9% accuracy in posture classification and 93.75% accuracy in trajectory recognition. We also checked whether the algorithm was able to process the data in real-time. We found that our algorithm could process more than 20,000 postures per second and all the required trajectory data-series in real-time, which in practice guarantees no perceptible delays. Later on, we carried out two clinical trials with real patients that suffered shoulder disorders. We obtained an exercise monitoring accuracy of 95.16%. We present an exercise recognition algorithm that handles the data provided by Kinect efficiently. The algorithm has been validated in a real scenario where we have verified its suitability. Moreover, we have received a positive feedback from both users and the physiotherapists who took part in the tests.
Instruction-matrix-based genetic programming.
Li, Gang; Wang, Jin Feng; Lee, Kin Hong; Leung, Kwong-Sak
2008-08-01
In genetic programming (GP), evolving tree nodes separately would reduce the huge solution space. However, tree nodes are highly interdependent with respect to their fitness. In this paper, we propose a new GP framework, namely, instruction-matrix (IM)-based GP (IMGP), to handle their interactions. IMGP maintains an IM to evolve tree nodes and subtrees separately. IMGP extracts program trees from an IM and updates the IM with the information of the extracted program trees. As the IM actually keeps most of the information of the schemata of GP and evolves the schemata directly, IMGP is effective and efficient. Our experimental results on benchmark problems have verified that IMGP is not only better than those of canonical GP in terms of the qualities of the solutions and the number of program evaluations, but they are also better than some of the related GP algorithms. IMGP can also be used to evolve programs for classification problems. The classifiers obtained have higher classification accuracies than four other GP classification algorithms on four benchmark classification problems. The testing errors are also comparable to or better than those obtained with well-known classifiers. Furthermore, an extended version, called condition matrix for rule learning, has been used successfully to handle multiclass classification problems.
Bai, Ou; Lin, Peter; Vorbach, Sherry; Li, Jiang; Furlani, Steve; Hallett, Mark
2007-12-01
To explore effective combinations of computational methods for the prediction of movement intention preceding the production of self-paced right and left hand movements from single trial scalp electroencephalogram (EEG). Twelve naïve subjects performed self-paced movements consisting of three key strokes with either hand. EEG was recorded from 128 channels. The exploration was performed offline on single trial EEG data. We proposed that a successful computational procedure for classification would consist of spatial filtering, temporal filtering, feature selection, and pattern classification. A systematic investigation was performed with combinations of spatial filtering using principal component analysis (PCA), independent component analysis (ICA), common spatial patterns analysis (CSP), and surface Laplacian derivation (SLD); temporal filtering using power spectral density estimation (PSD) and discrete wavelet transform (DWT); pattern classification using linear Mahalanobis distance classifier (LMD), quadratic Mahalanobis distance classifier (QMD), Bayesian classifier (BSC), multi-layer perceptron neural network (MLP), probabilistic neural network (PNN), and support vector machine (SVM). A robust multivariate feature selection strategy using a genetic algorithm was employed. The combinations of spatial filtering using ICA and SLD, temporal filtering using PSD and DWT, and classification methods using LMD, QMD, BSC and SVM provided higher performance than those of other combinations. Utilizing one of the better combinations of ICA, PSD and SVM, the discrimination accuracy was as high as 75%. Further feature analysis showed that beta band EEG activity of the channels over right sensorimotor cortex was most appropriate for discrimination of right and left hand movement intention. Effective combinations of computational methods provide possible classification of human movement intention from single trial EEG. Such a method could be the basis for a potential brain-computer interface based on human natural movement, which might reduce the requirement of long-term training. Effective combinations of computational methods can classify human movement intention from single trial EEG with reasonable accuracy.
Hybrid analysis for indicating patients with breast cancer using temperature time series.
Silva, Lincoln F; Santos, Alair Augusto S M D; Bravo, Renato S; Silva, Aristófanes C; Muchaluat-Saade, Débora C; Conci, Aura
2016-07-01
Breast cancer is the most common cancer among women worldwide. Diagnosis and treatment in early stages increase cure chances. The temperature of cancerous tissue is generally higher than that of healthy surrounding tissues, making thermography an option to be considered in screening strategies of this cancer type. This paper proposes a hybrid methodology for analyzing dynamic infrared thermography in order to indicate patients with risk of breast cancer, using unsupervised and supervised machine learning techniques, which characterizes the methodology as hybrid. The dynamic infrared thermography monitors or quantitatively measures temperature changes on the examined surface, after a thermal stress. In the dynamic infrared thermography execution, a sequence of breast thermograms is generated. In the proposed methodology, this sequence is processed and analyzed by several techniques. First, the region of the breasts is segmented and the thermograms of the sequence are registered. Then, temperature time series are built and the k-means algorithm is applied on these series using various values of k. Clustering formed by k-means algorithm, for each k value, is evaluated using clustering validation indices, generating values treated as features in the classification model construction step. A data mining tool was used to solve the combined algorithm selection and hyperparameter optimization (CASH) problem in classification tasks. Besides the classification algorithm recommended by the data mining tool, classifiers based on Bayesian networks, neural networks, decision rules and decision tree were executed on the data set used for evaluation. Test results support that the proposed analysis methodology is able to indicate patients with breast cancer. Among 39 tested classification algorithms, K-Star and Bayes Net presented 100% classification accuracy. Furthermore, among the Bayes Net, multi-layer perceptron, decision table and random forest classification algorithms, an average accuracy of 95.38% was obtained. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Characterization and delineation of caribou habitat on Unimak Island using remote sensing techniques
NASA Astrophysics Data System (ADS)
Atkinson, Brain M.
The assessment of herbivore habitat quality is traditionally based on quantifying the forages available to the animal across their home range through ground-based techniques. While these methods are highly accurate, they can be time-consuming and highly expensive, especially for herbivores that occupy vast spatial landscapes. The Unimak Island caribou herd has been decreasing in the last decade at rates that have prompted discussion of management intervention. Frequent inclement weather in this region of Alaska has provided for little opportunity to study the caribou forage habitat on Unimak Island. The overall objectives of this study were two-fold 1) to assess the feasibility of using high-resolution color and near-infrared aerial imagery to map the forage distribution of caribou habitat on Unimak Island and 2) to assess the use of a new high-resolution multispectral satellite imagery platform, RapidEye, and use of the "red-edge" spectral band on vegetation classification accuracy. Maximum likelihood classification algorithms were used to create land cover maps in aerial and satellite imagery. Accuracy assessments and transformed divergence values were produced to assess vegetative spectral information and classification accuracy. By using RapidEye and aerial digital imagery in a hierarchical supervised classification technique, we were able to produce a high resolution land cover map of Unimak Island. We obtained overall accuracy rates of 71.4 percent which are comparable to other land cover maps using RapidEye imagery. The "red-edge" spectral band included in the RapidEye imagery provides additional spectral information that allows for a more accurate overall classification, raising overall accuracy 5.2 percent.
NASA Astrophysics Data System (ADS)
Susanti, Yuliana; Zukhronah, Etik; Pratiwi, Hasih; Respatiwulan; Sri Sulistijowati, H.
2017-11-01
To achieve food resilience in Indonesia, food diversification by exploring potentials of local food is required. Corn is one of alternating staple food of Javanese society. For that reason, corn production needs to be improved by considering the influencing factors. CHAID and CRT are methods of data mining which can be used to classify the influencing variables. The present study seeks to dig up information on the potentials of local food availability of corn in regencies and cities in Java Island. CHAID analysis yields four classifications with accuracy of 78.8%, while CRT analysis yields seven classifications with accuracy of 79.6%.
Deep multi-scale convolutional neural network for hyperspectral image classification
NASA Astrophysics Data System (ADS)
Zhang, Feng-zhe; Yang, Xia
2018-04-01
In this paper, we proposed a multi-scale convolutional neural network for hyperspectral image classification task. Firstly, compared with conventional convolution, we utilize multi-scale convolutions, which possess larger respective fields, to extract spectral features of hyperspectral image. We design a deep neural network with a multi-scale convolution layer which contains 3 different convolution kernel sizes. Secondly, to avoid overfitting of deep neural network, dropout is utilized, which randomly sleeps neurons, contributing to improve the classification accuracy a bit. In addition, new skills like ReLU in deep learning is utilized in this paper. We conduct experiments on University of Pavia and Salinas datasets, and obtained better classification accuracy compared with other methods.
Rey, Sergio J.; Stephens, Philip A.; Laura, Jason R.
2017-01-01
Large data contexts present a number of challenges to optimal choropleth map classifiers. Application of optimal classifiers to a sample of the attribute space is one proposed solution. The properties of alternative sampling-based classification methods are examined through a series of Monte Carlo simulations. The impacts of spatial autocorrelation, number of desired classes, and form of sampling are shown to have significant impacts on the accuracy of map classifications. Tradeoffs between improved speed of the sampling approaches and loss of accuracy are also considered. The results suggest the possibility of guiding the choice of classification scheme as a function of the properties of large data sets.
A new self-report inventory of dyslexia for students: criterion and construct validity.
Tamboer, Peter; Vorst, Harrie C M
2015-02-01
The validity of a Dutch self-report inventory of dyslexia was ascertained in two samples of students. Six biographical questions, 20 general language statements and 56 specific language statements were based on dyslexia as a multi-dimensional deficit. Dyslexia and non-dyslexia were assessed with two criteria: identification with test results (Sample 1) and classification using biographical information (both samples). Using discriminant analyses, these criteria were predicted with various groups of statements. All together, 11 discriminant functions were used to estimate classification accuracy of the inventory. In Sample 1, 15 statements predicted the test criterion with classification accuracy of 98%, and 18 statements predicted the biographical criterion with classification accuracy of 97%. In Sample 2, 16 statements predicted the biographical criterion with classification accuracy of 94%. Estimations of positive and negative predictive value were 89% and 99%. Items of various discriminant functions were factor analysed to find characteristic difficulties of students with dyslexia, resulting in a five-factor structure in Sample 1 and a four-factor structure in Sample 2. Answer bias was investigated with measures of internal consistency reliability. Less than 20 self-report items are sufficient to accurately classify students with and without dyslexia. This supports the usefulness of self-assessment of dyslexia as a valid alternative to diagnostic test batteries. Copyright © 2015 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Liu, Tao; Im, Jungho; Quackenbush, Lindi J.
2015-12-01
This study provides a novel approach to individual tree crown delineation (ITCD) using airborne Light Detection and Ranging (LiDAR) data in dense natural forests using two main steps: crown boundary refinement based on a proposed Fishing Net Dragging (FiND) method, and segment merging based on boundary classification. FiND starts with approximate tree crown boundaries derived using a traditional watershed method with Gaussian filtering and refines these boundaries using an algorithm that mimics how a fisherman drags a fishing net. Random forest machine learning is then used to classify boundary segments into two classes: boundaries between trees and boundaries between branches that belong to a single tree. Three groups of LiDAR-derived features-two from the pseudo waveform generated along with crown boundaries and one from a canopy height model (CHM)-were used in the classification. The proposed ITCD approach was tested using LiDAR data collected over a mountainous region in the Adirondack Park, NY, USA. Overall accuracy of boundary classification was 82.4%. Features derived from the CHM were generally more important in the classification than the features extracted from the pseudo waveform. A comprehensive accuracy assessment scheme for ITCD was also introduced by considering both area of crown overlap and crown centroids. Accuracy assessment using this new scheme shows the proposed ITCD achieved 74% and 78% as overall accuracy, respectively, for deciduous and mixed forest.
Thomas C. Edwards; D. Richard Cutler; Niklaus E. Zimmermann; Linda Geiser; Gretchen G. Moisen
2006-01-01
We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by...
Using New Models to Analyze Complex Regularities of the World: Commentary on Musso et al. (2013)
ERIC Educational Resources Information Center
Nokelainen, Petri; Silander, Tomi
2014-01-01
This commentary to the recent article by Musso et al. (2013) discusses issues related to model fitting, comparison of classification accuracy of generative and discriminative models, and two (or more) cultures of data modeling. We start by questioning the extremely high classification accuracy with an empirical data from a complex domain. There is…
ERIC Educational Resources Information Center
Ball, Carrie R.; O'Connor, Edward
2016-01-01
This study examined the predictive validity and classification accuracy of two commonly used universal screening measures relative to a statewide achievement test. Results indicated that second-grade performance on oral reading fluency and the Measures of Academic Progress (MAP), together with special education status, explained 68% of the…
Two Approaches to Estimation of Classification Accuracy Rate under Item Response Theory
ERIC Educational Resources Information Center
Lathrop, Quinn N.; Cheng, Ying
2013-01-01
Within the framework of item response theory (IRT), there are two recent lines of work on the estimation of classification accuracy (CA) rate. One approach estimates CA when decisions are made based on total sum scores, the other based on latent trait estimates. The former is referred to as the Lee approach, and the latter, the Rudner approach,…
NASA Astrophysics Data System (ADS)
Zhang, Zhiming; de Wulf, Robert R.; van Coillie, Frieke M. B.; Verbeke, Lieven P. C.; de Clercq, Eva M.; Ou, Xiaokun
2011-01-01
Mapping of vegetation using remote sensing in mountainous areas is considerably hampered by topographic effects on the spectral response pattern. A variety of topographic normalization techniques have been proposed to correct these illumination effects due to topography. The purpose of this study was to compare six different topographic normalization methods (Cosine correction, Minnaert correction, C-correction, Sun-canopy-sensor correction, two-stage topographic normalization, and slope matching technique) for their effectiveness in enhancing vegetation classification in mountainous environments. Since most of the vegetation classes in the rugged terrain of the Lancang Watershed (China) did not feature a normal distribution, artificial neural networks (ANNs) were employed as a classifier. Comparing the ANN classifications, none of the topographic correction methods could significantly improve ETM+ image classification overall accuracy. Nevertheless, at the class level, the accuracy of pine forest could be increased by using topographically corrected images. On the contrary, oak forest and mixed forest accuracies were significantly decreased by using corrected images. The results also showed that none of the topographic normalization strategies was satisfactorily able to correct for the topographic effects in severely shadowed areas.
Support vector machine and principal component analysis for microarray data classification
NASA Astrophysics Data System (ADS)
Astuti, Widi; Adiwijaya
2018-03-01
Cancer is a leading cause of death worldwide although a significant proportion of it can be cured if it is detected early. In recent decades, technology called microarray takes an important role in the diagnosis of cancer. By using data mining technique, microarray data classification can be performed to improve the accuracy of cancer diagnosis compared to traditional techniques. The characteristic of microarray data is small sample but it has huge dimension. Since that, there is a challenge for researcher to provide solutions for microarray data classification with high performance in both accuracy and running time. This research proposed the usage of Principal Component Analysis (PCA) as a dimension reduction method along with Support Vector Method (SVM) optimized by kernel functions as a classifier for microarray data classification. The proposed scheme was applied on seven data sets using 5-fold cross validation and then evaluation and analysis conducted on term of both accuracy and running time. The result showed that the scheme can obtained 100% accuracy for Ovarian and Lung Cancer data when Linear and Cubic kernel functions are used. In term of running time, PCA greatly reduced the running time for every data sets.
Trakoolwilaiwan, Thanawin; Behboodi, Bahareh; Lee, Jaeseok; Kim, Kyungsoo; Choi, Ji-Woong
2018-01-01
The aim of this work is to develop an effective brain-computer interface (BCI) method based on functional near-infrared spectroscopy (fNIRS). In order to improve the performance of the BCI system in terms of accuracy, the ability to discriminate features from input signals and proper classification are desired. Previous studies have mainly extracted features from the signal manually, but proper features need to be selected carefully. To avoid performance degradation caused by manual feature selection, we applied convolutional neural networks (CNNs) as the automatic feature extractor and classifier for fNIRS-based BCI. In this study, the hemodynamic responses evoked by performing rest, right-, and left-hand motor execution tasks were measured on eight healthy subjects to compare performances. Our CNN-based method provided improvements in classification accuracy over conventional methods employing the most commonly used features of mean, peak, slope, variance, kurtosis, and skewness, classified by support vector machine (SVM) and artificial neural network (ANN). Specifically, up to 6.49% and 3.33% improvement in classification accuracy was achieved by CNN compared with SVM and ANN, respectively.
Gesteme-free context-aware adaptation of robot behavior in human-robot cooperation.
Nessi, Federico; Beretta, Elisa; Gatti, Cecilia; Ferrigno, Giancarlo; De Momi, Elena
2016-11-01
Cooperative robotics is receiving greater acceptance because the typical advantages provided by manipulators are combined with an intuitive usage. In particular, hands-on robotics may benefit from the adaptation of the assistant behavior with respect to the activity currently performed by the user. A fast and reliable classification of human activities is required, as well as strategies to smoothly modify the control of the manipulator. In this scenario, gesteme-based motion classification is inadequate because it needs the observation of a wide signal percentage and the definition of a rich vocabulary. In this work, a system able to recognize the user's current activity without a vocabulary of gestemes, and to accordingly adapt the manipulator's dynamic behavior is presented. An underlying stochastic model fits variations in the user's guidance forces and the resulting trajectories of the manipulator's end-effector with a set of Gaussian distribution. The high-level switching between these distributions is captured with hidden Markov models. The dynamic of the KUKA light-weight robot, a torque-controlled manipulator, is modified with respect to the classified activity using sigmoidal-shaped functions. The presented system is validated over a pool of 12 näive users in a scenario that addresses surgical targeting tasks on soft tissue. The robot's assistance is adapted in order to obtain a stiff behavior during activities that require critical accuracy constraint, and higher compliance during wide movements. Both the ability to provide the correct classification at each moment (sample accuracy) and the capability of correctly identify the correct sequence of activity (sequence accuracy) were evaluated. The proposed classifier is fast and accurate in all the experiments conducted (80% sample accuracy after the observation of ∼450ms of signal). Moreover, the ability of recognize the correct sequence of activities, without unwanted transitions is guaranteed (sequence accuracy ∼90% when computed far away from user desired transitions). Finally, the proposed activity-based adaptation of the robot's dynamic does not lead to a not smooth behavior (high smoothness, i.e. normalized jerk score <0.01). The provided system is able to dynamic assist the operator during cooperation in the presented scenario. Copyright © 2016 Elsevier B.V. All rights reserved.
Application of random forests methods to diabetic retinopathy classification analyses.
Casanova, Ramon; Saldana, Santiago; Chew, Emily Y; Danis, Ronald P; Greven, Craig M; Ambrosius, Walter T
2014-01-01
Diabetic retinopathy (DR) is one of the leading causes of blindness in the United States and world-wide. DR is a silent disease that may go unnoticed until it is too late for effective treatment. Therefore, early detection could improve the chances of therapeutic interventions that would alleviate its effects. Graded fundus photography and systemic data from 3443 ACCORD-Eye Study participants were used to estimate Random Forest (RF) and logistic regression classifiers. We studied the impact of sample size on classifier performance and the possibility of using RF generated class conditional probabilities as metrics describing DR risk. RF measures of variable importance are used to detect factors that affect classification performance. Both types of data were informative when discriminating participants with or without DR. RF based models produced much higher classification accuracy than those based on logistic regression. Combining both types of data did not increase accuracy but did increase statistical discrimination of healthy participants who subsequently did or did not have DR events during four years of follow-up. RF variable importance criteria revealed that microaneurysms counts in both eyes seemed to play the most important role in discrimination among the graded fundus variables, while the number of medicines and diabetes duration were the most relevant among the systemic variables. We have introduced RF methods to DR classification analyses based on fundus photography data. In addition, we propose an approach to DR risk assessment based on metrics derived from graded fundus photography and systemic data. Our results suggest that RF methods could be a valuable tool to diagnose DR diagnosis and evaluate its progression.
Using complex networks for text classification: Discriminating informative and imaginative documents
NASA Astrophysics Data System (ADS)
de Arruda, Henrique F.; Costa, Luciano da F.; Amancio, Diego R.
2016-01-01
Statistical methods have been widely employed in recent years to grasp many language properties. The application of such techniques have allowed an improvement of several linguistic applications, such as machine translation and document classification. In the latter, many approaches have emphasised the semantical content of texts, as is the case of bag-of-word language models. These approaches have certainly yielded reasonable performance. However, some potential features such as the structural organization of texts have been used only in a few studies. In this context, we probe how features derived from textual structure analysis can be effectively employed in a classification task. More specifically, we performed a supervised classification aiming at discriminating informative from imaginative documents. Using a networked model that describes the local topological/dynamical properties of function words, we achieved an accuracy rate of up to 95%, which is much higher than similar networked approaches. A systematic analysis of feature relevance revealed that symmetry and accessibility measurements are among the most prominent network measurements. Our results suggest that these measurements could be used in related language applications, as they play a complementary role in characterising texts.