Variance approximations for assessments of classification accuracy
R. L. Czaplewski
1994-01-01
Variance approximations are derived for the weighted and unweighted kappa statistics, the conditional kappa statistic, and conditional probabilities. These statistics are useful to assess classification accuracy, such as accuracy of remotely sensed classifications in thematic maps when compared to a sample of reference classifications made in the field. Published...
[Accuracy improvement of spectral classification of crop using microwave backscatter data].
Jia, Kun; Li, Qiang-Zi; Tian, Yi-Chen; Wu, Bing-Fang; Zhang, Fei-Fei; Meng, Ji-Hua
2011-02-01
In the present study, VV polarization microwave backscatter data used for improving accuracies of spectral classification of crop is investigated. Classification accuracy using different classifiers based on the fusion data of HJ satellite multi-spectral and Envisat ASAR VV backscatter data are compared. The results indicate that fusion data can take full advantage of spectral information of HJ multi-spectral data and the structure sensitivity feature of ASAR VV polarization data. The fusion data enlarges the spectral difference among different classifications and improves crop classification accuracy. The classification accuracy using fusion data can be increased by 5 percent compared to the single HJ data. Furthermore, ASAR VV polarization data is sensitive to non-agrarian area of planted field, and VV polarization data joined classification can effectively distinguish the field border. VV polarization data associating with multi-spectral data used in crop classification enlarges the application of satellite data and has the potential of spread in the domain of agriculture.
Ramsey, Elijah W.; Nelson, Gene A.; Sapkota, Sijan
1998-01-01
A progressive classification of a marsh and forest system using Landsat Thematic Mapper (TM), color infrared (CIR) photograph, and ERS-1 synthetic aperture radar (SAR) data improved classification accuracy when compared to classification using solely TM reflective band data. The classification resulted in a detailed identification of differences within a nearly monotypic black needlerush marsh. Accuracy percentages of these classes were surprisingly high given the complexities of classification. The detailed classification resulted in a more accurate portrayal of the marsh transgressive sequence than was obtainable with TM data alone. Individual sensor contribution to the improved classification was compared to that using only the six reflective TM bands. Individually, the green reflective CIR and SAR data identified broad categories of water, marsh, and forest. In combination with TM, SAR and the green CIR band each improved overall accuracy by about 3% and 15% respectively. The SAR data improved the TM classification accuracy mostly in the marsh classes. The green CIR data also improved the marsh classification accuracy and accuracies in some water classes. The final combination of all sensor data improved almost all class accuracies from 2% to 70% with an overall improvement of about 20% over TM data alone. Not only was the identification of vegetation types improved, but the spatial detail of the classification approached 10 m in some areas.
Comparison of Feature Selection Techniques in Machine Learning for Anatomical Brain MRI in Dementia.
Tohka, Jussi; Moradi, Elaheh; Huttunen, Heikki
2016-07-01
We present a comparative split-half resampling analysis of various data driven feature selection and classification methods for the whole brain voxel-based classification analysis of anatomical magnetic resonance images. We compared support vector machines (SVMs), with or without filter based feature selection, several embedded feature selection methods and stability selection. While comparisons of the accuracy of various classification methods have been reported previously, the variability of the out-of-training sample classification accuracy and the set of selected features due to independent training and test sets have not been previously addressed in a brain imaging context. We studied two classification problems: 1) Alzheimer's disease (AD) vs. normal control (NC) and 2) mild cognitive impairment (MCI) vs. NC classification. In AD vs. NC classification, the variability in the test accuracy due to the subject sample did not vary between different methods and exceeded the variability due to different classifiers. In MCI vs. NC classification, particularly with a large training set, embedded feature selection methods outperformed SVM-based ones with the difference in the test accuracy exceeding the test accuracy variability due to the subject sample. The filter and embedded methods produced divergent feature patterns for MCI vs. NC classification that suggests the utility of the embedded feature selection for this problem when linked with the good generalization performance. The stability of the feature sets was strongly correlated with the number of features selected, weakly correlated with the stability of classification accuracy, and uncorrelated with the average classification accuracy.
Minimum distance classification in remote sensing
NASA Technical Reports Server (NTRS)
Wacker, A. G.; Landgrebe, D. A.
1972-01-01
The utilization of minimum distance classification methods in remote sensing problems, such as crop species identification, is considered. Literature concerning both minimum distance classification problems and distance measures is reviewed. Experimental results are presented for several examples. The objective of these examples is to: (a) compare the sample classification accuracy of a minimum distance classifier, with the vector classification accuracy of a maximum likelihood classifier, and (b) compare the accuracy of a parametric minimum distance classifier with that of a nonparametric one. Results show the minimum distance classifier performance is 5% to 10% better than that of the maximum likelihood classifier. The nonparametric classifier is only slightly better than the parametric version.
Comparative Analysis of Haar and Daubechies Wavelet for Hyper Spectral Image Classification
NASA Astrophysics Data System (ADS)
Sharif, I.; Khare, S.
2014-11-01
With the number of channels in the hundreds instead of in the tens Hyper spectral imagery possesses much richer spectral information than multispectral imagery. The increased dimensionality of such Hyper spectral data provides a challenge to the current technique for analyzing data. Conventional classification methods may not be useful without dimension reduction pre-processing. So dimension reduction has become a significant part of Hyper spectral image processing. This paper presents a comparative analysis of the efficacy of Haar and Daubechies wavelets for dimensionality reduction in achieving image classification. Spectral data reduction using Wavelet Decomposition could be useful because it preserves the distinction among spectral signatures. Daubechies wavelets optimally capture the polynomial trends while Haar wavelet is discontinuous and resembles a step function. The performance of these wavelets are compared in terms of classification accuracy and time complexity. This paper shows that wavelet reduction has more separate classes and yields better or comparable classification accuracy. In the context of the dimensionality reduction algorithm, it is found that the performance of classification of Daubechies wavelets is better as compared to Haar wavelet while Daubechies takes more time compare to Haar wavelet. The experimental results demonstrate the classification system consistently provides over 84% classification accuracy.
Zhou, Tao; Li, Zhaofu; Pan, Jianjun
2018-01-27
This paper focuses on evaluating the ability and contribution of using backscatter intensity, texture, coherence, and color features extracted from Sentinel-1A data for urban land cover classification and comparing different multi-sensor land cover mapping methods to improve classification accuracy. Both Landsat-8 OLI and Hyperion images were also acquired, in combination with Sentinel-1A data, to explore the potential of different multi-sensor urban land cover mapping methods to improve classification accuracy. The classification was performed using a random forest (RF) method. The results showed that the optimal window size of the combination of all texture features was 9 × 9, and the optimal window size was different for each individual texture feature. For the four different feature types, the texture features contributed the most to the classification, followed by the coherence and backscatter intensity features; and the color features had the least impact on the urban land cover classification. Satisfactory classification results can be obtained using only the combination of texture and coherence features, with an overall accuracy up to 91.55% and a kappa coefficient up to 0.8935, respectively. Among all combinations of Sentinel-1A-derived features, the combination of the four features had the best classification result. Multi-sensor urban land cover mapping obtained higher classification accuracy. The combination of Sentinel-1A and Hyperion data achieved higher classification accuracy compared to the combination of Sentinel-1A and Landsat-8 OLI images, with an overall accuracy of up to 99.12% and a kappa coefficient up to 0.9889. When Sentinel-1A data was added to Hyperion images, the overall accuracy and kappa coefficient were increased by 4.01% and 0.0519, respectively.
Austin, Peter C; Lee, Douglas S
2011-01-01
Purpose: Classification trees are increasingly being used to classifying patients according to the presence or absence of a disease or health outcome. A limitation of classification trees is their limited predictive accuracy. In the data-mining and machine learning literature, boosting has been developed to improve classification. Boosting with classification trees iteratively grows classification trees in a sequence of reweighted datasets. In a given iteration, subjects that were misclassified in the previous iteration are weighted more highly than subjects that were correctly classified. Classifications from each of the classification trees in the sequence are combined through a weighted majority vote to produce a final classification. The authors' objective was to examine whether boosting improved the accuracy of classification trees for predicting outcomes in cardiovascular patients. Methods: We examined the utility of boosting classification trees for classifying 30-day mortality outcomes in patients hospitalized with either acute myocardial infarction or congestive heart failure. Results: Improvements in the misclassification rate using boosted classification trees were at best minor compared to when conventional classification trees were used. Minor to modest improvements to sensitivity were observed, with only a negligible reduction in specificity. For predicting cardiovascular mortality, boosted classification trees had high specificity, but low sensitivity. Conclusions: Gains in predictive accuracy for predicting cardiovascular outcomes were less impressive than gains in performance observed in the data mining literature. PMID:22254181
Feature ranking and rank aggregation for automatic sleep stage classification: a comparative study.
Najdi, Shirin; Gharbali, Ali Abdollahi; Fonseca, José Manuel
2017-08-18
Nowadays, sleep quality is one of the most important measures of healthy life, especially considering the huge number of sleep-related disorders. Identifying sleep stages using polysomnographic (PSG) signals is the traditional way of assessing sleep quality. However, the manual process of sleep stage classification is time-consuming, subjective and costly. Therefore, in order to improve the accuracy and efficiency of the sleep stage classification, researchers have been trying to develop automatic classification algorithms. Automatic sleep stage classification mainly consists of three steps: pre-processing, feature extraction and classification. Since classification accuracy is deeply affected by the extracted features, a poor feature vector will adversely affect the classifier and eventually lead to low classification accuracy. Therefore, special attention should be given to the feature extraction and selection process. In this paper the performance of seven feature selection methods, as well as two feature rank aggregation methods, were compared. Pz-Oz EEG, horizontal EOG and submental chin EMG recordings of 22 healthy males and females were used. A comprehensive feature set including 49 features was extracted from these recordings. The extracted features are among the most common and effective features used in sleep stage classification from temporal, spectral, entropy-based and nonlinear categories. The feature selection methods were evaluated and compared using three criteria: classification accuracy, stability, and similarity. Simulation results show that MRMR-MID achieves the highest classification performance while Fisher method provides the most stable ranking. In our simulations, the performance of the aggregation methods was in the average level, although they are known to generate more stable results and better accuracy. The Borda and RRA rank aggregation methods could not outperform significantly the conventional feature ranking methods. Among conventional methods, some of them slightly performed better than others, although the choice of a suitable technique is dependent on the computational complexity and accuracy requirements of the user.
Comparing ecoregional classifications for natural areas management in the Klamath Region, USA
Sarr, Daniel A.; Duff, Andrew; Dinger, Eric C.; Shafer, Sarah L.; Wing, Michael; Seavy, Nathaniel E.; Alexander, John D.
2015-01-01
We compared three existing ecoregional classification schemes (Bailey, Omernik, and World Wildlife Fund) with two derived schemes (Omernik Revised and Climate Zones) to explore their effectiveness in explaining species distributions and to better understand natural resource geography in the Klamath Region, USA. We analyzed presence/absence data derived from digital distribution maps for trees, amphibians, large mammals, small mammals, migrant birds, and resident birds using three statistical analyses of classification accuracy (Analysis of Similarity, Canonical Analysis of Principal Coordinates, and Classification Strength). The classifications were roughly comparable in classification accuracy, with Omernik Revised showing the best overall performance. Trees showed the strongest fidelity to the classifications, and large mammals showed the weakest fidelity. We discuss the implications for regional biogeography and describe how intermediate resolution ecoregional classifications may be appropriate for use as natural areas management domains.
NASA Technical Reports Server (NTRS)
Quattrochi, D. A.
1984-01-01
An initial analysis of LANDSAT 4 Thematic Mapper (TM) data for the discrimination of agricultural, forested wetland, and urban land covers is conducted using a scene of data collected over Arkansas and Tennessee. A classification of agricultural lands derived from multitemporal LANDSAT Multispectral Scanner (MSS) data is compared with a classification of TM data for the same area. Results from this comparative analysis show that the multitemporal MSS classification produced an overall accuracy of 80.91% while the TM classification yields an overall classification accuracy of 97.06% correct.
Pan, Jianjun
2018-01-01
This paper focuses on evaluating the ability and contribution of using backscatter intensity, texture, coherence, and color features extracted from Sentinel-1A data for urban land cover classification and comparing different multi-sensor land cover mapping methods to improve classification accuracy. Both Landsat-8 OLI and Hyperion images were also acquired, in combination with Sentinel-1A data, to explore the potential of different multi-sensor urban land cover mapping methods to improve classification accuracy. The classification was performed using a random forest (RF) method. The results showed that the optimal window size of the combination of all texture features was 9 × 9, and the optimal window size was different for each individual texture feature. For the four different feature types, the texture features contributed the most to the classification, followed by the coherence and backscatter intensity features; and the color features had the least impact on the urban land cover classification. Satisfactory classification results can be obtained using only the combination of texture and coherence features, with an overall accuracy up to 91.55% and a kappa coefficient up to 0.8935, respectively. Among all combinations of Sentinel-1A-derived features, the combination of the four features had the best classification result. Multi-sensor urban land cover mapping obtained higher classification accuracy. The combination of Sentinel-1A and Hyperion data achieved higher classification accuracy compared to the combination of Sentinel-1A and Landsat-8 OLI images, with an overall accuracy of up to 99.12% and a kappa coefficient up to 0.9889. When Sentinel-1A data was added to Hyperion images, the overall accuracy and kappa coefficient were increased by 4.01% and 0.0519, respectively. PMID:29382073
Stinchfield, Randy; McCready, John; Turner, Nigel E; Jimenez-Murcia, Susana; Petry, Nancy M; Grant, Jon; Welte, John; Chapman, Heather; Winters, Ken C
2016-09-01
The DSM-5 was published in 2013 and it included two substantive revisions for gambling disorder (GD). These changes are the reduction in the threshold from five to four criteria and elimination of the illegal activities criterion. The purpose of this study was to twofold. First, to assess the reliability, validity and classification accuracy of the DSM-5 diagnostic criteria for GD. Second, to compare the DSM-5-DSM-IV on reliability, validity, and classification accuracy, including an examination of the effect of the elimination of the illegal acts criterion on diagnostic accuracy. To compare DSM-5 and DSM-IV, eight datasets from three different countries (Canada, USA, and Spain; total N = 3247) were used. All datasets were based on similar research methods. Participants were recruited from outpatient gambling treatment services to represent the group with a GD and from the community to represent the group without a GD. All participants were administered a standardized measure of diagnostic criteria. The DSM-5 yielded satisfactory reliability, validity and classification accuracy. In comparing the DSM-5 to the DSM-IV, most comparisons of reliability, validity and classification accuracy showed more similarities than differences. There was evidence of modest improvements in classification accuracy for DSM-5 over DSM-IV, particularly in reduction of false negative errors. This reduction in false negative errors was largely a function of lowering the cut score from five to four and this revision is an improvement over DSM-IV. From a statistical standpoint, eliminating the illegal acts criterion did not make a significant impact on diagnostic accuracy. From a clinical standpoint, illegal acts can still be addressed in the context of the DSM-5 criterion of lying to others.
Analyzing thematic maps and mapping for accuracy
Rosenfield, G.H.
1982-01-01
Two problems which exist while attempting to test the accuracy of thematic maps and mapping are: (1) evaluating the accuracy of thematic content, and (2) evaluating the effects of the variables on thematic mapping. Statistical analysis techniques are applicable to both these problems and include techniques for sampling the data and determining their accuracy. In addition, techniques for hypothesis testing, or inferential statistics, are used when comparing the effects of variables. A comprehensive and valid accuracy test of a classification project, such as thematic mapping from remotely sensed data, includes the following components of statistical analysis: (1) sample design, including the sample distribution, sample size, size of the sample unit, and sampling procedure; and (2) accuracy estimation, including estimation of the variance and confidence limits. Careful consideration must be given to the minimum sample size necessary to validate the accuracy of a given. classification category. The results of an accuracy test are presented in a contingency table sometimes called a classification error matrix. Usually the rows represent the interpretation, and the columns represent the verification. The diagonal elements represent the correct classifications. The remaining elements of the rows represent errors by commission, and the remaining elements of the columns represent the errors of omission. For tests of hypothesis that compare variables, the general practice has been to use only the diagonal elements from several related classification error matrices. These data are arranged in the form of another contingency table. The columns of the table represent the different variables being compared, such as different scales of mapping. The rows represent the blocking characteristics, such as the various categories of classification. The values in the cells of the tables might be the counts of correct classification or the binomial proportions of these counts divided by either the row totals or the column totals from the original classification error matrices. In hypothesis testing, when the results of tests of multiple sample cases prove to be significant, some form of statistical test must be used to separate any results that differ significantly from the others. In the past, many analyses of the data in this error matrix were made by comparing the relative magnitudes of the percentage of correct classifications, for either individual categories, the entire map or both. More rigorous analyses have used data transformations and (or) two-way classification analysis of variance. A more sophisticated step of data analysis techniques would be to use the entire classification error matrices using the methods of discrete multivariate analysis or of multiviariate analysis of variance.
Bahadure, Nilesh Bhaskarrao; Ray, Arun Kumar; Thethi, Har Pal
2018-01-17
The detection of a brain tumor and its classification from modern imaging modalities is a primary concern, but a time-consuming and tedious work was performed by radiologists or clinical supervisors. The accuracy of detection and classification of tumor stages performed by radiologists is depended on their experience only, so the computer-aided technology is very important to aid with the diagnosis accuracy. In this study, to improve the performance of tumor detection, we investigated comparative approach of different segmentation techniques and selected the best one by comparing their segmentation score. Further, to improve the classification accuracy, the genetic algorithm is employed for the automatic classification of tumor stage. The decision of classification stage is supported by extracting relevant features and area calculation. The experimental results of proposed technique are evaluated and validated for performance and quality analysis on magnetic resonance brain images, based on segmentation score, accuracy, sensitivity, specificity, and dice similarity index coefficient. The experimental results achieved 92.03% accuracy, 91.42% specificity, 92.36% sensitivity, and an average segmentation score between 0.82 and 0.93 demonstrating the effectiveness of the proposed technique for identifying normal and abnormal tissues from brain MR images. The experimental results also obtained an average of 93.79% dice similarity index coefficient, which indicates better overlap between the automated extracted tumor regions with manually extracted tumor region by radiologists.
AVHRR composite period selection for land cover classification
Maxwell, S.K.; Hoffer, R.M.; Chapman, P.L.
2002-01-01
Multitemporal satellite image datasets provide valuable information on the phenological characteristics of vegetation, thereby significantly increasing the accuracy of cover type classifications compared to single date classifications. However, the processing of these datasets can become very complex when dealing with multitemporal data combined with multispectral data. Advanced Very High Resolution Radiometer (AVHRR) biweekly composite data are commonly used to classify land cover over large regions. Selecting a subset of these biweekly composite periods may be required to reduce the complexity and cost of land cover mapping. The objective of our research was to evaluate the effect of reducing the number of composite periods and altering the spacing of those composite periods on classification accuracy. Because inter-annual variability can have a major impact on classification results, 5 years of AVHRR data were evaluated. AVHRR biweekly composite images for spectral channels 1-4 (visible, near-infrared and two thermal bands) covering the entire growing season were used to classify 14 cover types over the entire state of Colorado for each of five different years. A supervised classification method was applied to maintain consistent procedures for each case tested. Results indicate that the number of composite periods can be halved-reduced from 14 composite dates to seven composite dates-without significantly reducing overall classification accuracy (80.4% Kappa accuracy for the 14-composite data-set as compared to 80.0% for a seven-composite dataset). At least seven composite periods were required to ensure the classification accuracy was not affected by inter-annual variability due to climate fluctuations. Concentrating more composites near the beginning and end of the growing season, as compared to using evenly spaced time periods, consistently produced slightly higher classification values over the 5 years tested (average Kappa) of 80.3% for the heavy early/late case as compared to 79.0% for the alternate dataset case).
A neural network approach to cloud classification
NASA Technical Reports Server (NTRS)
Lee, Jonathan; Weger, Ronald C.; Sengupta, Sailes K.; Welch, Ronald M.
1990-01-01
It is shown that, using high-spatial-resolution data, very high cloud classification accuracies can be obtained with a neural network approach. A texture-based neural network classifier using only single-channel visible Landsat MSS imagery achieves an overall cloud identification accuracy of 93 percent. Cirrus can be distinguished from boundary layer cloudiness with an accuracy of 96 percent, without the use of an infrared channel. Stratocumulus is retrieved with an accuracy of 92 percent, cumulus at 90 percent. The use of the neural network does not improve cirrus classification accuracy. Rather, its main effect is in the improved separation between stratocumulus and cumulus cloudiness. While most cloud classification algorithms rely on linear parametric schemes, the present study is based on a nonlinear, nonparametric four-layer neural network approach. A three-layer neural network architecture, the nonparametric K-nearest neighbor approach, and the linear stepwise discriminant analysis procedure are compared. A significant finding is that significantly higher accuracies are attained with the nonparametric approaches using only 20 percent of the database as training data, compared to 67 percent of the database in the linear approach.
Lu, Dengsheng; Batistella, Mateus; de Miranda, Evaristo E; Moran, Emilio
2008-01-01
Complex forest structure and abundant tree species in the moist tropical regions often cause difficulties in classifying vegetation classes with remotely sensed data. This paper explores improvement in vegetation classification accuracies through a comparative study of different image combinations based on the integration of Landsat Thematic Mapper (TM) and SPOT High Resolution Geometric (HRG) instrument data, as well as the combination of spectral signatures and textures. A maximum likelihood classifier was used to classify the different image combinations into thematic maps. This research indicated that data fusion based on HRG multispectral and panchromatic data slightly improved vegetation classification accuracies: a 3.1 to 4.6 percent increase in the kappa coefficient compared with the classification results based on original HRG or TM multispectral images. A combination of HRG spectral signatures and two textural images improved the kappa coefficient by 6.3 percent compared with pure HRG multispectral images. The textural images based on entropy or second-moment texture measures with a window size of 9 pixels × 9 pixels played an important role in improving vegetation classification accuracy. Overall, optical remote-sensing data are still insufficient for accurate vegetation classifications in the Amazon basin.
Lu, Dengsheng; Batistella, Mateus; de Miranda, Evaristo E.; Moran, Emilio
2009-01-01
Complex forest structure and abundant tree species in the moist tropical regions often cause difficulties in classifying vegetation classes with remotely sensed data. This paper explores improvement in vegetation classification accuracies through a comparative study of different image combinations based on the integration of Landsat Thematic Mapper (TM) and SPOT High Resolution Geometric (HRG) instrument data, as well as the combination of spectral signatures and textures. A maximum likelihood classifier was used to classify the different image combinations into thematic maps. This research indicated that data fusion based on HRG multispectral and panchromatic data slightly improved vegetation classification accuracies: a 3.1 to 4.6 percent increase in the kappa coefficient compared with the classification results based on original HRG or TM multispectral images. A combination of HRG spectral signatures and two textural images improved the kappa coefficient by 6.3 percent compared with pure HRG multispectral images. The textural images based on entropy or second-moment texture measures with a window size of 9 pixels × 9 pixels played an important role in improving vegetation classification accuracy. Overall, optical remote-sensing data are still insufficient for accurate vegetation classifications in the Amazon basin. PMID:19789716
Belgiu, Mariana; Dr Guţ, Lucian
2014-10-01
Although multiresolution segmentation (MRS) is a powerful technique for dealing with very high resolution imagery, some of the image objects that it generates do not match the geometries of the target objects, which reduces the classification accuracy. MRS can, however, be guided to produce results that approach the desired object geometry using either supervised or unsupervised approaches. Although some studies have suggested that a supervised approach is preferable, there has been no comparative evaluation of these two approaches. Therefore, in this study, we have compared supervised and unsupervised approaches to MRS. One supervised and two unsupervised segmentation methods were tested on three areas using QuickBird and WorldView-2 satellite imagery. The results were assessed using both segmentation evaluation methods and an accuracy assessment of the resulting building classifications. Thus, differences in the geometries of the image objects and in the potential to achieve satisfactory thematic accuracies were evaluated. The two approaches yielded remarkably similar classification results, with overall accuracies ranging from 82% to 86%. The performance of one of the unsupervised methods was unexpectedly similar to that of the supervised method; they identified almost identical scale parameters as being optimal for segmenting buildings, resulting in very similar geometries for the resulting image objects. The second unsupervised method produced very different image objects from the supervised method, but their classification accuracies were still very similar. The latter result was unexpected because, contrary to previously published findings, it suggests a high degree of independence between the segmentation results and classification accuracy. The results of this study have two important implications. The first is that object-based image analysis can be automated without sacrificing classification accuracy, and the second is that the previously accepted idea that classification is dependent on segmentation is challenged by our unexpected results, casting doubt on the value of pursuing 'optimal segmentation'. Our results rather suggest that as long as under-segmentation remains at acceptable levels, imperfections in segmentation can be ruled out, so that a high level of classification accuracy can still be achieved.
Transportation Modes Classification Using Sensors on Smartphones.
Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu
2016-08-19
This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user's transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes.
Transportation Modes Classification Using Sensors on Smartphones
Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu
2016-01-01
This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user’s transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes. PMID:27548182
Selective classification for improved robustness of myoelectric control under nonideal conditions.
Scheme, Erik J; Englehart, Kevin B; Hudgins, Bernard S
2011-06-01
Recent literature in pattern recognition-based myoelectric control has highlighted a disparity between classification accuracy and the usability of upper limb prostheses. This paper suggests that the conventionally defined classification accuracy may be idealistic and may not reflect true clinical performance. Herein, a novel myoelectric control system based on a selective multiclass one-versus-one classification scheme, capable of rejecting unknown data patterns, is introduced. This scheme is shown to outperform nine other popular classifiers when compared using conventional classification accuracy as well as a form of leave-one-out analysis that may be more representative of real prosthetic use. Additionally, the classification scheme allows for real-time, independent adjustment of individual class-pair boundaries making it flexible and intuitive for clinical use.
A fuzzy hill-climbing algorithm for the development of a compact associative classifier
NASA Astrophysics Data System (ADS)
Mitra, Soumyaroop; Lam, Sarah S.
2012-02-01
Classification, a data mining technique, has widespread applications including medical diagnosis, targeted marketing, and others. Knowledge discovery from databases in the form of association rules is one of the important data mining tasks. An integrated approach, classification based on association rules, has drawn the attention of the data mining community over the last decade. While attention has been mainly focused on increasing classifier accuracies, not much efforts have been devoted towards building interpretable and less complex models. This paper discusses the development of a compact associative classification model using a hill-climbing approach and fuzzy sets. The proposed methodology builds the rule-base by selecting rules which contribute towards increasing training accuracy, thus balancing classification accuracy with the number of classification association rules. The results indicated that the proposed associative classification model can achieve competitive accuracies on benchmark datasets with continuous attributes and lend better interpretability, when compared with other rule-based systems.
Sub-pixel image classification for forest types in East Texas
NASA Astrophysics Data System (ADS)
Westbrook, Joey
Sub-pixel classification is the extraction of information about the proportion of individual materials of interest within a pixel. Landcover classification at the sub-pixel scale provides more discrimination than traditional per-pixel multispectral classifiers for pixels where the material of interest is mixed with other materials. It allows for the un-mixing of pixels to show the proportion of each material of interest. The materials of interest for this study are pine, hardwood, mixed forest and non-forest. The goal of this project was to perform a sub-pixel classification, which allows a pixel to have multiple labels, and compare the result to a traditional supervised classification, which allows a pixel to have only one label. The satellite image used was a Landsat 5 Thematic Mapper (TM) scene of the Stephen F. Austin Experimental Forest in Nacogdoches County, Texas and the four cover type classes are pine, hardwood, mixed forest and non-forest. Once classified, a multi-layer raster datasets was created that comprised four raster layers where each layer showed the percentage of that cover type within the pixel area. Percentage cover type maps were then produced and the accuracy of each was assessed using a fuzzy error matrix for the sub-pixel classifications, and the results were compared to the supervised classification in which a traditional error matrix was used. The overall accuracy of the sub-pixel classification using the aerial photo for both training and reference data had the highest (65% overall) out of the three sub-pixel classifications. This was understandable because the analyst can visually observe the cover types actually on the ground for training data and reference data, whereas using the FIA (Forest Inventory and Analysis) plot data, the analyst must assume that an entire pixel contains the exact percentage of a cover type found in a plot. An increase in accuracy was found after reclassifying each sub-pixel classification from nine classes with 10 percent interval each to five classes with 20 percent interval each. When compared to the supervised classification which has a satisfactory overall accuracy of 90%, none of the sub-pixel classification achieved the same level. However, since traditional per-pixel classifiers assign only one label to pixels throughout the landscape while sub-pixel classifications assign multiple labels to each pixel, the traditional 85% accuracy of acceptance for pixel-based classifications should not apply to sub-pixel classifications. More research is needed in order to define the level of accuracy that is deemed acceptable for sub-pixel classifications.
Comparing Features for Classification of MEG Responses to Motor Imagery.
Halme, Hanna-Leena; Parkkonen, Lauri
2016-01-01
Motor imagery (MI) with real-time neurofeedback could be a viable approach, e.g., in rehabilitation of cerebral stroke. Magnetoencephalography (MEG) noninvasively measures electric brain activity at high temporal resolution and is well-suited for recording oscillatory brain signals. MI is known to modulate 10- and 20-Hz oscillations in the somatomotor system. In order to provide accurate feedback to the subject, the most relevant MI-related features should be extracted from MEG data. In this study, we evaluated several MEG signal features for discriminating between left- and right-hand MI and between MI and rest. MEG was measured from nine healthy participants imagining either left- or right-hand finger tapping according to visual cues. Data preprocessing, feature extraction and classification were performed offline. The evaluated MI-related features were power spectral density (PSD), Morlet wavelets, short-time Fourier transform (STFT), common spatial patterns (CSP), filter-bank common spatial patterns (FBCSP), spatio-spectral decomposition (SSD), and combined SSD+CSP, CSP+PSD, CSP+Morlet, and CSP+STFT. We also compared four classifiers applied to single trials using 5-fold cross-validation for evaluating the classification accuracy and its possible dependence on the classification algorithm. In addition, we estimated the inter-session left-vs-right accuracy for each subject. The SSD+CSP combination yielded the best accuracy in both left-vs-right (mean 73.7%) and MI-vs-rest (mean 81.3%) classification. CSP+Morlet yielded the best mean accuracy in inter-session left-vs-right classification (mean 69.1%). There were large inter-subject differences in classification accuracy, and the level of the 20-Hz suppression correlated significantly with the subjective MI-vs-rest accuracy. Selection of the classification algorithm had only a minor effect on the results. We obtained good accuracy in sensor-level decoding of MI from single-trial MEG data. Feature extraction methods utilizing both the spatial and spectral profile of MI-related signals provided the best classification results, suggesting good performance of these methods in an online MEG neurofeedback system.
Porras-Alfaro, Andrea; Liu, Kuan-Liang; Kuske, Cheryl R; Xie, Gary
2014-02-01
We compared the classification accuracy of two sections of the fungal internal transcribed spacer (ITS) region, individually and combined, and the 5' section (about 600 bp) of the large-subunit rRNA (LSU), using a naive Bayesian classifier and BLASTN. A hand-curated ITS-LSU training set of 1,091 sequences and a larger training set of 8,967 ITS region sequences were used. Of the factors evaluated, database composition and quality had the largest effect on classification accuracy, followed by fragment size and use of a bootstrap cutoff to improve classification confidence. The naive Bayesian classifier and BLASTN gave similar results at higher taxonomic levels, but the classifier was faster and more accurate at the genus level when a bootstrap cutoff was used. All of the ITS and LSU sections performed well (>97.7% accuracy) at higher taxonomic ranks from kingdom to family, and differences between them were small at the genus level (within 0.66 to 1.23%). When full-length sequence sections were used, the LSU outperformed the ITS1 and ITS2 fragments at the genus level, but the ITS1 and ITS2 showed higher accuracy when smaller fragment sizes of the same length and a 50% bootstrap cutoff were used. In a comparison using the larger ITS training set, ITS1 and ITS2 had very similar accuracy classification for fragments between 100 and 200 bp. Collectively, the results show that any of the ITS or LSU sections we tested provided comparable classification accuracy to the genus level and underscore the need for larger and more diverse classification training sets.
Liu, Kuan-Liang; Kuske, Cheryl R.
2014-01-01
We compared the classification accuracy of two sections of the fungal internal transcribed spacer (ITS) region, individually and combined, and the 5′ section (about 600 bp) of the large-subunit rRNA (LSU), using a naive Bayesian classifier and BLASTN. A hand-curated ITS-LSU training set of 1,091 sequences and a larger training set of 8,967 ITS region sequences were used. Of the factors evaluated, database composition and quality had the largest effect on classification accuracy, followed by fragment size and use of a bootstrap cutoff to improve classification confidence. The naive Bayesian classifier and BLASTN gave similar results at higher taxonomic levels, but the classifier was faster and more accurate at the genus level when a bootstrap cutoff was used. All of the ITS and LSU sections performed well (>97.7% accuracy) at higher taxonomic ranks from kingdom to family, and differences between them were small at the genus level (within 0.66 to 1.23%). When full-length sequence sections were used, the LSU outperformed the ITS1 and ITS2 fragments at the genus level, but the ITS1 and ITS2 showed higher accuracy when smaller fragment sizes of the same length and a 50% bootstrap cutoff were used. In a comparison using the larger ITS training set, ITS1 and ITS2 had very similar accuracy classification for fragments between 100 and 200 bp. Collectively, the results show that any of the ITS or LSU sections we tested provided comparable classification accuracy to the genus level and underscore the need for larger and more diverse classification training sets. PMID:24242255
Janousova, Eva; Schwarz, Daniel; Kasparek, Tomas
2015-06-30
We investigated a combination of three classification algorithms, namely the modified maximum uncertainty linear discriminant analysis (mMLDA), the centroid method, and the average linkage, with three types of features extracted from three-dimensional T1-weighted magnetic resonance (MR) brain images, specifically MR intensities, grey matter densities, and local deformations for distinguishing 49 first episode schizophrenia male patients from 49 healthy male subjects. The feature sets were reduced using intersubject principal component analysis before classification. By combining the classifiers, we were able to obtain slightly improved results when compared with single classifiers. The best classification performance (81.6% accuracy, 75.5% sensitivity, and 87.8% specificity) was significantly better than classification by chance. We also showed that classifiers based on features calculated using more computation-intensive image preprocessing perform better; mMLDA with classification boundary calculated as weighted mean discriminative scores of the groups had improved sensitivity but similar accuracy compared to the original MLDA; reducing a number of eigenvectors during data reduction did not always lead to higher classification accuracy, since noise as well as the signal important for classification were removed. Our findings provide important information for schizophrenia research and may improve accuracy of computer-aided diagnostics of neuropsychiatric diseases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Yang, Xiaoyan; Chen, Longgao; Li, Yingkui; Xi, Wenjia; Chen, Longqian
2015-07-01
Land use/land cover (LULC) inventory provides an important dataset in regional planning and environmental assessment. To efficiently obtain the LULC inventory, we compared the LULC classifications based on single satellite imagery with a rule-based classification based on multi-seasonal imagery in Lianyungang City, a coastal city in China, using CBERS-02 (the 2nd China-Brazil Environmental Resource Satellites) images. The overall accuracies of the classification based on single imagery are 78.9, 82.8, and 82.0% in winter, early summer, and autumn, respectively. The rule-based classification improves the accuracy to 87.9% (kappa 0.85), suggesting that combining multi-seasonal images can considerably improve the classification accuracy over any single image-based classification. This method could also be used to analyze seasonal changes of LULC types, especially for those associated with tidal changes in coastal areas. The distribution and inventory of LULC types with an overall accuracy of 87.9% and a spatial resolution of 19.5 m can assist regional planning and environmental assessment efficiently in Lianyungang City. This rule-based classification provides a guidance to improve accuracy for coastal areas with distinct LULC temporal spectral features.
Random forests for classification in ecology
Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J.
2007-01-01
Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in ecology. Advantages of RF compared to other statistical classifiers include (1) very high classification accuracy; (2) a novel method of determining variable importance; (3) ability to model complex interactions among predictor variables; (4) flexibility to perform several types of statistical data analysis, including regression, classification, survival analysis, and unsupervised learning; and (5) an algorithm for imputing missing values. We compared the accuracies of RF and four other commonly used statistical classifiers using data on invasive plant species presence in Lava Beds National Monument, California, USA, rare lichen species presence in the Pacific Northwest, USA, and nest sites for cavity nesting birds in the Uinta Mountains, Utah, USA. We observed high classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods. We also observed that the variables that RF identified as most important for classifying invasive plant species coincided with expectations based on the literature. ?? 2007 by the Ecological Society of America.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Yongjun; Lim, Jonghyuck; Kim, Namkug
2013-05-15
Purpose: To investigate the effect of using different computed tomography (CT) scanners on the accuracy of high-resolution CT (HRCT) images in classifying regional disease patterns in patients with diffuse lung disease, support vector machine (SVM) and Bayesian classifiers were applied to multicenter data. Methods: Two experienced radiologists marked sets of 600 rectangular 20 Multiplication-Sign 20 pixel regions of interest (ROIs) on HRCT images obtained from two scanners (GE and Siemens), including 100 ROIs for each of local patterns of lungs-normal lung and five of regional pulmonary disease patterns (ground-glass opacity, reticular opacity, honeycombing, emphysema, and consolidation). Each ROI was assessedmore » using 22 quantitative features belonging to one of the following descriptors: histogram, gradient, run-length, gray level co-occurrence matrix, low-attenuation area cluster, and top-hat transform. For automatic classification, a Bayesian classifier and a SVM classifier were compared under three different conditions. First, classification accuracies were estimated using data from each scanner. Next, data from the GE and Siemens scanners were used for training and testing, respectively, and vice versa. Finally, all ROI data were integrated regardless of the scanner type and were then trained and tested together. All experiments were performed based on forward feature selection and fivefold cross-validation with 20 repetitions. Results: For each scanner, better classification accuracies were achieved with the SVM classifier than the Bayesian classifier (92% and 82%, respectively, for the GE scanner; and 92% and 86%, respectively, for the Siemens scanner). The classification accuracies were 82%/72% for training with GE data and testing with Siemens data, and 79%/72% for the reverse. The use of training and test data obtained from the HRCT images of different scanners lowered the classification accuracy compared to the use of HRCT images from the same scanner. For integrated ROI data obtained from both scanners, the classification accuracies with the SVM and Bayesian classifiers were 92% and 77%, respectively. The selected features resulting from the classification process differed by scanner, with more features included for the classification of the integrated HRCT data than for the classification of the HRCT data from each scanner. For the integrated data, consisting of HRCT images of both scanners, the classification accuracy based on the SVM was statistically similar to the accuracy of the data obtained from each scanner. However, the classification accuracy of the integrated data using the Bayesian classifier was significantly lower than the classification accuracy of the ROI data of each scanner. Conclusions: The use of an integrated dataset along with a SVM classifier rather than a Bayesian classifier has benefits in terms of the classification accuracy of HRCT images acquired with more than one scanner. This finding is of relevance in studies involving large number of images, as is the case in a multicenter trial with different scanners.« less
Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks.
Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R; Nguyen, Tuan N; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T
2017-01-01
This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively.
Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks
Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R.; Nguyen, Tuan N.; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T.
2017-01-01
This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively. PMID:28326009
Real-time, resource-constrained object classification on a micro-air vehicle
NASA Astrophysics Data System (ADS)
Buck, Louis; Ray, Laura
2013-12-01
A real-time embedded object classification algorithm is developed through the novel combination of binary feature descriptors, a bag-of-visual-words object model and the cortico-striatal loop (CSL) learning algorithm. The BRIEF, ORB and FREAK binary descriptors are tested and compared to SIFT descriptors with regard to their respective classification accuracies, execution times, and memory requirements when used with CSL on a 12.6 g ARM Cortex embedded processor running at 800 MHz. Additionally, the effect of x2 feature mapping and opponent-color representations used with these descriptors is examined. These tests are performed on four data sets of varying sizes and difficulty, and the BRIEF descriptor is found to yield the best combination of speed and classification accuracy. Its use with CSL achieves accuracies between 67% and 95% of those achieved with SIFT descriptors and allows for the embedded classification of a 128x192 pixel image in 0.15 seconds, 60 times faster than classification with SIFT. X2 mapping is found to provide substantial improvements in classification accuracy for all of the descriptors at little cost, while opponent-color descriptors are offer accuracy improvements only on colorful datasets.
Improving crop classification through attention to the timing of airborne radar acquisitions
NASA Technical Reports Server (NTRS)
Brisco, B.; Ulaby, F. T.; Protz, R.
1984-01-01
Radar remote sensors may provide valuable input to crop classification procedures because of (1) their independence of weather conditions and solar illumination, and (2) their ability to respond to differences in crop type. Manual classification of multidate synthetic aperture radar (SAR) imagery resulted in an overall accuracy of 83 percent for corn, forest, grain, and 'other' cover types. Forests and corn fields were identified with accuracies approaching or exceeding 90 percent. Grain fields and 'other' fields were often confused with each other, resulting in classification accuracies of 51 and 66 percent, respectively. The 83 percent correct classification represents a 10 percent improvement when compared to similar SAR data for the same area collected at alternate time periods in 1978. These results demonstrate that improvements in crop classification accuracy can be achieved with SAR data by synchronizing data collection times with crop growth stages in order to maximize differences in the geometric and dielectric properties of the cover types of interest.
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-01-01
Background Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Methodology/Principal Findings Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. Conclusions/Significance The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice. PMID:21359184
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-02-16
Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice.
High Accuracy Human Activity Recognition Based on Sparse Locality Preserving Projections.
Zhu, Xiangbin; Qiu, Huiling
2016-01-01
Human activity recognition(HAR) from the temporal streams of sensory data has been applied to many fields, such as healthcare services, intelligent environments and cyber security. However, the classification accuracy of most existed methods is not enough in some applications, especially for healthcare services. In order to improving accuracy, it is necessary to develop a novel method which will take full account of the intrinsic sequential characteristics for time-series sensory data. Moreover, each human activity may has correlated feature relationship at different levels. Therefore, in this paper, we propose a three-stage continuous hidden Markov model (TSCHMM) approach to recognize human activities. The proposed method contains coarse, fine and accurate classification. The feature reduction is an important step in classification processing. In this paper, sparse locality preserving projections (SpLPP) is exploited to determine the optimal feature subsets for accurate classification of the stationary-activity data. It can extract more discriminative activities features from the sensor data compared with locality preserving projections. Furthermore, all of the gyro-based features are used for accurate classification of the moving data. Compared with other methods, our method uses significantly less number of features, and the over-all accuracy has been obviously improved.
High Accuracy Human Activity Recognition Based on Sparse Locality Preserving Projections
2016-01-01
Human activity recognition(HAR) from the temporal streams of sensory data has been applied to many fields, such as healthcare services, intelligent environments and cyber security. However, the classification accuracy of most existed methods is not enough in some applications, especially for healthcare services. In order to improving accuracy, it is necessary to develop a novel method which will take full account of the intrinsic sequential characteristics for time-series sensory data. Moreover, each human activity may has correlated feature relationship at different levels. Therefore, in this paper, we propose a three-stage continuous hidden Markov model (TSCHMM) approach to recognize human activities. The proposed method contains coarse, fine and accurate classification. The feature reduction is an important step in classification processing. In this paper, sparse locality preserving projections (SpLPP) is exploited to determine the optimal feature subsets for accurate classification of the stationary-activity data. It can extract more discriminative activities features from the sensor data compared with locality preserving projections. Furthermore, all of the gyro-based features are used for accurate classification of the moving data. Compared with other methods, our method uses significantly less number of features, and the over-all accuracy has been obviously improved. PMID:27893761
Deep multi-scale convolutional neural network for hyperspectral image classification
NASA Astrophysics Data System (ADS)
Zhang, Feng-zhe; Yang, Xia
2018-04-01
In this paper, we proposed a multi-scale convolutional neural network for hyperspectral image classification task. Firstly, compared with conventional convolution, we utilize multi-scale convolutions, which possess larger respective fields, to extract spectral features of hyperspectral image. We design a deep neural network with a multi-scale convolution layer which contains 3 different convolution kernel sizes. Secondly, to avoid overfitting of deep neural network, dropout is utilized, which randomly sleeps neurons, contributing to improve the classification accuracy a bit. In addition, new skills like ReLU in deep learning is utilized in this paper. We conduct experiments on University of Pavia and Salinas datasets, and obtained better classification accuracy compared with other methods.
The use of Landsat data to inventory cotton and soybean acreage in North Alabama
NASA Technical Reports Server (NTRS)
Downs, S. W., Jr.; Faust, N. L.
1980-01-01
This study was performed to determine if Landsat data could be used to improve the accuracy of the estimation of cotton acreage. A linear classification algorithm and a maximum likelihood algorithm were used for computer classification of the area, and the classification was compared with ground truth. The classification accuracy for some fields was greater than 90 percent; however, the overall accuracy was 71 percent for cotton and 56 percent for soybeans. The results of this research indicate that computer analysis of Landsat data has potential for improving upon the methods presently being used to determine cotton acreage; however, additional experiments and refinements are needed before the method can be used operationally.
Comparing Features for Classification of MEG Responses to Motor Imagery
Halme, Hanna-Leena; Parkkonen, Lauri
2016-01-01
Background Motor imagery (MI) with real-time neurofeedback could be a viable approach, e.g., in rehabilitation of cerebral stroke. Magnetoencephalography (MEG) noninvasively measures electric brain activity at high temporal resolution and is well-suited for recording oscillatory brain signals. MI is known to modulate 10- and 20-Hz oscillations in the somatomotor system. In order to provide accurate feedback to the subject, the most relevant MI-related features should be extracted from MEG data. In this study, we evaluated several MEG signal features for discriminating between left- and right-hand MI and between MI and rest. Methods MEG was measured from nine healthy participants imagining either left- or right-hand finger tapping according to visual cues. Data preprocessing, feature extraction and classification were performed offline. The evaluated MI-related features were power spectral density (PSD), Morlet wavelets, short-time Fourier transform (STFT), common spatial patterns (CSP), filter-bank common spatial patterns (FBCSP), spatio—spectral decomposition (SSD), and combined SSD+CSP, CSP+PSD, CSP+Morlet, and CSP+STFT. We also compared four classifiers applied to single trials using 5-fold cross-validation for evaluating the classification accuracy and its possible dependence on the classification algorithm. In addition, we estimated the inter-session left-vs-right accuracy for each subject. Results The SSD+CSP combination yielded the best accuracy in both left-vs-right (mean 73.7%) and MI-vs-rest (mean 81.3%) classification. CSP+Morlet yielded the best mean accuracy in inter-session left-vs-right classification (mean 69.1%). There were large inter-subject differences in classification accuracy, and the level of the 20-Hz suppression correlated significantly with the subjective MI-vs-rest accuracy. Selection of the classification algorithm had only a minor effect on the results. Conclusions We obtained good accuracy in sensor-level decoding of MI from single-trial MEG data. Feature extraction methods utilizing both the spatial and spectral profile of MI-related signals provided the best classification results, suggesting good performance of these methods in an online MEG neurofeedback system. PMID:27992574
Classification of EEG Signals Based on Pattern Recognition Approach.
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a "pattern recognition" approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90-7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11-89.63% and 91.60-81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy.
Classification of EEG Signals Based on Pattern Recognition Approach
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a “pattern recognition” approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90–7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11–89.63% and 91.60–81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy. PMID:29209190
Prediction of customer behaviour analysis using classification algorithms
NASA Astrophysics Data System (ADS)
Raju, Siva Subramanian; Dhandayudam, Prabha
2018-04-01
Customer Relationship management plays a crucial role in analyzing of customer behavior patterns and their values with an enterprise. Analyzing of customer data can be efficient performed using various data mining techniques, with the goal of developing business strategies and to enhance the business. In this paper, three classification models (NB, J48, and MLPNN) are studied and evaluated for our experimental purpose. The performance measures of the three classifications are compared using three different parameters (accuracy, sensitivity, specificity) and experimental results expose J48 algorithm has better accuracy with compare to NB and MLPNN algorithm.
Multi-source remotely sensed data fusion for improving land cover classification
NASA Astrophysics Data System (ADS)
Chen, Bin; Huang, Bo; Xu, Bing
2017-02-01
Although many advances have been made in past decades, land cover classification of fine-resolution remotely sensed (RS) data integrating multiple temporal, angular, and spectral features remains limited, and the contribution of different RS features to land cover classification accuracy remains uncertain. We proposed to improve land cover classification accuracy by integrating multi-source RS features through data fusion. We further investigated the effect of different RS features on classification performance. The results of fusing Landsat-8 Operational Land Imager (OLI) data with Moderate Resolution Imaging Spectroradiometer (MODIS), China Environment 1A series (HJ-1A), and Advanced Spaceborne Thermal Emission and Reflection (ASTER) digital elevation model (DEM) data, showed that the fused data integrating temporal, spectral, angular, and topographic features achieved better land cover classification accuracy than the original RS data. Compared with the topographic feature, the temporal and angular features extracted from the fused data played more important roles in classification performance, especially those temporal features containing abundant vegetation growth information, which markedly increased the overall classification accuracy. In addition, the multispectral and hyperspectral fusion successfully discriminated detailed forest types. Our study provides a straightforward strategy for hierarchical land cover classification by making full use of available RS data. All of these methods and findings could be useful for land cover classification at both regional and global scales.
NASA Astrophysics Data System (ADS)
Geelen, Christopher D.; Wijnhoven, Rob G. J.; Dubbelman, Gijs; de With, Peter H. N.
2015-03-01
This research considers gender classification in surveillance environments, typically involving low-resolution images and a large amount of viewpoint variations and occlusions. Gender classification is inherently difficult due to the large intra-class variation and interclass correlation. We have developed a gender classification system, which is successfully evaluated on two novel datasets, which realistically consider the above conditions, typical for surveillance. The system reaches a mean accuracy of up to 90% and approaches our human baseline of 92.6%, proving a high-quality gender classification system. We also present an in-depth discussion of the fundamental differences between SVM and RF classifiers. We conclude that balancing the degree of randomization in any classifier is required for the highest classification accuracy. For our problem, an RF-SVM hybrid classifier exploiting the combination of HSV and LBP features results in the highest classification accuracy of 89.9 0.2%, while classification computation time is negligible compared to the detection time of pedestrians.
Multi-site evaluation of IKONOS data for classification of tropical coral reef environments
Andrefouet, S.; Kramer, Philip; Torres-Pulliza, D.; Joyce, K.E.; Hochberg, E.J.; Garza-Perez, R.; Mumby, P.J.; Riegl, Bernhard; Yamano, H.; White, W.H.; Zubia, M.; Brock, J.C.; Phinn, S.R.; Naseer, A.; Hatcher, B.G.; Muller-Karger, F. E.
2003-01-01
Ten IKONOS images of different coral reef sites distributed around the world were processed to assess the potential of 4-m resolution multispectral data for coral reef habitat mapping. Complexity of reef environments, established by field observation, ranged from 3 to 15 classes of benthic habitats containing various combinations of sediments, carbonate pavement, seagrass, algae, and corals in different geomorphologic zones (forereef, lagoon, patch reef, reef flats). Processing included corrections for sea surface roughness and bathymetry, unsupervised or supervised classification, and accuracy assessment based on ground-truth data. IKONOS classification results were compared with classified Landsat 7 imagery for simple to moderate complexity of reef habitats (5-11 classes). For both sensors, overall accuracies of the classifications show a general linear trend of decreasing accuracy with increasing habitat complexity. The IKONOS sensor performed better, with a 15-20% improvement in accuracy compared to Landsat. For IKONOS, overall accuracy was 77% for 4-5 classes, 71% for 7-8 classes, 65% in 9-11 classes, and 53% for more than 13 classes. The Landsat classification accuracy was systematically lower, with an average of 56% for 5-10 classes. Within this general trend, inter-site comparisons and specificities demonstrate the benefits of different approaches. Pre-segmentation of the different geomorphologic zones and depth correction provided different advantages in different environments. Our results help guide scientists and managers in applying IKONOS-class data for coral reef mapping applications. ?? 2003 Elsevier Inc. All rights reserved.
Boursier, Jérôme; Bertrais, Sandrine; Oberti, Frédéric; Gallois, Yves; Fouchard-Hubert, Isabelle; Rousselet, Marie-Christine; Zarski, Jean-Pierre; Calès, Paul
2011-11-30
Non-invasive tests have been constructed and evaluated mainly for binary diagnoses such as significant fibrosis. Recently, detailed fibrosis classifications for several non-invasive tests have been developed, but their accuracy has not been thoroughly evaluated in comparison to liver biopsy, especially in clinical practice and for Fibroscan. Therefore, the main aim of the present study was to evaluate the accuracy of detailed fibrosis classifications available for non-invasive tests and liver biopsy. The secondary aim was to validate these accuracies in independent populations. Four HCV populations provided 2,068 patients with liver biopsy, four different pathologist skill-levels and non-invasive tests. Results were expressed as percentages of correctly classified patients. In population #1 including 205 patients and comparing liver biopsy (reference: consensus reading by two experts) and blood tests, Metavir fibrosis (FM) stage accuracy was 64.4% in local pathologists vs. 82.2% (p < 10-3) in single expert pathologist. Significant discrepancy (≥ 2FM vs reference histological result) rates were: Fibrotest: 17.2%, FibroMeter2G: 5.6%, local pathologists: 4.9%, FibroMeter3G: 0.5%, expert pathologist: 0% (p < 10-3). In population #2 including 1,056 patients and comparing blood tests, the discrepancy scores, taking into account the error magnitude, of detailed fibrosis classification were significantly different between FibroMeter2G (0.30 ± 0.55) and FibroMeter3G (0.14 ± 0.37, p < 10-3) or Fibrotest (0.84 ± 0.80, p < 10-3). In population #3 (and #4) including 458 (359) patients and comparing blood tests and Fibroscan, accuracies of detailed fibrosis classification were, respectively: Fibrotest: 42.5% (33.5%), Fibroscan: 64.9% (50.7%), FibroMeter2G: 68.7% (68.2%), FibroMeter3G: 77.1% (83.4%), p < 10-3 (p < 10-3). Significant discrepancy (≥ 2 FM) rates were, respectively: Fibrotest: 21.3% (22.2%), Fibroscan: 12.9% (12.3%), FibroMeter2G: 5.7% (6.0%), FibroMeter3G: 0.9% (0.9%), p < 10-3 (p < 10-3). The accuracy in detailed fibrosis classification of the best-performing blood test outperforms liver biopsy read by a local pathologist, i.e., in clinical practice; however, the classification precision is apparently lesser. This detailed classification accuracy is much lower than that of significant fibrosis with Fibroscan and even Fibrotest but higher with FibroMeter3G. FibroMeter classification accuracy was significantly higher than those of other non-invasive tests. Finally, for hepatitis C evaluation in clinical practice, fibrosis degree can be evaluated using an accurate blood test.
2011-01-01
Background Non-invasive tests have been constructed and evaluated mainly for binary diagnoses such as significant fibrosis. Recently, detailed fibrosis classifications for several non-invasive tests have been developed, but their accuracy has not been thoroughly evaluated in comparison to liver biopsy, especially in clinical practice and for Fibroscan. Therefore, the main aim of the present study was to evaluate the accuracy of detailed fibrosis classifications available for non-invasive tests and liver biopsy. The secondary aim was to validate these accuracies in independent populations. Methods Four HCV populations provided 2,068 patients with liver biopsy, four different pathologist skill-levels and non-invasive tests. Results were expressed as percentages of correctly classified patients. Results In population #1 including 205 patients and comparing liver biopsy (reference: consensus reading by two experts) and blood tests, Metavir fibrosis (FM) stage accuracy was 64.4% in local pathologists vs. 82.2% (p < 10-3) in single expert pathologist. Significant discrepancy (≥ 2FM vs reference histological result) rates were: Fibrotest: 17.2%, FibroMeter2G: 5.6%, local pathologists: 4.9%, FibroMeter3G: 0.5%, expert pathologist: 0% (p < 10-3). In population #2 including 1,056 patients and comparing blood tests, the discrepancy scores, taking into account the error magnitude, of detailed fibrosis classification were significantly different between FibroMeter2G (0.30 ± 0.55) and FibroMeter3G (0.14 ± 0.37, p < 10-3) or Fibrotest (0.84 ± 0.80, p < 10-3). In population #3 (and #4) including 458 (359) patients and comparing blood tests and Fibroscan, accuracies of detailed fibrosis classification were, respectively: Fibrotest: 42.5% (33.5%), Fibroscan: 64.9% (50.7%), FibroMeter2G: 68.7% (68.2%), FibroMeter3G: 77.1% (83.4%), p < 10-3 (p < 10-3). Significant discrepancy (≥ 2 FM) rates were, respectively: Fibrotest: 21.3% (22.2%), Fibroscan: 12.9% (12.3%), FibroMeter2G: 5.7% (6.0%), FibroMeter3G: 0.9% (0.9%), p < 10-3 (p < 10-3). Conclusions The accuracy in detailed fibrosis classification of the best-performing blood test outperforms liver biopsy read by a local pathologist, i.e., in clinical practice; however, the classification precision is apparently lesser. This detailed classification accuracy is much lower than that of significant fibrosis with Fibroscan and even Fibrotest but higher with FibroMeter3G. FibroMeter classification accuracy was significantly higher than those of other non-invasive tests. Finally, for hepatitis C evaluation in clinical practice, fibrosis degree can be evaluated using an accurate blood test. PMID:22129438
NASA Technical Reports Server (NTRS)
Hoffbeck, Joseph P.; Landgrebe, David A.
1994-01-01
Many analysis algorithms for high-dimensional remote sensing data require that the remotely sensed radiance spectra be transformed to approximate reflectance to allow comparison with a library of laboratory reflectance spectra. In maximum likelihood classification, however, the remotely sensed spectra are compared to training samples, thus a transformation to reflectance may or may not be helpful. The effect of several radiance-to-reflectance transformations on maximum likelihood classification accuracy is investigated in this paper. We show that the empirical line approach, LOWTRAN7, flat-field correction, single spectrum method, and internal average reflectance are all non-singular affine transformations, and that non-singular affine transformations have no effect on discriminant analysis feature extraction and maximum likelihood classification accuracy. (An affine transformation is a linear transformation with an optional offset.) Since the Atmosphere Removal Program (ATREM) and the log residue method are not affine transformations, experiments with Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data were conducted to determine the effect of these transformations on maximum likelihood classification accuracy. The average classification accuracy of the data transformed by ATREM and the log residue method was slightly less than the accuracy of the original radiance data. Since the radiance-to-reflectance transformations allow direct comparison of remotely sensed spectra with laboratory reflectance spectra, they can be quite useful in labeling the training samples required by maximum likelihood classification, but these transformations have only a slight effect or no effect at all on discriminant analysis and maximum likelihood classification accuracy.
a Gsa-Svm Hybrid System for Classification of Binary Problems
NASA Astrophysics Data System (ADS)
Sarafrazi, Soroor; Nezamabadi-pour, Hossein; Barahman, Mojgan
2011-06-01
This paperhybridizesgravitational search algorithm (GSA) with support vector machine (SVM) and made a novel GSA-SVM hybrid system to improve the classification accuracy in binary problems. GSA is an optimization heuristic toolused to optimize the value of SVM kernel parameter (in this paper, radial basis function (RBF) is chosen as the kernel function). The experimental results show that this newapproach can achieve high classification accuracy and is comparable to or better than the particle swarm optimization (PSO)-SVM and genetic algorithm (GA)-SVM, which are two hybrid systems for classification.
Typicality effects in artificial categories: is there a hemisphere difference?
Richards, L G; Chiarello, C
1990-07-01
In category classification tasks, typicality effects are usually found: accuracy and reaction time depend upon distance from a prototype. In this study, subjects learned either verbal or nonverbal dot pattern categories, followed by a lateralized classification task. Comparable typicality effects were found in both reaction time and accuracy across visual fields for both verbal and nonverbal categories. Both hemispheres appeared to use a similarity-to-prototype matching strategy in classification. This indicates that merely having a verbal label does not differentiate classification in the two hemispheres.
Farrell, Todd R.; Weir, Richard F. ff.
2011-01-01
The use of surface versus intramuscular electrodes as well as the effect of electrode targeting on pattern-recognition-based multifunctional prosthesis control was explored. Surface electrodes are touted for their ability to record activity from relatively large portions of muscle tissue. Intramuscular electromyograms (EMGs) can provide focal recordings from deep muscles of the forearm and independent signals relatively free of crosstalk. However, little work has been done to compare the two. Additionally, while previous investigations have either targeted electrodes to specific muscles or used untargeted (symmetric) electrode arrays, no work has compared these approaches to determine if one is superior. The classification accuracies of pattern-recognition-based classifiers utilizing surface and intramuscular as well as targeted and untargeted electrodes were compared across 11 subjects. A repeated-measures analysis of variance revealed that when only EMG amplitude information was used from all available EMG channels, the targeted surface, targeted intramuscular, and untargeted surface electrodes produced similar classification accuracies while the untargeted intramuscular electrodes produced significantly lower accuracies. However, no statistical differences were observed between any of the electrode conditions when additional features were extracted from the EMG signal. It was concluded that the choice of electrode should be driven by clinical factors, such as signal robustness/stability, cost, etc., instead of by classification accuracy. PMID:18713689
NASA Astrophysics Data System (ADS)
Dash, Jatindra K.; Kale, Mandar; Mukhopadhyay, Sudipta; Khandelwal, Niranjan; Prabhakar, Nidhi; Garg, Mandeep; Kalra, Naveen
2017-03-01
In this paper, we investigate the effect of the error criteria used during a training phase of the artificial neural network (ANN) on the accuracy of the classifier for classification of lung tissues affected with Interstitial Lung Diseases (ILD). Mean square error (MSE) and the cross-entropy (CE) criteria are chosen being most popular choice in state-of-the-art implementations. The classification experiment performed on the six interstitial lung disease (ILD) patterns viz. Consolidation, Emphysema, Ground Glass Opacity, Micronodules, Fibrosis and Healthy from MedGIFT database. The texture features from an arbitrary region of interest (AROI) are extracted using Gabor filter. Two different neural networks are trained with the scaled conjugate gradient back propagation algorithm with MSE and CE error criteria function respectively for weight updation. Performance is evaluated in terms of average accuracy of these classifiers using 4 fold cross-validation. Each network is trained for five times for each fold with randomly initialized weight vectors and accuracies are computed. Significant improvement in classification accuracy is observed when ANN is trained by using CE (67.27%) as error function compared to MSE (63.60%). Moreover, standard deviation of the classification accuracy for the network trained with CE (6.69) error criteria is found less as compared to network trained with MSE (10.32) criteria.
2011-01-01
Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most sensitivity was around or even lower than a median value of 0.5. Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing. PMID:21849043
Madison, Matthew J; Bradshaw, Laine P
2015-06-01
Diagnostic classification models are psychometric models that aim to classify examinees according to their mastery or non-mastery of specified latent characteristics. These models are well-suited for providing diagnostic feedback on educational assessments because of their practical efficiency and increased reliability when compared with other multidimensional measurement models. A priori specifications of which latent characteristics or attributes are measured by each item are a core element of the diagnostic assessment design. This item-attribute alignment, expressed in a Q-matrix, precedes and supports any inference resulting from the application of the diagnostic classification model. This study investigates the effects of Q-matrix design on classification accuracy for the log-linear cognitive diagnosis model. Results indicate that classification accuracy, reliability, and convergence rates improve when the Q-matrix contains isolated information from each measured attribute.
HEp-2 cell image classification method based on very deep convolutional networks with small datasets
NASA Astrophysics Data System (ADS)
Lu, Mengchi; Gao, Long; Guo, Xifeng; Liu, Qiang; Yin, Jianping
2017-07-01
Human Epithelial-2 (HEp-2) cell images staining patterns classification have been widely used to identify autoimmune diseases by the anti-Nuclear antibodies (ANA) test in the Indirect Immunofluorescence (IIF) protocol. Because manual test is time consuming, subjective and labor intensive, image-based Computer Aided Diagnosis (CAD) systems for HEp-2 cell classification are developing. However, methods proposed recently are mostly manual features extraction with low accuracy. Besides, the scale of available benchmark datasets is small, which does not exactly suitable for using deep learning methods. This issue will influence the accuracy of cell classification directly even after data augmentation. To address these issues, this paper presents a high accuracy automatic HEp-2 cell classification method with small datasets, by utilizing very deep convolutional networks (VGGNet). Specifically, the proposed method consists of three main phases, namely image preprocessing, feature extraction and classification. Moreover, an improved VGGNet is presented to address the challenges of small-scale datasets. Experimental results over two benchmark datasets demonstrate that the proposed method achieves superior performance in terms of accuracy compared with existing methods.
Application of Sensor Fusion to Improve Uav Image Classification
NASA Astrophysics Data System (ADS)
Jabari, S.; Fathollahi, F.; Zhang, Y.
2017-08-01
Image classification is one of the most important tasks of remote sensing projects including the ones that are based on using UAV images. Improving the quality of UAV images directly affects the classification results and can save a huge amount of time and effort in this area. In this study, we show that sensor fusion can improve image quality which results in increasing the accuracy of image classification. Here, we tested two sensor fusion configurations by using a Panchromatic (Pan) camera along with either a colour camera or a four-band multi-spectral (MS) camera. We use the Pan camera to benefit from its higher sensitivity and the colour or MS camera to benefit from its spectral properties. The resulting images are then compared to the ones acquired by a high resolution single Bayer-pattern colour camera (here referred to as HRC). We assessed the quality of the output images by performing image classification tests. The outputs prove that the proposed sensor fusion configurations can achieve higher accuracies compared to the images of the single Bayer-pattern colour camera. Therefore, incorporating a Pan camera on-board in the UAV missions and performing image fusion can help achieving higher quality images and accordingly higher accuracy classification results.
A novel artificial immune clonal selection classification and rule mining with swarm learning model
NASA Astrophysics Data System (ADS)
Al-Sheshtawi, Khaled A.; Abdul-Kader, Hatem M.; Elsisi, Ashraf B.
2013-06-01
Metaheuristic optimisation algorithms have become popular choice for solving complex problems. By integrating Artificial Immune clonal selection algorithm (CSA) and particle swarm optimisation (PSO) algorithm, a novel hybrid Clonal Selection Classification and Rule Mining with Swarm Learning Algorithm (CS2) is proposed. The main goal of the approach is to exploit and explore the parallel computation merit of Clonal Selection and the speed and self-organisation merits of Particle Swarm by sharing information between clonal selection population and particle swarm. Hence, we employed the advantages of PSO to improve the mutation mechanism of the artificial immune CSA and to mine classification rules within datasets. Consequently, our proposed algorithm required less training time and memory cells in comparison to other AIS algorithms. In this paper, classification rule mining has been modelled as a miltiobjective optimisation problem with predictive accuracy. The multiobjective approach is intended to allow the PSO algorithm to return an approximation to the accuracy and comprehensibility border, containing solutions that are spread across the border. We compared our proposed algorithm classification accuracy CS2 with five commonly used CSAs, namely: AIRS1, AIRS2, AIRS-Parallel, CLONALG, and CSCA using eight benchmark datasets. We also compared our proposed algorithm classification accuracy CS2 with other five methods, namely: Naïve Bayes, SVM, MLP, CART, and RFB. The results show that the proposed algorithm is comparable to the 10 studied algorithms. As a result, the hybridisation, built of CSA and PSO, can develop respective merit, compensate opponent defect, and make search-optimal effect and speed better.
Classifying four-category visual objects using multiple ERP components in single-trial ERP.
Qin, Yu; Zhan, Yu; Wang, Changming; Zhang, Jiacai; Yao, Li; Guo, Xiaojuan; Wu, Xia; Hu, Bin
2016-08-01
Object categorization using single-trial electroencephalography (EEG) data measured while participants view images has been studied intensively. In previous studies, multiple event-related potential (ERP) components (e.g., P1, N1, P2, and P3) were used to improve the performance of object categorization of visual stimuli. In this study, we introduce a novel method that uses multiple-kernel support vector machine to fuse multiple ERP component features. We investigate whether fusing the potential complementary information of different ERP components (e.g., P1, N1, P2a, and P2b) can improve the performance of four-category visual object classification in single-trial EEGs. We also compare the classification accuracy of different ERP component fusion methods. Our experimental results indicate that the classification accuracy increases through multiple ERP fusion. Additional comparative analyses indicate that the multiple-kernel fusion method can achieve a mean classification accuracy higher than 72 %, which is substantially better than that achieved with any single ERP component feature (55.07 % for the best single ERP component, N1). We compare the classification results with those of other fusion methods and determine that the accuracy of the multiple-kernel fusion method is 5.47, 4.06, and 16.90 % higher than those of feature concatenation, feature extraction, and decision fusion, respectively. Our study shows that our multiple-kernel fusion method outperforms other fusion methods and thus provides a means to improve the classification performance of single-trial ERPs in brain-computer interface research.
Developing Local Oral Reading Fluency Cut Scores for Predicting High-Stakes Test Performance
ERIC Educational Resources Information Center
Grapin, Sally L.; Kranzler, John H.; Waldron, Nancy; Joyce-Beaulieu, Diana; Algina, James
2017-01-01
This study evaluated the classification accuracy of a second grade oral reading fluency curriculum-based measure (R-CBM) in predicting third grade state test performance. It also compared the long-term classification accuracy of local and publisher-recommended R-CBM cut scores. Participants were 266 students who were divided into a calibration…
Ensemble Methods for Classification of Physical Activities from Wrist Accelerometry.
Chowdhury, Alok Kumar; Tjondronegoro, Dian; Chandran, Vinod; Trost, Stewart G
2017-09-01
To investigate whether the use of ensemble learning algorithms improve physical activity recognition accuracy compared to the single classifier algorithms, and to compare the classification accuracy achieved by three conventional ensemble machine learning methods (bagging, boosting, random forest) and a custom ensemble model comprising four algorithms commonly used for activity recognition (binary decision tree, k nearest neighbor, support vector machine, and neural network). The study used three independent data sets that included wrist-worn accelerometer data. For each data set, a four-step classification framework consisting of data preprocessing, feature extraction, normalization and feature selection, and classifier training and testing was implemented. For the custom ensemble, decisions from the single classifiers were aggregated using three decision fusion methods: weighted majority vote, naïve Bayes combination, and behavior knowledge space combination. Classifiers were cross-validated using leave-one subject out cross-validation and compared on the basis of average F1 scores. In all three data sets, ensemble learning methods consistently outperformed the individual classifiers. Among the conventional ensemble methods, random forest models provided consistently high activity recognition; however, the custom ensemble model using weighted majority voting demonstrated the highest classification accuracy in two of the three data sets. Combining multiple individual classifiers using conventional or custom ensemble learning methods can improve activity recognition accuracy from wrist-worn accelerometer data.
Acosta-Mesa, Héctor-Gabriel; Rechy-Ramírez, Fernando; Mezura-Montes, Efrén; Cruz-Ramírez, Nicandro; Hernández Jiménez, Rodolfo
2014-06-01
In this work, we present a novel application of time series discretization using evolutionary programming for the classification of precancerous cervical lesions. The approach optimizes the number of intervals in which the length and amplitude of the time series should be compressed, preserving the important information for classification purposes. Using evolutionary programming, the search for a good discretization scheme is guided by a cost function which considers three criteria: the entropy regarding the classification, the complexity measured as the number of different strings needed to represent the complete data set, and the compression rate assessed as the length of the discrete representation. This discretization approach is evaluated using a time series data based on temporal patterns observed during a classical test used in cervical cancer detection; the classification accuracy reached by our method is compared with the well-known times series discretization algorithm SAX and the dimensionality reduction method PCA. Statistical analysis of the classification accuracy shows that the discrete representation is as efficient as the complete raw representation for the present application, reducing the dimensionality of the time series length by 97%. This representation is also very competitive in terms of classification accuracy when compared with similar approaches. Copyright © 2014 Elsevier Inc. All rights reserved.
Steganalysis using logistic regression
NASA Astrophysics Data System (ADS)
Lubenko, Ivans; Ker, Andrew D.
2011-02-01
We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.
Spatial modeling and classification of corneal shape.
Marsolo, Keith; Twa, Michael; Bullimore, Mark A; Parthasarathy, Srinivasan
2007-03-01
One of the most promising applications of data mining is in biomedical data used in patient diagnosis. Any method of data analysis intended to support the clinical decision-making process should meet several criteria: it should capture clinically relevant features, be computationally feasible, and provide easily interpretable results. In an initial study, we examined the feasibility of using Zernike polynomials to represent biomedical instrument data in conjunction with a decision tree classifier to distinguish between the diseased and non-diseased eyes. Here, we provide a comprehensive follow-up to that work, examining a second representation, pseudo-Zernike polynomials, to determine whether they provide any increase in classification accuracy. We compare the fidelity of both methods using residual root-mean-square (rms) error and evaluate accuracy using several classifiers: neural networks, C4.5 decision trees, Voting Feature Intervals, and Naïve Bayes. We also examine the effect of several meta-learning strategies: boosting, bagging, and Random Forests (RFs). We present results comparing accuracy as it relates to dataset and transformation resolution over a larger, more challenging, multi-class dataset. They show that classification accuracy is similar for both data transformations, but differs by classifier. We find that the Zernike polynomials provide better feature representation than the pseudo-Zernikes and that the decision trees yield the best balance of classification accuracy and interpretability.
Duvekot, Jorieke; van der Ende, Jan; Verhulst, Frank C; Greaves-Lord, Kirstin
2015-06-01
The screening accuracy of the parent and teacher-reported Social Responsiveness Scale (SRS) was compared with an autism spectrum disorder (ASD) classification according to (1) the Developmental, Dimensional, and Diagnostic Interview (3 Di), (2) the Autism Diagnostic Observation Schedule (ADOS), (3) both the 3 Di and ADOS, in 186 children referred to six mental health centers. The parent report showed excellent correspondence to an ASD classification according to the 3 Di and both the 3 Di and ADOS. The teacher report added significantly to the screening accuracy over and above the parent report when compared with the ADOS classification. Findings support the screening utility of the parent-reported SRS among clinically referred children and indicate that different informants may provide unique information relevant for ASD assessment.
NASA Astrophysics Data System (ADS)
Hale Topaloğlu, Raziye; Sertel, Elif; Musaoğlu, Nebiye
2016-06-01
This study aims to compare classification accuracies of land cover/use maps created from Sentinel-2 and Landsat-8 data. Istanbul metropolitan city of Turkey, with a population of around 14 million, having different landscape characteristics was selected as study area. Water, forest, agricultural areas, grasslands, transport network, urban, airport- industrial units and barren land- mine land cover/use classes adapted from CORINE nomenclature were used as main land cover/use classes to identify. To fulfil the aims of this research, recently acquired dated 08/02/2016 Sentinel-2 and dated 22/02/2016 Landsat-8 images of Istanbul were obtained and image pre-processing steps like atmospheric and geometric correction were employed. Both Sentinel-2 and Landsat-8 images were resampled to 30m pixel size after geometric correction and similar spectral bands for both satellites were selected to create a similar base for these multi-sensor data. Maximum Likelihood (MLC) and Support Vector Machine (SVM) supervised classification methods were applied to both data sets to accurately identify eight different land cover/ use classes. Error matrix was created using same reference points for Sentinel-2 and Landsat-8 classifications. After the classification accuracy, results were compared to find out the best approach to create current land cover/use map of the region. The results of MLC and SVM classification methods were compared for both images.
Ruiz Hidalgo, Irene; Rodriguez, Pablo; Rozema, Jos J; Ní Dhubhghaill, Sorcha; Zakaria, Nadia; Tassignon, Marie-José; Koppen, Carina
2016-06-01
To evaluate the performance of a support vector machine algorithm that automatically and objectively identifies corneal patterns based on a combination of 22 parameters obtained from Pentacam measurements and to compare this method with other known keratoconus (KC) classification methods. Pentacam data from 860 eyes were included in the study and divided into 5 groups: 454 KC, 67 forme fruste (FF), 28 astigmatic, 117 after refractive surgery (PR), and 194 normal eyes (N). Twenty-two parameters were used for classification using a support vector machine algorithm developed in Weka, a machine-learning computer software. The cross-validation accuracy for 3 different classification tasks (KC vs. N, FF vs. N and all 5 groups) was calculated and compared with other known classification methods. The accuracy achieved in the KC versus N discrimination task was 98.9%, with 99.1% sensitivity and 98.5% specificity for KC detection. The accuracy in the FF versus N task was 93.1%, with 79.1% sensitivity and 97.9% specificity for the FF discrimination. Finally, for the 5-groups classification, the accuracy was 88.8%, with a weighted average sensitivity of 89.0% and specificity of 95.2%. Despite using the strictest definition for FF KC, the present study obtained comparable or better results than the single-parameter methods and indices reported in the literature. In some cases, direct comparisons with the literature were not possible because of differences in the compositions and definitions of the study groups, especially the FF KC.
Myint, S.W.; Yuan, M.; Cerveny, R.S.; Giri, C.P.
2008-01-01
Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and objectoriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. ?? 2008 by MDPI.
Myint, Soe W.; Yuan, May; Cerveny, Randall S.; Giri, Chandra P.
2008-01-01
Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and object-oriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. PMID:27879757
Van Cott, Andrew; Hastings, Charles E; Landsiedel, Robert; Kolle, Susanne; Stinchcombe, Stefan
2018-02-01
In vivo acute systemic testing is a regulatory requirement for agrochemical formulations. GHS specifies an alternative computational approach (GHS additivity formula) for calculating the acute toxicity of mixtures. We collected acute systemic toxicity data from formulations that contained one of several acutely-toxic active ingredients. The resulting acute data set includes 210 formulations tested for oral toxicity, 128 formulations tested for inhalation toxicity and 31 formulations tested for dermal toxicity. The GHS additivity formula was applied to each of these formulations and compared with the experimental in vivo result. In the acute oral assay, the GHS additivity formula misclassified 110 formulations using the GHS classification criteria (48% accuracy) and 119 formulations using the USEPA classification criteria (43% accuracy). With acute inhalation, the GHS additivity formula misclassified 50 formulations using the GHS classification criteria (61% accuracy) and 34 formulations using the USEPA classification criteria (73% accuracy). For acute dermal toxicity, the GHS additivity formula misclassified 16 formulations using the GHS classification criteria (48% accuracy) and 20 formulations using the USEPA classification criteria (36% accuracy). This data indicates the acute systemic toxicity of many formulations is not the sum of the ingredients' toxicity (additivity); but rather, ingredients in a formulation can interact to result in lower or higher toxicity than predicted by the GHS additivity formula. Copyright © 2018 Elsevier Inc. All rights reserved.
Classification of urban features using airborne hyperspectral data
NASA Astrophysics Data System (ADS)
Ganesh Babu, Bharath
Accurate mapping and modeling of urban environments are critical for their efficient and successful management. Superior understanding of complex urban environments is made possible by using modern geospatial technologies. This research focuses on thematic classification of urban land use and land cover (LULC) using 248 bands of 2.0 meter resolution hyperspectral data acquired from an airborne imaging spectrometer (AISA+) on 24th July 2006 in and near Terre Haute, Indiana. Three distinct study areas including two commercial classes, two residential classes, and two urban parks/recreational classes were selected for classification and analysis. Four commonly used classification methods -- maximum likelihood (ML), extraction and classification of homogeneous objects (ECHO), spectral angle mapper (SAM), and iterative self organizing data analysis (ISODATA) - were applied to each data set. Accuracy assessment was conducted and overall accuracies were compared between the twenty four resulting thematic maps. With the exception of SAM and ISODATA in a complex commercial area, all methods employed classified the designated urban features with more than 80% accuracy. The thematic classification from ECHO showed the best agreement with ground reference samples. The residential area with relatively homogeneous composition was classified consistently with highest accuracy by all four of the classification methods used. The average accuracy amongst the classifiers was 93.60% for this area. When individually observed, the complex recreational area (Deming Park) was classified with the highest accuracy by ECHO, with an accuracy of 96.80% and 96.10% Kappa. The average accuracy amongst all the classifiers was 92.07%. The commercial area with relatively high complexity was classified with the least accuracy by all classifiers. The lowest accuracy was achieved by SAM at 63.90% with 59.20% Kappa. This was also the lowest accuracy in the entire analysis. This study demonstrates the potential for using the visible and near infrared (VNIR) bands from AISA+ hyperspectral data in urban LULC classification. Based on their performance, the need for further research using ECHO and SAM is underscored. The importance incorporating imaging spectrometer data in high resolution urban feature mapping is emphasized.
On-line analysis of algae in water by discrete three-dimensional fluorescence spectroscopy.
Zhao, Nanjing; Zhang, Xiaoling; Yin, Gaofang; Yang, Ruifang; Hu, Li; Chen, Shuang; Liu, Jianguo; Liu, Wenqing
2018-03-19
In view of the problem of the on-line measurement of algae classification, a method of algae classification and concentration determination based on the discrete three-dimensional fluorescence spectra was studied in this work. The discrete three-dimensional fluorescence spectra of twelve common species of algae belonging to five categories were analyzed, the discrete three-dimensional standard spectra of five categories were built, and the recognition, classification and concentration prediction of algae categories were realized by the discrete three-dimensional fluorescence spectra coupled with non-negative weighted least squares linear regression analysis. The results show that similarities between discrete three-dimensional standard spectra of different categories were reduced and the accuracies of recognition, classification and concentration prediction of the algae categories were significantly improved. By comparing with that of the chlorophyll a fluorescence excitation spectra method, the recognition accuracy rate in pure samples by discrete three-dimensional fluorescence spectra is improved 1.38%, and the recovery rate and classification accuracy in pure diatom samples 34.1% and 46.8%, respectively; the recognition accuracy rate of mixed samples by discrete-three dimensional fluorescence spectra is enhanced by 26.1%, the recovery rate of mixed samples with Chlorophyta 37.8%, and the classification accuracy of mixed samples with diatoms 54.6%.
NASA Astrophysics Data System (ADS)
Zhang, Zhiming; de Wulf, Robert R.; van Coillie, Frieke M. B.; Verbeke, Lieven P. C.; de Clercq, Eva M.; Ou, Xiaokun
2011-01-01
Mapping of vegetation using remote sensing in mountainous areas is considerably hampered by topographic effects on the spectral response pattern. A variety of topographic normalization techniques have been proposed to correct these illumination effects due to topography. The purpose of this study was to compare six different topographic normalization methods (Cosine correction, Minnaert correction, C-correction, Sun-canopy-sensor correction, two-stage topographic normalization, and slope matching technique) for their effectiveness in enhancing vegetation classification in mountainous environments. Since most of the vegetation classes in the rugged terrain of the Lancang Watershed (China) did not feature a normal distribution, artificial neural networks (ANNs) were employed as a classifier. Comparing the ANN classifications, none of the topographic correction methods could significantly improve ETM+ image classification overall accuracy. Nevertheless, at the class level, the accuracy of pine forest could be increased by using topographically corrected images. On the contrary, oak forest and mixed forest accuracies were significantly decreased by using corrected images. The results also showed that none of the topographic normalization strategies was satisfactorily able to correct for the topographic effects in severely shadowed areas.
Trakoolwilaiwan, Thanawin; Behboodi, Bahareh; Lee, Jaeseok; Kim, Kyungsoo; Choi, Ji-Woong
2018-01-01
The aim of this work is to develop an effective brain-computer interface (BCI) method based on functional near-infrared spectroscopy (fNIRS). In order to improve the performance of the BCI system in terms of accuracy, the ability to discriminate features from input signals and proper classification are desired. Previous studies have mainly extracted features from the signal manually, but proper features need to be selected carefully. To avoid performance degradation caused by manual feature selection, we applied convolutional neural networks (CNNs) as the automatic feature extractor and classifier for fNIRS-based BCI. In this study, the hemodynamic responses evoked by performing rest, right-, and left-hand motor execution tasks were measured on eight healthy subjects to compare performances. Our CNN-based method provided improvements in classification accuracy over conventional methods employing the most commonly used features of mean, peak, slope, variance, kurtosis, and skewness, classified by support vector machine (SVM) and artificial neural network (ANN). Specifically, up to 6.49% and 3.33% improvement in classification accuracy was achieved by CNN compared with SVM and ANN, respectively.
Developing collaborative classifiers using an expert-based model
Mountrakis, G.; Watts, R.; Luo, L.; Wang, Jingyuan
2009-01-01
This paper presents a hierarchical, multi-stage adaptive strategy for image classification. We iteratively apply various classification methods (e.g., decision trees, neural networks), identify regions of parametric and geographic space where accuracy is low, and in these regions, test and apply alternate methods repeating the process until the entire image is classified. Currently, classifiers are evaluated through human input using an expert-based system; therefore, this paper acts as the proof of concept for collaborative classifiers. Because we decompose the problem into smaller, more manageable sub-tasks, our classification exhibits increased flexibility compared to existing methods since classification methods are tailored to the idiosyncrasies of specific regions. A major benefit of our approach is its scalability and collaborative support since selected low-accuracy classifiers can be easily replaced with others without affecting classification accuracy in high accuracy areas. At each stage, we develop spatially explicit accuracy metrics that provide straightforward assessment of results by non-experts and point to areas that need algorithmic improvement or ancillary data. Our approach is demonstrated in the task of detecting impervious surface areas, an important indicator for human-induced alterations to the environment, using a 2001 Landsat scene from Las Vegas, Nevada. ?? 2009 American Society for Photogrammetry and Remote Sensing.
Pettersson-Yeo, William; Benetti, Stefania; Marquand, Andre F.; Joules, Richard; Catani, Marco; Williams, Steve C. R.; Allen, Paul; McGuire, Philip; Mechelli, Andrea
2014-01-01
In the pursuit of clinical utility, neuroimaging researchers of psychiatric and neurological illness are increasingly using analyses, such as support vector machine, that allow inference at the single-subject level. Recent studies employing single-modality data, however, suggest that classification accuracies must be improved for such utility to be realized. One possible solution is to integrate different data types to provide a single combined output classification; either by generating a single decision function based on an integrated kernel matrix, or, by creating an ensemble of multiple single modality classifiers and integrating their predictions. Here, we describe four integrative approaches: (1) an un-weighted sum of kernels, (2) multi-kernel learning, (3) prediction averaging, and (4) majority voting, and compare their ability to enhance classification accuracy relative to the best single-modality classification accuracy. We achieve this by integrating structural, functional, and diffusion tensor magnetic resonance imaging data, in order to compare ultra-high risk (n = 19), first episode psychosis (n = 19) and healthy control subjects (n = 23). Our results show that (i) whilst integration can enhance classification accuracy by up to 13%, the frequency of such instances may be limited, (ii) where classification can be enhanced, simple methods may yield greater increases relative to more computationally complex alternatives, and, (iii) the potential for classification enhancement is highly influenced by the specific diagnostic comparison under consideration. In conclusion, our findings suggest that for moderately sized clinical neuroimaging datasets, combining different imaging modalities in a data-driven manner is no “magic bullet” for increasing classification accuracy. However, it remains possible that this conclusion is dependent on the use of neuroimaging modalities that had little, or no, complementary information to offer one another, and that the integration of more diverse types of data would have produced greater classification enhancement. We suggest that future studies ideally examine a greater variety of data types (e.g., genetic, cognitive, and neuroimaging) in order to identify the data types and combinations optimally suited to the classification of early stage psychosis. PMID:25076868
Pettersson-Yeo, William; Benetti, Stefania; Marquand, Andre F; Joules, Richard; Catani, Marco; Williams, Steve C R; Allen, Paul; McGuire, Philip; Mechelli, Andrea
2014-01-01
In the pursuit of clinical utility, neuroimaging researchers of psychiatric and neurological illness are increasingly using analyses, such as support vector machine, that allow inference at the single-subject level. Recent studies employing single-modality data, however, suggest that classification accuracies must be improved for such utility to be realized. One possible solution is to integrate different data types to provide a single combined output classification; either by generating a single decision function based on an integrated kernel matrix, or, by creating an ensemble of multiple single modality classifiers and integrating their predictions. Here, we describe four integrative approaches: (1) an un-weighted sum of kernels, (2) multi-kernel learning, (3) prediction averaging, and (4) majority voting, and compare their ability to enhance classification accuracy relative to the best single-modality classification accuracy. We achieve this by integrating structural, functional, and diffusion tensor magnetic resonance imaging data, in order to compare ultra-high risk (n = 19), first episode psychosis (n = 19) and healthy control subjects (n = 23). Our results show that (i) whilst integration can enhance classification accuracy by up to 13%, the frequency of such instances may be limited, (ii) where classification can be enhanced, simple methods may yield greater increases relative to more computationally complex alternatives, and, (iii) the potential for classification enhancement is highly influenced by the specific diagnostic comparison under consideration. In conclusion, our findings suggest that for moderately sized clinical neuroimaging datasets, combining different imaging modalities in a data-driven manner is no "magic bullet" for increasing classification accuracy. However, it remains possible that this conclusion is dependent on the use of neuroimaging modalities that had little, or no, complementary information to offer one another, and that the integration of more diverse types of data would have produced greater classification enhancement. We suggest that future studies ideally examine a greater variety of data types (e.g., genetic, cognitive, and neuroimaging) in order to identify the data types and combinations optimally suited to the classification of early stage psychosis.
NASA Astrophysics Data System (ADS)
Anitha, J.; Vijila, C. Kezi Selva; Hemanth, D. Jude
2010-02-01
Diabetic retinopathy (DR) is a chronic eye disease for which early detection is highly essential to avoid any fatal results. Image processing of retinal images emerge as a feasible tool for this early diagnosis. Digital image processing techniques involve image classification which is a significant technique to detect the abnormality in the eye. Various automated classification systems have been developed in the recent years but most of them lack high classification accuracy. Artificial neural networks are the widely preferred artificial intelligence technique since it yields superior results in terms of classification accuracy. In this work, Radial Basis function (RBF) neural network based bi-level classification system is proposed to differentiate abnormal DR Images and normal retinal images. The results are analyzed in terms of classification accuracy, sensitivity and specificity. A comparative analysis is performed with the results of the probabilistic classifier namely Bayesian classifier to show the superior nature of neural classifier. Experimental results show promising results for the neural classifier in terms of the performance measures.
Deshpande, Gopikrishna; Wang, Peng; Rangaprakash, D; Wilamowski, Bogdan
2015-12-01
Automated recognition and classification of brain diseases are of tremendous value to society. Attention deficit hyperactivity disorder (ADHD) is a diverse spectrum disorder whose diagnosis is based on behavior and hence will benefit from classification utilizing objective neuroimaging measures. Toward this end, an international competition was conducted for classifying ADHD using functional magnetic resonance imaging data acquired from multiple sites worldwide. Here, we consider the data from this competition as an example to illustrate the utility of fully connected cascade (FCC) artificial neural network (ANN) architecture for performing classification. We employed various directional and nondirectional brain connectivity-based methods to extract discriminative features which gave better classification accuracy compared to raw data. Our accuracy for distinguishing ADHD from healthy subjects was close to 90% and between the ADHD subtypes was close to 95%. Further, we show that, if properly used, FCC ANN performs very well compared to other classifiers such as support vector machines in terms of accuracy, irrespective of the feature used. Finally, the most discriminative connectivity features provided insights about the pathophysiology of ADHD and showed reduced and altered connectivity involving the left orbitofrontal cortex and various cerebellar regions in ADHD.
Landcover classification in MRF context using Dempster-Shafer fusion for multisensor imagery.
Sarkar, Anjan; Banerjee, Anjan; Banerjee, Nilanjan; Brahma, Siddhartha; Kartikeyan, B; Chakraborty, Manab; Majumder, K L
2005-05-01
This work deals with multisensor data fusion to obtain landcover classification. The role of feature-level fusion using the Dempster-Shafer rule and that of data-level fusion in the MRF context is studied in this paper to obtain an optimally segmented image. Subsequently, segments are validated and classification accuracy for the test data is evaluated. Two examples of data fusion of optical images and a synthetic aperture radar image are presented, each set having been acquired on different dates. Classification accuracies of the technique proposed are compared with those of some recent techniques in literature for the same image data.
NASA Astrophysics Data System (ADS)
Diesing, Markus; Green, Sophie L.; Stephens, David; Lark, R. Murray; Stewart, Heather A.; Dove, Dayton
2014-08-01
Marine spatial planning and conservation need underpinning with sufficiently detailed and accurate seabed substrate and habitat maps. Although multibeam echosounders enable us to map the seabed with high resolution and spatial accuracy, there is still a lack of fit-for-purpose seabed maps. This is due to the high costs involved in carrying out systematic seabed mapping programmes and the fact that the development of validated, repeatable, quantitative and objective methods of swath acoustic data interpretation is still in its infancy. We compared a wide spectrum of approaches including manual interpretation, geostatistics, object-based image analysis and machine-learning to gain further insights into the accuracy and comparability of acoustic data interpretation approaches based on multibeam echosounder data (bathymetry, backscatter and derivatives) and seabed samples with the aim to derive seabed substrate maps. Sample data were split into a training and validation data set to allow us to carry out an accuracy assessment. Overall thematic classification accuracy ranged from 67% to 76% and Cohen's kappa varied between 0.34 and 0.52. However, these differences were not statistically significant at the 5% level. Misclassifications were mainly associated with uncommon classes, which were rarely sampled. Map outputs were between 68% and 87% identical. To improve classification accuracy in seabed mapping, we suggest that more studies on the effects of factors affecting the classification performance as well as comparative studies testing the performance of different approaches need to be carried out with a view to developing guidelines for selecting an appropriate method for a given dataset. In the meantime, classification accuracy might be improved by combining different techniques to hybrid approaches and multi-method ensembles.
NASA Astrophysics Data System (ADS)
Hall-Brown, Mary
The heterogeneity of Arctic vegetation can make land cover classification vey difficult when using medium to small resolution imagery (Schneider et al., 2009; Muller et al., 1999). Using high radiometric and spatial resolution imagery, such as the SPOT 5 and IKONOS satellites, have helped arctic land cover classification accuracies rise into the 80 and 90 percentiles (Allard, 2003; Stine et al., 2010; Muller et al., 1999). However, those increases usually come at a high price. High resolution imagery is very expensive and can often add tens of thousands of dollars onto the cost of the research. The EO-1 satellite launched in 2002 carries two sensors that have high specral and/or high spatial resolutions and can be an acceptable compromise between the resolution versus cost issues. The Hyperion is a hyperspectral sensor with the capability of collecting 242 spectral bands of information. The Advanced Land Imager (ALI) is an advanced multispectral sensor whose spatial resolution can be sharpened to 10 meters. This dissertation compares the accuracies of arctic land cover classifications produced by the Hyperion and ALI sensors to the classification accuracies produced by the Systeme Pour l' Observation de le Terre (SPOT), the Landsat Thematic Mapper (TM) and the Landsat Enhanced Thematic Mapper Plus (ETM+) sensors. Hyperion and ALI images from August 2004 were collected over the Upper Kuparuk River Basin, Alaska. Image processing included the stepwise discriminant analysis of pixels that were positively classified from coinciding ground control points, geometric and radiometric correction, and principle component analysis. Finally, stratified random sampling was used to perform accuracy assessments on satellite derived land cover classifications. Accuracy was estimated from an error matrix (confusion matrix) that provided the overall, producer's and user's accuracies. This research found that while the Hyperion sensor produced classfication accuracies that were equivalent to the TM and ETM+ sensor (approximately 78%), the Hyperion could not obtain the accuracy of the SPOT 5 HRV sensor. However, the land cover classifications derived from the ALI sensor exceeded most classification accuracies derived from the TM and ETM+ senors and were even comparable to most SPOT 5 HRV classifications (87%). With the deactivation of the Landsat series satellites, the monitoring of remote locations such as in the Arctic on an uninterupted basis thoughout the world is in jeopardy. The utilization of the Hyperion and ALI sensors are a way to keep that endeavor operational. By keeping the ALI sensor active at all times, uninterupted observation of the entire Earth can be accomplished. Keeping the Hyperion sensor as a "tasked" sensor can provide scientists with additional imagery and options for their studies without overburdening storage issues.
NASA Technical Reports Server (NTRS)
Mulligan, P. J.; Gervin, J. C.; Lu, Y. C.
1985-01-01
An area bordering the Eastern Shore of the Chesapeake Bay was selected for study and classified using unsupervised techniques applied to LANDSAT-2 MSS data and several band combinations of LANDSAT-4 TM data. The accuracies of these Level I land cover classifications were verified using the Taylor's Island USGS 7.5 minute topographic map which was photointerpreted, digitized and rasterized. The the Taylor's Island map, comparing the MSS and TM three band (2 3 4) classifications, the increased resolution of TM produced a small improvement in overall accuracy of 1% correct due primarily to a small improvement, and 1% and 3%, in areas such as water and woodland. This was expected as the MSS data typically produce high accuracies for categories which cover large contiguous areas. However, in the categories covering smaller areas within the map there was generally an improvement of at least 10%. Classification of the important residential category improved 12%, and wetlands were mapped with 11% greater accuracy.
Zhang, He-Hua; Yang, Liuyang; Liu, Yuchuan; Wang, Pin; Yin, Jun; Li, Yongming; Qiu, Mingguo; Zhu, Xueru; Yan, Fang
2016-11-16
The use of speech based data in the classification of Parkinson disease (PD) has been shown to provide an effect, non-invasive mode of classification in recent years. Thus, there has been an increased interest in speech pattern analysis methods applicable to Parkinsonism for building predictive tele-diagnosis and tele-monitoring models. One of the obstacles in optimizing classifications is to reduce noise within the collected speech samples, thus ensuring better classification accuracy and stability. While the currently used methods are effect, the ability to invoke instance selection has been seldomly examined. In this study, a PD classification algorithm was proposed and examined that combines a multi-edit-nearest-neighbor (MENN) algorithm and an ensemble learning algorithm. First, the MENN algorithm is applied for selecting optimal training speech samples iteratively, thereby obtaining samples with high separability. Next, an ensemble learning algorithm, random forest (RF) or decorrelated neural network ensembles (DNNE), is used to generate trained samples from the collected training samples. Lastly, the trained ensemble learning algorithms are applied to the test samples for PD classification. This proposed method was examined using a more recently deposited public datasets and compared against other currently used algorithms for validation. Experimental results showed that the proposed algorithm obtained the highest degree of improved classification accuracy (29.44%) compared with the other algorithm that was examined. Furthermore, the MENN algorithm alone was found to improve classification accuracy by as much as 45.72%. Moreover, the proposed algorithm was found to exhibit a higher stability, particularly when combining the MENN and RF algorithms. This study showed that the proposed method could improve PD classification when using speech data and can be applied to future studies seeking to improve PD classification methods.
ANALYSIS OF A CLASSIFICATION ERROR MATRIX USING CATEGORICAL DATA TECHNIQUES.
Rosenfield, George H.; Fitzpatrick-Lins, Katherine
1984-01-01
Summary form only given. A classification error matrix typically contains tabulation results of an accuracy evaluation of a thematic classification, such as that of a land use and land cover map. The diagonal elements of the matrix represent the counts corrected, and the usual designation of classification accuracy has been the total percent correct. The nondiagonal elements of the matrix have usually been neglected. The classification error matrix is known in statistical terms as a contingency table of categorical data. As an example, an application of these methodologies to a problem of remotely sensed data concerning two photointerpreters and four categories of classification indicated that there is no significant difference in the interpretation between the two photointerpreters, and that there are significant differences among the interpreted category classifications. However, two categories, oak and cottonwood, are not separable in classification in this experiment at the 0. 51 percent probability. A coefficient of agreement is determined for the interpreted map as a whole, and individually for each of the interpreted categories. A conditional coefficient of agreement for the individual categories is compared to other methods for expressing category accuracy which have already been presented in the remote sensing literature.
Item Selection Criteria with Practical Constraints for Computerized Classification Testing
ERIC Educational Resources Information Center
Lin, Chuan-Ju
2011-01-01
This study compares four item selection criteria for a two-category computerized classification testing: (1) Fisher information (FI), (2) Kullback-Leibler information (KLI), (3) weighted log-odds ratio (WLOR), and (4) mutual information (MI), with respect to the efficiency and accuracy of classification decision using the sequential probability…
NASA Astrophysics Data System (ADS)
Mukhopadhyay, Sabyasachi; Kurmi, Indrajit; Pratiher, Sawon; Mukherjee, Sukanya; Barman, Ritwik; Ghosh, Nirmalya; Panigrahi, Prasanta K.
2018-02-01
In this paper, a comparative study between SVM and HMM has been carried out for multiclass classification of cervical healthy and cancerous tissues. In our study, the HMM methodology is more promising to produce higher accuracy in classification.
GMM-based speaker age and gender classification in Czech and Slovak
NASA Astrophysics Data System (ADS)
Přibil, Jiří; Přibilová, Anna; Matoušek, Jindřich
2017-01-01
The paper describes an experiment with using the Gaussian mixture models (GMM) for automatic classification of the speaker age and gender. It analyses and compares the influence of different number of mixtures and different types of speech features used for GMM gender/age classification. Dependence of the computational complexity on the number of used mixtures is also analysed. Finally, the GMM classification accuracy is compared with the output of the conventional listening tests. The results of these objective and subjective evaluations are in correspondence.
Automatic classification of protein structures using physicochemical parameters.
Mohan, Abhilash; Rao, M Divya; Sunderrajan, Shruthi; Pennathur, Gautam
2014-09-01
Protein classification is the first step to functional annotation; SCOP and Pfam databases are currently the most relevant protein classification schemes. However, the disproportion in the number of three dimensional (3D) protein structures generated versus their classification into relevant superfamilies/families emphasizes the need for automated classification schemes. Predicting function of novel proteins based on sequence information alone has proven to be a major challenge. The present study focuses on the use of physicochemical parameters in conjunction with machine learning algorithms (Naive Bayes, Decision Trees, Random Forest and Support Vector Machines) to classify proteins into their respective SCOP superfamily/Pfam family, using sequence derived information. Spectrophores™, a 1D descriptor of the 3D molecular field surrounding a structure was used as a benchmark to compare the performance of the physicochemical parameters. The machine learning algorithms were modified to select features based on information gain for each SCOP superfamily/Pfam family. The effect of combining physicochemical parameters and spectrophores on classification accuracy (CA) was studied. Machine learning algorithms trained with the physicochemical parameters consistently classified SCOP superfamilies and Pfam families with a classification accuracy above 90%, while spectrophores performed with a CA of around 85%. Feature selection improved classification accuracy for both physicochemical parameters and spectrophores based machine learning algorithms. Combining both attributes resulted in a marginal loss of performance. Physicochemical parameters were able to classify proteins from both schemes with classification accuracy ranging from 90-96%. These results suggest the usefulness of this method in classifying proteins from amino acid sequences.
Dumitru Salajanu; Dennis M. Jacobs
2007-01-01
The objective of this study was to determine how well forestfnon-forest and biomass classifications obtained from Landsat-TM and MODIS satellite data modeled with FIA plots, compare to each other and with forested area and biomass estimates from the national inventory data, as well as whether there is an increase in overall accuracy when pixel size (spatial resolution...
NASA Technical Reports Server (NTRS)
Spruce, J. P.; Smoot, James; Ellis, Jean; Hilbert, Kent; Swann, Roberta
2012-01-01
This paper discusses the development and implementation of a geospatial data processing method and multi-decadal Landsat time series for computing general coastal U.S. land-use and land-cover (LULC) classifications and change products consisting of seven classes (water, barren, upland herbaceous, non-woody wetland, woody upland, woody wetland, and urban). Use of this approach extends the observational period of the NOAA-generated Coastal Change and Analysis Program (C-CAP) products by almost two decades, assuming the availability of one cloud free Landsat scene from any season for each targeted year. The Mobile Bay region in Alabama was used as a study area to develop, demonstrate, and validate the method that was applied to derive LULC products for nine dates at approximate five year intervals across a 34-year time span, using single dates of data for each classification in which forests were either leaf-on, leaf-off, or mixed senescent conditions. Classifications were computed and refined using decision rules in conjunction with unsupervised classification of Landsat data and C-CAP value-added products. Each classification's overall accuracy was assessed by comparing stratified random locations to available reference data, including higher spatial resolution satellite and aerial imagery, field survey data, and raw Landsat RGBs. Overall classification accuracies ranged from 83 to 91% with overall Kappa statistics ranging from 0.78 to 0.89. The accuracies are comparable to those from similar, generalized LULC products derived from C-CAP data. The Landsat MSS-based LULC product accuracies are similar to those from Landsat TM or ETM+ data. Accurate classifications were computed for all nine dates, yielding effective results regardless of season. This classification method yielded products that were used to compute LULC change products via additive GIS overlay techniques.
Forest tree species discrimination in western Himalaya using EO-1 Hyperion
NASA Astrophysics Data System (ADS)
George, Rajee; Padalia, Hitendra; Kushwaha, S. P. S.
2014-05-01
The information acquired in the narrow bands of hyperspectral remote sensing data has potential to capture plant species spectral variability, thereby improving forest tree species mapping. This study assessed the utility of spaceborne EO-1 Hyperion data in discrimination and classification of broadleaved evergreen and conifer forest tree species in western Himalaya. The pre-processing of 242 bands of Hyperion data resulted into 160 noise-free and vertical stripe corrected reflectance bands. Of these, 29 bands were selected through step-wise exclusion of bands (Wilk's Lambda). Spectral Angle Mapper (SAM) and Support Vector Machine (SVM) algorithms were applied to the selected bands to assess their effectiveness in classification. SVM was also applied to broadband data (Landsat TM) to compare the variation in classification accuracy. All commonly occurring six gregarious tree species, viz., white oak, brown oak, chir pine, blue pine, cedar and fir in western Himalaya could be effectively discriminated. SVM produced a better species classification (overall accuracy 82.27%, kappa statistic 0.79) than SAM (overall accuracy 74.68%, kappa statistic 0.70). It was noticed that classification accuracy achieved with Hyperion bands was significantly higher than Landsat TM bands (overall accuracy 69.62%, kappa statistic 0.65). Study demonstrated the potential utility of narrow spectral bands of Hyperion data in discriminating tree species in a hilly terrain.
A Survey on Sentiment Classification in Face Recognition
NASA Astrophysics Data System (ADS)
Qian, Jingyu
2018-01-01
Face recognition has been an important topic for both industry and academia for a long time. K-means clustering, autoencoder, and convolutional neural network, each representing a design idea for face recognition method, are three popular algorithms to deal with face recognition problems. It is worthwhile to summarize and compare these three different algorithms. This paper will focus on one specific face recognition problem-sentiment classification from images. Three different algorithms for sentiment classification problems will be summarized, including k-means clustering, autoencoder, and convolutional neural network. An experiment with the application of these algorithms on a specific dataset of human faces will be conducted to illustrate how these algorithms are applied and their accuracy. Finally, the three algorithms are compared based on the accuracy result.
Word pair classification during imagined speech using direct brain recordings
NASA Astrophysics Data System (ADS)
Martin, Stephanie; Brunner, Peter; Iturrate, Iñaki; Millán, José Del R.; Schalk, Gerwin; Knight, Robert T.; Pasley, Brian N.
2016-05-01
People that cannot communicate due to neurological disorders would benefit from an internal speech decoder. Here, we showed the ability to classify individual words during imagined speech from electrocorticographic signals. In a word imagery task, we used high gamma (70-150 Hz) time features with a support vector machine model to classify individual words from a pair of words. To account for temporal irregularities during speech production, we introduced a non-linear time alignment into the SVM kernel. Classification accuracy reached 88% in a two-class classification framework (50% chance level), and average classification accuracy across fifteen word-pairs was significant across five subjects (mean = 58% p < 0.05). We also compared classification accuracy between imagined speech, overt speech and listening. As predicted, higher classification accuracy was obtained in the listening and overt speech conditions (mean = 89% and 86%, respectively; p < 0.0001), where speech stimuli were directly presented. The results provide evidence for a neural representation for imagined words in the temporal lobe, frontal lobe and sensorimotor cortex, consistent with previous findings in speech perception and production. These data represent a proof of concept study for basic decoding of speech imagery, and delineate a number of key challenges to usage of speech imagery neural representations for clinical applications.
Word pair classification during imagined speech using direct brain recordings
Martin, Stephanie; Brunner, Peter; Iturrate, Iñaki; Millán, José del R.; Schalk, Gerwin; Knight, Robert T.; Pasley, Brian N.
2016-01-01
People that cannot communicate due to neurological disorders would benefit from an internal speech decoder. Here, we showed the ability to classify individual words during imagined speech from electrocorticographic signals. In a word imagery task, we used high gamma (70–150 Hz) time features with a support vector machine model to classify individual words from a pair of words. To account for temporal irregularities during speech production, we introduced a non-linear time alignment into the SVM kernel. Classification accuracy reached 88% in a two-class classification framework (50% chance level), and average classification accuracy across fifteen word-pairs was significant across five subjects (mean = 58%; p < 0.05). We also compared classification accuracy between imagined speech, overt speech and listening. As predicted, higher classification accuracy was obtained in the listening and overt speech conditions (mean = 89% and 86%, respectively; p < 0.0001), where speech stimuli were directly presented. The results provide evidence for a neural representation for imagined words in the temporal lobe, frontal lobe and sensorimotor cortex, consistent with previous findings in speech perception and production. These data represent a proof of concept study for basic decoding of speech imagery, and delineate a number of key challenges to usage of speech imagery neural representations for clinical applications. PMID:27165452
Martin, Bryan D.; Wolfson, Julian; Adomavicius, Gediminas; Fan, Yingling
2017-01-01
We propose and compare combinations of several methods for classifying transportation activity data from smartphone GPS and accelerometer sensors. We have two main objectives. First, we aim to classify our data as accurately as possible. Second, we aim to reduce the dimensionality of the data as much as possible in order to reduce the computational burden of the classification. We combine dimension reduction and classification algorithms and compare them with a metric that balances accuracy and dimensionality. In doing so, we develop a classification algorithm that accurately classifies five different modes of transportation (i.e., walking, biking, car, bus and rail) while being computationally simple enough to run on a typical smartphone. Further, we use data that required no behavioral changes from the smartphone users to collect. Our best classification model uses the random forest algorithm to achieve 96.8% accuracy. PMID:28885550
Martin, Bryan D; Addona, Vittorio; Wolfson, Julian; Adomavicius, Gediminas; Fan, Yingling
2017-09-08
We propose and compare combinations of several methods for classifying transportation activity data from smartphone GPS and accelerometer sensors. We have two main objectives. First, we aim to classify our data as accurately as possible. Second, we aim to reduce the dimensionality of the data as much as possible in order to reduce the computational burden of the classification. We combine dimension reduction and classification algorithms and compare them with a metric that balances accuracy and dimensionality. In doing so, we develop a classification algorithm that accurately classifies five different modes of transportation (i.e., walking, biking, car, bus and rail) while being computationally simple enough to run on a typical smartphone. Further, we use data that required no behavioral changes from the smartphone users to collect. Our best classification model uses the random forest algorithm to achieve 96.8% accuracy.
Seeland, Marco; Rzanny, Michael; Alaqraa, Nedal; Wäldchen, Jana; Mäder, Patrick
2017-01-01
Steady improvements of image description methods induced a growing interest in image-based plant species classification, a task vital to the study of biodiversity and ecological sensitivity. Various techniques have been proposed for general object classification over the past years and several of them have already been studied for plant species classification. However, results of these studies are selective in the evaluated steps of a classification pipeline, in the utilized datasets for evaluation, and in the compared baseline methods. No study is available that evaluates the main competing methods for building an image representation on the same datasets allowing for generalized findings regarding flower-based plant species classification. The aim of this paper is to comparatively evaluate methods, method combinations, and their parameters towards classification accuracy. The investigated methods span from detection, extraction, fusion, pooling, to encoding of local features for quantifying shape and color information of flower images. We selected the flower image datasets Oxford Flower 17 and Oxford Flower 102 as well as our own Jena Flower 30 dataset for our experiments. Findings show large differences among the various studied techniques and that their wisely chosen orchestration allows for high accuracies in species classification. We further found that true local feature detectors in combination with advanced encoding methods yield higher classification results at lower computational costs compared to commonly used dense sampling and spatial pooling methods. Color was found to be an indispensable feature for high classification results, especially while preserving spatial correspondence to gray-level features. In result, our study provides a comprehensive overview of competing techniques and the implications of their main parameters for flower-based plant species classification. PMID:28234999
The impact of OCR accuracy on automated cancer classification of pathology reports.
Zuccon, Guido; Nguyen, Anthony N; Bergheim, Anton; Wickman, Sandra; Grayson, Narelle
2012-01-01
To evaluate the effects of Optical Character Recognition (OCR) on the automatic cancer classification of pathology reports. Scanned images of pathology reports were converted to electronic free-text using a commercial OCR system. A state-of-the-art cancer classification system, the Medical Text Extraction (MEDTEX) system, was used to automatically classify the OCR reports. Classifications produced by MEDTEX on the OCR versions of the reports were compared with the classification from a human amended version of the OCR reports. The employed OCR system was found to recognise scanned pathology reports with up to 99.12% character accuracy and up to 98.95% word accuracy. Errors in the OCR processing were found to minimally impact on the automatic classification of scanned pathology reports into notifiable groups. However, the impact of OCR errors is not negligible when considering the extraction of cancer notification items, such as primary site, histological type, etc. The automatic cancer classification system used in this work, MEDTEX, has proven to be robust to errors produced by the acquisition of freetext pathology reports from scanned images through OCR software. However, issues emerge when considering the extraction of cancer notification items.
An Evaluation of Item Response Theory Classification Accuracy and Consistency Indices
ERIC Educational Resources Information Center
Wyse, Adam E.; Hao, Shiqi
2012-01-01
This article introduces two new classification consistency indices that can be used when item response theory (IRT) models have been applied. The new indices are shown to be related to Rudner's classification accuracy index and Guo's classification accuracy index. The Rudner- and Guo-based classification accuracy and consistency indices are…
NASA Astrophysics Data System (ADS)
Gibril, Mohamed Barakat A.; Idrees, Mohammed Oludare; Yao, Kouame; Shafri, Helmi Zulhaidi Mohd
2018-01-01
The growing use of optimization for geographic object-based image analysis and the possibility to derive a wide range of information about the image in textual form makes machine learning (data mining) a versatile tool for information extraction from multiple data sources. This paper presents application of data mining for land-cover classification by fusing SPOT-6, RADARSAT-2, and derived dataset. First, the images and other derived indices (normalized difference vegetation index, normalized difference water index, and soil adjusted vegetation index) were combined and subjected to segmentation process with optimal segmentation parameters obtained using combination of spatial and Taguchi statistical optimization. The image objects, which carry all the attributes of the input datasets, were extracted and related to the target land-cover classes through data mining algorithms (decision tree) for classification. To evaluate the performance, the result was compared with two nonparametric classifiers: support vector machine (SVM) and random forest (RF). Furthermore, the decision tree classification result was evaluated against six unoptimized trials segmented using arbitrary parameter combinations. The result shows that the optimized process produces better land-use land-cover classification with overall classification accuracy of 91.79%, 87.25%, and 88.69% for SVM and RF, respectively, while the results of the six unoptimized classifications yield overall accuracy between 84.44% and 88.08%. Higher accuracy of the optimized data mining classification approach compared to the unoptimized results indicates that the optimization process has significant impact on the classification quality.
Reese, H.M.; Lillesand, T.M.; Nagel, D.E.; Stewart, J.S.; Goldmann, R.A.; Simmons, T.E.; Chipman, J.W.; Tessar, P.A.
2002-01-01
Landsat Thematic Mapper (TM) data were the basis in production of a statewide land cover data set for Wisconsin, undertaken in partnership with U.S. Geological Survey's (USGS) Gap Analysis Program (GAP). The data set contained seven classes comparable to Anderson Level I and 24 classes comparable to Anderson Level II/III. Twelve scenes of dual-date TM data were processed with methods that included principal components analysis, stratification into spectrally consistent units, separate classification of upland, wetland, and urban areas, and a hybrid supervised/unsupervised classification called "guided clustering." The final data had overall accuracies of 94% for Anderson Level I upland classes, 77% for Level II/III upland classes, and 84% for Level II/III wetland classes. Classification accuracies for deciduous and coniferous forest were 95% and 93%, respectively, and forest species' overall accuracies ranged from 70% to 84%. Limited availability of acceptable imagery necessitated use of an early May date in a majority of scene pairs, perhaps contributing to lower accuracy for upland deciduous forest species. The mixed deciduous/coniferous forest class had the lowest accuracy, most likely due to distinctly classifying a purely mixed class. Mixed forest signatures containing oak were often confused with pure oak. Guided clustering was seen as an efficient classification method, especially at the tree species level, although its success relied in part on image dates, accurate ground troth, and some analyst intervention. ?? 2002 Elsevier Science Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Fagan, Matthew E.; Defries, Ruth S.; Sesnie, Steven E.; Arroyo-Mora, J. Pablo; Soto, Carlomagno; Singh, Aditya; Townsend, Philip A.; Chazdon, Robin L.
2015-01-01
An efficient means to map tree plantations is needed to detect tropical land use change and evaluate reforestation projects. To analyze recent tree plantation expansion in northeastern Costa Rica, we examined the potential of combining moderate-resolution hyperspectral imagery (2005 HyMap mosaic) with multitemporal, multispectral data (Landsat) to accurately classify (1) general forest types and (2) tree plantations by species composition. Following a linear discriminant analysis to reduce data dimensionality, we compared four Random Forest classification models: hyperspectral data (HD) alone; HD plus interannual spectral metrics; HD plus a multitemporal forest regrowth classification; and all three models combined. The fourth, combined model achieved overall accuracy of 88.5%. Adding multitemporal data significantly improved classification accuracy (p less than 0.0001) of all forest types, although the effect on tree plantation accuracy was modest. The hyperspectral data alone classified six species of tree plantations with 75% to 93% producer's accuracy; adding multitemporal spectral data increased accuracy only for two species with dense canopies. Non-native tree species had higher classification accuracy overall and made up the majority of tree plantations in this landscape. Our results indicate that combining occasionally acquired hyperspectral data with widely available multitemporal satellite imagery enhances mapping and monitoring of reforestation in tropical landscapes.
Quantitative falls risk estimation through multi-sensor assessment of standing balance.
Greene, Barry R; McGrath, Denise; Walsh, Lorcan; Doheny, Emer P; McKeown, David; Garattini, Chiara; Cunningham, Clodagh; Crosby, Lisa; Caulfield, Brian; Kenny, Rose A
2012-12-01
Falls are the most common cause of injury and hospitalization and one of the principal causes of death and disability in older adults worldwide. Measures of postural stability have been associated with the incidence of falls in older adults. The aim of this study was to develop a model that accurately classifies fallers and non-fallers using novel multi-sensor quantitative balance metrics that can be easily deployed into a home or clinic setting. We compared the classification accuracy of our model with an established method for falls risk assessment, the Berg balance scale. Data were acquired using two sensor modalities--a pressure sensitive platform sensor and a body-worn inertial sensor, mounted on the lower back--from 120 community dwelling older adults (65 with a history of falls, 55 without, mean age 73.7 ± 5.8 years, 63 female) while performing a number of standing balance tasks in a geriatric research clinic. Results obtained using a support vector machine yielded a mean classification accuracy of 71.52% (95% CI: 68.82-74.28) in classifying falls history, obtained using one model classifying all data points. Considering male and female participant data separately yielded classification accuracies of 72.80% (95% CI: 68.85-77.17) and 73.33% (95% CI: 69.88-76.81) respectively, leading to a mean classification accuracy of 73.07% in identifying participants with a history of falls. Results compare favourably to those obtained using the Berg balance scale (mean classification accuracy: 59.42% (95% CI: 56.96-61.88)). Results from the present study could lead to a robust method for assessing falls risk in both supervised and unsupervised environments.
Protein classification based on text document classification techniques.
Cheng, Betty Yee Man; Carbonell, Jaime G; Klein-Seetharaman, Judith
2005-03-01
The need for accurate, automated protein classification methods continues to increase as advances in biotechnology uncover new proteins. G-protein coupled receptors (GPCRs) are a particularly difficult superfamily of proteins to classify due to extreme diversity among its members. Previous comparisons of BLAST, k-nearest neighbor (k-NN), hidden markov model (HMM) and support vector machine (SVM) using alignment-based features have suggested that classifiers at the complexity of SVM are needed to attain high accuracy. Here, analogous to document classification, we applied Decision Tree and Naive Bayes classifiers with chi-square feature selection on counts of n-grams (i.e. short peptide sequences of length n) to this classification task. Using the GPCR dataset and evaluation protocol from the previous study, the Naive Bayes classifier attained an accuracy of 93.0 and 92.4% in level I and level II subfamily classification respectively, while SVM has a reported accuracy of 88.4 and 86.3%. This is a 39.7 and 44.5% reduction in residual error for level I and level II subfamily classification, respectively. The Decision Tree, while inferior to SVM, outperforms HMM in both level I and level II subfamily classification. For those GPCR families whose profiles are stored in the Protein FAMilies database of alignments and HMMs (PFAM), our method performs comparably to a search against those profiles. Finally, our method can be generalized to other protein families by applying it to the superfamily of nuclear receptors with 94.5, 97.8 and 93.6% accuracy in family, level I and level II subfamily classification respectively. Copyright 2005 Wiley-Liss, Inc.
Research on Remote Sensing Image Classification Based on Feature Level Fusion
NASA Astrophysics Data System (ADS)
Yuan, L.; Zhu, G.
2018-04-01
Remote sensing image classification, as an important direction of remote sensing image processing and application, has been widely studied. However, in the process of existing classification algorithms, there still exists the phenomenon of misclassification and missing points, which leads to the final classification accuracy is not high. In this paper, we selected Sentinel-1A and Landsat8 OLI images as data sources, and propose a classification method based on feature level fusion. Compare three kind of feature level fusion algorithms (i.e., Gram-Schmidt spectral sharpening, Principal Component Analysis transform and Brovey transform), and then select the best fused image for the classification experimental. In the classification process, we choose four kinds of image classification algorithms (i.e. Minimum distance, Mahalanobis distance, Support Vector Machine and ISODATA) to do contrast experiment. We use overall classification precision and Kappa coefficient as the classification accuracy evaluation criteria, and the four classification results of fused image are analysed. The experimental results show that the fusion effect of Gram-Schmidt spectral sharpening is better than other methods. In four kinds of classification algorithms, the fused image has the best applicability to Support Vector Machine classification, the overall classification precision is 94.01 % and the Kappa coefficients is 0.91. The fused image with Sentinel-1A and Landsat8 OLI is not only have more spatial information and spectral texture characteristics, but also enhances the distinguishing features of the images. The proposed method is beneficial to improve the accuracy and stability of remote sensing image classification.
NASA Astrophysics Data System (ADS)
Majumder, S. K.; Krishna, H.; Sidramesh, M.; Chaturvedi, P.; Gupta, P. K.
2011-08-01
We report the results of a comparative evaluation of in vivo fluorescence and Raman spectroscopy for diagnosis of oral neoplasia. The study carried out at Tata Memorial Hospital, Mumbai, involved 26 healthy volunteers and 138 patients being screened for neoplasm of oral cavity. Spectral measurements were taken from multiple sites of abnormal as well as apparently uninvolved contra-lateral regions of the oral cavity in each patient. The different tissue sites investigated belonged to one of the four histopathology categories: 1) squamous cell carcinoma (SCC), 2) oral sub-mucous fibrosis (OSMF), 3) leukoplakia (LP) and 4) normal squamous tissue. A probability based multivariate statistical algorithm utilizing nonlinear Maximum Representation and Discrimination Feature for feature extraction and Sparse Multinomial Logistic Regression for classification was developed for direct multi-class classification in a leave-one-patient-out cross validation mode. The results reveal that the performance of Raman spectroscopy is considerably superior to that of fluorescence in stratifying the oral tissues into respective histopathologic categories. The best classification accuracy was observed to be 90%, 93%, 94%, and 89% for SCC, SMF, leukoplakia, and normal oral tissues, respectively, on the basis of leave-one-patient-out cross-validation, with an overall accuracy of 91%. However, when a binary classification was employed to distinguish spectra from all the SCC, SMF and leukoplakik tissue sites together from normal, fluorescence and Raman spectroscopy were seen to have almost comparable performances with Raman yielding marginally better classification accuracy of 98.5% as compared to 94% of fluorescence.
Multiple confidence estimates as indices of eyewitness memory.
Sauer, James D; Brewer, Neil; Weber, Nathan
2008-08-01
Eyewitness identification decisions are vulnerable to various influences on witnesses' decision criteria that contribute to false identifications of innocent suspects and failures to choose perpetrators. An alternative procedure using confidence estimates to assess the degree of match between novel and previously viewed faces was investigated. Classification algorithms were applied to participants' confidence data to determine when a confidence value or pattern of confidence values indicated a positive response. Experiment 1 compared confidence group classification accuracy with a binary decision control group's accuracy on a standard old-new face recognition task and found superior accuracy for the confidence group for target-absent trials but not for target-present trials. Experiment 2 used a face mini-lineup task and found reduced target-present accuracy offset by large gains in target-absent accuracy. Using a standard lineup paradigm, Experiments 3 and 4 also found improved classification accuracy for target-absent lineups and, with a more sophisticated algorithm, for target-present lineups. This demonstrates the accessibility of evidence for recognition memory decisions and points to a more sensitive index of memory quality than is afforded by binary decisions.
NASA Astrophysics Data System (ADS)
Sukawattanavijit, Chanika; Srestasathiern, Panu
2017-10-01
Land Use and Land Cover (LULC) information are significant to observe and evaluate environmental change. LULC classification applying remotely sensed data is a technique popularly employed on a global and local dimension particularly, in urban areas which have diverse land cover types. These are essential components of the urban terrain and ecosystem. In the present, object-based image analysis (OBIA) is becoming widely popular for land cover classification using the high-resolution image. COSMO-SkyMed SAR data was fused with THAICHOTE (namely, THEOS: Thailand Earth Observation Satellite) optical data for land cover classification using object-based. This paper indicates a comparison between object-based and pixel-based approaches in image fusion. The per-pixel method, support vector machines (SVM) was implemented to the fused image based on Principal Component Analysis (PCA). For the objectbased classification was applied to the fused images to separate land cover classes by using nearest neighbor (NN) classifier. Finally, the accuracy assessment was employed by comparing with the classification of land cover mapping generated from fused image dataset and THAICHOTE image. The object-based data fused COSMO-SkyMed with THAICHOTE images demonstrated the best classification accuracies, well over 85%. As the results, an object-based data fusion provides higher land cover classification accuracy than per-pixel data fusion.
Marciano, Michael A; Adelman, Jonathan D
2017-03-01
The deconvolution of DNA mixtures remains one of the most critical challenges in the field of forensic DNA analysis. In addition, of all the data features required to perform such deconvolution, the number of contributors in the sample is widely considered the most important, and, if incorrectly chosen, the most likely to negatively influence the mixture interpretation of a DNA profile. Unfortunately, most current approaches to mixture deconvolution require the assumption that the number of contributors is known by the analyst, an assumption that can prove to be especially faulty when faced with increasingly complex mixtures of 3 or more contributors. In this study, we propose a probabilistic approach for estimating the number of contributors in a DNA mixture that leverages the strengths of machine learning. To assess this approach, we compare classification performances of six machine learning algorithms and evaluate the model from the top-performing algorithm against the current state of the art in the field of contributor number classification. Overall results show over 98% accuracy in identifying the number of contributors in a DNA mixture of up to 4 contributors. Comparative results showed 3-person mixtures had a classification accuracy improvement of over 6% compared to the current best-in-field methodology, and that 4-person mixtures had a classification accuracy improvement of over 20%. The Probabilistic Assessment for Contributor Estimation (PACE) also accomplishes classification of mixtures of up to 4 contributors in less than 1s using a standard laptop or desktop computer. Considering the high classification accuracy rates, as well as the significant time commitment required by the current state of the art model versus seconds required by a machine learning-derived model, the approach described herein provides a promising means of estimating the number of contributors and, subsequently, will lead to improved DNA mixture interpretation. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Delavarian, Mona; Towhidkhah, Farzad; Gharibzadeh, Shahriar; Dibajnia, Parvin
2011-07-12
Automatic classification of different behavioral disorders with many similarities (e.g. in symptoms) by using an automated approach will help psychiatrists to concentrate on correct disorder and its treatment as soon as possible, to avoid wasting time on diagnosis, and to increase the accuracy of diagnosis. In this study, we tried to differentiate and classify (diagnose) 306 children with many similar symptoms and different behavioral disorders such as ADHD, depression, anxiety, comorbid depression and anxiety and conduct disorder with high accuracy. Classification was based on the symptoms and their severity. With examining 16 different available classifiers, by using "Prtools", we have proposed nearest mean classifier as the most accurate classifier with 96.92% accuracy in this research. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
ERIC Educational Resources Information Center
Lau, C. Allen; Wang, Tianyou
The purposes of this study were to: (1) extend the sequential probability ratio testing (SPRT) procedure to polytomous item response theory (IRT) models in computerized classification testing (CCT); (2) compare polytomous items with dichotomous items using the SPRT procedure for their accuracy and efficiency; (3) study a direct approach in…
Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio
NASA Astrophysics Data System (ADS)
Nababan, A. A.; Sitompul, O. S.; Tulus
2018-04-01
K- Nearest Neighbor (KNN) is a good classifier, but from several studies, the result performance accuracy of KNN still lower than other methods. One of the causes of the low accuracy produced, because each attribute has the same effect on the classification process, while some less relevant characteristics lead to miss-classification of the class assignment for new data. In this research, we proposed Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio as a parameter to see the correlation between each attribute in the data and the Gain Ratio also will be used as the basis for weighting each attribute of the dataset. The accuracy of results is compared to the accuracy acquired from the original KNN method using 10-fold Cross-Validation with several datasets from the UCI Machine Learning repository and KEEL-Dataset Repository, such as abalone, glass identification, haberman, hayes-roth and water quality status. Based on the result of the test, the proposed method was able to increase the classification accuracy of KNN, where the highest difference of accuracy obtained hayes-roth dataset is worth 12.73%, and the lowest difference of accuracy obtained in the abalone dataset of 0.07%. The average result of the accuracy of all dataset increases the accuracy by 5.33%.
NASA Astrophysics Data System (ADS)
Suiter, Ashley Elizabeth
Multi-spectral imagery provides a robust and low-cost dataset for assessing wetland extent and quality over broad regions and is frequently used for wetland inventories. However in forested wetlands, hydrology is obscured by tree canopy making it difficult to detect with multi-spectral imagery alone. Because of this, classification of forested wetlands often includes greater errors than that of other wetlands types. Elevation and terrain derivatives have been shown to be useful for modelling wetland hydrology. But, few studies have addressed the use of LiDAR intensity data detecting hydrology in forested wetlands. Due the tendency of LiDAR signal to be attenuated by water, this research proposed the fusion of LiDAR intensity data with LiDAR elevation, terrain data, and aerial imagery, for the detection of forested wetland hydrology. We examined the utility of LiDAR intensity data and determined whether the fusion of Lidar derived data with multispectral imagery increased the accuracy of forested wetland classification compared with a classification performed with only multi-spectral image. Four classifications were performed: Classification A -- All Imagery, Classification B -- All LiDAR, Classification C -- LiDAR without Intensity, and Classification D -- Fusion of All Data. These classifications were performed using random forest and each resulted in a 3-foot resolution thematic raster of forested upland and forested wetland locations in Vermilion County, Illinois. The accuracies of these classifications were compared using Kappa Coefficient of Agreement. Importance statistics produced within the random forest classifier were evaluated in order to understand the contribution of individual datasets. Classification D, which used the fusion of LiDAR and multi-spectral imagery as input variables, had moderate to strong agreement between reference data and classification results. It was found that Classification A performed using all the LiDAR data and its derivatives (intensity, elevation, slope, aspect, curvatures, and Topographic Wetness Index) was the most accurate classification with Kappa: 78.04%, indicating moderate to strong agreement. However, Classification C, performed with LiDAR derivative without intensity data had less agreement than would be expected by chance, indicating that LiDAR contributed significantly to the accuracy of Classification B.
Tabu search and binary particle swarm optimization for feature selection using microarray data.
Chuang, Li-Yeh; Yang, Cheng-Huei; Yang, Cheng-Hong
2009-12-01
Gene expression profiles have great potential as a medical diagnosis tool because they represent the state of a cell at the molecular level. In the classification of cancer type research, available training datasets generally have a fairly small sample size compared to the number of genes involved. This fact poses an unprecedented challenge to some classification methodologies due to training data limitations. Therefore, a good selection method for genes relevant for sample classification is needed to improve the predictive accuracy, and to avoid incomprehensibility due to the large number of genes investigated. In this article, we propose to combine tabu search (TS) and binary particle swarm optimization (BPSO) for feature selection. BPSO acts as a local optimizer each time the TS has been run for a single generation. The K-nearest neighbor method with leave-one-out cross-validation and support vector machine with one-versus-rest serve as evaluators of the TS and BPSO. The proposed method is applied and compared to the 11 classification problems taken from the literature. Experimental results show that our method simplifies features effectively and either obtains higher classification accuracy or uses fewer features compared to other feature selection methods.
NASA Astrophysics Data System (ADS)
Chen, Y.; Luo, M.; Xu, L.; Zhou, X.; Ren, J.; Zhou, J.
2018-04-01
The RF method based on grid-search parameter optimization could achieve a classification accuracy of 88.16 % in the classification of images with multiple feature variables. This classification accuracy was higher than that of SVM and ANN under the same feature variables. In terms of efficiency, the RF classification method performs better than SVM and ANN, it is more capable of handling multidimensional feature variables. The RF method combined with object-based analysis approach could highlight the classification accuracy further. The multiresolution segmentation approach on the basis of ESP scale parameter optimization was used for obtaining six scales to execute image segmentation, when the segmentation scale was 49, the classification accuracy reached the highest value of 89.58 %. The classification accuracy of object-based RF classification was 1.42 % higher than that of pixel-based classification (88.16 %), and the classification accuracy was further improved. Therefore, the RF classification method combined with object-based analysis approach could achieve relatively high accuracy in the classification and extraction of land use information for industrial and mining reclamation areas. Moreover, the interpretation of remotely sensed imagery using the proposed method could provide technical support and theoretical reference for remotely sensed monitoring land reclamation.
A Unified Classification Framework for FP, DP and CP Data at X-Band in Southern China
NASA Astrophysics Data System (ADS)
Xie, Lei; Zhang, Hong; Li, Hhongzhong; Wang, Chao
2015-04-01
The main objective of this paper is to introduce an unified framework for crop classification in Southern China using data in fully polarimetric (FP), dual-pol (DP) and compact polarimetric (CP) modes. The TerraSAR-X data acquired over the Leizhou Peninsula, South China are used in our experiments. The study site involves four main crops (rice, banana, sugarcane eucalyptus). Through exploring the similarities between data in these three modes, a knowledge-based characteristic space is created and the unified framework is presented. The overall classification accuracies for data in the FP, coherent HH/VV are about 95%, and is about 91% in CP modes, which suggests that the proposed classification scheme is effective and promising. Compared with the Wishart Maximum Likelihood (ML) classifier, the proposed method exhibits higher classification accuracy.
Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders.
Subasi, Abdulhamit
2013-06-01
Support vector machine (SVM) is an extensively used machine learning method with many biomedical signal classification applications. In this study, a novel PSO-SVM model has been proposed that hybridized the particle swarm optimization (PSO) and SVM to improve the EMG signal classification accuracy. This optimization mechanism involves kernel parameter setting in the SVM training procedure, which significantly influences the classification accuracy. The experiments were conducted on the basis of EMG signal to classify into normal, neurogenic or myopathic. In the proposed method the EMG signals were decomposed into the frequency sub-bands using discrete wavelet transform (DWT) and a set of statistical features were extracted from these sub-bands to represent the distribution of wavelet coefficients. The obtained results obviously validate the superiority of the SVM method compared to conventional machine learning methods, and suggest that further significant enhancements in terms of classification accuracy can be achieved by the proposed PSO-SVM classification system. The PSO-SVM yielded an overall accuracy of 97.41% on 1200 EMG signals selected from 27 subject records against 96.75%, 95.17% and 94.08% for the SVM, the k-NN and the RBF classifiers, respectively. PSO-SVM is developed as an efficient tool so that various SVMs can be used conveniently as the core of PSO-SVM for diagnosis of neuromuscular disorders. Copyright © 2013 Elsevier Ltd. All rights reserved.
Virtual Sensor of Surface Electromyography in a New Extensive Fault-Tolerant Classification System.
de Moura, Karina de O A; Balbinot, Alexandre
2018-05-01
A few prosthetic control systems in the scientific literature obtain pattern recognition algorithms adapted to changes that occur in the myoelectric signal over time and, frequently, such systems are not natural and intuitive. These are some of the several challenges for myoelectric prostheses for everyday use. The concept of the virtual sensor, which has as its fundamental objective to estimate unavailable measures based on other available measures, is being used in other fields of research. The virtual sensor technique applied to surface electromyography can help to minimize these problems, typically related to the degradation of the myoelectric signal that usually leads to a decrease in the classification accuracy of the movements characterized by computational intelligent systems. This paper presents a virtual sensor in a new extensive fault-tolerant classification system to maintain the classification accuracy after the occurrence of the following contaminants: ECG interference, electrode displacement, movement artifacts, power line interference, and saturation. The Time-Varying Autoregressive Moving Average (TVARMA) and Time-Varying Kalman filter (TVK) models are compared to define the most robust model for the virtual sensor. Results of movement classification were presented comparing the usual classification techniques with the method of the degraded signal replacement and classifier retraining. The experimental results were evaluated for these five noise types in 16 surface electromyography (sEMG) channel degradation case studies. The proposed system without using classifier retraining techniques recovered of mean classification accuracy was of 4% to 38% for electrode displacement, movement artifacts, and saturation noise. The best mean classification considering all signal contaminants and channel combinations evaluated was the classification using the retraining method, replacing the degraded channel by the virtual sensor TVARMA model. This method recovered the classification accuracy after the degradations, reaching an average of 5.7% below the classification of the clean signal, that is the signal without the contaminants or the original signal. Moreover, the proposed intelligent technique minimizes the impact of the motion classification caused by signal contamination related to degrading events over time. There are improvements in the virtual sensor model and in the algorithm optimization that need further development to provide an increase the clinical application of myoelectric prostheses but already presents robust results to enable research with virtual sensors on biological signs with stochastic behavior.
Virtual Sensor of Surface Electromyography in a New Extensive Fault-Tolerant Classification System
Balbinot, Alexandre
2018-01-01
A few prosthetic control systems in the scientific literature obtain pattern recognition algorithms adapted to changes that occur in the myoelectric signal over time and, frequently, such systems are not natural and intuitive. These are some of the several challenges for myoelectric prostheses for everyday use. The concept of the virtual sensor, which has as its fundamental objective to estimate unavailable measures based on other available measures, is being used in other fields of research. The virtual sensor technique applied to surface electromyography can help to minimize these problems, typically related to the degradation of the myoelectric signal that usually leads to a decrease in the classification accuracy of the movements characterized by computational intelligent systems. This paper presents a virtual sensor in a new extensive fault-tolerant classification system to maintain the classification accuracy after the occurrence of the following contaminants: ECG interference, electrode displacement, movement artifacts, power line interference, and saturation. The Time-Varying Autoregressive Moving Average (TVARMA) and Time-Varying Kalman filter (TVK) models are compared to define the most robust model for the virtual sensor. Results of movement classification were presented comparing the usual classification techniques with the method of the degraded signal replacement and classifier retraining. The experimental results were evaluated for these five noise types in 16 surface electromyography (sEMG) channel degradation case studies. The proposed system without using classifier retraining techniques recovered of mean classification accuracy was of 4% to 38% for electrode displacement, movement artifacts, and saturation noise. The best mean classification considering all signal contaminants and channel combinations evaluated was the classification using the retraining method, replacing the degraded channel by the virtual sensor TVARMA model. This method recovered the classification accuracy after the degradations, reaching an average of 5.7% below the classification of the clean signal, that is the signal without the contaminants or the original signal. Moreover, the proposed intelligent technique minimizes the impact of the motion classification caused by signal contamination related to degrading events over time. There are improvements in the virtual sensor model and in the algorithm optimization that need further development to provide an increase the clinical application of myoelectric prostheses but already presents robust results to enable research with virtual sensors on biological signs with stochastic behavior. PMID:29723994
EXhype: A tool for mineral classification using hyperspectral data
NASA Astrophysics Data System (ADS)
Adep, Ramesh Nityanand; shetty, Amba; Ramesh, H.
2017-02-01
Various supervised classification algorithms have been developed to classify earth surface features using hyperspectral data. Each algorithm is modelled based on different human expertises. However, the performance of conventional algorithms is not satisfactory to map especially the minerals in view of their typical spectral responses. This study introduces a new expert system named 'EXhype (Expert system for hyperspectral data classification)' to map minerals. The system incorporates human expertise at several stages of it's implementation: (i) to deal with intra-class variation; (ii) to identify absorption features; (iii) to discriminate spectra by considering absorption features, non-absorption features and by full spectra comparison; and (iv) finally takes a decision based on learning and by emphasizing most important features. It is developed using a knowledge base consisting of an Optimal Spectral Library, Segmented Upper Hull method, Spectral Angle Mapper (SAM) and Artificial Neural Network. The performance of the EXhype is compared with a traditional, most commonly used SAM algorithm using Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data acquired over Cuprite, Nevada, USA. A virtual verification method is used to collect samples information for accuracy assessment. Further, a modified accuracy assessment method is used to get a real users accuracies in cases where only limited or desired classes are considered for classification. With the modified accuracy assessment method, SAM and EXhype yields an overall accuracy of 60.35% and 90.75% and the kappa coefficient of 0.51 and 0.89 respectively. It was also found that the virtual verification method allows to use most desired stratified random sampling method and eliminates all the difficulties associated with it. The experimental results show that EXhype is not only producing better accuracy compared to traditional SAM but, can also rightly classify the minerals. It is proficient in avoiding misclassification between target classes when applied on minerals.
Erdodi, Laszlo A; Tyson, Bradley T; Shahein, Ayman G; Lichtenstein, Jonathan D; Abeare, Christopher A; Pelletier, Chantalle L; Zuccato, Brandon G; Kucharski, Brittany; Roth, Robert M
2017-05-01
The Recognition Memory Test (RMT) and Word Choice Test (WCT) are structurally similar, but psychometrically different. Previous research demonstrated that adding a time-to-completion cutoff improved the classification accuracy of the RMT. However, the contribution of WCT time-cutoffs to improve the detection of invalid responding has not been investigated. The present study was designed to evaluate the classification accuracy of time-to-completion on the WCT compared to the accuracy score and the RMT. Both tests were administered to 202 adults (M age = 45.3 years, SD = 16.8; 54.5% female) clinically referred for neuropsychological assessment in counterbalanced order as part of a larger battery of cognitive tests. Participants obtained lower and more variable scores on the RMT (M = 44.1, SD = 7.6) than on the WCT (M = 46.9, SD = 5.7). Similarly, they took longer to complete the recognition trial on the RMT (M = 157.2 s,SD = 71.8) than the WCT (M = 137.2 s, SD = 75.7). The optimal cutoff on the RMT (≤43) produced .60 sensitivity at .87 specificity. The optimal cutoff on the WCT (≤47) produced .57 sensitivity at .87 specificity. Time-cutoffs produced comparable classification accuracies for both RMT (≥192 s; .48 sensitivity at .88 specificity) and WCT (≥171 s; .49 sensitivity at .91 specificity). They also identified an additional 6-10% of the invalid profiles missed by accuracy score cutoffs, while maintaining good specificity (.93-.95). Functional equivalence was reached at accuracy scores ≤43 (RMT) and ≤47 (WCT) or time-to-completion ≥192 s (RMT) and ≥171 s (WCT). Time-to-completion cutoffs are valuable additions to both tests. They can function as independent validity indicators or enhance the sensitivity of accuracy scores without requiring additional measures or extending standard administration time.
Convolutional Neural Network for Histopathological Analysis of Osteosarcoma.
Mishra, Rashika; Daescu, Ovidiu; Leavey, Patrick; Rakheja, Dinesh; Sengupta, Anita
2018-03-01
Pathologists often deal with high complexity and sometimes disagreement over osteosarcoma tumor classification due to cellular heterogeneity in the dataset. Segmentation and classification of histology tissue in H&E stained tumor image datasets is a challenging task because of intra-class variations, inter-class similarity, crowded context, and noisy data. In recent years, deep learning approaches have led to encouraging results in breast cancer and prostate cancer analysis. In this article, we propose convolutional neural network (CNN) as a tool to improve efficiency and accuracy of osteosarcoma tumor classification into tumor classes (viable tumor, necrosis) versus nontumor. The proposed CNN architecture contains eight learned layers: three sets of stacked two convolutional layers interspersed with max pooling layers for feature extraction and two fully connected layers with data augmentation strategies to boost performance. The use of a neural network results in higher accuracy of average 92% for the classification. We compare the proposed architecture with three existing and proven CNN architectures for image classification: AlexNet, LeNet, and VGGNet. We also provide a pipeline to calculate percentage necrosis in a given whole slide image. We conclude that the use of neural networks can assure both high accuracy and efficiency in osteosarcoma classification.
Retinal vasculature classification using novel multifractal features
NASA Astrophysics Data System (ADS)
Ding, Y.; Ward, W. O. C.; Duan, Jinming; Auer, D. P.; Gowland, Penny; Bai, L.
2015-11-01
Retinal blood vessels have been implicated in a large number of diseases including diabetic retinopathy and cardiovascular diseases, which cause damages to retinal blood vessels. The availability of retinal vessel imaging provides an excellent opportunity for monitoring and diagnosis of retinal diseases, and automatic analysis of retinal vessels will help with the processes. However, state of the art vascular analysis methods such as counting the number of branches or measuring the curvature and diameter of individual vessels are unsuitable for the microvasculature. There has been published research using fractal analysis to calculate fractal dimensions of retinal blood vessels, but so far there has been no systematic research extracting discriminant features from retinal vessels for classifications. This paper introduces new methods for feature extraction from multifractal spectra of retinal vessels for classification. Two publicly available retinal vascular image databases are used for the experiments, and the proposed methods have produced accuracies of 85.5% and 77% for classification of healthy and diabetic retinal vasculatures. Experiments show that classification with multiple fractal features produces better rates compared with methods using a single fractal dimension value. In addition to this, experiments also show that classification accuracy can be affected by the accuracy of vessel segmentation algorithms.
Comparative analysis of Worldview-2 and Landsat 8 for coastal saltmarsh mapping accuracy assessment
NASA Astrophysics Data System (ADS)
Rasel, Sikdar M. M.; Chang, Hsing-Chung; Diti, Israt Jahan; Ralph, Tim; Saintilan, Neil
2016-05-01
Coastal saltmarsh and their constituent components and processes are of an interest scientifically due to their ecological function and services. However, heterogeneity and seasonal dynamic of the coastal wetland system makes it challenging to map saltmarshes with remotely sensed data. This study selected four important saltmarsh species Pragmitis australis, Sporobolus virginicus, Ficiona nodosa and Schoeloplectus sp. as well as a Mangrove and Pine tree species, Avecinia and Casuarina sp respectively. High Spatial Resolution Worldview-2 data and Coarse Spatial resolution Landsat 8 imagery were selected in this study. Among the selected vegetation types some patches ware fragmented and close to the spatial resolution of Worldview-2 data while and some patch were larger than the 30 meter resolution of Landsat 8 data. This study aims to test the effectiveness of different classifier for the imagery with various spatial and spectral resolutions. Three different classification algorithm, Maximum Likelihood Classifier (MLC), Support Vector Machine (SVM) and Artificial Neural Network (ANN) were tested and compared with their mapping accuracy of the results derived from both satellite imagery. For Worldview-2 data SVM was giving the higher overall accuracy (92.12%, kappa =0.90) followed by ANN (90.82%, Kappa 0.89) and MLC (90.55%, kappa = 0.88). For Landsat 8 data, MLC (82.04%) showed the highest classification accuracy comparing to SVM (77.31%) and ANN (75.23%). The producer accuracy of the classification results were also presented in the paper.
NASA Astrophysics Data System (ADS)
Roychowdhury, K.
2016-06-01
Landcover is the easiest detectable indicator of human interventions on land. Urban and peri-urban areas present a complex combination of landcover, which makes classification challenging. This paper assesses the different methods of classifying landcover using dual polarimetric Sentinel-1 data collected during monsoon (July) and winter (December) months of 2015. Four broad landcover classes such as built up areas, water bodies and wetlands, vegetation and open spaces of Kolkata and its surrounding regions were identified. Polarimetric analyses were conducted on Single Look Complex (SLC) data of the region while ground range detected (GRD) data were used for spectral and spatial classification. Unsupervised classification by means of K-Means clustering used backscatter values and was able to identify homogenous landcovers over the study area. The results produced an overall accuracy of less than 50% for both the seasons. Higher classification accuracy (around 70%) was achieved by adding texture variables as inputs along with the backscatter values. However, the accuracy of classification increased significantly with polarimetric analyses. The overall accuracy was around 80% in Wishart H-A-Alpha unsupervised classification. The method was useful in identifying urban areas due to their double-bounce scattering and vegetated areas, which have more random scattering. Normalized Difference Built-up index (NDBI) and Normalized Difference Vegetation Index (NDVI) obtained from Landsat 8 data over the study area were used to verify vegetation and urban classes. The study compares the accuracies of different methods of classifying landcover using medium resolution SAR data in a complex urban area and suggests that polarimetric analyses present the most accurate results for urban and suburban areas.
Atzori, Manfredo; Cognolato, Matteo; Müller, Henning
2016-01-01
Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too. PMID:27656140
Atzori, Manfredo; Cognolato, Matteo; Müller, Henning
2016-01-01
Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too.
Best Merge Region Growing with Integrated Probabilistic Classification for Hyperspectral Imagery
NASA Technical Reports Server (NTRS)
Tarabalka, Yuliya; Tilton, James C.
2011-01-01
A new method for spectral-spatial classification of hyperspectral images is proposed. The method is based on the integration of probabilistic classification within the hierarchical best merge region growing algorithm. For this purpose, preliminary probabilistic support vector machines classification is performed. Then, hierarchical step-wise optimization algorithm is applied, by iteratively merging regions with the smallest Dissimilarity Criterion (DC). The main novelty of this method consists in defining a DC between regions as a function of region statistical and geometrical features along with classification probabilities. Experimental results are presented on a 200-band AVIRIS image of the Northwestern Indiana s vegetation area and compared with those obtained by recently proposed spectral-spatial classification techniques. The proposed method improves classification accuracies when compared to other classification approaches.
Lu, Huijuan; Wei, Shasha; Zhou, Zili; Miao, Yanzi; Lu, Yi
2015-01-01
The main purpose of traditional classification algorithms on bioinformatics application is to acquire better classification accuracy. However, these algorithms cannot meet the requirement that minimises the average misclassification cost. In this paper, a new algorithm of cost-sensitive regularised extreme learning machine (CS-RELM) was proposed by using probability estimation and misclassification cost to reconstruct the classification results. By improving the classification accuracy of a group of small sample which higher misclassification cost, the new CS-RELM can minimise the classification cost. The 'rejection cost' was integrated into CS-RELM algorithm to further reduce the average misclassification cost. By using Colon Tumour dataset and SRBCT (Small Round Blue Cells Tumour) dataset, CS-RELM was compared with other cost-sensitive algorithms such as extreme learning machine (ELM), cost-sensitive extreme learning machine, regularised extreme learning machine, cost-sensitive support vector machine (SVM). The results of experiments show that CS-RELM with embedded rejection cost could reduce the average cost of misclassification and made more credible classification decision than others.
Integrative Chemical-Biological Read-Across Approach for Chemical Hazard Classification
Low, Yen; Sedykh, Alexander; Fourches, Denis; Golbraikh, Alexander; Whelan, Maurice; Rusyn, Ivan; Tropsha, Alexander
2013-01-01
Traditional read-across approaches typically rely on the chemical similarity principle to predict chemical toxicity; however, the accuracy of such predictions is often inadequate due to the underlying complex mechanisms of toxicity. Here we report on the development of a hazard classification and visualization method that draws upon both chemical structural similarity and comparisons of biological responses to chemicals measured in multiple short-term assays (”biological” similarity). The Chemical-Biological Read-Across (CBRA) approach infers each compound's toxicity from those of both chemical and biological analogs whose similarities are determined by the Tanimoto coefficient. Classification accuracy of CBRA was compared to that of classical RA and other methods using chemical descriptors alone, or in combination with biological data. Different types of adverse effects (hepatotoxicity, hepatocarcinogenicity, mutagenicity, and acute lethality) were classified using several biological data types (gene expression profiling and cytotoxicity screening). CBRA-based hazard classification exhibited consistently high external classification accuracy and applicability to diverse chemicals. Transparency of the CBRA approach is aided by the use of radial plots that show the relative contribution of analogous chemical and biological neighbors. Identification of both chemical and biological features that give rise to the high accuracy of CBRA-based toxicity prediction facilitates mechanistic interpretation of the models. PMID:23848138
Devos, Olivier; Downey, Gerard; Duponchel, Ludovic
2014-04-01
Classification is an important task in chemometrics. For several years now, support vector machines (SVMs) have proven to be powerful for infrared spectral data classification. However such methods require optimisation of parameters in order to control the risk of overfitting and the complexity of the boundary. Furthermore, it is established that the prediction ability of classification models can be improved using pre-processing in order to remove unwanted variance in the spectra. In this paper we propose a new methodology based on genetic algorithm (GA) for the simultaneous optimisation of SVM parameters and pre-processing (GENOPT-SVM). The method has been tested for the discrimination of the geographical origin of Italian olive oil (Ligurian and non-Ligurian) on the basis of near infrared (NIR) or mid infrared (FTIR) spectra. Different classification models (PLS-DA, SVM with mean centre data, GENOPT-SVM) have been tested and statistically compared using McNemar's statistical test. For the two datasets, SVM with optimised pre-processing give models with higher accuracy than the one obtained with PLS-DA on pre-processed data. In the case of the NIR dataset, most of this accuracy improvement (86.3% compared with 82.8% for PLS-DA) occurred using only a single pre-processing step. For the FTIR dataset, three optimised pre-processing steps are required to obtain SVM model with significant accuracy improvement (82.2%) compared to the one obtained with PLS-DA (78.6%). Furthermore, this study demonstrates that even SVM models have to be developed on the basis of well-corrected spectral data in order to obtain higher classification rates. Copyright © 2013 Elsevier Ltd. All rights reserved.
Engelken, Florian; Wassilew, Georgi I; Köhlitz, Torsten; Brockhaus, Sebastian; Hamm, Bernd; Perka, Carsten; Diederichs, und Gerd
2014-01-01
The purpose of this study was to quantify the performance of the Goutallier classification for assessing fatty degeneration of the gluteus muscles from magnetic resonance (MR) images and to compare its performance to a newly proposed system. Eighty-four hips with clinical signs of gluteal insufficiency and 50 hips from asymptomatic controls were analyzed using a standard classification system (Goutallier) and a new scoring system (Quartile). Interobserver reliability and intraobserver repeatability were determined, and accuracy was assessed by comparing readers' scores with quantitative estimates of the proportion of intramuscular fat based on MR signal intensities (gold standard). The existing Goutallier classification system and the new Quartile system performed equally well in assessing fatty degeneration of the gluteus muscles, both showing excellent levels of interrater and intrarater agreement. While the Goutallier classification system has the advantage of being widely known, the benefit of the Quartile system is that it is based on more clearly defined grades of fatty degeneration. Copyright © 2014 Elsevier Inc. All rights reserved.
Pashaei, Elnaz; Ozen, Mustafa; Aydin, Nizamettin
2015-08-01
Improving accuracy of supervised classification algorithms in biomedical applications is one of active area of research. In this study, we improve the performance of Particle Swarm Optimization (PSO) combined with C4.5 decision tree (PSO+C4.5) classifier by applying Boosted C5.0 decision tree as the fitness function. To evaluate the effectiveness of our proposed method, it is implemented on 1 microarray dataset and 5 different medical data sets obtained from UCI machine learning databases. Moreover, the results of PSO + Boosted C5.0 implementation are compared to eight well-known benchmark classification methods (PSO+C4.5, support vector machine under the kernel of Radial Basis Function, Classification And Regression Tree (CART), C4.5 decision tree, C5.0 decision tree, Boosted C5.0 decision tree, Naive Bayes and Weighted K-Nearest neighbor). Repeated five-fold cross-validation method was used to justify the performance of classifiers. Experimental results show that our proposed method not only improve the performance of PSO+C4.5 but also obtains higher classification accuracy compared to the other classification methods.
NASA Astrophysics Data System (ADS)
Dementev, A. O.; Dmitriev, E. V.; Kozoderov, V. V.; Egorov, V. D.
2017-10-01
Hyperspectral imaging is up-to-date promising technology widely applied for the accurate thematic mapping. The presence of a large number of narrow survey channels allows us to use subtle differences in spectral characteristics of objects and to make a more detailed classification than in the case of using standard multispectral data. The difficulties encountered in the processing of hyperspectral images are usually associated with the redundancy of spectral information which leads to the problem of the curse of dimensionality. Methods currently used for recognizing objects on multispectral and hyperspectral images are usually based on standard base supervised classification algorithms of various complexity. Accuracy of these algorithms can be significantly different depending on considered classification tasks. In this paper we study the performance of ensemble classification methods for the problem of classification of the forest vegetation. Error correcting output codes and boosting are tested on artificial data and real hyperspectral images. It is demonstrates, that boosting gives more significant improvement when used with simple base classifiers. The accuracy in this case in comparable the error correcting output code (ECOC) classifier with Gaussian kernel SVM base algorithm. However the necessity of boosting ECOC with Gaussian kernel SVM is questionable. It is demonstrated, that selected ensemble classifiers allow us to recognize forest species with high enough accuracy which can be compared with ground-based forest inventory data.
ERIC Educational Resources Information Center
Koon, Sharon; Petscher, Yaacov
2015-01-01
The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…
Yong Wang; Shanta Parajuli; Callie Schweitzer; Glendon Smalley; Dawn Lemke; Wubishet Tadesse; Xiongwen Chen
2010-01-01
Forest cover classifications focus on the overall growth form (physiognomy) of the community, dominant vegetation, and species composition of the existing forest. Accurately classifying the forest cover type is important for forest inventory and silviculture. We compared classification accuracy based on Landsat Enhanced Thematic Mapper Plus (Landsat ETM+) and Satellite...
Iliyasu, Abdullah M; Fatichah, Chastine
2017-12-19
A quantum hybrid (QH) intelligent approach that blends the adaptive search capability of the quantum-behaved particle swarm optimisation (QPSO) method with the intuitionistic rationality of traditional fuzzy k -nearest neighbours (Fuzzy k -NN) algorithm (known simply as the Q-Fuzzy approach) is proposed for efficient feature selection and classification of cells in cervical smeared (CS) images. From an initial multitude of 17 features describing the geometry, colour, and texture of the CS images, the QPSO stage of our proposed technique is used to select the best subset features (i.e., global best particles) that represent a pruned down collection of seven features. Using a dataset of almost 1000 images, performance evaluation of our proposed Q-Fuzzy approach assesses the impact of our feature selection on classification accuracy by way of three experimental scenarios that are compared alongside two other approaches: the All-features (i.e., classification without prior feature selection) and another hybrid technique combining the standard PSO algorithm with the Fuzzy k -NN technique (P-Fuzzy approach). In the first and second scenarios, we further divided the assessment criteria in terms of classification accuracy based on the choice of best features and those in terms of the different categories of the cervical cells. In the third scenario, we introduced new QH hybrid techniques, i.e., QPSO combined with other supervised learning methods, and compared the classification accuracy alongside our proposed Q-Fuzzy approach. Furthermore, we employed statistical approaches to establish qualitative agreement with regards to the feature selection in the experimental scenarios 1 and 3. The synergy between the QPSO and Fuzzy k -NN in the proposed Q-Fuzzy approach improves classification accuracy as manifest in the reduction in number cell features, which is crucial for effective cervical cancer detection and diagnosis.
Mikhno, Arthur; Nuevo, Pablo Martinez; Devanand, Davangere P.; Parsey, Ramin V.; Laine, Andrew F.
2013-01-01
Multimodality classification of Alzheimer’s disease (AD) and its prodromal stage, Mild Cognitive Impairment (MCI), is of interest to the medical community. We improve on prior classification frameworks by incorporating multiple features from MRI and PET data obtained with multiple radioligands, fluorodeoxyglucose (FDG) and Pittsburg compound B (PIB). We also introduce a new MRI feature, invariant shape descriptors based on 3D Zernike moments applied to the hippocampus region. Classification performance is evaluated on data from 17 healthy controls (CTR), 22 MCI, and 17 AD subjects. Zernike significantly outperforms volume, accuracy (Zernike to volume): CTR/AD (90.7% to 71.6%), CTR/MCI (76.2% to 60.0%), MCI/AD (84.3% to 65.5%). Zernike also provides comparable and complementary performance to PET. Optimal accuracy is achieved when Zernike and PET features are combined (accuracy, specificity, sensitivity), CTR/AD (98.8%, 99.5%, 98.1%), CTR/MCI (84.3%, 82.9%, 85.9%) and MCI/AD (93.3%, 93.6%, 93.3%). PMID:24576927
Mikhno, Arthur; Nuevo, Pablo Martinez; Devanand, Davangere P; Parsey, Ramin V; Laine, Andrew F
2012-01-01
Multimodality classification of Alzheimer's disease (AD) and its prodromal stage, Mild Cognitive Impairment (MCI), is of interest to the medical community. We improve on prior classification frameworks by incorporating multiple features from MRI and PET data obtained with multiple radioligands, fluorodeoxyglucose (FDG) and Pittsburg compound B (PIB). We also introduce a new MRI feature, invariant shape descriptors based on 3D Zernike moments applied to the hippocampus region. Classification performance is evaluated on data from 17 healthy controls (CTR), 22 MCI, and 17 AD subjects. Zernike significantly outperforms volume, accuracy (Zernike to volume): CTR/AD (90.7% to 71.6%), CTR/MCI (76.2% to 60.0%), MCI/AD (84.3% to 65.5%). Zernike also provides comparable and complementary performance to PET. Optimal accuracy is achieved when Zernike and PET features are combined (accuracy, specificity, sensitivity), CTR/AD (98.8%, 99.5%, 98.1%), CTR/MCI (84.3%, 82.9%, 85.9%) and MCI/AD (93.3%, 93.6%, 93.3%).
Determining successional stage of temperate coniferous forests with Landsat satellite data
NASA Technical Reports Server (NTRS)
Fiorella, Maria; Ripple, William J.
1993-01-01
Thematic Mapper (TM) digital imagery was used to map forest successional stages and to evaluate spectral differences between old-growth and mature forests in the central Cascade Range of Oregon. Relative sun incidence values were incorporated into the successional stage classification to compensate for topographic induced variation. Relative sun incidence improved the classification accuracy of young successional stages, but did not improve the classification accuracy of older, closed canopy forest classes or overall accuracy. TM bands 1, 2, and 4; the normalized difference vegetation index; and TM 4/3, 4/5, and 4/7 band ratio values for o|d-growth forests were found to be significantly lower than the values of mature forests. The Tasseled Cap features of brightness, greenness, and wetness also had significantly lower old-growth values as compared to mature forest values .
[Electroencephalogram Feature Selection Based on Correlation Coefficient Analysis].
Zhou, Jinzhi; Tang, Xiaofang
2015-08-01
In order to improve the accuracy of classification with small amount of motor imagery training data on the development of brain-computer interface (BCD systems, we proposed an analyzing method to automatically select the characteristic parameters based on correlation coefficient analysis. Throughout the five sample data of dataset IV a from 2005 BCI Competition, we utilized short-time Fourier transform (STFT) and correlation coefficient calculation to reduce the number of primitive electroencephalogram dimension, then introduced feature extraction based on common spatial pattern (CSP) and classified by linear discriminant analysis (LDA). Simulation results showed that the average rate of classification accuracy could be improved by using correlation coefficient feature selection method than those without using this algorithm. Comparing with support vector machine (SVM) optimization features algorithm, the correlation coefficient analysis can lead better selection parameters to improve the accuracy of classification.
Variability of wetland reflectance and its effect on automatic categorization of satellite imagery
NASA Technical Reports Server (NTRS)
Klemas, V. (Principal Investigator); Bartlett, D.
1977-01-01
The author has identified the following significant results. Land cover categorization of data from the same overpass in four test wetland areas was carried out using a four category classification system. The tests indicate that training data based on in situ reflectance measurements and atmospheric correction of LANDSAT data can produce comparable accuracy of categorization to that achieved using more than four wetlands cover categories (salt marsh cordgrass, salt hay, unvegetated, and water tidal flat) produced overall classification accuracies of 85% by conventional and relative radiance training and 81% by use of in situ measurements. Overall mapping accuracies were 76% and 72% respectively.
The ERTS-1 investigation (ER-600). Volume 5: ERTS-1 urban land use analysis
NASA Technical Reports Server (NTRS)
Erb, R. B.
1974-01-01
The Urban Land Use Team conducted a year's investigation of ERTS-1 MSS data to determine the number of Land Use categories in the Houston, Texas, area. They discovered unusually low classification accuracies occurred when a spectrally complex urban scene was classified with extensive rural areas containing spectrally homogeneous features. Separate computer processing of only data in the urbanized area increased classification accuracies of certain urban land use categories. Even so, accuracies of urban landscape were in the 40-70 percent range compared to 70-90 percent for the land use categories containing more homogeneous features (agriculture, forest, water, etc.) in the nonurban areas.
Mediterranean Land Use and Land Cover Classification Assessment Using High Spatial Resolution Data
NASA Astrophysics Data System (ADS)
Elhag, Mohamed; Boteva, Silvena
2016-10-01
Landscape fragmentation is noticeably practiced in Mediterranean regions and imposes substantial complications in several satellite image classification methods. To some extent, high spatial resolution data were able to overcome such complications. For better classification performances in Land Use Land Cover (LULC) mapping, the current research adopts different classification methods comparison for LULC mapping using Sentinel-2 satellite as a source of high spatial resolution. Both of pixel-based and an object-based classification algorithms were assessed; the pixel-based approach employs Maximum Likelihood (ML), Artificial Neural Network (ANN) algorithms, Support Vector Machine (SVM), and, the object-based classification uses the Nearest Neighbour (NN) classifier. Stratified Masking Process (SMP) that integrates a ranking process within the classes based on spectral fluctuation of the sum of the training and testing sites was implemented. An analysis of the overall and individual accuracy of the classification results of all four methods reveals that the SVM classifier was the most efficient overall by distinguishing most of the classes with the highest accuracy. NN succeeded to deal with artificial surface classes in general while agriculture area classes, and forest and semi-natural area classes were segregated successfully with SVM. Furthermore, a comparative analysis indicates that the conventional classification method yielded better accuracy results than the SMP method overall with both classifiers used, ML and SVM.
NASA Technical Reports Server (NTRS)
Rignot, Eric; Williams, Cynthia; Way, Jobea; Viereck, Leslie
1993-01-01
A maximum a posteriori Bayesian classifier for multifrequency polarimetric SAR data is used to perform a supervised classification of forest types in the floodplains of Alaska. The image classes include white spruce, balsam poplar, black spruce, alder, non-forests, and open water. The authors investigate the effect on classification accuracy of changing environmental conditions, and of frequency and polarization of the signal. The highest classification accuracy (86 percent correctly classified forest pixels, and 91 percent overall) is obtained combining L- and C-band frequencies fully polarimetric on a date where the forest is just recovering from flooding. The forest map compares favorably with a vegetation map assembled from digitized aerial photos which took five years for completion, and address the state of the forest in 1978, ignoring subsequent fires, changes in the course of the river, clear-cutting of trees, and tree growth. HV-polarization is the most useful polarization at L- and C-band for classification. C-band VV (ERS-1 mode) and L-band HH (J-ERS-1 mode) alone or combined yield unsatisfactory classification accuracies. Additional data acquired in the winter season during thawed and frozen days yield classification accuracies respectively 20 percent and 30 percent lower due to a greater confusion between conifers and deciduous trees. Data acquired at the peak of flooding in May 1991 also yield classification accuracies 10 percent lower because of dominant trunk-ground interactions which mask out finer differences in radar backscatter between tree species. Combination of several of these dates does not improve classification accuracy. For comparison, panchromatic optical data acquired by SPOT in the summer season of 1991 are used to classify the same area. The classification accuracy (78 percent for the forest types and 90 percent if open water is included) is lower than that obtained with AIRSAR although conifers and deciduous trees are better separated due to the presence of leaves on the deciduous trees. Optical data do not separate black spruce and white spruce as well as SAR data, cannot separate alder from balsam poplar, and are of course limited by the frequent cloud cover in the polar regions. Yet, combining SPOT and AIRSAR offers better chances to identify vegetation types independent of ground truth information using a combination of NDVI indexes from SPOT, biomass numbers from AIRSAR, and a segmentation map from either one.
Pilania, G.; Gubernatis, J. E.; Lookman, T.
2015-12-03
The role of dynamical (or Born effective) charges in classification of octet AB-type binary compounds between four-fold (zincblende/wurtzite crystal structures) and six-fold (rocksalt crystal structure) coordinated systems is discussed. We show that the difference in the dynamical charges of the fourfold and sixfold coordinated structures, in combination with Harrison’s polarity, serves as an excellent feature to classify the coordination of 82 sp–bonded binary octet compounds. We use a support vector machine classifier to estimate the average classification accuracy and the associated variance in our model where a decision boundary is learned in a supervised manner. Lastly, we compare the out-of-samplemore » classification accuracy achieved by our feature pair with those reported previously.« less
NASA Astrophysics Data System (ADS)
Rahmadani, S.; Dongoran, A.; Zarlis, M.; Zakarias
2018-03-01
This paper discusses the problem of feature selection using genetic algorithms on a dataset for classification problems. The classification model used is the decicion tree (DT), and Naive Bayes. In this paper we will discuss how the Naive Bayes and Decision Tree models to overcome the classification problem in the dataset, where the dataset feature is selectively selected using GA. Then both models compared their performance, whether there is an increase in accuracy or not. From the results obtained shows an increase in accuracy if the feature selection using GA. The proposed model is referred to as GADT (GA-Decision Tree) and GANB (GA-Naive Bayes). The data sets tested in this paper are taken from the UCI Machine Learning repository.
Mexican Hat Wavelet Kernel ELM for Multiclass Classification.
Wang, Jie; Song, Yi-Fan; Ma, Tian-Lei
2017-01-01
Kernel extreme learning machine (KELM) is a novel feedforward neural network, which is widely used in classification problems. To some extent, it solves the existing problems of the invalid nodes and the large computational complexity in ELM. However, the traditional KELM classifier usually has a low test accuracy when it faces multiclass classification problems. In order to solve the above problem, a new classifier, Mexican Hat wavelet KELM classifier, is proposed in this paper. The proposed classifier successfully improves the training accuracy and reduces the training time in the multiclass classification problems. Moreover, the validity of the Mexican Hat wavelet as a kernel function of ELM is rigorously proved. Experimental results on different data sets show that the performance of the proposed classifier is significantly superior to the compared classifiers.
Support vector machine and principal component analysis for microarray data classification
NASA Astrophysics Data System (ADS)
Astuti, Widi; Adiwijaya
2018-03-01
Cancer is a leading cause of death worldwide although a significant proportion of it can be cured if it is detected early. In recent decades, technology called microarray takes an important role in the diagnosis of cancer. By using data mining technique, microarray data classification can be performed to improve the accuracy of cancer diagnosis compared to traditional techniques. The characteristic of microarray data is small sample but it has huge dimension. Since that, there is a challenge for researcher to provide solutions for microarray data classification with high performance in both accuracy and running time. This research proposed the usage of Principal Component Analysis (PCA) as a dimension reduction method along with Support Vector Method (SVM) optimized by kernel functions as a classifier for microarray data classification. The proposed scheme was applied on seven data sets using 5-fold cross validation and then evaluation and analysis conducted on term of both accuracy and running time. The result showed that the scheme can obtained 100% accuracy for Ovarian and Lung Cancer data when Linear and Cubic kernel functions are used. In term of running time, PCA greatly reduced the running time for every data sets.
Guthridge, Joel M.; Bean, Krista M.; Fife, Dustin A.; Chen, Hua; Slight-Webb, Samantha R.; Keith, Michael P.; Harley, John B.; James, Judith A.
2016-01-01
Systemic lupus erythematosus (SLE) is a complex autoimmune disease with a poorly understood preclinical stage of immune dysregulation and symptom accrual. Accumulation of antinuclear autoantibody (ANA) specificities is a hallmark of impending clinical disease. Yet, many ANA-positive individuals remain healthy, suggesting that additional immune dysregulation underlies SLE pathogenesis. Indeed, we have recently demonstrated that interferon (IFN) pathways are dysregulated in preclinical SLE. To determine if other forms of immune dysregulation contribute to preclinical SLE pathogenesis, we measured SLE-associated autoantibodies and soluble mediators in samples from 84 individuals collected prior to SLE classification (average timespan = 5.98 years), compared to unaffected, healthy control samples matched by race, gender, age (± 5 years), and time of sample procurement. We found that multiple soluble mediators, including interleukin (IL)-5, IL-6, and IFN-γ, were significantly elevated in cases compared to controls more than 3.5 years pre-classification, prior to or concurrent with autoantibody positivity. Additional mediators, including innate cytokines, IFN-associated chemokines, and soluble tumor necrosis factor (TNF) superfamily mediators increased longitudinally in cases approaching SLE classification, but not in controls. In particular, levels of B lymphocyte stimulator (BLyS) and a proliferation-inducing ligand (APRIL) were comparable in cases and controls until less than 10 months pre-classification. Over the entire pre-classification period, random forest models incorporating ANA and anti-Ro/SSA positivity with levels of IL-5, IL-6, and the IFN-γ-induced chemokine, MIG, distinguished future SLE patients with 92% (± 1.8%) accuracy, compared to 78% accuracy utilizing ANA positivity alone. These data suggest that immune dysregulation involving multiple pathways contributes to SLE pathogenesis. Importantly, distinct immunological profiles are predictive for individuals who will develop clinical SLE and may be useful for delineating early pathogenesis, discovering therapeutic targets, and designing prevention trials. PMID:27338520
Hotz, Christine S; Templeton, Steven J; Christopher, Mary M
2005-03-01
A rule-based expert system using CLIPS programming language was created to classify body cavity effusions as transudates, modified transudates, exudates, chylous, and hemorrhagic effusions. The diagnostic accuracy of the rule-based system was compared with that produced by 2 machine-learning methods: Rosetta, a rough sets algorithm and RIPPER, a rule-induction method. Results of 508 body cavity fluid analyses (canine, feline, equine) obtained from the University of California-Davis Veterinary Medical Teaching Hospital computerized patient database were used to test CLIPS and to test and train RIPPER and Rosetta. The CLIPS system, using 17 rules, achieved an accuracy of 93.5% compared with pathologist consensus diagnoses. Rosetta accurately classified 91% of effusions by using 5,479 rules. RIPPER achieved the greatest accuracy (95.5%) using only 10 rules. When the original rules of the CLIPS application were replaced with those of RIPPER, the accuracy rates were identical. These results suggest that both rule-based expert systems and machine-learning methods hold promise for the preliminary classification of body fluids in the clinical laboratory.
NASA Astrophysics Data System (ADS)
Liu, Wanjun; Liang, Xuejian; Qu, Haicheng
2017-11-01
Hyperspectral image (HSI) classification is one of the most popular topics in remote sensing community. Traditional and deep learning-based classification methods were proposed constantly in recent years. In order to improve the classification accuracy and robustness, a dimensionality-varied convolutional neural network (DVCNN) was proposed in this paper. DVCNN was a novel deep architecture based on convolutional neural network (CNN). The input of DVCNN was a set of 3D patches selected from HSI which contained spectral-spatial joint information. In the following feature extraction process, each patch was transformed into some different 1D vectors by 3D convolution kernels, which were able to extract features from spectral-spatial data. The rest of DVCNN was about the same as general CNN and processed 2D matrix which was constituted by by all 1D data. So that the DVCNN could not only extract more accurate and rich features than CNN, but also fused spectral-spatial information to improve classification accuracy. Moreover, the robustness of network on water-absorption bands was enhanced in the process of spectral-spatial fusion by 3D convolution, and the calculation was simplified by dimensionality varied convolution. Experiments were performed on both Indian Pines and Pavia University scene datasets, and the results showed that the classification accuracy of DVCNN improved by 32.87% on Indian Pines and 19.63% on Pavia University scene than spectral-only CNN. The maximum accuracy improvement of DVCNN achievement was 13.72% compared with other state-of-the-art HSI classification methods, and the robustness of DVCNN on water-absorption bands noise was demonstrated.
Burlina, Philippe; Billings, Seth; Joshi, Neil
2017-01-01
Objective To evaluate the use of ultrasound coupled with machine learning (ML) and deep learning (DL) techniques for automated or semi-automated classification of myositis. Methods Eighty subjects comprised of 19 with inclusion body myositis (IBM), 14 with polymyositis (PM), 14 with dermatomyositis (DM), and 33 normal (N) subjects were included in this study, where 3214 muscle ultrasound images of 7 muscles (observed bilaterally) were acquired. We considered three problems of classification including (A) normal vs. affected (DM, PM, IBM); (B) normal vs. IBM patients; and (C) IBM vs. other types of myositis (DM or PM). We studied the use of an automated DL method using deep convolutional neural networks (DL-DCNNs) for diagnostic classification and compared it with a semi-automated conventional ML method based on random forests (ML-RF) and “engineered” features. We used the known clinical diagnosis as the gold standard for evaluating performance of muscle classification. Results The performance of the DL-DCNN method resulted in accuracies ± standard deviation of 76.2% ± 3.1% for problem (A), 86.6% ± 2.4% for (B) and 74.8% ± 3.9% for (C), while the ML-RF method led to accuracies of 72.3% ± 3.3% for problem (A), 84.3% ± 2.3% for (B) and 68.9% ± 2.5% for (C). Conclusions This study demonstrates the application of machine learning methods for automatically or semi-automatically classifying inflammatory muscle disease using muscle ultrasound. Compared to the conventional random forest machine learning method used here, which has the drawback of requiring manual delineation of muscle/fat boundaries, DCNN-based classification by and large improved the accuracies in all classification problems while providing a fully automated approach to classification. PMID:28854220
Burlina, Philippe; Billings, Seth; Joshi, Neil; Albayda, Jemima
2017-01-01
To evaluate the use of ultrasound coupled with machine learning (ML) and deep learning (DL) techniques for automated or semi-automated classification of myositis. Eighty subjects comprised of 19 with inclusion body myositis (IBM), 14 with polymyositis (PM), 14 with dermatomyositis (DM), and 33 normal (N) subjects were included in this study, where 3214 muscle ultrasound images of 7 muscles (observed bilaterally) were acquired. We considered three problems of classification including (A) normal vs. affected (DM, PM, IBM); (B) normal vs. IBM patients; and (C) IBM vs. other types of myositis (DM or PM). We studied the use of an automated DL method using deep convolutional neural networks (DL-DCNNs) for diagnostic classification and compared it with a semi-automated conventional ML method based on random forests (ML-RF) and "engineered" features. We used the known clinical diagnosis as the gold standard for evaluating performance of muscle classification. The performance of the DL-DCNN method resulted in accuracies ± standard deviation of 76.2% ± 3.1% for problem (A), 86.6% ± 2.4% for (B) and 74.8% ± 3.9% for (C), while the ML-RF method led to accuracies of 72.3% ± 3.3% for problem (A), 84.3% ± 2.3% for (B) and 68.9% ± 2.5% for (C). This study demonstrates the application of machine learning methods for automatically or semi-automatically classifying inflammatory muscle disease using muscle ultrasound. Compared to the conventional random forest machine learning method used here, which has the drawback of requiring manual delineation of muscle/fat boundaries, DCNN-based classification by and large improved the accuracies in all classification problems while providing a fully automated approach to classification.
NASA Technical Reports Server (NTRS)
Hill, C. L.
1984-01-01
A computer-implemented classification has been derived from Landsat-4 Thematic Mapper data acquired over Baldwin County, Alabama on January 15, 1983. One set of spectral signatures was developed from the data by utilizing a 3x3 pixel sliding window approach. An analysis of the classification produced from this technique identified forested areas. Additional information regarding only the forested areas. Additional information regarding only the forested areas was extracted by employing a pixel-by-pixel signature development program which derived spectral statistics only for pixels within the forested land covers. The spectral statistics from both approaches were integrated and the data classified. This classification was evaluated by comparing the spectral classes produced from the data against corresponding ground verification polygons. This iterative data analysis technique resulted in an overall classification accuracy of 88.4 percent correct for slash pine, young pine, loblolly pine, natural pine, and mixed hardwood-pine. An accuracy assessment matrix has been produced for the classification.
NASA Astrophysics Data System (ADS)
Amit, Guy; Ben-Ari, Rami; Hadad, Omer; Monovich, Einat; Granot, Noa; Hashoul, Sharbell
2017-03-01
Diagnostic interpretation of breast MRI studies requires meticulous work and a high level of expertise. Computerized algorithms can assist radiologists by automatically characterizing the detected lesions. Deep learning approaches have shown promising results in natural image classification, but their applicability to medical imaging is limited by the shortage of large annotated training sets. In this work, we address automatic classification of breast MRI lesions using two different deep learning approaches. We propose a novel image representation for dynamic contrast enhanced (DCE) breast MRI lesions, which combines the morphological and kinetics information in a single multi-channel image. We compare two classification approaches for discriminating between benign and malignant lesions: training a designated convolutional neural network and using a pre-trained deep network to extract features for a shallow classifier. The domain-specific trained network provided higher classification accuracy, compared to the pre-trained model, with an area under the ROC curve of 0.91 versus 0.81, and an accuracy of 0.83 versus 0.71. Similar accuracy was achieved in classifying benign lesions, malignant lesions, and normal tissue images. The trained network was able to improve accuracy by using the multi-channel image representation, and was more robust to reductions in the size of the training set. A small-size convolutional neural network can learn to accurately classify findings in medical images using only a few hundred images from a few dozen patients. With sufficient data augmentation, such a network can be trained to outperform a pre-trained out-of-domain classifier. Developing domain-specific deep-learning models for medical imaging can facilitate technological advancements in computer-aided diagnosis.
NASA Astrophysics Data System (ADS)
Seo, Young Wook; Yoon, Seung Chul; Park, Bosoon; Hinton, Arthur; Windham, William R.; Lawrence, Kurt C.
2013-05-01
Salmonella is a major cause of foodborne disease outbreaks resulting from the consumption of contaminated food products in the United States. This paper reports the development of a hyperspectral imaging technique for detecting and differentiating two of the most common Salmonella serotypes, Salmonella Enteritidis (SE) and Salmonella Typhimurium (ST), from background microflora that are often found in poultry carcass rinse. Presumptive positive screening of colonies with a traditional direct plating method is a labor intensive and time consuming task. Thus, this paper is concerned with the detection of differences in spectral characteristics among the pure SE, ST, and background microflora grown on brilliant green sulfa (BGS) and xylose lysine tergitol 4 (XLT4) agar media with a spread plating technique. Visible near-infrared hyperspectral imaging, providing the spectral and spatial information unique to each microorganism, was utilized to differentiate SE and ST from the background microflora. A total of 10 classification models, including five machine learning algorithms, each without and with principal component analysis (PCA), were validated and compared to find the best model in classification accuracy. The five machine learning (classification) algorithms used in this study were Mahalanobis distance (MD), k-nearest neighbor (kNN), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM). The average classification accuracy of all 10 models on a calibration (or training) set of the pure cultures on BGS agar plates was 98% (Kappa coefficient = 0.95) in determining the presence of SE and/or ST although it was difficult to differentiate between SE and ST. The average classification accuracy of all 10 models on a training set for ST detection on XLT4 agar was over 99% (Kappa coefficient = 0.99) although SE colonies on XLT4 agar were difficult to differentiate from background microflora. The average classification accuracy of all 10 models on a validation set of chicken carcass rinses spiked with SE or ST and incubated on BGS agar plates was 94.45% and 83.73%, without and with PCA for classification, respectively. The best performing classification model on the validation set was QDA without PCA by achieving the classification accuracy of 98.65% (Kappa coefficient=0.98). The overall best performing classification model regardless of using PCA was MD with the classification accuracy of 94.84% (Kappa coefficient=0.88) on the validation set.
Characterization and classification of lupus patients based on plasma thermograms
Chaires, Jonathan B.; Mekmaysy, Chongkham S.; DeLeeuw, Lynn; Sivils, Kathy L.; Harley, John B.; Rovin, Brad H.; Kulasekera, K. B.; Jarjour, Wael N.
2017-01-01
Objective Plasma thermograms (thermal stability profiles of blood plasma) are being utilized as a new diagnostic approach for clinical assessment. In this study, we investigated the ability of plasma thermograms to classify systemic lupus erythematosus (SLE) patients versus non SLE controls using a sample of 300 SLE and 300 control subjects from the Lupus Family Registry and Repository. Additionally, we evaluated the heterogeneity of thermograms along age, sex, ethnicity, concurrent health conditions and SLE diagnostic criteria. Methods Thermograms were visualized graphically for important differences between covariates and summarized using various measures. A modified linear discriminant analysis was used to segregate SLE versus control subjects on the basis of the thermograms. Classification accuracy was measured based on multiple training/test splits of the data and compared to classification based on SLE serological markers. Results Median sensitivity, specificity, and overall accuracy based on classification using plasma thermograms was 86%, 83%, and 84% compared to 78%, 95%, and 86% based on a combination of five antibody tests. Combining thermogram and serology information together improved sensitivity from 78% to 86% and overall accuracy from 86% to 89% relative to serology alone. Predictive accuracy of thermograms for distinguishing SLE and osteoarthritis / rheumatoid arthritis patients was comparable. Both gender and anemia significantly interacted with disease status for plasma thermograms (p<0.001), with greater separation between SLE and control thermograms for females relative to males and for patients with anemia relative to patients without anemia. Conclusion Plasma thermograms constitute an additional biomarker which may help improve diagnosis of SLE patients, particularly when coupled with standard diagnostic testing. Differences in thermograms according to patient sex, ethnicity, clinical and environmental factors are important considerations for application of thermograms in a clinical setting. PMID:29149219
Fault detection and diagnosis of diesel engine valve trains
NASA Astrophysics Data System (ADS)
Flett, Justin; Bone, Gary M.
2016-05-01
This paper presents the development of a fault detection and diagnosis (FDD) system for use with a diesel internal combustion engine (ICE) valve train. A novel feature is generated for each of the valve closing and combustion impacts. Deformed valve spring faults and abnormal valve clearance faults were seeded on a diesel engine instrumented with one accelerometer. Five classification methods were implemented experimentally and compared. The FDD system using the Naïve-Bayes classification method produced the best overall performance, with a lowest detection accuracy (DA) of 99.95% and a lowest classification accuracy (CA) of 99.95% for the spring faults occurring on individual valves. The lowest DA and CA values for multiple faults occurring simultaneously were 99.95% and 92.45%, respectively. The DA and CA results demonstrate the accuracy of our FDD system for diesel ICE valve train fault scenarios not previously addressed in the literature.
Tuberculosis disease diagnosis using artificial immune recognition system.
Shamshirband, Shahaboddin; Hessam, Somayeh; Javidnia, Hossein; Amiribesheli, Mohsen; Vahdat, Shaghayegh; Petković, Dalibor; Gani, Abdullah; Kiah, Miss Laiha Mat
2014-01-01
There is a high risk of tuberculosis (TB) disease diagnosis among conventional methods. This study is aimed at diagnosing TB using hybrid machine learning approaches. Patient epicrisis reports obtained from the Pasteur Laboratory in the north of Iran were used. All 175 samples have twenty features. The features are classified based on incorporating a fuzzy logic controller and artificial immune recognition system. The features are normalized through a fuzzy rule based on a labeling system. The labeled features are categorized into normal and tuberculosis classes using the Artificial Immune Recognition Algorithm. Overall, the highest classification accuracy reached was for the 0.8 learning rate (α) values. The artificial immune recognition system (AIRS) classification approaches using fuzzy logic also yielded better diagnosis results in terms of detection accuracy compared to other empirical methods. Classification accuracy was 99.14%, sensitivity 87.00%, and specificity 86.12%.
Das, Dev Kumar; Ghosh, Madhumala; Pal, Mallika; Maiti, Asok K; Chakraborty, Chandan
2013-02-01
The aim of this paper is to address the development of computer assisted malaria parasite characterization and classification using machine learning approach based on light microscopic images of peripheral blood smears. In doing this, microscopic image acquisition from stained slides, illumination correction and noise reduction, erythrocyte segmentation, feature extraction, feature selection and finally classification of different stages of malaria (Plasmodium vivax and Plasmodium falciparum) have been investigated. The erythrocytes are segmented using marker controlled watershed transformation and subsequently total ninety six features describing shape-size and texture of erythrocytes are extracted in respect to the parasitemia infected versus non-infected cells. Ninety four features are found to be statistically significant in discriminating six classes. Here a feature selection-cum-classification scheme has been devised by combining F-statistic, statistical learning techniques i.e., Bayesian learning and support vector machine (SVM) in order to provide the higher classification accuracy using best set of discriminating features. Results show that Bayesian approach provides the highest accuracy i.e., 84% for malaria classification by selecting 19 most significant features while SVM provides highest accuracy i.e., 83.5% with 9 most significant features. Finally, the performance of these two classifiers under feature selection framework has been compared toward malaria parasite classification. Copyright © 2012 Elsevier Ltd. All rights reserved.
Accurate crop classification using hierarchical genetic fuzzy rule-based systems
NASA Astrophysics Data System (ADS)
Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.
2014-10-01
This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.
[Object-oriented aquatic vegetation extracting approach based on visible vegetation indices.
Jing, Ran; Deng, Lei; Zhao, Wen Ji; Gong, Zhao Ning
2016-05-01
Using the estimation of scale parameters (ESP) image segmentation tool to determine the ideal image segmentation scale, the optimal segmented image was created by the multi-scale segmentation method. Based on the visible vegetation indices derived from mini-UAV imaging data, we chose a set of optimal vegetation indices from a series of visible vegetation indices, and built up a decision tree rule. A membership function was used to automatically classify the study area and an aquatic vegetation map was generated. The results showed the overall accuracy of image classification using the supervised classification was 53.7%, and the overall accuracy of object-oriented image analysis (OBIA) was 91.7%. Compared with pixel-based supervised classification method, the OBIA method improved significantly the image classification result and further increased the accuracy of extracting the aquatic vegetation. The Kappa value of supervised classification was 0.4, and the Kappa value based OBIA was 0.9. The experimental results demonstrated that using visible vegetation indices derived from the mini-UAV data and OBIA method extracting the aquatic vegetation developed in this study was feasible and could be applied in other physically similar areas.
NASA Astrophysics Data System (ADS)
Dronova, I.; Gong, P.; Wang, L.; Clinton, N.; Fu, W.; Qi, S.
2011-12-01
Remote sensing-based vegetation classifications representing plant function such as photosynthesis and productivity are challenging in wetlands with complex cover and difficult field access. Recent advances in object-based image analysis (OBIA) and machine-learning algorithms offer new classification tools; however, few comparisons of different algorithms and spatial scales have been discussed to date. We applied OBIA to delineate wetland plant functional types (PFTs) for Poyang Lake, the largest freshwater lake in China and Ramsar wetland conservation site, from 30-m Landsat TM scene at the peak of spring growing season. We targeted major PFTs (C3 grasses, C3 forbs and different types of C4 grasses and aquatic vegetation) that are both key players in system's biogeochemical cycles and critical providers of waterbird habitat. Classification results were compared among: a) several object segmentation scales (with average object sizes 900-9000 m2); b) several families of statistical classifiers (including Bayesian, Logistic, Neural Network, Decision Trees and Support Vector Machines) and c) two hierarchical levels of vegetation classification, a generalized 3-class set and more detailed 6-class set. We found that classification benefited from object-based approach which allowed including object shape, texture and context descriptors in classification. While a number of classifiers achieved high accuracy at the finest pixel-equivalent segmentation scale, the highest accuracies and best agreement among algorithms occurred at coarser object scales. No single classifier was consistently superior across all scales, although selected algorithms of Neural Network, Logistic and K-Nearest Neighbors families frequently provided the best discrimination of classes at different scales. The choice of vegetation categories also affected classification accuracy. The 6-class set allowed for higher individual class accuracies but lower overall accuracies than the 3-class set because individual classes differed in scales at which they were best discriminated from others. Main classification challenges included a) presence of C3 grasses in C4-grass areas, particularly following harvesting of C4 reeds and b) mixtures of emergent, floating and submerged aquatic plants at sub-object and sub-pixel scales. We conclude that OBIA with advanced statistical classifiers offers useful instruments for landscape vegetation analyses, and that spatial scale considerations are critical in mapping PFTs, while multi-scale comparisons can be used to guide class selection. Future work will further apply fuzzy classification and field-collected spectral data for PFT analysis and compare results with MODIS PFT products.
Bhaduri, Aritra; Banerjee, Amitava; Roy, Subhrajit; Kar, Sougata; Basu, Arindam
2018-03-01
We present a neuromorphic current mode implementation of a spiking neural classifier with lumped square law dendritic nonlinearity. It has been shown previously in software simulations that such a system with binary synapses can be trained with structural plasticity algorithms to achieve comparable classification accuracy with fewer synaptic resources than conventional algorithms. We show that even in real analog systems with manufacturing imperfections (CV of 23.5% and 14.4% for dendritic branch gains and leaks respectively), this network is able to produce comparable results with fewer synaptic resources. The chip fabricated in [Formula: see text]m complementary metal oxide semiconductor has eight dendrites per cell and uses two opposing cells per class to cancel common-mode inputs. The chip can operate down to a [Formula: see text] V and dissipates 19 nW of static power per neuronal cell and [Formula: see text] 125 pJ/spike. For two-class classification problems of high-dimensional rate encoded binary patterns, the hardware achieves comparable performance as software implementation of the same with only about a 0.5% reduction in accuracy. On two UCI data sets, the IC integrated circuit has classification accuracy comparable to standard machine learners like support vector machines and extreme learning machines while using two to five times binary synapses. We also show that the system can operate on mean rate encoded spike patterns, as well as short bursts of spikes. To the best of our knowledge, this is the first attempt in hardware to perform classification exploiting dendritic properties and binary synapses.
NASA Astrophysics Data System (ADS)
Park, M.; Stenstrom, M. K.
2004-12-01
Recognizing urban information from the satellite imagery is problematic due to the diverse features and dynamic changes of urban landuse. The use of Landsat imagery for urban land use classification involves inherent uncertainty due to its spatial resolution and the low separability among land uses. To resolve the uncertainty problem, we investigated the performance of Bayesian networks to classify urban land use since Bayesian networks provide a quantitative way of handling uncertainty and have been successfully used in many areas. In this study, we developed the optimized networks for urban land use classification from Landsat ETM+ images of Marina del Rey area based on USGS land cover/use classification level III. The networks started from a tree structure based on mutual information between variables and added the links to improve accuracy. This methodology offers several advantages: (1) The network structure shows the dependency relationships between variables. The class node value can be predicted even with particular band information missing due to sensor system error. The missing information can be inferred from other dependent bands. (2) The network structure provides information of variables that are important for the classification, which is not available from conventional classification methods such as neural networks and maximum likelihood classification. In our case, for example, bands 1, 5 and 6 are the most important inputs in determining the land use of each pixel. (3) The networks can be reduced with those input variables important for classification. This minimizes the problem without considering all possible variables. We also examined the effect of incorporating ancillary data: geospatial information such as X and Y coordinate values of each pixel and DEM data, and vegetation indices such as NDVI and Tasseled Cap transformation. The results showed that the locational information improved overall accuracy (81%) and kappa coefficient (76%), and lowered the omission and commission errors compared with using only spectral data (accuracy 71%, kappa coefficient 62%). Incorporating DEM data did not significantly improve overall accuracy (74%) and kappa coefficient (66%) but lowered the omission and commission errors. Incorporating NDVI did not much improve the overall accuracy (72%) and k coefficient (65%). Including Tasseled Cap transformation reduced the accuracy (accuracy 70%, kappa 61%). Therefore, additional information from the DEM and vegetation indices was not useful as locational ancillary data.
High-accuracy user identification using EEG biometrics.
Koike-Akino, Toshiaki; Mahajan, Ruhi; Marks, Tim K; Ye Wang; Watanabe, Shinji; Tuzel, Oncel; Orlik, Philip
2016-08-01
We analyze brain waves acquired through a consumer-grade EEG device to investigate its capabilities for user identification and authentication. First, we show the statistical significance of the P300 component in event-related potential (ERP) data from 14-channel EEGs across 25 subjects. We then apply a variety of machine learning techniques, comparing the user identification performance of various different combinations of a dimensionality reduction technique followed by a classification algorithm. Experimental results show that an identification accuracy of 72% can be achieved using only a single 800 ms ERP epoch. In addition, we demonstrate that the user identification accuracy can be significantly improved to more than 96.7% by joint classification of multiple epochs.
A Spiking Neural Network in sEMG Feature Extraction.
Lobov, Sergey; Mironov, Vasiliy; Kastalskiy, Innokentiy; Kazantsev, Victor
2015-11-03
We have developed a novel algorithm for sEMG feature extraction and classification. It is based on a hybrid network composed of spiking and artificial neurons. The spiking neuron layer with mutual inhibition was assigned as feature extractor. We demonstrate that the classification accuracy of the proposed model could reach high values comparable with existing sEMG interface systems. Moreover, the algorithm sensibility for different sEMG collecting systems characteristics was estimated. Results showed rather equal accuracy, despite a significant sampling rate difference. The proposed algorithm was successfully tested for mobile robot control.
Le, Trang T; Simmons, W Kyle; Misaki, Masaya; Bodurka, Jerzy; White, Bill C; Savitz, Jonathan; McKinney, Brett A
2017-09-15
Classification of individuals into disease or clinical categories from high-dimensional biological data with low prediction error is an important challenge of statistical learning in bioinformatics. Feature selection can improve classification accuracy but must be incorporated carefully into cross-validation to avoid overfitting. Recently, feature selection methods based on differential privacy, such as differentially private random forests and reusable holdout sets, have been proposed. However, for domains such as bioinformatics, where the number of features is much larger than the number of observations p≫n , these differential privacy methods are susceptible to overfitting. We introduce private Evaporative Cooling, a stochastic privacy-preserving machine learning algorithm that uses Relief-F for feature selection and random forest for privacy preserving classification that also prevents overfitting. We relate the privacy-preserving threshold mechanism to a thermodynamic Maxwell-Boltzmann distribution, where the temperature represents the privacy threshold. We use the thermal statistical physics concept of Evaporative Cooling of atomic gases to perform backward stepwise privacy-preserving feature selection. On simulated data with main effects and statistical interactions, we compare accuracies on holdout and validation sets for three privacy-preserving methods: the reusable holdout, reusable holdout with random forest, and private Evaporative Cooling, which uses Relief-F feature selection and random forest classification. In simulations where interactions exist between attributes, private Evaporative Cooling provides higher classification accuracy without overfitting based on an independent validation set. In simulations without interactions, thresholdout with random forest and private Evaporative Cooling give comparable accuracies. We also apply these privacy methods to human brain resting-state fMRI data from a study of major depressive disorder. Code available at http://insilico.utulsa.edu/software/privateEC . brett-mckinney@utulsa.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Landcover Classification Using Deep Fully Convolutional Neural Networks
NASA Astrophysics Data System (ADS)
Wang, J.; Li, X.; Zhou, S.; Tang, J.
2017-12-01
Land cover classification has always been an essential application in remote sensing. Certain image features are needed for land cover classification whether it is based on pixel or object-based methods. Different from other machine learning methods, deep learning model not only extracts useful information from multiple bands/attributes, but also learns spatial characteristics. In recent years, deep learning methods have been developed rapidly and widely applied in image recognition, semantic understanding, and other application domains. However, there are limited studies applying deep learning methods in land cover classification. In this research, we used fully convolutional networks (FCN) as the deep learning model to classify land covers. The National Land Cover Database (NLCD) within the state of Kansas was used as training dataset and Landsat images were classified using the trained FCN model. We also applied an image segmentation method to improve the original results from the FCN model. In addition, the pros and cons between deep learning and several machine learning methods were compared and explored. Our research indicates: (1) FCN is an effective classification model with an overall accuracy of 75%; (2) image segmentation improves the classification results with better match of spatial patterns; (3) FCN has an excellent ability of learning which can attains higher accuracy and better spatial patterns compared with several machine learning methods.
Optimal two-phase sampling design for comparing accuracies of two binary classification rules.
Xu, Huiping; Hui, Siu L; Grannis, Shaun
2014-02-10
In this paper, we consider the design for comparing the performance of two binary classification rules, for example, two record linkage algorithms or two screening tests. Statistical methods are well developed for comparing these accuracy measures when the gold standard is available for every unit in the sample, or in a two-phase study when the gold standard is ascertained only in the second phase in a subsample using a fixed sampling scheme. However, these methods do not attempt to optimize the sampling scheme to minimize the variance of the estimators of interest. In comparing the performance of two classification rules, the parameters of primary interest are the difference in sensitivities, specificities, and positive predictive values. We derived the analytic variance formulas for these parameter estimates and used them to obtain the optimal sampling design. The efficiency of the optimal sampling design is evaluated through an empirical investigation that compares the optimal sampling with simple random sampling and with proportional allocation. Results of the empirical study show that the optimal sampling design is similar for estimating the difference in sensitivities and in specificities, and both achieve a substantial amount of variance reduction with an over-sample of subjects with discordant results and under-sample of subjects with concordant results. A heuristic rule is recommended when there is no prior knowledge of individual sensitivities and specificities, or the prevalence of the true positive findings in the study population. The optimal sampling is applied to a real-world example in record linkage to evaluate the difference in classification accuracy of two matching algorithms. Copyright © 2013 John Wiley & Sons, Ltd.
ERIC Educational Resources Information Center
Wang, Wenyi; Song, Lihong; Chen, Ping; Meng, Yaru; Ding, Shuliang
2015-01-01
Classification consistency and accuracy are viewed as important indicators for evaluating the reliability and validity of classification results in cognitive diagnostic assessment (CDA). Pattern-level classification consistency and accuracy indices were introduced by Cui, Gierl, and Chang. However, the indices at the attribute level have not yet…
AVNM: A Voting based Novel Mathematical Rule for Image Classification.
Vidyarthi, Ankit; Mittal, Namita
2016-12-01
In machine learning, the accuracy of the system depends upon classification result. Classification accuracy plays an imperative role in various domains. Non-parametric classifier like K-Nearest Neighbor (KNN) is the most widely used classifier for pattern analysis. Besides its easiness, simplicity and effectiveness characteristics, the main problem associated with KNN classifier is the selection of a number of nearest neighbors i.e. "k" for computation. At present, it is hard to find the optimal value of "k" using any statistical algorithm, which gives perfect accuracy in terms of low misclassification error rate. Motivated by the prescribed problem, a new sample space reduction weighted voting mathematical rule (AVNM) is proposed for classification in machine learning. The proposed AVNM rule is also non-parametric in nature like KNN. AVNM uses the weighted voting mechanism with sample space reduction to learn and examine the predicted class label for unidentified sample. AVNM is free from any initial selection of predefined variable and neighbor selection as found in KNN algorithm. The proposed classifier also reduces the effect of outliers. To verify the performance of the proposed AVNM classifier, experiments are made on 10 standard datasets taken from UCI database and one manually created dataset. The experimental result shows that the proposed AVNM rule outperforms the KNN classifier and its variants. Experimentation results based on confusion matrix accuracy parameter proves higher accuracy value with AVNM rule. The proposed AVNM rule is based on sample space reduction mechanism for identification of an optimal number of nearest neighbor selections. AVNM results in better classification accuracy and minimum error rate as compared with the state-of-art algorithm, KNN, and its variants. The proposed rule automates the selection of nearest neighbor selection and improves classification rate for UCI dataset and manually created dataset. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Adi Putra, Januar
2018-04-01
In this paper, we propose a new mammogram classification scheme to classify the breast tissues as normal or abnormal. Feature matrix is generated using Local Binary Pattern to all the detailed coefficients from 2D-DWT of the region of interest (ROI) of a mammogram. Feature selection is done by selecting the relevant features that affect the classification. Feature selection is used to reduce the dimensionality of data and features that are not relevant, in this paper the F-test and Ttest will be performed to the results of the feature extraction dataset to reduce and select the relevant feature. The best features are used in a Neural Network classifier for classification. In this research we use MIAS and DDSM database. In addition to the suggested scheme, the competent schemes are also simulated for comparative analysis. It is observed that the proposed scheme has a better say with respect to accuracy, specificity and sensitivity. Based on experiments, the performance of the proposed scheme can produce high accuracy that is 92.71%, while the lowest accuracy obtained is 77.08%.
Multiclass cancer diagnosis using tumor gene expression signatures
Ramaswamy, S.; Tamayo, P.; Rifkin, R.; ...
2001-12-11
The optimal treatment of patients with cancer depends on establishing accurate diagnoses by using a complex combination of clinical and histopathological data. In some instances, this task is difficult or impossible because of atypical clinical presentation or histopathology. To determine whether the diagnosis of multiple common adult malignancies could be achieved purely by molecular classification, we subjected 218 tumor samples, spanning 14 common tumor types, and 90 normal tissue samples to oligonucleotide microarray gene expression analysis. The expression levels of 16,063 genes and expressed sequence tags were used to evaluate the accuracy of a multiclass classifier based on a supportmore » vector machine algorithm. Overall classification accuracy was 78%, far exceeding the accuracy of random classification (9%). Poorly differentiated cancers resulted in low-confidence predictions and could not be accurately classified according to their tissue of origin, indicating that they are molecularly distinct entities with dramatically different gene expression patterns compared with their well differentiated counterparts. Taken together, these results demonstrate the feasibility of accurate, multiclass molecular cancer classification and suggest a strategy for future clinical implementation of molecular cancer diagnostics.« less
Characterization and delineation of caribou habitat on Unimak Island using remote sensing techniques
NASA Astrophysics Data System (ADS)
Atkinson, Brain M.
The assessment of herbivore habitat quality is traditionally based on quantifying the forages available to the animal across their home range through ground-based techniques. While these methods are highly accurate, they can be time-consuming and highly expensive, especially for herbivores that occupy vast spatial landscapes. The Unimak Island caribou herd has been decreasing in the last decade at rates that have prompted discussion of management intervention. Frequent inclement weather in this region of Alaska has provided for little opportunity to study the caribou forage habitat on Unimak Island. The overall objectives of this study were two-fold 1) to assess the feasibility of using high-resolution color and near-infrared aerial imagery to map the forage distribution of caribou habitat on Unimak Island and 2) to assess the use of a new high-resolution multispectral satellite imagery platform, RapidEye, and use of the "red-edge" spectral band on vegetation classification accuracy. Maximum likelihood classification algorithms were used to create land cover maps in aerial and satellite imagery. Accuracy assessments and transformed divergence values were produced to assess vegetative spectral information and classification accuracy. By using RapidEye and aerial digital imagery in a hierarchical supervised classification technique, we were able to produce a high resolution land cover map of Unimak Island. We obtained overall accuracy rates of 71.4 percent which are comparable to other land cover maps using RapidEye imagery. The "red-edge" spectral band included in the RapidEye imagery provides additional spectral information that allows for a more accurate overall classification, raising overall accuracy 5.2 percent.
Spectral-Spatial Classification of Hyperspectral Images Using Hierarchical Optimization
NASA Technical Reports Server (NTRS)
Tarabalka, Yuliya; Tilton, James C.
2011-01-01
A new spectral-spatial method for hyperspectral data classification is proposed. For a given hyperspectral image, probabilistic pixelwise classification is first applied. Then, hierarchical step-wise optimization algorithm is performed, by iteratively merging neighboring regions with the smallest Dissimilarity Criterion (DC) and recomputing class labels for new regions. The DC is computed by comparing region mean vectors, class labels and a number of pixels in the two regions under consideration. The algorithm is converged when all the pixels get involved in the region merging procedure. Experimental results are presented on two remote sensing hyperspectral images acquired by the AVIRIS and ROSIS sensors. The proposed approach improves classification accuracies and provides maps with more homogeneous regions, when compared to previously proposed classification techniques.
Activity classification using the GENEA: optimum sampling frequency and number of axes.
Zhang, Shaoyan; Murray, Peter; Zillmer, Ruediger; Eston, Roger G; Catt, Michael; Rowlands, Alex V
2012-11-01
The GENEA shows high accuracy for classification of sedentary, household, walking, and running activities when sampling at 80 Hz on three axes. It is not known whether it is possible to decrease this sampling frequency and/or the number of axes without detriment to classification accuracy. The purpose of this study was to compare the classification rate of activities on the basis of data from a single axis, two axes, and three axes, with sampling rates ranging from 5 to 80 Hz. Sixty participants (age, 49.4 yr (6.5 yr); BMI, 24.6 kg·m (3.4 kg·m)) completed 10-12 semistructured activities in the laboratory and outdoor environment while wearing a GENEA accelerometer on the right wrist. We analyzed data from single axis, dual axes, and three axes at sampling rates of 5, 10, 20, 40, and 80 Hz. Mathematical models based on features extracted from mean, SD, fast Fourier transform, and wavelet decomposition were built, which combined one of the numbers of axes with one of the sampling rates to classify activities into sedentary, household, walking, and running. Classification accuracy was high irrespective of the number of axes for data collected at 80 Hz (96.93% ± 0.97%), 40 Hz (97.4% ± 0.73%), 20 Hz (96.86% ± 1.12%), and 10 Hz (97.01% ± 1.01%) but dropped for data collected at 5 Hz (94.98% ± 1.36%). Sampling frequencies >10 Hz and/or more than one axis of measurement were not associated with greater classification accuracy. Lower sampling rates and measurement of a single axis would result in a lower data load, longer battery life, and higher efficiency of data processing. Further research should investigate whether a lower sampling rate and a single axis affects classification accuracy when considering a wider range of activities.
Corcoran, Jennifer M.; Knight, Joseph F.; Gallant, Alisa L.
2013-01-01
Wetland mapping at the landscape scale using remotely sensed data requires both affordable data and an efficient accurate classification method. Random forest classification offers several advantages over traditional land cover classification techniques, including a bootstrapping technique to generate robust estimations of outliers in the training data, as well as the capability of measuring classification confidence. Though the random forest classifier can generate complex decision trees with a multitude of input data and still not run a high risk of over fitting, there is a great need to reduce computational and operational costs by including only key input data sets without sacrificing a significant level of accuracy. Our main questions for this study site in Northern Minnesota were: (1) how does classification accuracy and confidence of mapping wetlands compare using different remote sensing platforms and sets of input data; (2) what are the key input variables for accurate differentiation of upland, water, and wetlands, including wetland type; and (3) which datasets and seasonal imagery yield the best accuracy for wetland classification. Our results show the key input variables include terrain (elevation and curvature) and soils descriptors (hydric), along with an assortment of remotely sensed data collected in the spring (satellite visible, near infrared, and thermal bands; satellite normalized vegetation index and Tasseled Cap greenness and wetness; and horizontal-horizontal (HH) and horizontal-vertical (HV) polarization using L-band satellite radar). We undertook this exploratory analysis to inform decisions by natural resource managers charged with monitoring wetland ecosystems and to aid in designing a system for consistent operational mapping of wetlands across landscapes similar to those found in Northern Minnesota.
ERIC Educational Resources Information Center
Madison, Matthew J.; Bradshaw, Laine P.
2015-01-01
Diagnostic classification models are psychometric models that aim to classify examinees according to their mastery or non-mastery of specified latent characteristics. These models are well-suited for providing diagnostic feedback on educational assessments because of their practical efficiency and increased reliability when compared with other…
An alternative respiratory sounds classification system utilizing artificial neural networks.
Oweis, Rami J; Abdulhay, Enas W; Khayal, Amer; Awad, Areen
2015-01-01
Computerized lung sound analysis involves recording lung sound via an electronic device, followed by computer analysis and classification based on specific signal characteristics as non-linearity and nonstationarity caused by air turbulence. An automatic analysis is necessary to avoid dependence on expert skills. This work revolves around exploiting autocorrelation in the feature extraction stage. All process stages were implemented in MATLAB. The classification process was performed comparatively using both artificial neural networks (ANNs) and adaptive neuro-fuzzy inference systems (ANFIS) toolboxes. The methods have been applied to 10 different respiratory sounds for classification. The ANN was superior to the ANFIS system and returned superior performance parameters. Its accuracy, specificity, and sensitivity were 98.6%, 100%, and 97.8%, respectively. The obtained parameters showed superiority to many recent approaches. The promising proposed method is an efficient fast tool for the intended purpose as manifested in the performance parameters, specifically, accuracy, specificity, and sensitivity. Furthermore, it may be added that utilizing the autocorrelation function in the feature extraction in such applications results in enhanced performance and avoids undesired computation complexities compared to other techniques.
Detailed analysis of CAMS procedures for phase 3 using ground truth inventories
NASA Technical Reports Server (NTRS)
Carnes, J. G.
1979-01-01
The results of a study of Procedure 1 as used during LACIE Phase 3 are presented. The study was performed by comparing the Procedure 1 classification results with digitized ground-truth inventories. The proportion estimation accuracy, dot labeling accuracy, and clustering effectiveness are discussed.
Feature Extraction of Electronic Nose Signals Using QPSO-Based Multiple KFDA Signal Processing
Wen, Tailai; Huang, Daoyu; Lu, Kun; Deng, Changjian; Zeng, Tanyue; Yu, Song; He, Zhiyi
2018-01-01
The aim of this research was to enhance the classification accuracy of an electronic nose (E-nose) in different detecting applications. During the learning process of the E-nose to predict the types of different odors, the prediction accuracy was not quite satisfying because the raw features extracted from sensors’ responses were regarded as the input of a classifier without any feature extraction processing. Therefore, in order to obtain more useful information and improve the E-nose’s classification accuracy, in this paper, a Weighted Kernels Fisher Discriminant Analysis (WKFDA) combined with Quantum-behaved Particle Swarm Optimization (QPSO), i.e., QWKFDA, was presented to reprocess the original feature matrix. In addition, we have also compared the proposed method with quite a few previously existing ones including Principal Component Analysis (PCA), Locality Preserving Projections (LPP), Fisher Discriminant Analysis (FDA) and Kernels Fisher Discriminant Analysis (KFDA). Experimental results proved that QWKFDA is an effective feature extraction method for E-nose in predicting the types of wound infection and inflammable gases, which shared much higher classification accuracy than those of the contrast methods. PMID:29382146
Feature Extraction of Electronic Nose Signals Using QPSO-Based Multiple KFDA Signal Processing.
Wen, Tailai; Yan, Jia; Huang, Daoyu; Lu, Kun; Deng, Changjian; Zeng, Tanyue; Yu, Song; He, Zhiyi
2018-01-29
The aim of this research was to enhance the classification accuracy of an electronic nose (E-nose) in different detecting applications. During the learning process of the E-nose to predict the types of different odors, the prediction accuracy was not quite satisfying because the raw features extracted from sensors' responses were regarded as the input of a classifier without any feature extraction processing. Therefore, in order to obtain more useful information and improve the E-nose's classification accuracy, in this paper, a Weighted Kernels Fisher Discriminant Analysis (WKFDA) combined with Quantum-behaved Particle Swarm Optimization (QPSO), i.e., QWKFDA, was presented to reprocess the original feature matrix. In addition, we have also compared the proposed method with quite a few previously existing ones including Principal Component Analysis (PCA), Locality Preserving Projections (LPP), Fisher Discriminant Analysis (FDA) and Kernels Fisher Discriminant Analysis (KFDA). Experimental results proved that QWKFDA is an effective feature extraction method for E-nose in predicting the types of wound infection and inflammable gases, which shared much higher classification accuracy than those of the contrast methods.
Accurate, Rapid Taxonomic Classification of Fungal Large-Subunit rRNA Genes
Liu, Kuan-Liang; Porras-Alfaro, Andrea; Eichorst, Stephanie A.
2012-01-01
Taxonomic and phylogenetic fingerprinting based on sequence analysis of gene fragments from the large-subunit rRNA (LSU) gene or the internal transcribed spacer (ITS) region is becoming an integral part of fungal classification. The lack of an accurate and robust classification tool trained by a validated sequence database for taxonomic placement of fungal LSU genes is a severe limitation in taxonomic analysis of fungal isolates or large data sets obtained from environmental surveys. Using a hand-curated set of 8,506 fungal LSU gene fragments, we determined the performance characteristics of a naïve Bayesian classifier across multiple taxonomic levels and compared the classifier performance to that of a sequence similarity-based (BLASTN) approach. The naïve Bayesian classifier was computationally more rapid (>460-fold with our system) than the BLASTN approach, and it provided equal or superior classification accuracy. Classifier accuracies were compared using sequence fragments of 100 bp and 400 bp and two different PCR primer anchor points to mimic sequence read lengths commonly obtained using current high-throughput sequencing technologies. Accuracy was higher with 400-bp sequence reads than with 100-bp reads. It was also significantly affected by sequence location across the 1,400-bp test region. The highest accuracy was obtained across either the D1 or D2 variable region. The naïve Bayesian classifier provides an effective and rapid means to classify fungal LSU sequences from large environmental surveys. The training set and tool are publicly available through the Ribosomal Database Project (http://rdp.cme.msu.edu/classifier/classifier.jsp). PMID:22194300
Crabtree, Nathaniel M; Moore, Jason H; Bowyer, John F; George, Nysia I
2017-01-01
A computational evolution system (CES) is a knowledge discovery engine that can identify subtle, synergistic relationships in large datasets. Pareto optimization allows CESs to balance accuracy with model complexity when evolving classifiers. Using Pareto optimization, a CES is able to identify a very small number of features while maintaining high classification accuracy. A CES can be designed for various types of data, and the user can exploit expert knowledge about the classification problem in order to improve discrimination between classes. These characteristics give CES an advantage over other classification and feature selection algorithms, particularly when the goal is to identify a small number of highly relevant, non-redundant biomarkers. Previously, CESs have been developed only for binary class datasets. In this study, we developed a multi-class CES. The multi-class CES was compared to three common feature selection and classification algorithms: support vector machine (SVM), random k-nearest neighbor (RKNN), and random forest (RF). The algorithms were evaluated on three distinct multi-class RNA sequencing datasets. The comparison criteria were run-time, classification accuracy, number of selected features, and stability of selected feature set (as measured by the Tanimoto distance). The performance of each algorithm was data-dependent. CES performed best on the dataset with the smallest sample size, indicating that CES has a unique advantage since the accuracy of most classification methods suffer when sample size is small. The multi-class extension of CES increases the appeal of its application to complex, multi-class datasets in order to identify important biomarkers and features.
Armutlu, Pelin; Ozdemir, Muhittin E; Uney-Yuksektepe, Fadime; Kavakli, I Halil; Turkay, Metin
2008-10-03
A priori analysis of the activity of drugs on the target protein by computational approaches can be useful in narrowing down drug candidates for further experimental tests. Currently, there are a large number of computational methods that predict the activity of drugs on proteins. In this study, we approach the activity prediction problem as a classification problem and, we aim to improve the classification accuracy by introducing an algorithm that combines partial least squares regression with mixed-integer programming based hyper-boxes classification method, where drug molecules are classified as low active or high active regarding their binding activity (IC50 values) on target proteins. We also aim to determine the most significant molecular descriptors for the drug molecules. We first apply our approach by analyzing the activities of widely known inhibitor datasets including Acetylcholinesterase (ACHE), Benzodiazepine Receptor (BZR), Dihydrofolate Reductase (DHFR), Cyclooxygenase-2 (COX-2) with known IC50 values. The results at this stage proved that our approach consistently gives better classification accuracies compared to 63 other reported classification methods such as SVM, Naïve Bayes, where we were able to predict the experimentally determined IC50 values with a worst case accuracy of 96%. To further test applicability of this approach we first created dataset for Cytochrome P450 C17 inhibitors and then predicted their activities with 100% accuracy. Our results indicate that this approach can be utilized to predict the inhibitory effects of inhibitors based on their molecular descriptors. This approach will not only enhance drug discovery process, but also save time and resources committed.
Salari, Nader; Shohaimi, Shamarina; Najafi, Farid; Nallappan, Meenakshii; Karishnarajah, Isthrinayagy
2014-01-01
Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms, genetic algorithms, and artificial neural networks are considered as the most common and effective methods in classification problems in numerous studies. In the present study, the results of the implementation of a novel hybrid feature selection-classification model using the above mentioned methods are presented. The purpose is benefitting from the synergies obtained from combining these technologies for the development of classification models. Such a combination creates an opportunity to invest in the strength of each algorithm, and is an approach to make up for their deficiencies. To develop proposed model, with the aim of obtaining the best array of features, first, feature ranking techniques such as the Fisher's discriminant ratio and class separability criteria were used to prioritize features. Second, the obtained results that included arrays of the top-ranked features were used as the initial population of a genetic algorithm to produce optimum arrays of features. Third, using a modified k-Nearest Neighbor method as well as an improved method of backpropagation neural networks, the classification process was advanced based on optimum arrays of the features selected by genetic algorithms. The performance of the proposed model was compared with thirteen well-known classification models based on seven datasets. Furthermore, the statistical analysis was performed using the Friedman test followed by post-hoc tests. The experimental findings indicated that the novel proposed hybrid model resulted in significantly better classification performance compared with all 13 classification methods. Finally, the performance results of the proposed model was benchmarked against the best ones reported as the state-of-the-art classifiers in terms of classification accuracy for the same data sets. The substantial findings of the comprehensive comparative study revealed that performance of the proposed model in terms of classification accuracy is desirable, promising, and competitive to the existing state-of-the-art classification models. PMID:25419659
NASA Technical Reports Server (NTRS)
Justice, C.; Townshend, J. (Principal Investigator)
1981-01-01
Two unsupervised classification procedures were applied to ratioed and unratioed LANDSAT multispectral scanner data of an area of spatially complex vegetation and terrain. An objective accuracy assessment was undertaken on each classification and comparison was made of the classification accuracies. The two unsupervised procedures use the same clustering algorithm. By on procedure the entire area is clustered and by the other a representative sample of the area is clustered and the resulting statistics are extrapolated to the remaining area using a maximum likelihood classifier. Explanation is given of the major steps in the classification procedures including image preprocessing; classification; interpretation of cluster classes; and accuracy assessment. Of the four classifications undertaken, the monocluster block approach on the unratioed data gave the highest accuracy of 80% for five coarse cover classes. This accuracy was increased to 84% by applying a 3 x 3 contextual filter to the classified image. A detailed description and partial explanation is provided for the major misclassification. The classification of the unratioed data produced higher percentage accuracies than for the ratioed data and the monocluster block approach gave higher accuracies than clustering the entire area. The moncluster block approach was additionally the most economical in terms of computing time.
Parsons, Helen M; Ludwig, Christian; Günther, Ulrich L; Viant, Mark R
2007-01-01
Background Classifying nuclear magnetic resonance (NMR) spectra is a crucial step in many metabolomics experiments. Since several multivariate classification techniques depend upon the variance of the data, it is important to first minimise any contribution from unwanted technical variance arising from sample preparation and analytical measurements, and thereby maximise any contribution from wanted biological variance between different classes. The generalised logarithm (glog) transform was developed to stabilise the variance in DNA microarray datasets, but has rarely been applied to metabolomics data. In particular, it has not been rigorously evaluated against other scaling techniques used in metabolomics, nor tested on all forms of NMR spectra including 1-dimensional (1D) 1H, projections of 2D 1H, 1H J-resolved (pJRES), and intact 2D J-resolved (JRES). Results Here, the effects of the glog transform are compared against two commonly used variance stabilising techniques, autoscaling and Pareto scaling, as well as unscaled data. The four methods are evaluated in terms of the effects on the variance of NMR metabolomics data and on the classification accuracy following multivariate analysis, the latter achieved using principal component analysis followed by linear discriminant analysis. For two of three datasets analysed, classification accuracies were highest following glog transformation: 100% accuracy for discriminating 1D NMR spectra of hypoxic and normoxic invertebrate muscle, and 100% accuracy for discriminating 2D JRES spectra of fish livers sampled from two rivers. For the third dataset, pJRES spectra of urine from two breeds of dog, the glog transform and autoscaling achieved equal highest accuracies. Additionally we extended the glog algorithm to effectively suppress noise, which proved critical for the analysis of 2D JRES spectra. Conclusion We have demonstrated that the glog and extended glog transforms stabilise the technical variance in NMR metabolomics datasets. This significantly improves the discrimination between sample classes and has resulted in higher classification accuracies compared to unscaled, autoscaled or Pareto scaled data. Additionally we have confirmed the broad applicability of the glog approach using three disparate datasets from different biological samples using 1D NMR spectra, 1D projections of 2D JRES spectra, and intact 2D JRES spectra. PMID:17605789
NASA Technical Reports Server (NTRS)
Craig, R. G. (Principal Investigator)
1983-01-01
Richmond, Virginia and Denver, Colorado were study sites in an effort to determine the effect of autocorrelation on the accuracy of a parallelopiped classifier of LANDSAT digital data. The autocorrelation was assumed to decay to insignificant levels when sampled at distances of at least ten pixels. Spectral themes developed using blocks of adjacent pixels, and using groups of pixels spaced at least 10 pixels apart were used. Effects of geometric distortions were minimized by using only pixels from the interiors of land cover sections. Accuracy was evaluated for three classes; agriculture, residential and "all other"; both type 1 and type 2 errors were evaluated by means of overall classification accuracy. All classes give comparable results. Accuracy is approximately the same in both techniques; however, the variance in accuracy is significantly higher using the themes developed from autocorrelated data. The vectors of mean spectral response were nearly identical regardless of sampling method used. The estimated variances were much larger when using autocorrelated pixels.
Fiannaca, Antonino; La Rosa, Massimo; Rizzo, Riccardo; Urso, Alfonso
2015-07-01
In this paper, an alignment-free method for DNA barcode classification that is based on both a spectral representation and a neural gas network for unsupervised clustering is proposed. In the proposed methodology, distinctive words are identified from a spectral representation of DNA sequences. A taxonomic classification of the DNA sequence is then performed using the sequence signature, i.e., the smallest set of k-mers that can assign a DNA sequence to its proper taxonomic category. Experiments were then performed to compare our method with other supervised machine learning classification algorithms, such as support vector machine, random forest, ripper, naïve Bayes, ridor, and classification tree, which also consider short DNA sequence fragments of 200 and 300 base pairs (bp). The experimental tests were conducted over 10 real barcode datasets belonging to different animal species, which were provided by the on-line resource "Barcode of Life Database". The experimental results showed that our k-mer-based approach is directly comparable, in terms of accuracy, recall and precision metrics, with the other classifiers when considering full-length sequences. In addition, we demonstrate the robustness of our method when a classification is performed task with a set of short DNA sequences that were randomly extracted from the original data. For example, the proposed method can reach the accuracy of 64.8% at the species level with 200-bp fragments. Under the same conditions, the best other classifier (random forest) reaches the accuracy of 20.9%. Our results indicate that we obtained a clear improvement over the other classifiers for the study of short DNA barcode sequence fragments. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Phinyomark, A.; Hu, H.; Phukpattaranont, P.; Limsakul, C.
2012-01-01
The classification of upper-limb movements based on surface electromyography (EMG) signals is an important issue in the control of assistive devices and rehabilitation systems. Increasing the number of EMG channels and features in order to increase the number of control commands can yield a high dimensional feature vector. To cope with the accuracy and computation problems associated with high dimensionality, it is commonplace to apply a processing step that transforms the data to a space of significantly lower dimensions with only a limited loss of useful information. Linear discriminant analysis (LDA) has been successfully applied as an EMG feature projection method. Recently, a number of extended LDA-based algorithms have been proposed, which are more competitive in terms of both classification accuracy and computational costs/times with classical LDA. This paper presents the findings of a comparative study of classical LDA and five extended LDA methods. From a quantitative comparison based on seven multi-feature sets, three extended LDA-based algorithms, consisting of uncorrelated LDA, orthogonal LDA and orthogonal fuzzy neighborhood discriminant analysis, produce better class separability when compared with a baseline system (without feature projection), principle component analysis (PCA), and classical LDA. Based on a 7-dimension time domain and time-scale feature vectors, these methods achieved respectively 95.2% and 93.2% classification accuracy by using a linear discriminant classifier.
NASA Astrophysics Data System (ADS)
Dondurur, Mehmet
The primary objective of this study was to determine the degree to which modern SAR systems can be used to obtain information about the Earth's vegetative resources. Information obtainable from microwave synthetic aperture radar (SAR) data was compared with that obtainable from LANDSAT-TM and SPOT data. Three hypotheses were tested: (a) Classification of land cover/use from SAR data can be accomplished on a pixel-by-pixel basis with the same overall accuracy as from LANDSAT-TM and SPOT data. (b) Classification accuracy for individual land cover/use classes will differ between sensors. (c) Combining information derived from optical and SAR data into an integrated monitoring system will improve overall and individual land cover/use class accuracies. The study was conducted with three data sets for the Sleeping Bear Dunes test site in the northwestern part of Michigan's lower peninsula, including an October 1982 LANDSAT-TM scene, a June 1989 SPOT scene and C-, L- and P-Band radar data from the Jet Propulsion Laboratory AIRSAR. Reference data were derived from the Michigan Resource Information System (MIRIS) and available color infrared aerial photos. Classification and rectification of data sets were done using ERDAS Image Processing Programs. Classification algorithms included Maximum Likelihood, Mahalanobis Distance, Minimum Spectral Distance, ISODATA, Parallelepiped, and Sequential Cluster Analysis. Classified images were rectified as necessary so that all were at the same scale and oriented north-up. Results were analyzed with contingency tables and percent correctly classified (PCC) and Cohen's Kappa (CK) as accuracy indices using CSLANT and ImagePro programs developed for this study. Accuracy analyses were based upon a 1.4 by 6.5 km area with its long axis east-west. Reference data for this subscene total 55,770 15 by 15 m pixels with sixteen cover types, including seven level III forest classes, three level III urban classes, two level II range classes, two water classes, one wetland class and one agriculture class. An initial analysis was made without correcting the 1978 MIRIS reference data to the different dates of the TM, SPOT and SAR data sets. In this analysis, highest overall classification accuracy (PCC) was 87% with the TM data set, with both SPOT and C-Band SAR at 85%, a difference statistically significant at the 0.05 level. When the reference data were corrected for land cover change between 1978 and 1991, classification accuracy with the C-Band SAR data increased to 87%. Classification accuracy differed from sensor to sensor for individual land cover classes, Combining sensors into hypothetical multi-sensor systems resulted in higher accuracies than for any single sensor. Combining LANDSAT -TM and C-Band SAR yielded an overall classification accuracy (PCC) of 92%. The results of this study indicate that C-Band SAR data provide an acceptable substitute for LANDSAT-TM or SPOT data when land cover information is desired of areas where cloud cover obscures the terrain. Even better results can be obtained by integrating TM and C-Band SAR data into a multi-sensor system.
Research on Remote Sensing Geological Information Extraction Based on Object Oriented Classification
NASA Astrophysics Data System (ADS)
Gao, Hui
2018-04-01
The northern Tibet belongs to the Sub cold arid climate zone in the plateau. It is rarely visited by people. The geological working conditions are very poor. However, the stratum exposures are good and human interference is very small. Therefore, the research on the automatic classification and extraction of remote sensing geological information has typical significance and good application prospect. Based on the object-oriented classification in Northern Tibet, using the Worldview2 high-resolution remote sensing data, combined with the tectonic information and image enhancement, the lithological spectral features, shape features, spatial locations and topological relations of various geological information are excavated. By setting the threshold, based on the hierarchical classification, eight kinds of geological information were classified and extracted. Compared with the existing geological maps, the accuracy analysis shows that the overall accuracy reached 87.8561 %, indicating that the classification-oriented method is effective and feasible for this study area and provides a new idea for the automatic extraction of remote sensing geological information.
Zhang, Jianhua; Li, Sunan; Wang, Rubin
2017-01-01
In this paper, we deal with the Mental Workload (MWL) classification problem based on the measured physiological data. First we discussed the optimal depth (i.e., the number of hidden layers) and parameter optimization algorithms for the Convolutional Neural Networks (CNN). The base CNNs designed were tested according to five classification performance indices, namely Accuracy, Precision, F-measure, G-mean, and required training time. Then we developed an Ensemble Convolutional Neural Network (ECNN) to enhance the accuracy and robustness of the individual CNN model. For the ECNN design, three model aggregation approaches (weighted averaging, majority voting and stacking) were examined and a resampling strategy was used to enhance the diversity of individual CNN models. The results of MWL classification performance comparison indicated that the proposed ECNN framework can effectively improve MWL classification performance and is featured by entirely automatic feature extraction and MWL classification, when compared with traditional machine learning methods.
Incorporating spatial context into statistical classification of multidimensional image data
NASA Technical Reports Server (NTRS)
Bauer, M. E. (Principal Investigator); Tilton, J. C.; Swain, P. H.
1981-01-01
Compound decision theory is employed to develop a general statistical model for classifying image data using spatial context. The classification algorithm developed from this model exploits the tendency of certain ground-cover classes to occur more frequently in some spatial contexts than in others. A key input to this contextural classifier is a quantitative characterization of this tendency: the context function. Several methods for estimating the context function are explored, and two complementary methods are recommended. The contextural classifier is shown to produce substantial improvements in classification accuracy compared to the accuracy produced by a non-contextural uniform-priors maximum likelihood classifier when these methods of estimating the context function are used. An approximate algorithm, which cuts computational requirements by over one-half, is presented. The search for an optimal implementation is furthered by an exploration of the relative merits of using spectral classes or information classes for classification and/or context function estimation.
Can segmentation evaluation metric be used as an indicator of land cover classification accuracy?
NASA Astrophysics Data System (ADS)
Švab Lenarčič, Andreja; Đurić, Nataša; Čotar, Klemen; Ritlop, Klemen; Oštir, Krištof
2016-10-01
It is a broadly established belief that the segmentation result significantly affects subsequent image classification accuracy. However, the actual correlation between the two has never been evaluated. Such an evaluation would be of considerable importance for any attempts to automate the object-based classification process, as it would reduce the amount of user intervention required to fine-tune the segmentation parameters. We conducted an assessment of segmentation and classification by analyzing 100 different segmentation parameter combinations, 3 classifiers, 5 land cover classes, 20 segmentation evaluation metrics, and 7 classification accuracy measures. The reliability definition of segmentation evaluation metrics as indicators of land cover classification accuracy was based on the linear correlation between the two. All unsupervised metrics that are not based on number of segments have a very strong correlation with all classification measures and are therefore reliable as indicators of land cover classification accuracy. On the other hand, correlation at supervised metrics is dependent on so many factors that it cannot be trusted as a reliable classification quality indicator. Algorithms for land cover classification studied in this paper are widely used; therefore, presented results are applicable to a wider area.
Zhang, Heng; Pan, Zhongming; Zhang, Wenna
2018-06-07
An acoustic⁻seismic mixed feature extraction method based on the wavelet coefficient energy ratio (WCER) of the target signal is proposed in this study for classifying vehicle targets in wireless sensor networks. The signal was decomposed into a set of wavelet coefficients using the à trous algorithm, which is a concise method used to implement the wavelet transform of a discrete signal sequence. After the wavelet coefficients of the target acoustic and seismic signals were obtained, the energy ratio of each layer coefficient was calculated as the feature vector of the target signals. Subsequently, the acoustic and seismic features were merged into an acoustic⁻seismic mixed feature to improve the target classification accuracy after the acoustic and seismic WCER features of the target signal were simplified using the hierarchical clustering method. We selected the support vector machine method for classification and utilized the data acquired from a real-world experiment to validate the proposed method. The calculated results show that the WCER feature extraction method can effectively extract the target features from target signals. Feature simplification can reduce the time consumption of feature extraction and classification, with no effect on the target classification accuracy. The use of acoustic⁻seismic mixed features effectively improved target classification accuracy by approximately 12% compared with either acoustic signal or seismic signal alone.
Coniferous forest classification and inventory using Landsat and digital terrain data
NASA Technical Reports Server (NTRS)
Franklin, J.; Logan, T. L.; Woodcock, C. E.; Strahler, A. H.
1986-01-01
Machine-processing techniques were used in a Forest Classification and Inventory System (FOCIS) procedure to extract and process tonal, textural, and terrain information from registered Landsat multispectral and digital terrain data. Using FOCIS as a basis for stratified sampling, the softwood timber volumes of the Klamath National Forest and Eldorado National Forest were estimated within standard errors of 4.8 and 4.0 percent, respectively. The accuracy of these large-area inventories is comparable to the accuracy yielded by use of conventional timber inventory methods, but, because of automation, the FOCIS inventories are more rapid (9-12 months compared to 2-3 years for conventional manual photointerpretation, map compilation and drafting, field sampling, and data processing) and are less costly.
D Land Cover Classification Based on Multispectral LIDAR Point Clouds
NASA Astrophysics Data System (ADS)
Zou, Xiaoliang; Zhao, Guihua; Li, Jonathan; Yang, Yuanxi; Fang, Yong
2016-06-01
Multispectral Lidar System can emit simultaneous laser pulses at the different wavelengths. The reflected multispectral energy is captured through a receiver of the sensor, and the return signal together with the position and orientation information of sensor is recorded. These recorded data are solved with GNSS/IMU data for further post-processing, forming high density multispectral 3D point clouds. As the first commercial multispectral airborne Lidar sensor, Optech Titan system is capable of collecting point clouds data from all three channels at 532nm visible (Green), at 1064 nm near infrared (NIR) and at 1550nm intermediate infrared (IR). It has become a new source of data for 3D land cover classification. The paper presents an Object Based Image Analysis (OBIA) approach to only use multispectral Lidar point clouds datasets for 3D land cover classification. The approach consists of three steps. Firstly, multispectral intensity images are segmented into image objects on the basis of multi-resolution segmentation integrating different scale parameters. Secondly, intensity objects are classified into nine categories by using the customized features of classification indexes and a combination the multispectral reflectance with the vertical distribution of object features. Finally, accuracy assessment is conducted via comparing random reference samples points from google imagery tiles with the classification results. The classification results show higher overall accuracy for most of the land cover types. Over 90% of overall accuracy is achieved via using multispectral Lidar point clouds for 3D land cover classification.
NASA Technical Reports Server (NTRS)
Chang, C. Y.
1974-01-01
The author has identified the following significant results. The Skylab S192 data was evaluated by: (1) comparing the classification results using S192 and ERTS-1 data over the Holt County, Nebraska agricultural study area, and (2) investigating the impact of signal-to-noise ratio on classification accuracies using registered S192 and ERTS-1 data. Results indicate: (1) The classification accuracy obtained on S192 data using its best subset of four bands can be expected to be as high as that on ERTS-1 data. (2) When a subset of four S192 bands that are spectrally similar to the ERTS-1 bands was used for classification, an obvious deterioration in the classification accuracy was observed with respect to the ERTS-1 results. (3) The thermal bands 13 and 14 as well as the near IR bands were found to be relatively important in the classification of agricultural data. Although bands 11 and 12 were highly correlated, both were invariably included in the best subsets of the band sizes, four and beyond, according to the divergence criterion. (4) The differentiation of corn from popcorn was difficult on both S192 and ERTS-1 data acquired at an early summer date. (5) The results on both sets of data indicate that it was relatively easy to differentiate grass from any other class.
Ozcift, Akin
2012-08-01
Parkinson disease (PD) is an age-related deterioration of certain nerve systems, which affects movement, balance, and muscle control of clients. PD is one of the common diseases which affect 1% of people older than 60 years. A new classification scheme based on support vector machine (SVM) selected features to train rotation forest (RF) ensemble classifiers is presented for improving diagnosis of PD. The dataset contains records of voice measurements from 31 people, 23 with PD and each record in the dataset is defined with 22 features. The diagnosis model first makes use of a linear SVM to select ten most relevant features from 22. As a second step of the classification model, six different classifiers are trained with the subset of features. Subsequently, at the third step, the accuracies of classifiers are improved by the utilization of RF ensemble classification strategy. The results of the experiments are evaluated using three metrics; classification accuracy (ACC), Kappa Error (KE) and Area under the Receiver Operating Characteristic (ROC) Curve (AUC). Performance measures of two base classifiers, i.e. KStar and IBk, demonstrated an apparent increase in PD diagnosis accuracy compared to similar studies in literature. After all, application of RF ensemble classification scheme improved PD diagnosis in 5 of 6 classifiers significantly. We, numerically, obtained about 97% accuracy in RF ensemble of IBk (a K-Nearest Neighbor variant) algorithm, which is a quite high performance for Parkinson disease diagnosis.
Mapping Mangrove Density from Rapideye Data in Central America
NASA Astrophysics Data System (ADS)
Son, Nguyen-Thanh; Chen, Chi-Farn; Chen, Cheng-Ru
2017-06-01
Mangrove forests provide a wide range of socioeconomic and ecological services for coastal communities. Extensive aquaculture development of mangrove waters in many developing countries has constantly ignored services of mangrove ecosystems, leading to unintended environmental consequences. Monitoring the current status and distribution of mangrove forests is deemed important for evaluating forest management strategies. This study aims to delineate the density distribution of mangrove forests in the Gulf of Fonseca, Central America with Rapideye data using the support vector machines (SVM). The data collected in 2012 for density classification of mangrove forests were processed based on four different band combination schemes: scheme-1 (bands 1-3, 5 excluding the red-edge band 4), scheme-2 (bands 1-5), scheme-3 (bands 1-3, 5 incorporating with the normalized difference vegetation index, NDVI), and scheme-4 (bands 1-3, 5 incorporating with the normalized difference red-edge index, NDRI). We also hypothesized if the obvious contribution of Rapideye red-edge band could improve the classification results. Three main steps of data processing were employed: (1), data pre-processing, (2) image classification, and (3) accuracy assessment to evaluate the contribution of red-edge band in terms of the accuracy of classification results across these four schemes. The classification maps compared with the ground reference data indicated the slightly higher accuracy level observed for schemes 2 and 4. The overall accuracies and Kappa coefficients were 97% and 0.95 for scheme-2 and 96.9% and 0.95 for scheme-4, respectively.
The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images
Mitry, Danny; Zutis, Kris; Dhillon, Baljean; Peto, Tunde; Hayat, Shabina; Khaw, Kay-Tee; Morgan, James E.; Moncur, Wendy; Trucco, Emanuele; Foster, Paul J.
2016-01-01
Purpose Crowdsourcing is based on outsourcing computationally intensive tasks to numerous individuals in the online community who have no formal training. Our aim was to develop a novel online tool designed to facilitate large-scale annotation of digital retinal images, and to assess the accuracy of crowdsource grading using this tool, comparing it to expert classification. Methods We used 100 retinal fundus photograph images with predetermined disease criteria selected by two experts from a large cohort study. The Amazon Mechanical Turk Web platform was used to drive traffic to our site so anonymous workers could perform a classification and annotation task of the fundus photographs in our dataset after a short training exercise. Three groups were assessed: masters only, nonmasters only and nonmasters with compulsory training. We calculated the sensitivity, specificity, and area under the curve (AUC) of receiver operating characteristic (ROC) plots for all classifications compared to expert grading, and used the Dice coefficient and consensus threshold to assess annotation accuracy. Results In total, we received 5389 annotations for 84 images (excluding 16 training images) in 2 weeks. A specificity and sensitivity of 71% (95% confidence interval [CI], 69%–74%) and 87% (95% CI, 86%–88%) was achieved for all classifications. The AUC in this study for all classifications combined was 0.93 (95% CI, 0.91–0.96). For image annotation, a maximal Dice coefficient (∼0.6) was achieved with a consensus threshold of 0.25. Conclusions This study supports the hypothesis that annotation of abnormalities in retinal images by ophthalmologically naive individuals is comparable to expert annotation. The highest AUC and agreement with expert annotation was achieved in the nonmasters with compulsory training group. Translational Relevance The use of crowdsourcing as a technique for retinal image analysis may be comparable to expert graders and has the potential to deliver timely, accurate, and cost-effective image analysis. PMID:27668130
The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images.
Mitry, Danny; Zutis, Kris; Dhillon, Baljean; Peto, Tunde; Hayat, Shabina; Khaw, Kay-Tee; Morgan, James E; Moncur, Wendy; Trucco, Emanuele; Foster, Paul J
2016-09-01
Crowdsourcing is based on outsourcing computationally intensive tasks to numerous individuals in the online community who have no formal training. Our aim was to develop a novel online tool designed to facilitate large-scale annotation of digital retinal images, and to assess the accuracy of crowdsource grading using this tool, comparing it to expert classification. We used 100 retinal fundus photograph images with predetermined disease criteria selected by two experts from a large cohort study. The Amazon Mechanical Turk Web platform was used to drive traffic to our site so anonymous workers could perform a classification and annotation task of the fundus photographs in our dataset after a short training exercise. Three groups were assessed: masters only, nonmasters only and nonmasters with compulsory training. We calculated the sensitivity, specificity, and area under the curve (AUC) of receiver operating characteristic (ROC) plots for all classifications compared to expert grading, and used the Dice coefficient and consensus threshold to assess annotation accuracy. In total, we received 5389 annotations for 84 images (excluding 16 training images) in 2 weeks. A specificity and sensitivity of 71% (95% confidence interval [CI], 69%-74%) and 87% (95% CI, 86%-88%) was achieved for all classifications. The AUC in this study for all classifications combined was 0.93 (95% CI, 0.91-0.96). For image annotation, a maximal Dice coefficient (∼0.6) was achieved with a consensus threshold of 0.25. This study supports the hypothesis that annotation of abnormalities in retinal images by ophthalmologically naive individuals is comparable to expert annotation. The highest AUC and agreement with expert annotation was achieved in the nonmasters with compulsory training group. The use of crowdsourcing as a technique for retinal image analysis may be comparable to expert graders and has the potential to deliver timely, accurate, and cost-effective image analysis.
NASA Astrophysics Data System (ADS)
Khan, Asif; Ryoo, Chang-Kyung; Kim, Heung Soo
2017-04-01
This paper presents a comparative study of different classification algorithms for the classification of various types of inter-ply delaminations in smart composite laminates. Improved layerwise theory is used to model delamination at different interfaces along the thickness and longitudinal directions of the smart composite laminate. The input-output data obtained through surface bonded piezoelectric sensor and actuator is analyzed by the system identification algorithm to get the system parameters. The identified parameters for the healthy and delaminated structure are supplied as input data to the classification algorithms. The classification algorithms considered in this study are ZeroR, Classification via regression, Naïve Bayes, Multilayer Perceptron, Sequential Minimal Optimization, Multiclass-Classifier, and Decision tree (J48). The open source software of Waikato Environment for Knowledge Analysis (WEKA) is used to evaluate the classification performance of the classifiers mentioned above via 75-25 holdout and leave-one-sample-out cross-validation regarding classification accuracy, precision, recall, kappa statistic and ROC Area.
Bashir, Saba; Qamar, Usman; Khan, Farhan Hassan
2016-02-01
Accuracy plays a vital role in the medical field as it concerns with the life of an individual. Extensive research has been conducted on disease classification and prediction using machine learning techniques. However, there is no agreement on which classifier produces the best results. A specific classifier may be better than others for a specific dataset, but another classifier could perform better for some other dataset. Ensemble of classifiers has been proved to be an effective way to improve classification accuracy. In this research we present an ensemble framework with multi-layer classification using enhanced bagging and optimized weighting. The proposed model called "HM-BagMoov" overcomes the limitations of conventional performance bottlenecks by utilizing an ensemble of seven heterogeneous classifiers. The framework is evaluated on five different heart disease datasets, four breast cancer datasets, two diabetes datasets, two liver disease datasets and one hepatitis dataset obtained from public repositories. The analysis of the results show that ensemble framework achieved the highest accuracy, sensitivity and F-Measure when compared with individual classifiers for all the diseases. In addition to this, the ensemble framework also achieved the highest accuracy when compared with the state of the art techniques. An application named "IntelliHealth" is also developed based on proposed model that may be used by hospitals/doctors for diagnostic advice. Copyright © 2015 Elsevier Inc. All rights reserved.
An unsupervised classification technique for multispectral remote sensing data.
NASA Technical Reports Server (NTRS)
Su, M. Y.; Cummings, R. E.
1973-01-01
Description of a two-part clustering technique consisting of (a) a sequential statistical clustering, which is essentially a sequential variance analysis, and (b) a generalized K-means clustering. In this composite clustering technique, the output of (a) is a set of initial clusters which are input to (b) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by traditional supervised maximum-likelihood classification techniques.
Unsupervised classification of earth resources data.
NASA Technical Reports Server (NTRS)
Su, M. Y.; Jayroe, R. R., Jr.; Cummings, R. E.
1972-01-01
A new clustering technique is presented. It consists of two parts: (a) a sequential statistical clustering which is essentially a sequential variance analysis and (b) a generalized K-means clustering. In this composite clustering technique, the output of (a) is a set of initial clusters which are input to (b) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by existing supervised maximum liklihood classification technique.
Darmawan, M F; Yusuf, Suhaila M; Kadir, M R Abdul; Haron, H
2015-02-01
Sex estimation is used in forensic anthropology to assist the identification of individual remains. However, the estimation techniques tend to be unique and applicable only to a certain population. This paper analyzed sex estimation on living individual child below 19 years old using the length of 19 bones of left hand applied for three classification techniques, which were Discriminant Function Analysis (DFA), Support Vector Machine (SVM) and Artificial Neural Network (ANN) multilayer perceptron. These techniques were carried out on X-ray images of the left hand taken from an Asian population data set. All the 19 bones of the left hand were measured using Free Image software, and all the techniques were performed using MATLAB. The group of age "16-19" years old and "7-9" years old were the groups that could be used for sex estimation with as their average of accuracy percentage was above 80%. ANN model was the best classification technique with the highest average of accuracy percentage in the two groups of age compared to other classification techniques. The results show that each classification technique has the best accuracy percentage on each different group of age. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Zemp, Roland; Tanadini, Matteo; Plüss, Stefan; Schnüriger, Karin; Singh, Navrag B; Taylor, William R; Lorenzetti, Silvio
2016-01-01
Occupational musculoskeletal disorders, particularly chronic low back pain (LBP), are ubiquitous due to prolonged static sitting or nonergonomic sitting positions. Therefore, the aim of this study was to develop an instrumented chair with force and acceleration sensors to determine the accuracy of automatically identifying the user's sitting position by applying five different machine learning methods (Support Vector Machines, Multinomial Regression, Boosting, Neural Networks, and Random Forest). Forty-one subjects were requested to sit four times in seven different prescribed sitting positions (total 1148 samples). Sixteen force sensor values and the backrest angle were used as the explanatory variables (features) for the classification. The different classification methods were compared by means of a Leave-One-Out cross-validation approach. The best performance was achieved using the Random Forest classification algorithm, producing a mean classification accuracy of 90.9% for subjects with which the algorithm was not familiar. The classification accuracy varied between 81% and 98% for the seven different sitting positions. The present study showed the possibility of accurately classifying different sitting positions by means of the introduced instrumented office chair combined with machine learning analyses. The use of such novel approaches for the accurate assessment of chair usage could offer insights into the relationships between sitting position, sitting behaviour, and the occurrence of musculoskeletal disorders.
Cascade Classification with Adaptive Feature Extraction for Arrhythmia Detection.
Park, Juyoung; Kang, Mingon; Gao, Jean; Kim, Younghoon; Kang, Kyungtae
2017-01-01
Detecting arrhythmia from ECG data is now feasible on mobile devices, but in this environment it is necessary to trade computational efficiency against accuracy. We propose an adaptive strategy for feature extraction that only considers normalized beat morphology features when running in a resource-constrained environment; but in a high-performance environment it takes account of a wider range of ECG features. This process is augmented by a cascaded random forest classifier. Experiments on data from the MIT-BIH Arrhythmia Database showed classification accuracies from 96.59% to 98.51%, which are comparable to state-of-the art methods.
Estimating local scaling properties for the classification of interstitial lung disease patterns
NASA Astrophysics Data System (ADS)
Huber, Markus B.; Nagarajan, Mahesh B.; Leinsinger, Gerda; Ray, Lawrence A.; Wismueller, Axel
2011-03-01
Local scaling properties of texture regions were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honeycombing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and the estimation of local scaling properties with Scaling Index Method (SIM). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions including the Bonferroni correction. The best classification results were obtained by the set of SIM features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers with the highest accuracy (94.1%, 93.7%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced texture features using local scaling properties can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.
Lee, Jong Hoon; Jang, Hong Seok; Kim, Jun-Gi; Lee, Myung Ah; Kim, Dae Yong; Kim, Tae Hyun; Oh, Jae Hwan; Park, Sung Chan; Kim, Sun Young; Baek, Ji Yeon; Park, Hee Chul; Kim, Hee Cheol; Nam, Taek-Keun; Chie, Eui Kyu; Jung, Ji-Han; Oh, Seong Taek
2014-10-01
The reported overall accuracy of MRI in predicting the pathologic stage of nonirradiated rectal cancer is high. However, the role of MRI in restaging rectal tumors after neoadjuvant CRT is contentious. Thus, we evaluate the accuracy of restaging magnetic resonance imaging (MRI) for rectal cancer patients who receive preoperative chemoradiotherapy (CRT). We analyzed 150 patients with locally advanced rectal cancer (T3-4N0-2) who had received preoperative CRT. Pre-CRT MRI was performed for local tumor and nodal staging. All patients underwent restaging MRI followed by total mesorectal excision after the end of radiotherapy. The primary endpoint of the present study was to estimate the accuracy of post-CRT MRI as compared with pathologic staging. Pathologic T classification matched the post-CRT MRI findings in 97 (64.7%) of 150 patients. 36 (24.0%) of 150 patients were overstaged in T classification, and the concordance degree was moderate (k=0.33, p<0.01). Pathologic N classification matched the post-CRI MRI findings in 85 (56.6%) of 150 patients. 54 (36.0%) of 150 patients were overstaged in N classification. 26 patients achieved downstaging (ycT0-2N0) on restaging MRI after CRT. 23 (88.5%) of 26 patients who had been downstaged on MRI after CRT were confirmed on the pathological staging, and the concordance degree was good (k=0.72, p<0.01). Restaging MRI has low accuracy for the prediction of the pathologic T and N classifications in rectal cancer patients who received preoperative CRT. The diagnostic accuracy of restaging MRI is relatively high in rectal cancer patients who achieved clinical downstaging after CRT. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Zou, Xiaoliang; Zhao, Guihua; Li, Jonathan; Yang, Yuanxi; Fang, Yong
2016-06-01
With the rapid developments of the sensor technology, high spatial resolution imagery and airborne Lidar point clouds can be captured nowadays, which make classification, extraction, evaluation and analysis of a broad range of object features available. High resolution imagery, Lidar dataset and parcel map can be widely used for classification as information carriers. Therefore, refinement of objects classification is made possible for the urban land cover. The paper presents an approach to object based image analysis (OBIA) combing high spatial resolution imagery and airborne Lidar point clouds. The advanced workflow for urban land cover is designed with four components. Firstly, colour-infrared TrueOrtho photo and laser point clouds were pre-processed to derive the parcel map of water bodies and nDSM respectively. Secondly, image objects are created via multi-resolution image segmentation integrating scale parameter, the colour and shape properties with compactness criterion. Image can be subdivided into separate object regions. Thirdly, image objects classification is performed on the basis of segmentation and a rule set of knowledge decision tree. These objects imagery are classified into six classes such as water bodies, low vegetation/grass, tree, low building, high building and road. Finally, in order to assess the validity of the classification results for six classes, accuracy assessment is performed through comparing randomly distributed reference points of TrueOrtho imagery with the classification results, forming the confusion matrix and calculating overall accuracy and Kappa coefficient. The study area focuses on test site Vaihingen/Enz and a patch of test datasets comes from the benchmark of ISPRS WG III/4 test project. The classification results show higher overall accuracy for most types of urban land cover. Overall accuracy is 89.5% and Kappa coefficient equals to 0.865. The OBIA approach provides an effective and convenient way to combine high resolution imagery and Lidar ancillary data for classification of urban land cover.
The use of the modified Cholesky decomposition in divergence and classification calculations
NASA Technical Reports Server (NTRS)
Vanroony, D. L.; Lynn, M. S.; Snyder, C. H.
1973-01-01
The use of the Cholesky decomposition technique is analyzed as applied to the feature selection and classification algorithms used in the analysis of remote sensing data (e.g. as in LARSYS). This technique is approximately 30% faster in classification and a factor of 2-3 faster in divergence, as compared with LARSYS. Also numerical stability and accuracy are slightly improved. Other methods necessary to deal with numerical stablity problems are briefly discussed.
The use of the modified Cholesky decomposition in divergence and classification calculations
NASA Technical Reports Server (NTRS)
Van Rooy, D. L.; Lynn, M. S.; Snyder, C. H.
1973-01-01
This report analyzes the use of the modified Cholesky decomposition technique as applied to the feature selection and classification algorithms used in the analysis of remote sensing data (e.g., as in LARSYS). This technique is approximately 30% faster in classification and a factor of 2-3 faster in divergence, as compared with LARSYS. Also numerical stability and accuracy are slightly improved. Other methods necessary to deal with numerical stability problems are briefly discussed.
Classification of permafrost active layer depth from remotely sensed and topographic evidence
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peddle, D.R.; Franklin, S.E.
1993-04-01
The remote detection of permafrost (perennially frozen ground) has important implications to environmental resource development, engineering studies, natural hazard prediction, and climate change research. In this study, the authors present results from two experiments into the classification of permafrost active layer depth within the zone of discontinuous permafrost in northern Canada. A new software system based on evidential reasoning was implemented to permit the integrated classification of multisource data consisting of landcover, terrain aspect, and equivalent latitude, each of which possessed different formats, data types, or statistical properties that could not be handled by conventional classification algorithms available to thismore » study. In the first experiment, four active layer depth classes were classified using ground based measurements of the three variables with an accuracy of 83% compared to in situ soil probe determination of permafrost active layer depth at over 500 field sites. This confirmed the environmental significance of the variables selected, and provided a baseline result to which a remote sensing classification could be compared. In the second experiment, evidence for each input variable was obtained from image processing of digital SPOT imagery and a photogrammetric digital elevation model, and used to classify active layer depth with an accuracy of 79%. These results suggest the classification of evidence from remotely sensed measures of spectral response and topography may provide suitable indicators of permafrost active layer depth.« less
Gadermayr, M.; Liedlgruber, M.; Uhl, A.; Vécsei, A.
2013-01-01
Due to the optics used in endoscopes, a typical degradation observed in endoscopic images are barrel-type distortions. In this work we investigate the impact of methods used to correct such distortions in images on the classification accuracy in the context of automated celiac disease classification. For this purpose we compare various different distortion correction methods and apply them to endoscopic images, which are subsequently classified. Since the interpolation used in such methods is also assumed to have an influence on the resulting classification accuracies, we also investigate different interpolation methods and their impact on the classification performance. In order to be able to make solid statements about the benefit of distortion correction we use various different feature extraction methods used to obtain features for the classification. Our experiments show that it is not possible to make a clear statement about the usefulness of distortion correction methods in the context of an automated diagnosis of celiac disease. This is mainly due to the fact that an eventual benefit of distortion correction highly depends on the feature extraction method used for the classification. PMID:23981585
Adaptive sleep-wake discrimination for wearable devices.
Karlen, Walter; Floreano, Dario
2011-04-01
Sleep/wake classification systems that rely on physiological signals suffer from intersubject differences that make accurate classification with a single, subject-independent model difficult. To overcome the limitations of intersubject variability, we suggest a novel online adaptation technique that updates the sleep/wake classifier in real time. The objective of the present study was to evaluate the performance of a newly developed adaptive classification algorithm that was embedded on a wearable sleep/wake classification system called SleePic. The algorithm processed ECG and respiratory effort signals for the classification task and applied behavioral measurements (obtained from accelerometer and press-button data) for the automatic adaptation task. When trained as a subject-independent classifier algorithm, the SleePic device was only able to correctly classify 74.94 ± 6.76% of the human-rated sleep/wake data. By using the suggested automatic adaptation method, the mean classification accuracy could be significantly improved to 92.98 ± 3.19%. A subject-independent classifier based on activity data only showed a comparable accuracy of 90.44 ± 3.57%. We demonstrated that subject-independent models used for online sleep-wake classification can successfully be adapted to previously unseen subjects without the intervention of human experts or off-line calibration.
A hybrid sensing approach for pure and adulterated honey classification.
Subari, Norazian; Mohamad Saleh, Junita; Md Shakaff, Ali Yeon; Zakaria, Ammar
2012-10-17
This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data.
Land-cover classification in a moist tropical region of Brazil with Landsat TM imagery.
Li, Guiying; Lu, Dengsheng; Moran, Emilio; Hetrick, Scott
2011-01-01
This research aims to improve land-cover classification accuracy in a moist tropical region in Brazil by examining the use of different remote sensing-derived variables and classification algorithms. Different scenarios based on Landsat Thematic Mapper (TM) spectral data and derived vegetation indices and textural images, and different classification algorithms - maximum likelihood classification (MLC), artificial neural network (ANN), classification tree analysis (CTA), and object-based classification (OBC), were explored. The results indicated that a combination of vegetation indices as extra bands into Landsat TM multispectral bands did not improve the overall classification performance, but the combination of textural images was valuable for improving vegetation classification accuracy. In particular, the combination of both vegetation indices and textural images into TM multispectral bands improved overall classification accuracy by 5.6% and kappa coefficient by 6.25%. Comparison of the different classification algorithms indicated that CTA and ANN have poor classification performance in this research, but OBC improved primary forest and pasture classification accuracies. This research indicates that use of textural images or use of OBC are especially valuable for improving the vegetation classes such as upland and liana forest classes having complex stand structures and having relatively large patch sizes.
Land-cover classification in a moist tropical region of Brazil with Landsat TM imagery
LI, GUIYING; LU, DENGSHENG; MORAN, EMILIO; HETRICK, SCOTT
2011-01-01
This research aims to improve land-cover classification accuracy in a moist tropical region in Brazil by examining the use of different remote sensing-derived variables and classification algorithms. Different scenarios based on Landsat Thematic Mapper (TM) spectral data and derived vegetation indices and textural images, and different classification algorithms – maximum likelihood classification (MLC), artificial neural network (ANN), classification tree analysis (CTA), and object-based classification (OBC), were explored. The results indicated that a combination of vegetation indices as extra bands into Landsat TM multispectral bands did not improve the overall classification performance, but the combination of textural images was valuable for improving vegetation classification accuracy. In particular, the combination of both vegetation indices and textural images into TM multispectral bands improved overall classification accuracy by 5.6% and kappa coefficient by 6.25%. Comparison of the different classification algorithms indicated that CTA and ANN have poor classification performance in this research, but OBC improved primary forest and pasture classification accuracies. This research indicates that use of textural images or use of OBC are especially valuable for improving the vegetation classes such as upland and liana forest classes having complex stand structures and having relatively large patch sizes. PMID:22368311
NASA Astrophysics Data System (ADS)
Mohammadimanesh, F.; Salehi, B.; Mahdianpari, M.; Homayouni, S.
2016-06-01
Polarimetric Synthetic Aperture Radar (PolSAR) imagery is a complex multi-dimensional dataset, which is an important source of information for various natural resources and environmental classification and monitoring applications. PolSAR imagery produces valuable information by observing scattering mechanisms from different natural and man-made objects. Land cover mapping using PolSAR data classification is one of the most important applications of SAR remote sensing earth observations, which have gained increasing attention in the recent years. However, one of the most challenging aspects of classification is selecting features with maximum discrimination capability. To address this challenge, a statistical approach based on the Fisher Linear Discriminant Analysis (FLDA) and the incorporation of physical interpretation of PolSAR data into classification is proposed in this paper. After pre-processing of PolSAR data, including the speckle reduction, the H/α classification is used in order to classify the basic scattering mechanisms. Then, a new method for feature weighting, based on the fusion of FLDA and physical interpretation, is implemented. This method proves to increase the classification accuracy as well as increasing between-class discrimination in the final Wishart classification. The proposed method was applied to a full polarimetric C-band RADARSAT-2 data set from Avalon area, Newfoundland and Labrador, Canada. This imagery has been acquired in June 2015, and covers various types of wetlands including bogs, fens, marshes and shallow water. The results were compared with the standard Wishart classification, and an improvement of about 20% was achieved in the overall accuracy. This method provides an opportunity for operational wetland classification in northern latitude with high accuracy using only SAR polarimetric data.
CSE database: extended annotations and new recommendations for ECG software testing.
Smíšek, Radovan; Maršánová, Lucie; Němcová, Andrea; Vítek, Martin; Kozumplík, Jiří; Nováková, Marie
2017-08-01
Nowadays, cardiovascular diseases represent the most common cause of death in western countries. Among various examination techniques, electrocardiography (ECG) is still a highly valuable tool used for the diagnosis of many cardiovascular disorders. In order to diagnose a person based on ECG, cardiologists can use automatic diagnostic algorithms. Research in this area is still necessary. In order to compare various algorithms correctly, it is necessary to test them on standard annotated databases, such as the Common Standards for Quantitative Electrocardiography (CSE) database. According to Scopus, the CSE database is the second most cited standard database. There were two main objectives in this work. First, new diagnoses were added to the CSE database, which extended its original annotations. Second, new recommendations for diagnostic software quality estimation were established. The ECG recordings were diagnosed by five new cardiologists independently, and in total, 59 different diagnoses were found. Such a large number of diagnoses is unique, even in terms of standard databases. Based on the cardiologists' diagnoses, a four-round consensus (4R consensus) was established. Such a 4R consensus means a correct final diagnosis, which should ideally be the output of any tested classification software. The accuracy of the cardiologists' diagnoses compared with the 4R consensus was the basis for the establishment of accuracy recommendations. The accuracy was determined in terms of sensitivity = 79.20-86.81%, positive predictive value = 79.10-87.11%, and the Jaccard coefficient = 72.21-81.14%, respectively. Within these ranges, the accuracy of the software is comparable with the accuracy of cardiologists. The accuracy quantification of the correct classification is unique. Diagnostic software developers can objectively evaluate the success of their algorithm and promote its further development. The annotations and recommendations proposed in this work will allow for faster development and testing of classification software. As a result, this might facilitate cardiologists' work and lead to faster diagnoses and earlier treatment.
Large Scale Crop Mapping in Ukraine Using Google Earth Engine
NASA Astrophysics Data System (ADS)
Shelestov, A.; Lavreniuk, M. S.; Kussul, N.
2016-12-01
There are no globally available high resolution satellite-derived crop specific maps at present. Only coarse-resolution imagery (> 250 m spatial resolution) has been utilized to derive global cropland extent. In 2016 we are going to carry out a country level demonstration of Sentinel-2 use for crop classification in Ukraine within the ESA Sen2-Agri project. But optical imagery can be contaminated by cloud cover that makes it difficult to acquire imagery in an optimal time range to discriminate certain crops. Due to the Copernicus program since 2015, a lot of Sentinel-1 SAR data at high spatial resolution is available for free for Ukraine. It allows us to use the time series of SAR data for crop classification. Our experiment for one administrative region in 2015 showed much higher crop classification accuracy with SAR data than with optical only time series [1, 2]. Therefore, in 2016 within the Google Earth Engine Research Award we use SAR data together with optical ones for large area crop mapping (entire territory of Ukraine) using cloud computing capabilities available at Google Earth Engine (GEE). This study compares different classification methods for crop mapping for the whole territory of Ukraine using data and algorithms from GEE. Classification performance assessed using overall classification accuracy, Kappa coefficients, and user's and producer's accuracies. Also, crop areas from derived classification maps compared to the official statistics [3]. S. Skakun et al., "Efficiency assessment of multitemporal C-band Radarsat-2 intensity and Landsat-8 surface reflectance satellite imagery for crop classification in Ukraine," IEEE Journal of Selected Topics in Applied Earth Observ. and Rem. Sens., 2015, DOI: 10.1109/JSTARS.2015.2454297. N. Kussul, S. Skakun, A. Shelestov, O. Kussul, "The use of satellite SAR imagery to crop classification in Ukraine within JECAM project," IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pp.1497-1500, 13-18 July 2014, Quebec City, Canada. F.J. Gallego, N. Kussul, S. Skakun, O. Kravchenko, A. Shelestov, O. Kussul, "Efficiency assessment of using satellite data for crop area estimation in Ukraine," International Journal of Applied Earth Observation and Geoinformation vol. 29, pp. 22-30, 2014.
2012-01-01
Background While progress has been made to develop automatic segmentation techniques for mitochondria, there remains a need for more accurate and robust techniques to delineate mitochondria in serial blockface scanning electron microscopic data. Previously developed texture based methods are limited for solving this problem because texture alone is often not sufficient to identify mitochondria. This paper presents a new three-step method, the Cytoseg process, for automated segmentation of mitochondria contained in 3D electron microscopic volumes generated through serial block face scanning electron microscopic imaging. The method consists of three steps. The first is a random forest patch classification step operating directly on 2D image patches. The second step consists of contour-pair classification. At the final step, we introduce a method to automatically seed a level set operation with output from previous steps. Results We report accuracy of the Cytoseg process on three types of tissue and compare it to a previous method based on Radon-Like Features. At step 1, we show that the patch classifier identifies mitochondria texture but creates many false positive pixels. At step 2, our contour processing step produces contours and then filters them with a second classification step, helping to improve overall accuracy. We show that our final level set operation, which is automatically seeded with output from previous steps, helps to smooth the results. Overall, our results show that use of contour pair classification and level set operations improve segmentation accuracy beyond patch classification alone. We show that the Cytoseg process performs well compared to another modern technique based on Radon-Like Features. Conclusions We demonstrated that texture based methods for mitochondria segmentation can be enhanced with multiple steps that form an image processing pipeline. While we used a random-forest based patch classifier to recognize texture, it would be possible to replace this with other texture identifiers, and we plan to explore this in future work. PMID:22321695
Classification of right-hand grasp movement based on EMOTIV Epoc+
NASA Astrophysics Data System (ADS)
Tobing, T. A. M. L.; Prawito, Wijaya, S. K.
2017-07-01
Combinations of BCT elements for right-hand grasp movement have been obtained, providing the average value of their classification accuracy. The aim of this study is to find a suitable combination for best classification accuracy of right-hand grasp movement based on EEG headset, EMOTIV Epoc+. There are three movement classifications: grasping hand, relax, and opening hand. These classifications take advantage of Event-Related Desynchronization (ERD) phenomenon that makes it possible to differ relaxation, imagery, and movement state from each other. The combinations of elements are the usage of Independent Component Analysis (ICA), spectrum analysis by Fast Fourier Transform (FFT), maximum mu and beta power with their frequency as features, and also classifier Probabilistic Neural Network (PNN) and Radial Basis Function (RBF). The average values of classification accuracy are ± 83% for training and ± 57% for testing. To have a better understanding of the signal quality recorded by EMOTIV Epoc+, the result of classification accuracy of left or right-hand grasping movement EEG signal (provided by Physionet) also be given, i.e.± 85% for training and ± 70% for testing. The comparison of accuracy value from each combination, experiment condition, and external EEG data are provided for the purpose of value analysis of classification accuracy.
Liu, Jingfang; Zhang, Pengzhu; Lu, Yingjie
2014-11-01
User-generated medical messages on Internet contain extensive information related to adverse drug reactions (ADRs) and are known as valuable resources for post-marketing drug surveillance. The aim of this study was to find an effective method to identify messages related to ADRs automatically from online user reviews. We conducted experiments on online user reviews using different feature set and different classification technique. Firstly, the messages from three communities, allergy community, schizophrenia community and pain management community, were collected, the 3000 messages were annotated. Secondly, the N-gram-based features set and medical domain-specific features set were generated. Thirdly, three classification techniques, SVM, C4.5 and Naïve Bayes, were used to perform classification tasks separately. Finally, we evaluated the performance of different method using different feature set and different classification technique by comparing the metrics including accuracy and F-measure. In terms of accuracy, the accuracy of SVM classifier was higher than 0.8, the accuracy of C4.5 classifier or Naïve Bayes classifier was lower than 0.8; meanwhile, the combination feature sets including n-gram-based feature set and domain-specific feature set consistently outperformed single feature set. In terms of F-measure, the highest F-measure is 0.895 which was achieved by using combination feature sets and a SVM classifier. In all, we can get the best classification performance by using combination feature sets and SVM classifier. By using combination feature sets and SVM classifier, we can get an effective method to identify messages related to ADRs automatically from online user reviews.
NASA Astrophysics Data System (ADS)
Tao, C.-S.; Chen, S.-W.; Li, Y.-Z.; Xiao, S.-P.
2017-09-01
Land cover classification is an important application for polarimetric synthetic aperture radar (PolSAR) data utilization. Rollinvariant polarimetric features such as H / Ani / α / Span are commonly adopted in PolSAR land cover classification. However, target orientation diversity effect makes PolSAR images understanding and interpretation difficult. Only using the roll-invariant polarimetric features may introduce ambiguity in the interpretation of targets' scattering mechanisms and limit the followed classification accuracy. To address this problem, this work firstly focuses on hidden polarimetric feature mining in the rotation domain along the radar line of sight using the recently reported uniform polarimetric matrix rotation theory and the visualization and characterization tool of polarimetric coherence pattern. The former rotates the acquired polarimetric matrix along the radar line of sight and fully describes the rotation characteristics of each entry of the matrix. Sets of new polarimetric features are derived to describe the hidden scattering information of the target in the rotation domain. The latter extends the traditional polarimetric coherence at a given rotation angle to the rotation domain for complete interpretation. A visualization and characterization tool is established to derive new polarimetric features for hidden information exploration. Then, a classification scheme is developed combing both the selected new hidden polarimetric features in rotation domain and the commonly used roll-invariant polarimetric features with a support vector machine (SVM) classifier. Comparison experiments based on AIRSAR and multi-temporal UAVSAR data demonstrate that compared with the conventional classification scheme which only uses the roll-invariant polarimetric features, the proposed classification scheme achieves both higher classification accuracy and better robustness. For AIRSAR data, the overall classification accuracy with the proposed classification scheme is 94.91 %, while that with the conventional classification scheme is 93.70 %. Moreover, for multi-temporal UAVSAR data, the averaged overall classification accuracy with the proposed classification scheme is up to 97.08 %, which is much higher than the 87.79 % from the conventional classification scheme. Furthermore, for multitemporal PolSAR data, the proposed classification scheme can achieve better robustness. The comparison studies also clearly demonstrate that mining and utilization of hidden polarimetric features and information in the rotation domain can gain the added benefits for PolSAR land cover classification and provide a new vision for PolSAR image interpretation and application.
Rajagopal, Rekha; Ranganathan, Vidhyapriya
2018-06-05
Automation in cardiac arrhythmia classification helps medical professionals make accurate decisions about the patient's health. The aim of this work was to design a hybrid classification model to classify cardiac arrhythmias. The design phase of the classification model comprises the following stages: preprocessing of the cardiac signal by eliminating detail coefficients that contain noise, feature extraction through Daubechies wavelet transform, and arrhythmia classification using a collaborative decision from the K nearest neighbor classifier (KNN) and a support vector machine (SVM). The proposed model is able to classify 5 arrhythmia classes as per the ANSI/AAMI EC57: 1998 classification standard. Level 1 of the proposed model involves classification using the KNN and the classifier is trained with examples from all classes. Level 2 involves classification using an SVM and is trained specifically to classify overlapped classes. The final classification of a test heartbeat pertaining to a particular class is done using the proposed KNN/SVM hybrid model. The experimental results demonstrated that the average sensitivity of the proposed model was 92.56%, the average specificity 99.35%, the average positive predictive value 98.13%, the average F-score 94.5%, and the average accuracy 99.78%. The results obtained using the proposed model were compared with the results of discriminant, tree, and KNN classifiers. The proposed model is able to achieve a high classification accuracy.
Accurate label-free 3-part leukocyte recognition with single cell lens-free imaging flow cytometry.
Li, Yuqian; Cornelis, Bruno; Dusa, Alexandra; Vanmeerbeeck, Geert; Vercruysse, Dries; Sohn, Erik; Blaszkiewicz, Kamil; Prodanov, Dimiter; Schelkens, Peter; Lagae, Liesbet
2018-05-01
Three-part white blood cell differentials which are key to routine blood workups are typically performed in centralized laboratories on conventional hematology analyzers operated by highly trained staff. With the trend of developing miniaturized blood analysis tool for point-of-need in order to accelerate turnaround times and move routine blood testing away from centralized facilities on the rise, our group has developed a highly miniaturized holographic imaging system for generating lens-free images of white blood cells in suspension. Analysis and classification of its output data, constitutes the final crucial step ensuring appropriate accuracy of the system. In this work, we implement reference holographic images of single white blood cells in suspension, in order to establish an accurate ground truth to increase classification accuracy. We also automate the entire workflow for analyzing the output and demonstrate clear improvement in the accuracy of the 3-part classification. High-dimensional optical and morphological features are extracted from reconstructed digital holograms of single cells using the ground-truth images and advanced machine learning algorithms are investigated and implemented to obtain 99% classification accuracy. Representative features of the three white blood cell subtypes are selected and give comparable results, with a focus on rapid cell recognition and decreased computational cost. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Migraine classification using magnetic resonance imaging resting-state functional connectivity data.
Chong, Catherine D; Gaw, Nathan; Fu, Yinlin; Li, Jing; Wu, Teresa; Schwedt, Todd J
2017-08-01
Background This study used machine-learning techniques to develop discriminative brain-connectivity biomarkers from resting-state functional magnetic resonance neuroimaging ( rs-fMRI) data that distinguish between individual migraine patients and healthy controls. Methods This study included 58 migraine patients (mean age = 36.3 years; SD = 11.5) and 50 healthy controls (mean age = 35.9 years; SD = 11.0). The functional connections of 33 seeded pain-related regions were used as input for a brain classification algorithm that tested the accuracy of determining whether an individual brain MRI belongs to someone with migraine or to a healthy control. Results The best classification accuracy using a 10-fold cross-validation method was 86.1%. Resting functional connectivity of the right middle temporal, posterior insula, middle cingulate, left ventromedial prefrontal and bilateral amygdala regions best discriminated the migraine brain from that of a healthy control. Migraineurs with longer disease durations were classified more accurately (>14 years; 96.7% accuracy) compared to migraineurs with shorter disease durations (≤14 years; 82.1% accuracy). Conclusions Classification of migraine using rs-fMRI provides insights into pain circuits that are altered in migraine and could potentially contribute to the development of a new, noninvasive migraine biomarker. Migraineurs with longer disease burden were classified more accurately than migraineurs with shorter disease burden, potentially indicating that disease duration leads to reorganization of brain circuitry.
Effective classification of the prevalence of Schistosoma mansoni.
Mitchell, Shira A; Pagano, Marcello
2012-12-01
To present an effective classification method based on the prevalence of Schistosoma mansoni in the community. We created decision rules (defined by cut-offs for number of positive slides), which account for imperfect sensitivity, both with a simple adjustment of fixed sensitivity and with a more complex adjustment of changing sensitivity with prevalence. To reduce screening costs while maintaining accuracy, we propose a pooled classification method. To estimate sensitivity, we use the De Vlas model for worm and egg distributions. We compare the proposed method with the standard method to investigate differences in efficiency, measured by number of slides read, and accuracy, measured by probability of correct classification. Modelling varying sensitivity lowers the lower cut-off more significantly than the upper cut-off, correctly classifying regions as moderate rather than lower, thus receiving life-saving treatment. The classification method goes directly to classification on the basis of positive pools, avoiding having to know sensitivity to estimate prevalence. For model parameter values describing worm and egg distributions among children, the pooled method with 25 slides achieves an expected 89.9% probability of correct classification, whereas the standard method with 50 slides achieves 88.7%. Among children, it is more efficient and more accurate to use the pooled method for classification of S. mansoni prevalence than the current standard method. © 2012 Blackwell Publishing Ltd.
NASA Astrophysics Data System (ADS)
Aricò, P.; Aloise, F.; Schettini, F.; Salinari, S.; Mattia, D.; Cincotti, F.
2014-06-01
Objective. Several ERP-based brain-computer interfaces (BCIs) that can be controlled even without eye movements (covert attention) have been recently proposed. However, when compared to similar systems based on overt attention, they displayed significantly lower accuracy. In the current interpretation, this is ascribed to the absence of the contribution of short-latency visual evoked potentials (VEPs) in the tasks performed in the covert attention modality. This study aims to investigate if this decrement (i) is fully explained by the lack of VEP contribution to the classification accuracy; (ii) correlates with lower temporal stability of the single-trial P300 potentials elicited in the covert attention modality. Approach. We evaluated the latency jitter of P300 evoked potentials in three BCI interfaces exploiting either overt or covert attention modalities in 20 healthy subjects. The effect of attention modality on the P300 jitter, and the relative contribution of VEPs and P300 jitter to the classification accuracy have been analyzed. Main results. The P300 jitter is higher when the BCI is controlled in covert attention. Classification accuracy negatively correlates with jitter. Even disregarding short-latency VEPs, overt-attention BCI yields better accuracy than covert. When the latency jitter is compensated offline, the difference between accuracies is not significant. Significance. The lower temporal stability of the P300 evoked potential generated during the tasks performed in covert attention modality should be regarded as the main contributing explanation of lower accuracy of covert-attention ERP-based BCIs.
Approximated mutual information training for speech recognition using myoelectric signals.
Guo, Hua J; Chan, A D C
2006-01-01
A new training algorithm called the approximated maximum mutual information (AMMI) is proposed to improve the accuracy of myoelectric speech recognition using hidden Markov models (HMMs). Previous studies have demonstrated that automatic speech recognition can be performed using myoelectric signals from articulatory muscles of the face. Classification of facial myoelectric signals can be performed using HMMs that are trained using the maximum likelihood (ML) algorithm; however, this algorithm maximizes the likelihood of the observations in the training sequence, which is not directly associated with optimal classification accuracy. The AMMI training algorithm attempts to maximize the mutual information, thereby training the HMMs to optimize their parameters for discrimination. Our results show that AMMI training consistently reduces the error rates compared to these by the ML training, increasing the accuracy by approximately 3% on average.
Evaluation of an Algorithm to Predict Menstrual-Cycle Phase at the Time of Injury.
Tourville, Timothy W; Shultz, Sandra J; Vacek, Pamela M; Knudsen, Emily J; Bernstein, Ira M; Tourville, Kelly J; Hardy, Daniel M; Johnson, Robert J; Slauterbeck, James R; Beynnon, Bruce D
2016-01-01
Women are 2 to 8 times more likely to sustain an anterior cruciate ligament (ACL) injury than men, and previous studies indicated an increased risk for injury during the preovulatory phase of the menstrual cycle (MC). However, investigations of risk rely on retrospective classification of MC phase, and no tools for this have been validated. To evaluate the accuracy of an algorithm for retrospectively classifying MC phase at the time of a mock injury based on MC history and salivary progesterone (P4) concentration. Descriptive laboratory study. Research laboratory. Thirty-one healthy female collegiate athletes (age range, 18-24 years) provided serum or saliva (or both) samples at 8 visits over 1 complete MC. Self-reported MC information was obtained on a randomized date (1-45 days) after mock injury, which is the typical timeframe in which researchers have access to ACL-injured study participants. The MC phase was classified using the algorithm as applied in a stand-alone computational fashion and also by 4 clinical experts using the algorithm and additional subjective hormonal history information to help inform their decision. To assess algorithm accuracy, phase classifications were compared with the actual MC phase at the time of mock injury (ascertained using urinary luteinizing hormone tests and serial serum P4 samples). Clinical expert and computed classifications were compared using κ statistics. Fourteen participants (45%) experienced anovulatory cycles. The algorithm correctly classified MC phase for 23 participants (74%): 22 (76%) of 29 who were preovulatory/anovulatory and 1 (50%) of 2 who were postovulatory. Agreement between expert and algorithm classifications ranged from 80.6% (κ = 0.50) to 93% (κ = 0.83). Classifications based on same-day saliva sample and optimal P4 threshold were the same as those based on MC history alone (87.1% correct). Algorithm accuracy varied during the MC but at no time were both sensitivity and specificity levels acceptable. These findings raise concerns about the accuracy of previous retrospective MC-phase classification systems, particularly in a population with a high occurrence of anovulatory cycles.
NASA Astrophysics Data System (ADS)
Samsudin, Sarah Hanim; Shafri, Helmi Z. M.; Hamedianfar, Alireza
2016-04-01
Status observations of roofing material degradation are constantly evolving due to urban feature heterogeneities. Although advanced classification techniques have been introduced to improve within-class impervious surface classifications, these techniques involve complex processing and high computation times. This study integrates field spectroscopy and satellite multispectral remote sensing data to generate degradation status maps of concrete and metal roofing materials. Field spectroscopy data were used as bases for selecting suitable bands for spectral index development because of the limited number of multispectral bands. Mapping methods for roof degradation status were established for metal and concrete roofing materials by developing the normalized difference concrete condition index (NDCCI) and the normalized difference metal condition index (NDMCI). Results indicate that the accuracies achieved using the spectral indices are higher than those obtained using supervised pixel-based classification. The NDCCI generated an accuracy of 84.44%, whereas the support vector machine (SVM) approach yielded an accuracy of 73.06%. The NDMCI obtained an accuracy of 94.17% compared with 62.5% for the SVM approach. These findings support the suitability of the developed spectral index methods for determining roof degradation statuses from satellite observations in heterogeneous urban environments.
Thanh Noi, Phan; Kappas, Martin
2017-01-01
In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a few studies have compared the performances of these classifiers with different training sample sizes for the same remote sensing images, particularly the Sentinel-2 Multispectral Imager (MSI). In this study, we examined and compared the performances of the RF, kNN, and SVM classifiers for land use/cover classification using Sentinel-2 image data. An area of 30 × 30 km2 within the Red River Delta of Vietnam with six land use/cover types was classified using 14 different training sample sizes, including balanced and imbalanced, from 50 to over 1250 pixels/class. All classification results showed a high overall accuracy (OA) ranging from 90% to 95%. Among the three classifiers and 14 sub-datasets, SVM produced the highest OA with the least sensitivity to the training sample sizes, followed consecutively by RF and kNN. In relation to the sample size, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough, i.e., greater than 750 pixels/class or representing an area of approximately 0.25% of the total study area. The high accuracy was achieved with both imbalanced and balanced datasets. PMID:29271909
Thanh Noi, Phan; Kappas, Martin
2017-12-22
In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a few studies have compared the performances of these classifiers with different training sample sizes for the same remote sensing images, particularly the Sentinel-2 Multispectral Imager (MSI). In this study, we examined and compared the performances of the RF, kNN, and SVM classifiers for land use/cover classification using Sentinel-2 image data. An area of 30 × 30 km² within the Red River Delta of Vietnam with six land use/cover types was classified using 14 different training sample sizes, including balanced and imbalanced, from 50 to over 1250 pixels/class. All classification results showed a high overall accuracy (OA) ranging from 90% to 95%. Among the three classifiers and 14 sub-datasets, SVM produced the highest OA with the least sensitivity to the training sample sizes, followed consecutively by RF and kNN. In relation to the sample size, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough, i.e., greater than 750 pixels/class or representing an area of approximately 0.25% of the total study area. The high accuracy was achieved with both imbalanced and balanced datasets.
Estimating Classification Consistency and Accuracy for Cognitive Diagnostic Assessment
ERIC Educational Resources Information Center
Cui, Ying; Gierl, Mark J.; Chang, Hua-Hua
2012-01-01
This article introduces procedures for the computation and asymptotic statistical inference for classification consistency and accuracy indices specifically designed for cognitive diagnostic assessments. The new classification indices can be used as important indicators of the reliability and validity of classification results produced by…
The effect of finite field size on classification and atmospheric correction
NASA Technical Reports Server (NTRS)
Kaufman, Y. J.; Fraser, R. S.
1981-01-01
The atmospheric effect on the upward radiance of sunlight scattered from the Earth-atmosphere system is strongly influenced by the contrasts between fields and their sizes. For a given atmospheric turbidity, the atmospheric effect on classification of surface features is much stronger for nonuniform surfaces than for uniform surfaces. Therefore, the classification accuracy of agricultural fields and urban areas is dependent not only on the optical characteristics of the atmosphere, but also on the size of the surface do not account for the nonuniformity of the surface have only a slight effect on the classification accuracy; in other cases the classification accuracy descreases. The radiances above finite fields were computed to simulate radiances measured by a satellite. A simulation case including 11 agricultural fields and four natural fields (water, soil, savanah, and forest) was used to test the effect of the size of the background reflectance and the optical thickness of the atmosphere on classification accuracy. It is concluded that new atmospheric correction methods, which take into account the finite size of the fields, have to be developed to improve significantly the classification accuracy.
NASA Astrophysics Data System (ADS)
Li, Manchun; Ma, Lei; Blaschke, Thomas; Cheng, Liang; Tiede, Dirk
2016-07-01
Geographic Object-Based Image Analysis (GEOBIA) is becoming more prevalent in remote sensing classification, especially for high-resolution imagery. Many supervised classification approaches are applied to objects rather than pixels, and several studies have been conducted to evaluate the performance of such supervised classification techniques in GEOBIA. However, these studies did not systematically investigate all relevant factors affecting the classification (segmentation scale, training set size, feature selection and mixed objects). In this study, statistical methods and visual inspection were used to compare these factors systematically in two agricultural case studies in China. The results indicate that Random Forest (RF) and Support Vector Machines (SVM) are highly suitable for GEOBIA classifications in agricultural areas and confirm the expected general tendency, namely that the overall accuracies decline with increasing segmentation scale. All other investigated methods except for RF and SVM are more prone to obtain a lower accuracy due to the broken objects at fine scales. In contrast to some previous studies, the RF classifiers yielded the best results and the k-nearest neighbor classifier were the worst results, in most cases. Likewise, the RF and Decision Tree classifiers are the most robust with or without feature selection. The results of training sample analyses indicated that the RF and adaboost. M1 possess a superior generalization capability, except when dealing with small training sample sizes. Furthermore, the classification accuracies were directly related to the homogeneity/heterogeneity of the segmented objects for all classifiers. Finally, it was suggested that RF should be considered in most cases for agricultural mapping.
NASA Astrophysics Data System (ADS)
Fusco, Terence; Bi, Yaxin; Nugent, Chris; Wu, Shengli
2016-08-01
We can see that the data imputation approach using the Regression CTA has performed more favourably when compared with the alternative methods on this dataset. We now have the evidence to show that this method is viable moving forward with further research in this area. The weighted distribution experiments have provided us with a more balanced and appropriate ratio for snail density classification purposes when using either the 3 or 5 category combination. The most desirable results are found when using 3 categories of SD with the weighted distribution of classes being 20-60-20. This information reflects the optimum classification accuracy across the data range and can be applied to any novel environment feature dataset pertaining to Schistosomiasis vector classification. ITSVM has provided us with a method of labelling SD data which we can use for classification with epidemic disease prediction research. The confidence level selection enables consistent labelling accuracy for bespoke requirements when classifying the data from each year. The SMOTE Equilibrium proposed method has yielded a slight increase with each multiple of synthetic instances that are compounded to the training dataset. The reduction of overfitting and increase of data instances has shown a gradual classification accuracy increase across the data for each year. We will now test to see what the optimum synthetic instance incremental increase is across our data and apply this to our experiments with this research.
Design of neural networks for classification of remotely sensed imagery
NASA Technical Reports Server (NTRS)
Chettri, Samir R.; Cromp, Robert F.; Birmingham, Mark
1992-01-01
Classification accuracies of a backpropagation neural network are discussed and compared with a maximum likelihood classifier (MLC) with multivariate normal class models. We have found that, because of its nonparametric nature, the neural network outperforms the MLC in this area. In addition, we discuss techniques for constructing optimal neural nets on parallel hardware like the MasPar MP-1 currently at GSFC. Other important discussions are centered around training and classification times of the two methods, and sensitivity to the training data. Finally, we discuss future work in the area of classification and neural nets.
Shin, Jaeyoung; Müller, Klaus-R; Hwang, Han-Jeong
2016-01-01
We propose a near-infrared spectroscopy (NIRS)-based brain-computer interface (BCI) that can be operated in eyes-closed (EC) state. To evaluate the feasibility of NIRS-based EC BCIs, we compared the performance of an eye-open (EO) BCI paradigm and an EC BCI paradigm with respect to hemodynamic response and classification accuracy. To this end, subjects performed either mental arithmetic or imagined vocalization of the English alphabet as a baseline task with very low cognitive loading. The performances of two linear classifiers were compared; resulting in an advantage of shrinkage linear discriminant analysis (LDA). The classification accuracy of EC paradigm (75.6 ± 7.3%) was observed to be lower than that of EO paradigm (77.0 ± 9.2%), which was statistically insignificant (p = 0.5698). Subjects reported they felt it more comfortable (p = 0.057) and easier (p < 0.05) to perform the EC BCI tasks. The different task difficulty may become a cause of the slightly lower classification accuracy of EC data. From the analysis results, we could confirm the feasibility of NIRS-based EC BCIs, which can be a BCI option that may ultimately be of use for patients who cannot keep their eyes open consistently. PMID:27824089
Shin, Jaeyoung; Müller, Klaus-R; Hwang, Han-Jeong
2016-11-08
We propose a near-infrared spectroscopy (NIRS)-based brain-computer interface (BCI) that can be operated in eyes-closed (EC) state. To evaluate the feasibility of NIRS-based EC BCIs, we compared the performance of an eye-open (EO) BCI paradigm and an EC BCI paradigm with respect to hemodynamic response and classification accuracy. To this end, subjects performed either mental arithmetic or imagined vocalization of the English alphabet as a baseline task with very low cognitive loading. The performances of two linear classifiers were compared; resulting in an advantage of shrinkage linear discriminant analysis (LDA). The classification accuracy of EC paradigm (75.6 ± 7.3%) was observed to be lower than that of EO paradigm (77.0 ± 9.2%), which was statistically insignificant (p = 0.5698). Subjects reported they felt it more comfortable (p = 0.057) and easier (p < 0.05) to perform the EC BCI tasks. The different task difficulty may become a cause of the slightly lower classification accuracy of EC data. From the analysis results, we could confirm the feasibility of NIRS-based EC BCIs, which can be a BCI option that may ultimately be of use for patients who cannot keep their eyes open consistently.
Garcia-Chimeno, Yolanda; Garcia-Zapirain, Begonya; Gomez-Beldarrain, Marian; Fernandez-Ruanova, Begonya; Garcia-Monco, Juan Carlos
2017-04-13
Feature selection methods are commonly used to identify subsets of relevant features to facilitate the construction of models for classification, yet little is known about how feature selection methods perform in diffusion tensor images (DTIs). In this study, feature selection and machine learning classification methods were tested for the purpose of automating diagnosis of migraines using both DTIs and questionnaire answers related to emotion and cognition - factors that influence of pain perceptions. We select 52 adult subjects for the study divided into three groups: control group (15), subjects with sporadic migraine (19) and subjects with chronic migraine and medication overuse (18). These subjects underwent magnetic resonance with diffusion tensor to see white matter pathway integrity of the regions of interest involved in pain and emotion. The tests also gather data about pathology. The DTI images and test results were then introduced into feature selection algorithms (Gradient Tree Boosting, L1-based, Random Forest and Univariate) to reduce features of the first dataset and classification algorithms (SVM (Support Vector Machine), Boosting (Adaboost) and Naive Bayes) to perform a classification of migraine group. Moreover we implement a committee method to improve the classification accuracy based on feature selection algorithms. When classifying the migraine group, the greatest improvements in accuracy were made using the proposed committee-based feature selection method. Using this approach, the accuracy of classification into three types improved from 67 to 93% when using the Naive Bayes classifier, from 90 to 95% with the support vector machine classifier, 93 to 94% in boosting. The features that were determined to be most useful for classification included are related with the pain, analgesics and left uncinate brain (connected with the pain and emotions). The proposed feature selection committee method improved the performance of migraine diagnosis classifiers compared to individual feature selection methods, producing a robust system that achieved over 90% accuracy in all classifiers. The results suggest that the proposed methods can be used to support specialists in the classification of migraines in patients undergoing magnetic resonance imaging.
Di-codon Usage for Gene Classification
NASA Astrophysics Data System (ADS)
Nguyen, Minh N.; Ma, Jianmin; Fogel, Gary B.; Rajapakse, Jagath C.
Classification of genes into biologically related groups facilitates inference of their functions. Codon usage bias has been described previously as a potential feature for gene classification. In this paper, we demonstrate that di-codon usage can further improve classification of genes. By using both codon and di-codon features, we achieve near perfect accuracies for the classification of HLA molecules into major classes and sub-classes. The method is illustrated on 1,841 HLA sequences which are classified into two major classes, HLA-I and HLA-II. Major classes are further classified into sub-groups. A binary SVM using di-codon usage patterns achieved 99.95% accuracy in the classification of HLA genes into major HLA classes; and multi-class SVM achieved accuracy rates of 99.82% and 99.03% for sub-class classification of HLA-I and HLA-II genes, respectively. Furthermore, by combining codon and di-codon usages, the prediction accuracies reached 100%, 99.82%, and 99.84% for HLA major class classification, and for sub-class classification of HLA-I and HLA-II genes, respectively.
NASA Technical Reports Server (NTRS)
Cibula, William G.; Nyquist, Maurice O.
1987-01-01
An unsupervised computer classification of vegetation/landcover of Olympic National Park and surrounding environs was initially carried out using four bands of Landsat MSS data. The primary objective of the project was to derive a level of landcover classifications useful for park management applications while maintaining an acceptably high level of classification accuracy. Initially, nine generalized vegetation/landcover classes were derived. Overall classification accuracy was 91.7 percent. In an attempt to refine the level of classification, a geographic information system (GIS) approach was employed. Topographic data and watershed boundaries (inferred precipitation/temperature) data were registered with the Landsat MSS data. The resultant boolean operations yielded 21 vegetation/landcover classes while maintaining the same level of classification accuracy. The final classification provided much better identification and location of the major forest types within the park at the same high level of accuracy, and these met the project objective. This classification could now become inputs into a GIS system to help provide answers to park management coupled with other ancillary data programs such as fire management.
NASA Astrophysics Data System (ADS)
Gao, Yan; Marpu, Prashanth; Morales Manila, Luis M.
2014-11-01
This paper assesses the suitability of 8-band Worldview-2 (WV2) satellite data and object-based random forest algorithm for the classification of avocado growth stages in Mexico. We tested both pixel-based with minimum distance (MD) and maximum likelihood (MLC) and object-based with Random Forest (RF) algorithm for this task. Training samples and verification data were selected by visual interpreting the WV2 images for seven thematic classes: fully grown, middle stage, and early stage of avocado crops, bare land, two types of natural forests, and water body. To examine the contribution of the four new spectral bands of WV2 sensor, all the tested classifications were carried out with and without the four new spectral bands. Classification accuracy assessment results show that object-based classification with RF algorithm obtained higher overall higher accuracy (93.06%) than pixel-based MD (69.37%) and MLC (64.03%) method. For both pixel-based and object-based methods, the classifications with the four new spectral bands (overall accuracy obtained higher accuracy than those without: overall accuracy of object-based RF classification with vs without: 93.06% vs 83.59%, pixel-based MD: 69.37% vs 67.2%, pixel-based MLC: 64.03% vs 36.05%, suggesting that the four new spectral bands in WV2 sensor contributed to the increase of the classification accuracy.
Yang, Ze-Hui; Zheng, Rui; Gao, Yuan; Zhang, Qiang
2016-09-01
With the widespread application of high-throughput technology, numerous meta-analysis methods have been proposed for differential expression profiling across multiple studies. We identified the suitable differentially expressed (DE) genes that contributed to lung adenocarcinoma (ADC) clustering based on seven popular multiple meta-analysis methods. Seven microarray expression profiles of ADC and normal controls were extracted from the ArrayExpress database. The Bioconductor was used to perform the data preliminary preprocessing. Then, DE genes across multiple studies were identified. Hierarchical clustering was applied to compare the classification performance for microarray data samples. The classification efficiency was compared based on accuracy, sensitivity and specificity. Across seven datasets, 573 ADC cases and 222 normal controls were collected. After filtering out unexpressed and noninformative genes, 3688 genes were remained for further analysis. The classification efficiency analysis showed that DE genes identified by sum of ranks method separated ADC from normal controls with the best accuracy, sensitivity and specificity of 0.953, 0.969 and 0.932, respectively. The gene set with the highest classification accuracy mainly participated in the regulation of response to external stimulus (P = 7.97E-04), cyclic nucleotide-mediated signaling (P = 0.01), regulation of cell morphogenesis (P = 0.01) and regulation of cell proliferation (P = 0.01). Evaluation of DE genes identified by different meta-analysis methods in classification efficiency provided a new perspective to the choice of the suitable method in a given application. Varying meta-analysis methods always present varying abilities, so synthetic consideration should be taken when providing meta-analysis methods for particular research. © 2015 John Wiley & Sons Ltd.
Schmitter, Daniel; Roche, Alexis; Maréchal, Bénédicte; Ribes, Delphine; Abdulkadir, Ahmed; Bach-Cuadra, Meritxell; Daducci, Alessandro; Granziera, Cristina; Klöppel, Stefan; Maeder, Philippe; Meuli, Reto; Krueger, Gunnar
2014-01-01
Voxel-based morphometry from conventional T1-weighted images has proved effective to quantify Alzheimer's disease (AD) related brain atrophy and to enable fairly accurate automated classification of AD patients, mild cognitive impaired patients (MCI) and elderly controls. Little is known, however, about the classification power of volume-based morphometry, where features of interest consist of a few brain structure volumes (e.g. hippocampi, lobes, ventricles) as opposed to hundreds of thousands of voxel-wise gray matter concentrations. In this work, we experimentally evaluate two distinct volume-based morphometry algorithms (FreeSurfer and an in-house algorithm called MorphoBox) for automatic disease classification on a standardized data set from the Alzheimer's Disease Neuroimaging Initiative. Results indicate that both algorithms achieve classification accuracy comparable to the conventional whole-brain voxel-based morphometry pipeline using SPM for AD vs elderly controls and MCI vs controls, and higher accuracy for classification of AD vs MCI and early vs late AD converters, thereby demonstrating the potential of volume-based morphometry to assist diagnosis of mild cognitive impairment and Alzheimer's disease. PMID:25429357
NASA Astrophysics Data System (ADS)
Gevaert, C. M.; Persello, C.; Sliuzas, R.; Vosselman, G.
2016-06-01
Unmanned Aerial Vehicles (UAVs) are capable of providing very high resolution and up-to-date information to support informal settlement upgrading projects. In order to provide accurate basemaps, urban scene understanding through the identification and classification of buildings and terrain is imperative. However, common characteristics of informal settlements such as small, irregular buildings with heterogeneous roof material and large presence of clutter challenge state-of-the-art algorithms. Especially the dense buildings and steeply sloped terrain cause difficulties in identifying elevated objects. This work investigates how 2D radiometric and textural features, 2.5D topographic features, and 3D geometric features obtained from UAV imagery can be integrated to obtain a high classification accuracy in challenging classification problems for the analysis of informal settlements. It compares the utility of pixel-based and segment-based features obtained from an orthomosaic and DSM with point-based and segment-based features extracted from the point cloud to classify an unplanned settlement in Kigali, Rwanda. Findings show that the integration of 2D and 3D features leads to higher classification accuracies.
Protein classification using modified n-grams and skip-grams.
Islam, S M Ashiqul; Heil, Benjamin J; Kearney, Christopher Michel; Baker, Erich J
2018-05-01
Classification by supervised machine learning greatly facilitates the annotation of protein characteristics from their primary sequence. However, the feature generation step in this process requires detailed knowledge of attributes used to classify the proteins. Lack of this knowledge risks the selection of irrelevant features, resulting in a faulty model. In this study, we introduce a supervised protein classification method with a novel means of automating the work-intensive feature generation step via a Natural Language Processing (NLP)-dependent model, using a modified combination of n-grams and skip-grams (m-NGSG). A meta-comparison of cross-validation accuracy with twelve training datasets from nine different published studies demonstrates a consistent increase in accuracy of m-NGSG when compared to contemporary classification and feature generation models. We expect this model to accelerate the classification of proteins from primary sequence data and increase the accessibility of protein characteristic prediction to a broader range of scientists. m-NGSG is freely available at Bitbucket: https://bitbucket.org/sm_islam/mngsg/src. A web server is available at watson.ecs.baylor.edu/ngsg. erich_baker@baylor.edu. Supplementary data are available at Bioinformatics online.
The construction of support vector machine classifier using the firefly algorithm.
Chao, Chih-Feng; Horng, Ming-Huwi
2015-01-01
The setting of parameters in the support vector machines (SVMs) is very important with regard to its accuracy and efficiency. In this paper, we employ the firefly algorithm to train all parameters of the SVM simultaneously, including the penalty parameter, smoothness parameter, and Lagrangian multiplier. The proposed method is called the firefly-based SVM (firefly-SVM). This tool is not considered the feature selection, because the SVM, together with feature selection, is not suitable for the application in a multiclass classification, especially for the one-against-all multiclass SVM. In experiments, binary and multiclass classifications are explored. In the experiments on binary classification, ten of the benchmark data sets of the University of California, Irvine (UCI), machine learning repository are used; additionally the firefly-SVM is applied to the multiclass diagnosis of ultrasonic supraspinatus images. The classification performance of firefly-SVM is also compared to the original LIBSVM method associated with the grid search method and the particle swarm optimization based SVM (PSO-SVM). The experimental results advocate the use of firefly-SVM to classify pattern classifications for maximum accuracy.
The Construction of Support Vector Machine Classifier Using the Firefly Algorithm
Chao, Chih-Feng; Horng, Ming-Huwi
2015-01-01
The setting of parameters in the support vector machines (SVMs) is very important with regard to its accuracy and efficiency. In this paper, we employ the firefly algorithm to train all parameters of the SVM simultaneously, including the penalty parameter, smoothness parameter, and Lagrangian multiplier. The proposed method is called the firefly-based SVM (firefly-SVM). This tool is not considered the feature selection, because the SVM, together with feature selection, is not suitable for the application in a multiclass classification, especially for the one-against-all multiclass SVM. In experiments, binary and multiclass classifications are explored. In the experiments on binary classification, ten of the benchmark data sets of the University of California, Irvine (UCI), machine learning repository are used; additionally the firefly-SVM is applied to the multiclass diagnosis of ultrasonic supraspinatus images. The classification performance of firefly-SVM is also compared to the original LIBSVM method associated with the grid search method and the particle swarm optimization based SVM (PSO-SVM). The experimental results advocate the use of firefly-SVM to classify pattern classifications for maximum accuracy. PMID:25802511
Hussain, Shaista; Basu, Arindam
2016-01-01
The development of power-efficient neuromorphic devices presents the challenge of designing spike pattern classification algorithms which can be implemented on low-precision hardware and can also achieve state-of-the-art performance. In our pursuit of meeting this challenge, we present a pattern classification model which uses a sparse connection matrix and exploits the mechanism of nonlinear dendritic processing to achieve high classification accuracy. A rate-based structural learning rule for multiclass classification is proposed which modifies a connectivity matrix of binary synaptic connections by choosing the best “k” out of “d” inputs to make connections on every dendritic branch (k < < d). Because learning only modifies connectivity, the model is well suited for implementation in neuromorphic systems using address-event representation (AER). We develop an ensemble method which combines several dendritic classifiers to achieve enhanced generalization over individual classifiers. We have two major findings: (1) Our results demonstrate that an ensemble created with classifiers comprising moderate number of dendrites performs better than both ensembles of perceptrons and of complex dendritic trees. (2) In order to determine the moderate number of dendrites required for a specific classification problem, a two-step solution is proposed. First, an adaptive approach is proposed which scales the relative size of the dendritic trees of neurons for each class. It works by progressively adding dendrites with fixed number of synapses to the network, thereby allocating synaptic resources as per the complexity of the given problem. As a second step, theoretical capacity calculations are used to convert each neuronal dendritic tree to its optimal topology where dendrites of each class are assigned different number of synapses. The performance of the model is evaluated on classification of handwritten digits from the benchmark MNIST dataset and compared with other spike classifiers. We show that our system can achieve classification accuracy within 1 − 2% of other reported spike-based classifiers while using much less synaptic resources (only 7%) compared to that used by other methods. Further, an ensemble classifier created with adaptively learned sizes can attain accuracy of 96.4% which is at par with the best reported performance of spike-based classifiers. Moreover, the proposed method achieves this by using about 20% of the synapses used by other spike algorithms. We also present results of applying our algorithm to classify the MNIST-DVS dataset collected from a real spike-based image sensor and show results comparable to the best reported ones (88.1% accuracy). For VLSI implementations, we show that the reduced synaptic memory can save upto 4X area compared to conventional crossbar topologies. Finally, we also present a biologically realistic spike-based version for calculating the correlations required by the structural learning rule and demonstrate the correspondence between the rate-based and spike-based methods of learning. PMID:27065782
NASA Astrophysics Data System (ADS)
Prochazka, D.; Mazura, M.; Samek, O.; Rebrošová, K.; Pořízka, P.; Klus, J.; Prochazková, P.; Novotný, J.; Novotný, K.; Kaiser, J.
2018-01-01
In this work, we investigate the impact of data provided by complementary laser-based spectroscopic methods on multivariate classification accuracy. Discrimination and classification of five Staphylococcus bacterial strains and one strain of Escherichia coli is presented. The technique that we used for measurements is a combination of Raman spectroscopy and Laser-Induced Breakdown Spectroscopy (LIBS). Obtained spectroscopic data were then processed using Multivariate Data Analysis algorithms. Principal Components Analysis (PCA) was selected as the most suitable technique for visualization of bacterial strains data. To classify the bacterial strains, we used Neural Networks, namely a supervised version of Kohonen's self-organizing maps (SOM). We were processing results in three different ways - separately from LIBS measurements, from Raman measurements, and we also merged data from both mentioned methods. The three types of results were then compared. By applying the PCA to Raman spectroscopy data, we observed that two bacterial strains were fully distinguished from the rest of the data set. In the case of LIBS data, three bacterial strains were fully discriminated. Using a combination of data from both methods, we achieved the complete discrimination of all bacterial strains. All the data were classified with a high success rate using SOM algorithm. The most accurate classification was obtained using a combination of data from both techniques. The classification accuracy varied, depending on specific samples and techniques. As for LIBS, the classification accuracy ranged from 45% to 100%, as for Raman Spectroscopy from 50% to 100% and in case of merged data, all samples were classified correctly. Based on the results of the experiments presented in this work, we can assume that the combination of Raman spectroscopy and LIBS significantly enhances discrimination and classification accuracy of bacterial species and strains. The reason is the complementarity in obtained chemical information while using these two methods.
Classification of interstitial lung disease patterns with topological texture features
NASA Astrophysics Data System (ADS)
Huber, Markus B.; Nagarajan, Mahesh; Leinsinger, Gerda; Ray, Lawrence A.; Wismüller, Axel
2010-03-01
Topological texture features were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honey-combing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. A set of 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and three Minkowski Functionals (MFs, e.g. MF.euler). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions and the significance thresholds were adjusted for multiple comparisons by the Bonferroni correction. The best classification results were obtained by the MF features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers. The highest accuracy was found for MF.euler (97.5%, 96.6%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced topological texture features can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.
Mycofier: a new machine learning-based classifier for fungal ITS sequences.
Delgado-Serrano, Luisa; Restrepo, Silvia; Bustos, Jose Ricardo; Zambrano, Maria Mercedes; Anzola, Juan Manuel
2016-08-11
The taxonomic and phylogenetic classification based on sequence analysis of the ITS1 genomic region has become a crucial component of fungal ecology and diversity studies. Nowadays, there is no accurate alignment-free classification tool for fungal ITS1 sequences for large environmental surveys. This study describes the development of a machine learning-based classifier for the taxonomical assignment of fungal ITS1 sequences at the genus level. A fungal ITS1 sequence database was built using curated data. Training and test sets were generated from it. A Naïve Bayesian classifier was built using features from the primary sequence with an accuracy of 87 % in the classification at the genus level. The final model was based on a Naïve Bayes algorithm using ITS1 sequences from 510 fungal genera. This classifier, denoted as Mycofier, provides similar classification accuracy compared to BLASTN, but the database used for the classification contains curated data and the tool, independent of alignment, is more efficient and contributes to the field, given the lack of an accurate classification tool for large data from fungal ITS1 sequences. The software and source code for Mycofier are freely available at https://github.com/ldelgado-serrano/mycofier.git .
Corn and soybean Landsat MSS classification performance as a function of scene characteristics
NASA Technical Reports Server (NTRS)
Batista, G. T.; Hixson, M. M.; Bauer, M. E.
1982-01-01
In order to fully utilize remote sensing to inventory crop production, it is important to identify the factors that affect the accuracy of Landsat classifications. The objective of this study was to investigate the effect of scene characteristics involving crop, soil, and weather variables on the accuracy of Landsat classifications of corn and soybeans. Segments sampling the U.S. Corn Belt were classified using a Gaussian maximum likelihood classifier on multitemporally registered data from two key acquisition periods. Field size had a strong effect on classification accuracy with small fields tending to have low accuracies even when the effect of mixed pixels was eliminated. Other scene characteristics accounting for variability in classification accuracy included proportions of corn and soybeans, crop diversity index, proportion of all field crops, soil drainage, slope, soil order, long-term average soybean yield, maximum yield, relative position of the segment in the Corn Belt, weather, and crop development stage.
A Classification of Remote Sensing Image Based on Improved Compound Kernels of Svm
NASA Astrophysics Data System (ADS)
Zhao, Jianing; Gao, Wanlin; Liu, Zili; Mou, Guifen; Lu, Lin; Yu, Lina
The accuracy of RS classification based on SVM which is developed from statistical learning theory is high under small number of train samples, which results in satisfaction of classification on RS using SVM methods. The traditional RS classification method combines visual interpretation with computer classification. The accuracy of the RS classification, however, is improved a lot based on SVM method, because it saves much labor and time which is used to interpret images and collect training samples. Kernel functions play an important part in the SVM algorithm. It uses improved compound kernel function and therefore has a higher accuracy of classification on RS images. Moreover, compound kernel improves the generalization and learning ability of the kernel.
Nationwide forestry applications program. Analysis of forest classification accuracy
NASA Technical Reports Server (NTRS)
Congalton, R. G.; Mead, R. A.; Oderwald, R. G.; Heinen, J. (Principal Investigator)
1981-01-01
The development of LANDSAT classification accuracy assessment techniques, and of a computerized system for assessing wildlife habitat from land cover maps are considered. A literature review on accuracy assessment techniques and an explanation for the techniques development under both projects are included along with listings of the computer programs. The presentations and discussions at the National Working Conference on LANDSAT Classification Accuracy are summarized. Two symposium papers which were published on the results of this project are appended.
A Study of Light Level Effect on the Accuracy of Image Processing-based Tomato Grading
NASA Astrophysics Data System (ADS)
Prijatna, D.; Muhaemin, M.; Wulandari, R. P.; Herwanto, T.; Saukat, M.; Sugandi, W. K.
2018-05-01
Image processing method has been used in non-destructive tests of agricultural products. Compared to manual method, image processing method may produce more objective and consistent results. Image capturing box installed in currently used tomato grading machine (TEP-4) is equipped with four fluorescence lamps to illuminate the processed tomatoes. Since the performance of any lamp will decrease if its service time has exceeded its lifetime, it is predicted that this will affect tomato classification. The objective of this study was to determine the minimum light levels which affect classification accuracy. This study was conducted by varying light level from minimum and maximum on tomatoes in image capturing boxes and then investigates its effects on image characteristics. Research results showed that light intensity affects two variables which are important for classification, for example, area and color of captured image. Image processing program was able to determine correctly the weight and classification of tomatoes when light level was 30 lx to 140 lx.
NASA Technical Reports Server (NTRS)
Messmore, J. A.
1976-01-01
The feasibility of using digital satellite imagery and automatic data processing techniques as a means of mapping swamp forest vegetation was considered, using multispectral scanner data acquired by the LANDSAT-1 satellite. The site for this investigation was the Dismal Swamp, a 210,000 acre swamp forest located south of Suffolk, Va. on the Virginia-North Carolina border. Two basic classification strategies were employed. The initial classification utilized unsupervised techniques which produced a map of the swamp indicating the distribution of thirteen forest spectral classes. These classes were later combined into three informational categories: Atlantic white cedar (Chamaecyparis thyoides), Loblolly pine (Pinus taeda), and deciduous forest. The subsequent classification employed supervised techniques which mapped Atlantic white cedar, Loblolly pine, deciduous forest, water and agriculture within the study site. A classification accuracy of 82.5% was produced by unsupervised techniques compared with 89% accuracy using supervised techniques.
Prahm, Cosima; Eckstein, Korbinian; Ortiz-Catalan, Max; Dorffner, Georg; Kaniusas, Eugenijus; Aszmann, Oskar C
2016-08-31
Controlling a myoelectric prosthesis for upper limbs is increasingly challenging for the user as more electrodes and joints become available. Motion classification based on pattern recognition with a multi-electrode array allows multiple joints to be controlled simultaneously. Previous pattern recognition studies are difficult to compare, because individual research groups use their own data sets. To resolve this shortcoming and to facilitate comparisons, open access data sets were analysed using components of BioPatRec and Netlab pattern recognition models. Performances of the artificial neural networks, linear models, and training program components were compared. Evaluation took place within the BioPatRec environment, a Matlab-based open source platform that provides feature extraction, processing and motion classification algorithms for prosthetic control. The algorithms were applied to myoelectric signals for individual and simultaneous classification of movements, with the aim of finding the best performing algorithm and network model. Evaluation criteria included classification accuracy and training time. Results in both the linear and the artificial neural network models demonstrated that Netlab's implementation using scaled conjugate training algorithm reached significantly higher accuracies than BioPatRec. It is concluded that the best movement classification performance would be achieved through integrating Netlab training algorithms in the BioPatRec environment so that future prosthesis training can be shortened and control made more reliable. Netlab was therefore included into the newest release of BioPatRec (v4.0).
Zbroch, Tomasz; Knapp, Paweł Grzegorz; Knapp, Piotr Andrzej
2007-09-01
Increasing knowledge concerning carcinogenesis within cervical epithelium has forced us to make continues modifications of cytology classification of the cervical smears. Eventually, new descriptions of the submicroscopic cytomorphological abnormalities have enabled the implementation of Bethesda System which was meant to take place of the former Papanicolaou classification although temporarily both are sometimes used simultaneously. The aim of this study was to compare results of these two classification systems in the aspect of diagnostic accuracy verified by further tests of the diagnostic algorithm for the cervical lesion evaluation. The study was conducted in the group of women selected from general population, the criteria being the place of living and cervical cancer age risk group, in the consecutive periods of mass screening in Podlaski region. The performed diagnostic tests have been based on the commonly used algorithm, as well as identical laboratory and methodological conditions. Performed assessment revealed comparable diagnostic accuracy of both analyzing classifications, verified by histological examination, although with marked higher specificity for dysplastic lesions with decreased number of HSIL results and increased diagnosis of LSILs. Higher number of performed colposcopies and biopsies were an additional consequence of TBS classification. Results based on Bethesda System made it possible to find the sources and reasons of abnormalities with much greater precision, which enabled causing agent treatment. Two evaluated cytology classification systems, although not much different, depicted higher potential of TBS and better, more effective communication between cytology laboratory and gynecologist, making reasonable implementation of The Bethesda System in the daily cytology screening work.
NASA Astrophysics Data System (ADS)
Qu, Haicheng; Liang, Xuejian; Liang, Shichao; Liu, Wanjun
2018-01-01
Many methods of hyperspectral image classification have been proposed recently, and the convolutional neural network (CNN) achieves outstanding performance. However, spectral-spatial classification of CNN requires an excessively large model, tremendous computations, and complex network, and CNN is generally unable to use the noisy bands caused by water-vapor absorption. A dimensionality-varied CNN (DV-CNN) is proposed to address these issues. There are four stages in DV-CNN and the dimensionalities of spectral-spatial feature maps vary with the stages. DV-CNN can reduce the computation and simplify the structure of the network. All feature maps are processed by more kernels in higher stages to extract more precise features. DV-CNN also improves the classification accuracy and enhances the robustness to water-vapor absorption bands. The experiments are performed on data sets of Indian Pines and Pavia University scene. The classification performance of DV-CNN is compared with state-of-the-art methods, which contain the variations of CNN, traditional, and other deep learning methods. The experiment of performance analysis about DV-CNN itself is also carried out. The experimental results demonstrate that DV-CNN outperforms state-of-the-art methods for spectral-spatial classification and it is also robust to water-vapor absorption bands. Moreover, reasonable parameters selection is effective to improve classification accuracy.
A Hybrid Sensing Approach for Pure and Adulterated Honey Classification
Subari, Norazian; Saleh, Junita Mohamad; Shakaff, Ali Yeon Md; Zakaria, Ammar
2012-01-01
This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data. PMID:23202033
Derivation of an artificial gene to improve classification accuracy upon gene selection.
Seo, Minseok; Oh, Sejong
2012-02-01
Classification analysis has been developed continuously since 1936. This research field has advanced as a result of development of classifiers such as KNN, ANN, and SVM, as well as through data preprocessing areas. Feature (gene) selection is required for very high dimensional data such as microarray before classification work. The goal of feature selection is to choose a subset of informative features that reduces processing time and provides higher classification accuracy. In this study, we devised a method of artificial gene making (AGM) for microarray data to improve classification accuracy. Our artificial gene was derived from a whole microarray dataset, and combined with a result of gene selection for classification analysis. We experimentally confirmed a clear improvement of classification accuracy after inserting artificial gene. Our artificial gene worked well for popular feature (gene) selection algorithms and classifiers. The proposed approach can be applied to any type of high dimensional dataset. Copyright © 2011 Elsevier Ltd. All rights reserved.
SVM-RFE based feature selection and Taguchi parameters optimization for multiclass SVM classifier.
Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W M; Li, R K; Jiang, Bo-Ru
2014-01-01
Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.
SVM-RFE Based Feature Selection and Taguchi Parameters Optimization for Multiclass SVM Classifier
Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W. M.; Li, R. K.; Jiang, Bo-Ru
2014-01-01
Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases. PMID:25295306
Lee, David; Park, Sang-Hoon; Lee, Sang-Goog
2017-10-07
In this paper, we propose a set of wavelet-based combined feature vectors and a Gaussian mixture model (GMM)-supervector to enhance training speed and classification accuracy in motor imagery brain-computer interfaces. The proposed method is configured as follows: first, wavelet transforms are applied to extract the feature vectors for identification of motor imagery electroencephalography (EEG) and principal component analyses are used to reduce the dimensionality of the feature vectors and linearly combine them. Subsequently, the GMM universal background model is trained by the expectation-maximization (EM) algorithm to purify the training data and reduce its size. Finally, a purified and reduced GMM-supervector is used to train the support vector machine classifier. The performance of the proposed method was evaluated for three different motor imagery datasets in terms of accuracy, kappa, mutual information, and computation time, and compared with the state-of-the-art algorithms. The results from the study indicate that the proposed method achieves high accuracy with a small amount of training data compared with the state-of-the-art algorithms in motor imagery EEG classification.
NASA Astrophysics Data System (ADS)
Keyport, Ren N.; Oommen, Thomas; Martha, Tapas R.; Sajinkumar, K. S.; Gierke, John S.
2018-02-01
A comparative analysis of landslides detected by pixel-based and object-oriented analysis (OOA) methods was performed using very high-resolution (VHR) remotely sensed aerial images for the San Juan La Laguna, Guatemala, which witnessed widespread devastation during the 2005 Hurricane Stan. A 3-band orthophoto of 0.5 m spatial resolution together with a 115 field-based landslide inventory were used for the analysis. A binary reference was assigned with a zero value for landslide and unity for non-landslide pixels. The pixel-based analysis was performed using unsupervised classification, which resulted in 11 different trial classes. Detection of landslides using OOA includes 2-step K-means clustering to eliminate regions based on brightness; elimination of false positives using object properties such as rectangular fit, compactness, length/width ratio, mean difference of objects, and slope angle. Both overall accuracy and F-score for OOA methods outperformed pixel-based unsupervised classification methods in both landslide and non-landslide classes. The overall accuracy for OOA and pixel-based unsupervised classification was 96.5% and 94.3%, respectively, whereas the best F-score for landslide identification for OOA and pixel-based unsupervised methods: were 84.3% and 77.9%, respectively.Results indicate that the OOA is able to identify the majority of landslides with a few false positive when compared to pixel-based unsupervised classification.
Sobol-Shikler, Tal; Robinson, Peter
2010-07-01
We present a classification algorithm for inferring affective states (emotions, mental states, attitudes, and the like) from their nonverbal expressions in speech. It is based on the observations that affective states can occur simultaneously and different sets of vocal features, such as intonation and speech rate, distinguish between nonverbal expressions of different affective states. The input to the inference system was a large set of vocal features and metrics that were extracted from each utterance. The classification algorithm conducted independent pairwise comparisons between nine affective-state groups. The classifier used various subsets of metrics of the vocal features and various classification algorithms for different pairs of affective-state groups. Average classification accuracy of the 36 pairwise machines was 75 percent, using 10-fold cross validation. The comparison results were consolidated into a single ranked list of the nine affective-state groups. This list was the output of the system and represented the inferred combination of co-occurring affective states for the analyzed utterance. The inference accuracy of the combined machine was 83 percent. The system automatically characterized over 500 affective state concepts from the Mind Reading database. The inference of co-occurring affective states was validated by comparing the inferred combinations to the lexical definitions of the labels of the analyzed sentences. The distinguishing capabilities of the system were comparable to human performance.
Efficiency of the spectral-spatial classification of hyperspectral imaging data
NASA Astrophysics Data System (ADS)
Borzov, S. M.; Potaturkin, O. I.
2017-01-01
The efficiency of methods of the spectral-spatial classification of similarly looking types of vegetation on the basis of hyperspectral data of remote sensing of the Earth, which take into account local neighborhoods of analyzed image pixels, is experimentally studied. Algorithms that involve spatial pre-processing of the raw data and post-processing of pixel-based spectral classification maps are considered. Results obtained both for a large-size hyperspectral image and for its test fragment with different methods of training set construction are reported. The classification accuracy in all cases is estimated through comparisons of ground-truth data and classification maps formed by using the compared methods. The reasons for the differences in these estimates are discussed.
Classification of large-scale fundus image data sets: a cloud-computing framework.
Roychowdhury, Sohini
2016-08-01
Large medical image data sets with high dimensionality require substantial amount of computation time for data creation and data processing. This paper presents a novel generalized method that finds optimal image-based feature sets that reduce computational time complexity while maximizing overall classification accuracy for detection of diabetic retinopathy (DR). First, region-based and pixel-based features are extracted from fundus images for classification of DR lesions and vessel-like structures. Next, feature ranking strategies are used to distinguish the optimal classification feature sets. DR lesion and vessel classification accuracies are computed using the boosted decision tree and decision forest classifiers in the Microsoft Azure Machine Learning Studio platform, respectively. For images from the DIARETDB1 data set, 40 of its highest-ranked features are used to classify four DR lesion types with an average classification accuracy of 90.1% in 792 seconds. Also, for classification of red lesion regions and hemorrhages from microaneurysms, accuracies of 85% and 72% are observed, respectively. For images from STARE data set, 40 high-ranked features can classify minor blood vessels with an accuracy of 83.5% in 326 seconds. Such cloud-based fundus image analysis systems can significantly enhance the borderline classification performances in automated screening systems.
A Nonparametric Approach to Estimate Classification Accuracy and Consistency
ERIC Educational Resources Information Center
Lathrop, Quinn N.; Cheng, Ying
2014-01-01
When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…
NASA Astrophysics Data System (ADS)
Löw, Fabian; Schorcht, Gunther; Michel, Ulrich; Dech, Stefan; Conrad, Christopher
2012-10-01
Accurate crop identification and crop area estimation are important for studies on irrigated agricultural systems, yield and water demand modeling, and agrarian policy development. In this study a novel combination of Random Forest (RF) and Support Vector Machine (SVM) classifiers is presented that (i) enhances crop classification accuracy and (ii) provides spatial information on map uncertainty. The methodology was implemented over four distinct irrigated sites in Middle Asia using RapidEye time series data. The RF feature importance statistics was used as feature-selection strategy for the SVM to assess possible negative effects on classification accuracy caused by an oversized feature space. The results of the individual RF and SVM classifications were combined with rules based on posterior classification probability and estimates of classification probability entropy. SVM classification performance was increased by feature selection through RF. Further experimental results indicate that the hybrid classifier improves overall classification accuracy in comparison to the single classifiers as well as useŕs and produceŕs accuracy.
NASA Astrophysics Data System (ADS)
O'Neil, Gina L.; Goodall, Jonathan L.; Watson, Layne T.
2018-04-01
Wetlands are important ecosystems that provide many ecological benefits, and their quality and presence are protected by federal regulations. These regulations require wetland delineations, which can be costly and time-consuming to perform. Computer models can assist in this process, but lack the accuracy necessary for environmental planning-scale wetland identification. In this study, the potential for improvement of wetland identification models through modification of digital elevation model (DEM) derivatives, derived from high-resolution and increasingly available light detection and ranging (LiDAR) data, at a scale necessary for small-scale wetland delineations is evaluated. A novel approach of flow convergence modelling is presented where Topographic Wetness Index (TWI), curvature, and Cartographic Depth-to-Water index (DTW), are modified to better distinguish wetland from upland areas, combined with ancillary soil data, and used in a Random Forest classification. This approach is applied to four study sites in Virginia, implemented as an ArcGIS model. The model resulted in significant improvement in average wetland accuracy compared to the commonly used National Wetland Inventory (84.9% vs. 32.1%), at the expense of a moderately lower average non-wetland accuracy (85.6% vs. 98.0%) and average overall accuracy (85.6% vs. 92.0%). From this, we concluded that modifying TWI, curvature, and DTW provides more robust wetland and non-wetland signatures to the models by improving accuracy rates compared to classifications using the original indices. The resulting ArcGIS model is a general tool able to modify these local LiDAR DEM derivatives based on site characteristics to identify wetlands at a high resolution.
Posture Detection Based on Smart Cushion for Wheelchair Users
Ma, Congcong; Li, Wenfeng; Gravina, Raffaele; Fortino, Giancarlo
2017-01-01
The postures of wheelchair users can reveal their sitting habit, mood, and even predict health risks such as pressure ulcers or lower back pain. Mining the hidden information of the postures can reveal their wellness and general health conditions. In this paper, a cushion-based posture recognition system is used to process pressure sensor signals for the detection of user’s posture in the wheelchair. The proposed posture detection method is composed of three main steps: data level classification for posture detection, backward selection of sensor configuration, and recognition results compared with previous literature. Five supervised classification techniques—Decision Tree (J48), Support Vector Machines (SVM), Multilayer Perceptron (MLP), Naive Bayes, and k-Nearest Neighbor (k-NN)—are compared in terms of classification accuracy, precision, recall, and F-measure. Results indicate that the J48 classifier provides the highest accuracy compared to other techniques. The backward selection method was used to determine the best sensor deployment configuration of the wheelchair. Several kinds of pressure sensor deployments are compared and our new method of deployment is shown to better detect postures of the wheelchair users. Performance analysis also took into account the Body Mass Index (BMI), useful for evaluating the robustness of the method across individual physical differences. Results show that our proposed sensor deployment is effective, achieving 99.47% posture recognition accuracy. Our proposed method is very competitive for posture recognition and robust in comparison with other former research. Accurate posture detection represents a fundamental basic block to develop several applications, including fatigue estimation and activity level assessment. PMID:28353684
Shermeyer, Jacob S.; Haack, Barry N.
2015-01-01
Two forestry-change detection methods are described, compared, and contrasted for estimating deforestation and growth in threatened forests in southern Peru from 2000 to 2010. The methods used in this study rely on freely available data, including atmospherically corrected Landsat 5 Thematic Mapper and Moderate Resolution Imaging Spectroradiometer (MODIS) vegetation continuous fields (VCF). The two methods include a conventional supervised signature extraction method and a unique self-calibrating method called MODIS VCF guided forest/nonforest (FNF) masking. The process chain for each of these methods includes a threshold classification of MODIS VCF, training data or signature extraction, signature evaluation, k-nearest neighbor classification, analyst-guided reclassification, and postclassification image differencing to generate forest change maps. Comparisons of all methods were based on an accuracy assessment using 500 validation pixels. Results of this accuracy assessment indicate that FNF masking had a 5% higher overall accuracy and was superior to conventional supervised classification when estimating forest change. Both methods succeeded in classifying persistently forested and nonforested areas, and both had limitations when classifying forest change.
Single-accelerometer-based daily physical activity classification.
Long, Xi; Yin, Bin; Aarts, Ronald M
2009-01-01
In this study, a single tri-axial accelerometer placed on the waist was used to record the acceleration data for human physical activity classification. The data collection involved 24 subjects performing daily real-life activities in a naturalistic environment without researchers' intervention. For the purpose of assessing customers' daily energy expenditure, walking, running, cycling, driving, and sports were chosen as target activities for classification. This study compared a Bayesian classification with that of a Decision Tree based approach. A Bayes classifier has the advantage to be more extensible, requiring little effort in classifier retraining and software update upon further expansion or modification of the target activities. Principal components analysis was applied to remove the correlation among features and to reduce the feature vector dimension. Experiments using leave-one-subject-out and 10-fold cross validation protocols revealed a classification accuracy of approximately 80%, which was comparable with that obtained by a Decision Tree classifier.
A Neuro-Fuzzy Approach in the Classification of Students' Academic Performance
2013-01-01
Classifying the student academic performance with high accuracy facilitates admission decisions and enhances educational services at educational institutions. The purpose of this paper is to present a neuro-fuzzy approach for classifying students into different groups. The neuro-fuzzy classifier used previous exam results and other related factors as input variables and labeled students based on their expected academic performance. The results showed that the proposed approach achieved a high accuracy. The results were also compared with those obtained from other well-known classification approaches, including support vector machine, Naive Bayes, neural network, and decision tree approaches. The comparative analysis indicated that the neuro-fuzzy approach performed better than the others. It is expected that this work may be used to support student admission procedures and to strengthen the services of educational institutions. PMID:24302928
A neuro-fuzzy approach in the classification of students' academic performance.
Do, Quang Hung; Chen, Jeng-Fung
2013-01-01
Classifying the student academic performance with high accuracy facilitates admission decisions and enhances educational services at educational institutions. The purpose of this paper is to present a neuro-fuzzy approach for classifying students into different groups. The neuro-fuzzy classifier used previous exam results and other related factors as input variables and labeled students based on their expected academic performance. The results showed that the proposed approach achieved a high accuracy. The results were also compared with those obtained from other well-known classification approaches, including support vector machine, Naive Bayes, neural network, and decision tree approaches. The comparative analysis indicated that the neuro-fuzzy approach performed better than the others. It is expected that this work may be used to support student admission procedures and to strengthen the services of educational institutions.
Information extraction with object based support vector machines and vegetation indices
NASA Astrophysics Data System (ADS)
Ustuner, Mustafa; Abdikan, Saygin; Balik Sanli, Fusun
2016-07-01
Information extraction through remote sensing data is important for policy and decision makers as extracted information provide base layers for many application of real world. Classification of remotely sensed data is the one of the most common methods of extracting information however it is still a challenging issue because several factors are affecting the accuracy of the classification. Resolution of the imagery, number and homogeneity of land cover classes, purity of training data and characteristic of adopted classifiers are just some of these challenging factors. Object based image classification has some superiority than pixel based classification for high resolution images since it uses geometry and structure information besides spectral information. Vegetation indices are also commonly used for the classification process since it provides additional spectral information for vegetation, forestry and agricultural areas. In this study, the impacts of the Normalized Difference Vegetation Index (NDVI) and Normalized Difference Red Edge Index (NDRE) on the classification accuracy of RapidEye imagery were investigated. Object based Support Vector Machines were implemented for the classification of crop types for the study area located in Aegean region of Turkey. Results demonstrated that the incorporation of NDRE increase the classification accuracy from 79,96% to 86,80% as overall accuracy, however NDVI decrease the classification accuracy from 79,96% to 78,90%. Moreover it is proven than object based classification with RapidEye data give promising results for crop type mapping and analysis.
Bolin, Jocelyn Holden; Finch, W Holmes
2014-01-01
Statistical classification of phenomena into observed groups is very common in the social and behavioral sciences. Statistical classification methods, however, are affected by the characteristics of the data under study. Statistical classification can be further complicated by initial misclassification of the observed groups. The purpose of this study is to investigate the impact of initial training data misclassification on several statistical classification and data mining techniques. Misclassification conditions in the three group case will be simulated and results will be presented in terms of overall as well as subgroup classification accuracy. Results show decreased classification accuracy as sample size, group separation and group size ratio decrease and as misclassification percentage increases with random forests demonstrating the highest accuracy across conditions.
NASA Astrophysics Data System (ADS)
Othman, Arsalan; Gloaguen, Richard
2015-04-01
Topographic effects and complex vegetation cover hinder lithology classification in mountain regions based not only in field, but also in reflectance remote sensing data. The area of interest "Bardi-Zard" is located in the NE of Iraq. It is part of the Zagros orogenic belt, where seven lithological units outcrop and is known for its chromite deposit. The aim of this study is to compare three machine learning algorithms (MLAs): Maximum Likelihood (ML), Support Vector Machines (SVM), and Random Forest (RF) in the context of a supervised lithology classification task using Advanced Space-borne Thermal Emission and Reflection radiometer (ASTER) satellite, its derived, spatial information (spatial coordinates) and geomorphic data. We emphasize the enhancement in remote sensing lithological mapping accuracy that arises from the integration of geomorphic features and spatial information (spatial coordinates) in classifications. This study identifies that RF is better than ML and SVM algorithms in almost the sixteen combination datasets, which were tested. The overall accuracy of the best dataset combination with the RF map for the all seven classes reach ~80% and the producer and user's accuracies are ~73.91% and 76.09% respectively while the kappa coefficient is ~0.76. TPI is more effective with SVM algorithm than an RF algorithm. This paper demonstrates that adding geomorphic indices such as TPI and spatial information in the dataset increases the lithological classification accuracy.
Graph-Based Semi-Supervised Hyperspectral Image Classification Using Spatial Information
NASA Astrophysics Data System (ADS)
Jamshidpour, N.; Homayouni, S.; Safari, A.
2017-09-01
Hyperspectral image classification has been one of the most popular research areas in the remote sensing community in the past decades. However, there are still some problems that need specific attentions. For example, the lack of enough labeled samples and the high dimensionality problem are two most important issues which degrade the performance of supervised classification dramatically. The main idea of semi-supervised learning is to overcome these issues by the contribution of unlabeled samples, which are available in an enormous amount. In this paper, we propose a graph-based semi-supervised classification method, which uses both spectral and spatial information for hyperspectral image classification. More specifically, two graphs were designed and constructed in order to exploit the relationship among pixels in spectral and spatial spaces respectively. Then, the Laplacians of both graphs were merged to form a weighted joint graph. The experiments were carried out on two different benchmark hyperspectral data sets. The proposed method performed significantly better than the well-known supervised classification methods, such as SVM. The assessments consisted of both accuracy and homogeneity analyses of the produced classification maps. The proposed spectral-spatial SSL method considerably increased the classification accuracy when the labeled training data set is too scarce.When there were only five labeled samples for each class, the performance improved 5.92% and 10.76% compared to spatial graph-based SSL, for AVIRIS Indian Pine and Pavia University data sets respectively.
Allen, Y.C.; Wilson, C.A.; Roberts, H.H.; Supan, J.
2005-01-01
Sidescan sonar holds great promise as a tool to quantitatively depict the distribution and extent of benthic habitats in Louisiana's turbid estuaries. In this study, we describe an effective protocol for acoustic sampling in this environment. We also compared three methods of classification in detail: mean-based thresholding, supervised, and unsupervised techniques to classify sidescan imagery into categories of mud and shell. Classification results were compared to ground truth results using quadrat and dredge sampling. Supervised classification gave the best overall result (kappa = 75%) when compared to quadrat results. Classification accuracy was less robust when compared to all dredge samples (kappa = 21-56%), but increased greatly (90-100%) when only dredge samples taken from acoustically homogeneous areas were considered. Sidescan sonar when combined with ground truth sampling at an appropriate scale can be effectively used to establish an accurate substrate base map for both research applications and shellfish management. The sidescan imagery presented here also provides, for the first time, a detailed presentation of oyster habitat patchiness and scale in a productive oyster growing area.
Evaluation of airborne image data for mapping riparian vegetation within the Grand Canyon
Davis, Philip A.; Staid, Matthew I.; Plescia, Jeffrey B.; Johnson, Jeffrey R.
2002-01-01
This study examined various types of remote-sensing data that have been acquired during a 12-month period over a portion of the Colorado River corridor to determine the type of data and conditions for data acquisition that provide the optimum classification results for mapping riparian vegetation. Issues related to vegetation mapping included time of year, number and positions of wavelength bands, and spatial resolution for data acquisition to produce accurate vegetation maps versus cost of data. Image data considered in the study consisted of scanned color-infrared (CIR) film, digital CIR, and digital multispectral data, whose resolutions from 11 cm (photographic film) to 100 cm (multispectral), that were acquired during the Spring, Summer, and Fall seasons in 2000 for five long-term monitoring sites containing riparian vegetation. Results show that digitally acquired data produce higher and more consistent classification accuracies for mapping vegetation units than do film products. The highest accuracies were obtained from nine-band multispectral data; however, a four-band subset of these data, that did not include short-wave infrared bands, produced comparable mapping results. The four-band subset consisted of the wavelength bands 0.52-0.59 µm, 0.59-0.62 µm, 0.67-0.72 µm, and 0.73-0.85 µm. Use of only three of these bands that simulate digital CIR sensors produced accuracies for several vegetation units that were 10% lower than those obtained using the full multispectral data set. Classification tests using band ratios produced lower accuracies than those using band reflectance for scanned film data; a result attributed to the relatively poor radiometric fidelity maintained by the film scanning process, whereas calibrated multispectral data produced similar classification accuracies using band reflectance and band ratios. This suggests that the intrinsic band reflectance of the vegetation is more important than inter-band reflectance differences in attaining high mapping accuracies. These results also indicate that radiometrically calibrated sensors that record a wide range of radiance produce superior results and that such sensors should be used for monitoring purposes. When texture (spatial variance) at near-infrared wavelength is combined with spectral data in classification, accuracy increased most markedly (20-30%) for the highest resolution (11-cm) CIR film data, but decreased in its effect on accuracy in lower-resolution multi-spectral image data; a result observed in previous studies (Franklin and McDermid 1993, Franklin et al. 2000, 2001). While many classification unit accuracies obtained from the 11-cm film CIR band with texture data were in fact higher than those produced using the 100-cm, nine-band multispectral data with texture, the 11-cm film CIR data produced much lower accuracies than the 100-cm multispectral data for the more sparsely populated vegetation units due to saturation of picture elements during the film scanning process in vegetation units with a high proportion of alluvium. Overall classification accuracies obtained from spectral band and texture data range from 36% to 78% for all databases considered, from 57% to 71% for the 11-cm film CIR data, and from 54% to 78% for the 100-cm multispectral data. Classification results obtained from 20-cm film CIR band and texture data, which were produced by applying a Gaussian filter to the 11-cm film CIR data, showed increases in accuracy due to texture that were similar to those observed using the original 11-cm film CIR data. This suggests that data can be collected at the lower resolution and still retain the added power of vegetation texture. Classification accuracies for the riparian vegetation units examined in this study do not appear to be influenced by season of data acquisition, although data acquired under direct sunlight produced higher overall accuracies than data acquired under overcast conditions. The latter observation, in addition to the importance of band reflectance for classification, implies that data should be acquired near summer solstice when sun elevation and reflectance is highest and when shadows cast by steep canyon walls are minimized.
NASA Technical Reports Server (NTRS)
Ackleson, S. G.; Klemas, V.
1987-01-01
Landsat MSS and TM imagery, obtained simultaneously over Guinea Marsh, VA, as analyzed and compares for its ability to detect submerged aquatic vegetation (SAV). An unsupervised clustering algorithm was applied to each image, where the input classification parameters are defined as functions of apparent sensor noise. Class confidence and accuracy were computed for all water areas by comparing the classified images, pixel-by-pixel, to rasterized SAV distributions derived from color aerial photography. To illustrate the effect of water depth on classification error, areas of depth greater than 1.9 m were masked, and class confidence and accuracy recalculated. A single-scattering radiative-transfer model is used to illustrate how percent canopy cover and water depth affect the volume reflectance from a water column containing SAV. For a submerged canopy that is morphologically and optically similar to Zostera marina inhabiting Lower Chesapeake Bay, dense canopies may be isolated by masking optically deep water. For less dense canopies, the effect of increasing water depth is to increase the apparent percent crown cover, which may result in classification error.
Object-oriented crop mapping and monitoring using multi-temporal polarimetric RADARSAT-2 data
NASA Astrophysics Data System (ADS)
Jiao, Xianfeng; Kovacs, John M.; Shang, Jiali; McNairn, Heather; Walters, Dan; Ma, Baoluo; Geng, Xiaoyuan
2014-10-01
The aim of this paper is to assess the accuracy of an object-oriented classification of polarimetric Synthetic Aperture Radar (PolSAR) data to map and monitor crops using 19 RADARSAT-2 fine beam polarimetric (FQ) images of an agricultural area in North-eastern Ontario, Canada. Polarimetric images and field data were acquired during the 2011 and 2012 growing seasons. The classification and field data collection focused on the main crop types grown in the region, which include: wheat, oat, soybean, canola and forage. The polarimetric parameters were extracted with PolSAR analysis using both the Cloude-Pottier and Freeman-Durden decompositions. The object-oriented classification, with a single date of PolSAR data, was able to classify all five crop types with an accuracy of 95% and Kappa of 0.93; a 6% improvement in comparison with linear-polarization only classification. However, the time of acquisition is crucial. The larger biomass crops of canola and soybean were most accurately mapped, whereas the identification of oat and wheat were more variable. The multi-temporal data using the Cloude-Pottier decomposition parameters provided the best classification accuracy compared to the linear polarizations and the Freeman-Durden decomposition parameters. In general, the object-oriented classifications were able to accurately map crop types by reducing the noise inherent in the SAR data. Furthermore, using the crop classification maps we were able to monitor crop growth stage based on a trend analysis of the radar response. Based on field data from canola crops, there was a strong relationship between the phenological growth stage based on the BBCH scale, and the HV backscatter and entropy.
Karan, Shivesh Kishore; Samadder, Sukha Ranjan
2016-08-01
One objective of the present study was to evaluate the performance of support vector machine (SVM)-based image classification technique with the maximum likelihood classification (MLC) technique for a rapidly changing landscape of an open-cast mine. The other objective was to assess the change in land use pattern due to coal mining from 2006 to 2016. Assessing the change in land use pattern accurately is important for the development and monitoring of coalfields in conjunction with sustainable development. For the present study, Landsat 5 Thematic Mapper (TM) data of 2006 and Landsat 8 Operational Land Imager (OLI)/Thermal Infrared Sensor (TIRS) data of 2016 of a part of Jharia Coalfield, Dhanbad, India, were used. The SVM classification technique provided greater overall classification accuracy when compared to the MLC technique in classifying heterogeneous landscape with limited training dataset. SVM exceeded MLC in handling a difficult challenge of classifying features having near similar reflectance on the mean signature plot, an improvement of over 11 % was observed in classification of built-up area, and an improvement of 24 % was observed in classification of surface water using SVM; similarly, the SVM technique improved the overall land use classification accuracy by almost 6 and 3 % for Landsat 5 and Landsat 8 images, respectively. Results indicated that land degradation increased significantly from 2006 to 2016 in the study area. This study will help in quantifying the changes and can also serve as a basis for further decision support system studies aiding a variety of purposes such as planning and management of mines and environmental impact assessment.
An assessment of the effectiveness of a random forest classifier for land-cover classification
NASA Astrophysics Data System (ADS)
Rodriguez-Galiano, V. F.; Ghimire, B.; Rogan, J.; Chica-Olmo, M.; Rigol-Sanchez, J. P.
2012-01-01
Land cover monitoring using remotely sensed data requires robust classification methods which allow for the accurate mapping of complex land cover and land use categories. Random forest (RF) is a powerful machine learning classifier that is relatively unknown in land remote sensing and has not been evaluated thoroughly by the remote sensing community compared to more conventional pattern recognition techniques. Key advantages of RF include: their non-parametric nature; high classification accuracy; and capability to determine variable importance. However, the split rules for classification are unknown, therefore RF can be considered to be black box type classifier. RF provides an algorithm for estimating missing values; and flexibility to perform several types of data analysis, including regression, classification, survival analysis, and unsupervised learning. In this paper, the performance of the RF classifier for land cover classification of a complex area is explored. Evaluation was based on several criteria: mapping accuracy, sensitivity to data set size and noise. Landsat-5 Thematic Mapper data captured in European spring and summer were used with auxiliary variables derived from a digital terrain model to classify 14 different land categories in the south of Spain. Results show that the RF algorithm yields accurate land cover classifications, with 92% overall accuracy and a Kappa index of 0.92. RF is robust to training data reduction and noise because significant differences in kappa values were only observed for data reduction and noise addition values greater than 50 and 20%, respectively. Additionally, variables that RF identified as most important for classifying land cover coincided with expectations. A McNemar test indicates an overall better performance of the random forest model over a single decision tree at the 0.00001 significance level.
Douglas, P K; Harris, Sam; Yuille, Alan; Cohen, Mark S
2011-05-15
Machine learning (ML) has become a popular tool for mining functional neuroimaging data, and there are now hopes of performing such analyses efficiently in real-time. Towards this goal, we compared accuracy of six different ML algorithms applied to neuroimaging data of persons engaged in a bivariate task, asserting their belief or disbelief of a variety of propositional statements. We performed unsupervised dimension reduction and automated feature extraction using independent component (IC) analysis and extracted IC time courses. Optimization of classification hyperparameters across each classifier occurred prior to assessment. Maximum accuracy was achieved at 92% for Random Forest, followed by 91% for AdaBoost, 89% for Naïve Bayes, 87% for a J48 decision tree, 86% for K*, and 84% for support vector machine. For real-time decoding applications, finding a parsimonious subset of diagnostic ICs might be useful. We used a forward search technique to sequentially add ranked ICs to the feature subspace. For the current data set, we determined that approximately six ICs represented a meaningful basis set for classification. We then projected these six IC spatial maps forward onto a later scanning session within subject. We then applied the optimized ML algorithms to these new data instances, and found that classification accuracy results were reproducible. Additionally, we compared our classification method to our previously published general linear model results on this same data set. The highest ranked IC spatial maps show similarity to brain regions associated with contrasts for belief > disbelief, and disbelief < belief. Copyright © 2010 Elsevier Inc. All rights reserved.
Yoon, Jong H.; Tamir, Diana; Minzenberg, Michael J.; Ragland, J. Daniel; Ursu, Stefan; Carter, Cameron S.
2009-01-01
Background Multivariate pattern analysis is an alternative method of analyzing fMRI data, which is capable of decoding distributed neural representations. We applied this method to test the hypothesis of the impairment in distributed representations in schizophrenia. We also compared the results of this method with traditional GLM-based univariate analysis. Methods 19 schizophrenia and 15 control subjects viewed two runs of stimuli--exemplars of faces, scenes, objects, and scrambled images. To verify engagement with stimuli, subjects completed a 1-back matching task. A multi-voxel pattern classifier was trained to identify category-specific activity patterns on one run of fMRI data. Classification testing was conducted on the remaining run. Correlation of voxel-wise activity across runs evaluated variance over time in activity patterns. Results Patients performed the task less accurately. This group difference was reflected in the pattern analysis results with diminished classification accuracy in patients compared to controls, 59% and 72% respectively. In contrast, there was no group difference in GLM-based univariate measures. In both groups, classification accuracy was significantly correlated with behavioral measures. Both groups showed highly significant correlation between inter-run correlations and classification accuracy. Conclusions Distributed representations of visual objects are impaired in schizophrenia. This impairment is correlated with diminished task performance, suggesting that decreased integrity of cortical activity patterns is reflected in impaired behavior. Comparisons with univariate results suggest greater sensitivity of pattern analysis in detecting group differences in neural activity and reduced likelihood of non-specific factors driving these results. PMID:18822407
Roach, Jennifer K.; Griffith, Brad; Verbyla, David
2012-01-01
Programs to monitor lake area change are becoming increasingly important in high latitude regions, and their development often requires evaluating tradeoffs among different approaches in terms of accuracy of measurement, consistency across multiple users over long time periods, and efficiency. We compared three supervised methods for lake classification from Landsat imagery (density slicing, classification trees, and feature extraction). The accuracy of lake area and number estimates was evaluated relative to high-resolution aerial photography acquired within two days of satellite overpasses. The shortwave infrared band 5 was better at separating surface water from nonwater when used alone than when combined with other spectral bands. The simplest of the three methods, density slicing, performed best overall. The classification tree method resulted in the most omission errors (approx. 2x), feature extraction resulted in the most commission errors (approx. 4x), and density slicing had the least directional bias (approx. half of the lakes with overestimated area and half of the lakes with underestimated area). Feature extraction was the least consistent across training sets (i.e., large standard error among different training sets). Density slicing was the best of the three at classifying small lakes as evidenced by its lower optimal minimum lake size criterion of 5850 m2 compared with the other methods (8550 m2). Contrary to conventional wisdom, the use of additional spectral bands and a more sophisticated method not only required additional processing effort but also had a cost in terms of the accuracy and consistency of lake classifications.
NASA Astrophysics Data System (ADS)
Liu, F.; Chen, T.; He, J.; Wen, Q.; Yu, F.; Gu, X.; Wang, Z.
2018-04-01
In recent years, the quick upgrading and improvement of SAR sensors provide beneficial complements for the traditional optical remote sensing in the aspects of theory, technology and data. In this paper, Sentinel-1A SAR data and GF-1 optical data were selected for image fusion, and more emphases were put on the dryland crop classification under a complex crop planting structure, regarding corn and cotton as the research objects. Considering the differences among various data fusion methods, the principal component analysis (PCA), Gram-Schmidt (GS), Brovey and wavelet transform (WT) methods were compared with each other, and the GS and Brovey methods were proved to be more applicable in the study area. Then, the classification was conducted based on the object-oriented technique process. And for the GS, Brovey fusion images and GF-1 optical image, the nearest neighbour algorithm was adopted to realize the supervised classification with the same training samples. Based on the sample plots in the study area, the accuracy assessment was conducted subsequently. The values of overall accuracy and kappa coefficient of fusion images were all higher than those of GF-1 optical image, and GS method performed better than Brovey method. In particular, the overall accuracy of GS fusion image was 79.8 %, and the Kappa coefficient was 0.644. Thus, the results showed that GS and Brovey fusion images were superior to optical images for dryland crop classification. This study suggests that the fusion of SAR and optical images is reliable for dryland crop classification under a complex crop planting structure.
SENTINEL-1 and SENTINEL-2 Data Fusion for Wetlands Mapping: Balikdami, Turkey
NASA Astrophysics Data System (ADS)
Kaplan, G.; Avdan, U.
2018-04-01
Wetlands provide a number of environmental and socio-economic benefits such as their ability to store floodwaters and improve water quality, providing habitats for wildlife and supporting biodiversity, as well as aesthetic values. Remote sensing technology has proven to be a useful and frequent application in monitoring and mapping wetlands. Combining optical and microwave satellite data can help with mapping and monitoring the biophysical characteristics of wetlands and wetlands` vegetation. Also, fusing radar and optical remote sensing data can increase the wetland classification accuracy. In this paper, data from the fine spatial resolution optical satellite, Sentinel-2 and the Synthetic Aperture Radar Satellite, Sentinel-1, were fused for mapping wetlands. Both Sentinel-1 and Sentinel-2 images were pre-processed. After the pre-processing, vegetation indices were calculated using the Sentinel-2 bands and the results were included in the fusion data set. For the classification of the fused data, three different classification approaches were used and compared. The results showed significant improvement in the wetland classification using both multispectral and microwave data. Also, the presence of the red edge bands and the vegetation indices used in the data set showed significant improvement in the discrimination between wetlands and other vegetated areas. The statistical results of the fusion of the optical and radar data showed high wetland mapping accuracy, showing an overall classification accuracy of approximately 90 % in the object-based classification method. For future research, we recommend multi-temporal image use, terrain data collection, as well as a comparison of the used method with the traditional image fusion techniques.
Real-Time Food Authentication Using a Miniature Mass Spectrometer.
Gerbig, Stefanie; Neese, Stephan; Penner, Alexander; Spengler, Bernhard; Schulz, Sabine
2017-10-17
Food adulteration is a threat to public health and the economy. In order to determine food adulteration efficiently, rapid and easy-to-use on-site analytical methods are needed. In this study, a miniaturized mass spectrometer in combination with three ambient ionization methods was used for food authentication. The chemical fingerprints of three milk types, five fish species, and two coffee types were measured using electrospray ionization, desorption electrospray ionization, and low temperature plasma ionization. Minimum sample preparation was needed for the analysis of liquid and solid food samples. Mass spectrometric data was processed using the laboratory-built software MS food classifier, which allows for the definition of specific food profiles from reference data sets using multivariate statistical methods and the subsequent classification of unknown data. Applicability of the obtained mass spectrometric fingerprints for food authentication was evaluated using different data processing methods, leave-10%-out cross-validation, and real-time classification of new data. Classification accuracy of 100% was achieved for the differentiation of milk types and fish species, and a classification accuracy of 96.4% was achieved for coffee types in cross-validation experiments. Measurement of two milk mixtures yielded correct classification of >94%. For real-time classification, the accuracies were comparable. Functionality of the software program and its performance is described. Processing time for a reference data set and a newly acquired spectrum was found to be 12 s and 2 s, respectively. These proof-of-principle experiments show that the combination of a miniaturized mass spectrometer, ambient ionization, and statistical analysis is suitable for on-site real-time food authentication.
Shin, Jaeyoung; Kwon, Jinuk; Im, Chang-Hwan
2018-01-01
The performance of a brain-computer interface (BCI) can be enhanced by simultaneously using two or more modalities to record brain activity, which is generally referred to as a hybrid BCI. To date, many BCI researchers have tried to implement a hybrid BCI system by combining electroencephalography (EEG) and functional near-infrared spectroscopy (NIRS) to improve the overall accuracy of binary classification. However, since hybrid EEG-NIRS BCI, which will be denoted by hBCI in this paper, has not been applied to ternary classification problems, paradigms and classification strategies appropriate for ternary classification using hBCI are not well investigated. Here we propose the use of an hBCI for the classification of three brain activation patterns elicited by mental arithmetic, motor imagery, and idle state, with the aim to elevate the information transfer rate (ITR) of hBCI by increasing the number of classes while minimizing the loss of accuracy. EEG electrodes were placed over the prefrontal cortex and the central cortex, and NIRS optodes were placed only on the forehead. The ternary classification problem was decomposed into three binary classification problems using the "one-versus-one" (OVO) classification strategy to apply the filter-bank common spatial patterns filter to EEG data. A 10 × 10-fold cross validation was performed using shrinkage linear discriminant analysis (sLDA) to evaluate the average classification accuracies for EEG-BCI, NIRS-BCI, and hBCI when the meta-classification method was adopted to enhance classification accuracy. The ternary classification accuracies for EEG-BCI, NIRS-BCI, and hBCI were 76.1 ± 12.8, 64.1 ± 9.7, and 82.2 ± 10.2%, respectively. The classification accuracy of the proposed hBCI was thus significantly higher than those of the other BCIs ( p < 0.005). The average ITR for the proposed hBCI was calculated to be 4.70 ± 1.92 bits/minute, which was 34.3% higher than that reported for a previous binary hBCI study.
Alshamlan, Hala M; Badr, Ghada H; Alohali, Yousef A
2015-06-01
Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification. Copyright © 2015 Elsevier Ltd. All rights reserved.
Graumann, Ole; Osther, Susanne Sloth; Karstoft, Jens; Hørlyck, Arne; Osther, Palle Jörn Sloth
2016-11-01
Background The Bosniak classification was originally based on computed tomographic (CT) findings. Magnetic resonance (MR) and contrast-enhanced ultrasonography (CEUS) imaging may demonstrate findings that are not depicted at CT, and there may not always be a clear correlation between the findings at MR and CEUS imaging and those at CT. Purpose To compare diagnostic accuracy of MR, CEUS, and CT when categorizing complex renal cystic masses according to the Bosniak classification. Material and Methods From February 2011 to June 2012, 46 complex renal cysts were prospectively evaluated by three readers. Each mass was categorized according to the Bosniak classification and CT was chosen as gold standard. Kappa was calculated for diagnostic accuracy and data was compared with pathological results. Results CT images found 27 BII, six BIIF, seven BIII, and six BIV. Forty-three cysts could be characterized by CEUS, 79% were in agreement with CT (κ = 0.86). Five BII lesions were upgraded to BIIF and four lesions were categorized lower with CEUS. Forty-one lesions were examined with MR; 78% were in agreement with CT (κ = 0.91). Three BII lesions were upgraded to BIIF and six lesions were categorized one category lower. Pathologic correlation in six lesions revealed four malignant and two benign lesions. Conclusion CEUS and MR both up- and downgraded renal cysts compared to CT, and until these non-radiation modalities have been refined and adjusted, CT should remain the gold standard of the Bosniak classification.
NASA Technical Reports Server (NTRS)
Dixon, C. M.
1981-01-01
Land cover information derived from LANDSAT is being utilized by Piedmont Planning District Commission located in the State of Virginia. Progress to date is reported on a level one land cover classification map being produced with nine categories. The nine categories of classification are defined. The computer compatible tape selection is presented. Two unsupervised classifications were done, with 50 and 70 classes respectively. Twenty-eight spectral classes were developed using the supervised technique, employing actual ground truth training sites. The accuracy of the unsupervised classifications are estimated through comparison with local county statistics and with an actual pixel count of LANDSAT information compared to ground truth.
Online clustering algorithms for radar emitter classification.
Liu, Jun; Lee, Jim P Y; Senior; Li, Lingjie; Luo, Zhi-Quan; Wong, K Max
2005-08-01
Radar emitter classification is a special application of data clustering for classifying unknown radar emitters from received radar pulse samples. The main challenges of this task are the high dimensionality of radar pulse samples, small sample group size, and closely located radar pulse clusters. In this paper, two new online clustering algorithms are developed for radar emitter classification: One is model-based using the Minimum Description Length (MDL) criterion and the other is based on competitive learning. Computational complexity is analyzed for each algorithm and then compared. Simulation results show the superior performance of the model-based algorithm over competitive learning in terms of better classification accuracy, flexibility, and stability.
Comparison of accelerometer cut points for predicting activity intensity in youth.
Trost, Stewart G; Loprinzi, Paul D; Moore, Rebecca; Pfeiffer, Karin A
2011-07-01
The absence of comparative validity studies has prevented researchers from reaching consensus regarding the application of intensity-related accelerometer cut points for children and adolescents. This study aimed to evaluate the classification accuracy of five sets of independently developed ActiGraph cut points using energy expenditure, measured by indirect calorimetry, as a criterion reference standard. A total of 206 participants between the ages of 5 and 15 yr completed 12 standardized activity trials. Trials consisted of sedentary activities (lying down, writing, computer game), lifestyle activities (sweeping, laundry, throw and catch, aerobics, basketball), and ambulatory activities (comfortable walk, brisk walk, brisk treadmill walk, running). During each trial, participants wore an ActiGraph GT1M, and V˙O2 was measured breath-by-breath using the Oxycon Mobile portable metabolic system. Physical activity intensity was estimated using five independently developed cut points: Freedson/Trost (FT), Puyau (PU), Treuth (TR), Mattocks (MT), and Evenson (EV). Classification accuracy was evaluated via weighted κ statistics and area under the receiver operating characteristic curve (ROC-AUC). Across all four intensity levels, the EV (κ=0.68) and FT (κ=0.66) cut points exhibited significantly better agreement than TR (κ=0.62), MT (κ=0.54), and PU (κ=0.36). The EV and FT cut points exhibited significantly better classification accuracy for moderate- to vigorous-intensity physical activity (ROC-AUC=0.90) than TR, PU, or MT cut points (ROC-AUC=0.77-0.85). Only the EV cut points provided acceptable classification accuracy for all four levels of physical activity intensity and performed well among children of all ages. The widely applied sedentary cut point of 100 counts per minute exhibited excellent classification accuracy (ROC-AUC=0.90). On the basis of these findings, we recommend that researchers use the EV ActiGraph cut points to estimate time spent in sedentary, light-, moderate-, and vigorous-intensity activity in children and adolescents.
NASA Astrophysics Data System (ADS)
Squiers, John J.; Li, Weizhi; King, Darlene R.; Mo, Weirong; Zhang, Xu; Lu, Yang; Sellke, Eric W.; Fan, Wensheng; DiMaio, J. Michael; Thatcher, Jeffrey E.
2016-03-01
The clinical judgment of expert burn surgeons is currently the standard on which diagnostic and therapeutic decisionmaking regarding burn injuries is based. Multispectral imaging (MSI) has the potential to increase the accuracy of burn depth assessment and the intraoperative identification of viable wound bed during surgical debridement of burn injuries. A highly accurate classification model must be developed using machine-learning techniques in order to translate MSI data into clinically-relevant information. An animal burn model was developed to build an MSI training database and to study the burn tissue classification ability of several models trained via common machine-learning algorithms. The algorithms tested, from least to most complex, were: K-nearest neighbors (KNN), decision tree (DT), linear discriminant analysis (LDA), weighted linear discriminant analysis (W-LDA), quadratic discriminant analysis (QDA), ensemble linear discriminant analysis (EN-LDA), ensemble K-nearest neighbors (EN-KNN), and ensemble decision tree (EN-DT). After the ground-truth database of six tissue types (healthy skin, wound bed, blood, hyperemia, partial injury, full injury) was generated by histopathological analysis, we used 10-fold cross validation to compare the algorithms' performances based on their accuracies in classifying data against the ground truth, and each algorithm was tested 100 times. The mean test accuracy of the algorithms were KNN 68.3%, DT 61.5%, LDA 70.5%, W-LDA 68.1%, QDA 68.9%, EN-LDA 56.8%, EN-KNN 49.7%, and EN-DT 36.5%. LDA had the highest test accuracy, reflecting the bias-variance tradeoff over the range of complexities inherent to the algorithms tested. Several algorithms were able to match the current standard in burn tissue classification, the clinical judgment of expert burn surgeons. These results will guide further development of an MSI burn tissue classification system. Given that there are few surgeons and facilities specializing in burn care, this technology may improve the standard of burn care for patients without access to specialized facilities.
Analysis on the Utility of Satellite Imagery for Detection of Agricultural Facility
NASA Astrophysics Data System (ADS)
Kang, J.-M.; Baek, S.-H.; Jung, K.-Y.
2012-07-01
Now that the agricultural facilities are being increase owing to development of technology and diversification of agriculture and the ratio of garden crops that are imported a lot and the crops cultivated in facilities are raised in Korea, the number of vinyl greenhouses is tending upward. So, it is important to grasp the distribution of vinyl greenhouses as much as that of rice fields, dry fields and orchards, but it is difficult to collect the information of wide areas economically and correctly. Remote sensing using satellite imagery is able to obtain data of wide area at the same time, quickly and cost-effectively collect, monitor and analyze information from every object on earth. In this study, in order to analyze the utilization of satellite imagery at detection of agricultural facility, image classification was performed about the agricultural facility, vinyl greenhouse using Formosat-2 satellite imagery. The training set of sea, vegetation, building, bare ground and vinyl greenhouse was set to monitor the agricultural facilities of the object area and the training set for the vinyl greenhouses that are main monitoring object was classified and set again into 3 types according the spectral characteristics. The image classification using 4 kinds of supervise classification methods applied by the same training set were carried out to grasp the image classification method which is effective for monitoring agricultural facilities. And, in order to minimize the misclassification appeared in the classification using the spectral information, the accuracy of classification was intended to be raised by adding texture information. The results of classification were analyzed regarding the accuracy comparing with that of naked-eyed detection. As the results of classification, the method of Mahalanobis distance was shown as more efficient than other methods and the accuracy of classification was higher when adding texture information. Hence the more effective monitoring of agricultural facilities is expected to be available if the characteristics such as texture information including satellite images or spatial pattern are studied in detail.
Classification Accuracy Increase Using Multisensor Data Fusion
NASA Astrophysics Data System (ADS)
Makarau, A.; Palubinskas, G.; Reinartz, P.
2011-09-01
The practical use of very high resolution visible and near-infrared (VNIR) data is still growing (IKONOS, Quickbird, GeoEye-1, etc.) but for classification purposes the number of bands is limited in comparison to full spectral imaging. These limitations may lead to the confusion of materials such as different roofs, pavements, roads, etc. and therefore may provide wrong interpretation and use of classification products. Employment of hyperspectral data is another solution, but their low spatial resolution (comparing to multispectral data) restrict their usage for many applications. Another improvement can be achieved by fusion approaches of multisensory data since this may increase the quality of scene classification. Integration of Synthetic Aperture Radar (SAR) and optical data is widely performed for automatic classification, interpretation, and change detection. In this paper we present an approach for very high resolution SAR and multispectral data fusion for automatic classification in urban areas. Single polarization TerraSAR-X (SpotLight mode) and multispectral data are integrated using the INFOFUSE framework, consisting of feature extraction (information fission), unsupervised clustering (data representation on a finite domain and dimensionality reduction), and data aggregation (Bayesian or neural network). This framework allows a relevant way of multisource data combination following consensus theory. The classification is not influenced by the limitations of dimensionality, and the calculation complexity primarily depends on the step of dimensionality reduction. Fusion of single polarization TerraSAR-X, WorldView-2 (VNIR or full set), and Digital Surface Model (DSM) data allow for different types of urban objects to be classified into predefined classes of interest with increased accuracy. The comparison to classification results of WorldView-2 multispectral data (8 spectral bands) is provided and the numerical evaluation of the method in comparison to other established methods illustrates the advantage in the classification accuracy for many classes such as buildings, low vegetation, sport objects, forest, roads, rail roads, etc.
NASA Astrophysics Data System (ADS)
Hammann, Mark Gregory
The fusion of electro-optical (EO) multi-spectral satellite imagery with Synthetic Aperture Radar (SAR) data was explored with the working hypothesis that the addition of multi-band SAR will increase the land-cover (LC) classification accuracy compared to EO alone. Three satellite sources for SAR imagery were used: X-band from TerraSAR-X, C-band from RADARSAT-2, and L-band from PALSAR. Images from the RapidEye satellites were the source of the EO imagery. Imagery from the GeoEye-1 and WorldView-2 satellites aided the selection of ground truth. Three study areas were chosen: Wad Medani, Sudan; Campinas, Brazil; and Fresno- Kings Counties, USA. EO imagery were radiometrically calibrated, atmospherically compensated, orthorectifed, co-registered, and clipped to a common area of interest (AOI). SAR imagery were radiometrically calibrated, and geometrically corrected for terrain and incidence angle by converting to ground range and Sigma Naught (?0). The original SAR HH data were included in the fused image stack after despeckling with a 3x3 Enhanced Lee filter. The variance and Gray-Level-Co-occurrence Matrix (GLCM) texture measures of contrast, entropy, and correlation were derived from the non-despeckled SAR HH bands. Data fusion was done with layer stacking and all data were resampled to a common spatial resolution. The Support Vector Machine (SVM) decision rule was used for the supervised classifications. Similar LC classes were identified and tested for each study area. For Wad Medani, nine classes were tested: low and medium intensity urban, sparse forest, water, barren ground, and four agriculture classes (fallow, bare agricultural ground, green crops, and orchards). For Campinas, Brazil, five generic classes were tested: urban, agriculture, forest, water, and barren ground. For the Fresno-Kings Counties location 11 classes were studied: three generic classes (urban, water, barren land), and eight specific crops. In all cases the addition of SAR to EO resulted in higher overall classification accuracies. In many cases using more than a single SAR band also improved the classification accuracy. There was no single best SAR band for all cases; for specific study areas or LC classes, different SAR bands were better. For Wad Medani, the overall accuracy increased nearly 25% over EO by using all three SAR bands and GLCM texture. For Campinas, the improvement over EO was 4.3%; the large areas of vegetation were classified by EO with good accuracy. At Fresno-Kings Counties, EO+SAR fusion improved the overall classification accuracy by 7%. For times or regions where EO is not available due to extended cloud cover, classification with SAR is often the only option; note that SAR alone typically results in lower classification accuracies than when using EO or EO-SAR fusion. Fusion of EO and SAR was especially important to improve the separability of orchards from other crops, and separating urban areas with buildings from bare soil; those classes are difficult to accurately separate with EO. The outcome of this dissertation contributes to the understanding of the benefits of combining data from EO imagery with different SAR bands and SAR derived texture data to identify different LC classes. In times of increased public and private budget constraints and industry consolidation, this dissertation provides insight as to which band packages could be most useful for increased accuracy in LC classification.
Otitis Media Diagnosis for Developing Countries Using Tympanic Membrane Image-Analysis.
Myburgh, Hermanus C; van Zijl, Willemien H; Swanepoel, DeWet; Hellström, Sten; Laurent, Claude
2016-03-01
Otitis media is one of the most common childhood diseases worldwide, but because of lack of doctors and health personnel in developing countries it is often misdiagnosed or not diagnosed at all. This may lead to serious, and life-threatening complications. There is, thus a need for an automated computer based image-analyzing system that could assist in making accurate otitis media diagnoses anywhere. A method for automated diagnosis of otitis media is proposed. The method uses image-processing techniques to classify otitis media. The system is trained using high quality pre-assessed images of tympanic membranes, captured by digital video-otoscopes, and classifies undiagnosed images into five otitis media categories based on predefined signs. Several verification tests analyzed the classification capability of the method. An accuracy of 80.6% was achieved for images taken with commercial video-otoscopes, while an accuracy of 78.7% was achieved for images captured on-site with a low cost custom-made video-otoscope. The high accuracy of the proposed otitis media classification system compares well with the classification accuracy of general practitioners and pediatricians (~64% to 80%) using traditional otoscopes, and therefore holds promise for the future in making automated diagnosis of otitis media in medically underserved populations.
Otitis Media Diagnosis for Developing Countries Using Tympanic Membrane Image-Analysis
Myburgh, Hermanus C.; van Zijl, Willemien H.; Swanepoel, DeWet; Hellström, Sten; Laurent, Claude
2016-01-01
Background Otitis media is one of the most common childhood diseases worldwide, but because of lack of doctors and health personnel in developing countries it is often misdiagnosed or not diagnosed at all. This may lead to serious, and life-threatening complications. There is, thus a need for an automated computer based image-analyzing system that could assist in making accurate otitis media diagnoses anywhere. Methods A method for automated diagnosis of otitis media is proposed. The method uses image-processing techniques to classify otitis media. The system is trained using high quality pre-assessed images of tympanic membranes, captured by digital video-otoscopes, and classifies undiagnosed images into five otitis media categories based on predefined signs. Several verification tests analyzed the classification capability of the method. Findings An accuracy of 80.6% was achieved for images taken with commercial video-otoscopes, while an accuracy of 78.7% was achieved for images captured on-site with a low cost custom-made video-otoscope. Interpretation The high accuracy of the proposed otitis media classification system compares well with the classification accuracy of general practitioners and pediatricians (~ 64% to 80%) using traditional otoscopes, and therefore holds promise for the future in making automated diagnosis of otitis media in medically underserved populations. PMID:27077122
Study design requirements for RNA sequencing-based breast cancer diagnostics.
Mer, Arvind Singh; Klevebring, Daniel; Grönberg, Henrik; Rantalainen, Mattias
2016-02-01
Sequencing-based molecular characterization of tumors provides information required for individualized cancer treatment. There are well-defined molecular subtypes of breast cancer that provide improved prognostication compared to routine biomarkers. However, molecular subtyping is not yet implemented in routine breast cancer care. Clinical translation is dependent on subtype prediction models providing high sensitivity and specificity. In this study we evaluate sample size and RNA-sequencing read requirements for breast cancer subtyping to facilitate rational design of translational studies. We applied subsampling to ascertain the effect of training sample size and the number of RNA sequencing reads on classification accuracy of molecular subtype and routine biomarker prediction models (unsupervised and supervised). Subtype classification accuracy improved with increasing sample size up to N = 750 (accuracy = 0.93), although with a modest improvement beyond N = 350 (accuracy = 0.92). Prediction of routine biomarkers achieved accuracy of 0.94 (ER) and 0.92 (Her2) at N = 200. Subtype classification improved with RNA-sequencing library size up to 5 million reads. Development of molecular subtyping models for cancer diagnostics requires well-designed studies. Sample size and the number of RNA sequencing reads directly influence accuracy of molecular subtyping. Results in this study provide key information for rational design of translational studies aiming to bring sequencing-based diagnostics to the clinic.
Will it Blend? Visualization and Accuracy Evaluation of High-Resolution Fuzzy Vegetation Maps
NASA Astrophysics Data System (ADS)
Zlinszky, A.; Kania, A.
2016-06-01
Instead of assigning every map pixel to a single class, fuzzy classification includes information on the class assigned to each pixel but also the certainty of this class and the alternative possible classes based on fuzzy set theory. The advantages of fuzzy classification for vegetation mapping are well recognized, but the accuracy and uncertainty of fuzzy maps cannot be directly quantified with indices developed for hard-boundary categorizations. The rich information in such a map is impossible to convey with a single map product or accuracy figure. Here we introduce a suite of evaluation indices and visualization products for fuzzy maps generated with ensemble classifiers. We also propose a way of evaluating classwise prediction certainty with "dominance profiles" visualizing the number of pixels in bins according to the probability of the dominant class, also showing the probability of all the other classes. Together, these data products allow a quantitative understanding of the rich information in a fuzzy raster map both for individual classes and in terms of variability in space, and also establish the connection between spatially explicit class certainty and traditional accuracy metrics. These map products are directly comparable to widely used hard boundary evaluation procedures, support active learning-based iterative classification and can be applied for operational use.
Comparative utility of LANDSAT-1 and Skylab data for coastal wetland mapping and ecological studies
NASA Technical Reports Server (NTRS)
Anderson, R.; Alsid, L.; Carter, V.
1975-01-01
Skylab 190-A photography and LANDSAT-1 analog data have been analyzed to determine coastal wetland mapping potential as a near term substitute for aircraft data and as a long term monitoring tool. The level of detail and accuracy of each was compared. Skylab data provides more accurate classification of wetland types, better delineation of freshwater marshes and more detailed analysis of drainage patterns. LANDSAT-1 analog data is useful for general classification, boundary definition and monitoring of human impact in wetlands.
Rajasekaran, S; Bhushan, Manindra; Aiyer, Siddharth; Kanna, Rishi; Shetty, Ajoy Prasad
2018-01-09
To develop a classification based on the technical complexity encountered during pedicle screw insertion and to evaluate the performance of AIRO ® CT navigation system based on this classification, in the clinical scenario of complex spinal deformity. 31 complex spinal deformity correction surgeries were prospectively analyzed for performance of AIRO ® mobile CT-based navigation system. Pedicles were classified according to complexity of insertion into five types. Analysis was performed to estimate the accuracy of screw placement and time for screw insertion. Breach greater than 2 mm was considered for analysis. 452 pedicle screws were inserted (T1-T6: 116; T7-T12: 171; L1-S1: 165). The average Cobb angle was 68.3° (range 60°-104°). We had 242 grade 2 pedicles, 133 grade 3, and 77 grade 4, and 44 pedicles were unfit for pedicle screw insertion. We noted 27 pedicle screw breach (medial: 10; lateral: 16; anterior: 1). Among lateral breach (n = 16), ten screws were planned for in-out-in pedicle screw insertion. Among lateral breach (n = 16), ten screws were planned for in-out-in pedicle screw insertion. Average screw insertion time was 1.76 ± 0.89 min. After accounting for planned breach, the effective breach rate was 3.8% resulting in 96.2% accuracy for pedicle screw placement. This classification helps compare the accuracy of screw insertion in range of conditions by considering the complexity of screw insertion. Considering the clinical scenario of complex pedicle anatomy in spinal deformity AIRO ® navigation showed an excellent accuracy rate of 96.2%.
NASA Astrophysics Data System (ADS)
Snavely, Rachel A.
Focusing on the semi-arid and highly disturbed landscape of San Clemente Island, California, this research tests the effectiveness of incorporating a hierarchal object-based image analysis (OBIA) approach with high-spatial resolution imagery and light detection and range (LiDAR) derived canopy height surfaces for mapping vegetation communities. The study is part of a large-scale research effort conducted by researchers at San Diego State University's (SDSU) Center for Earth Systems Analysis Research (CESAR) and Soil Ecology and Restoration Group (SERG), to develop an updated vegetation community map which will support both conservation and management decisions on Naval Auxiliary Landing Field (NALF) San Clemente Island. Trimble's eCognition Developer software was used to develop and generate vegetation community maps for two study sites, with and without vegetation height data as input. Overall and class-specific accuracies were calculated and compared across the two classifications. The highest overall accuracy (approximately 80%) was observed with the classification integrating airborne visible and near infrared imagery having very high spatial resolution with a LiDAR derived canopy height model. Accuracies for individual vegetation classes differed between both classification methods, but were highest when incorporating the LiDAR digital surface data. The addition of a canopy height model, however, yielded little difference in classification accuracies for areas of very dense shrub cover. Overall, the results show the utility of the OBIA approach for mapping vegetation with high spatial resolution imagery, and emphasizes the advantage of both multi-scale analysis and digital surface data for accuracy characterizing highly disturbed landscapes. The integrated imagery and digital canopy height model approach presented both advantages and limitations, which have to be considered prior to its operational use in mapping vegetation communities.
Improving zero-training brain-computer interfaces by mixing model estimators
NASA Astrophysics Data System (ADS)
Verhoeven, T.; Hübner, D.; Tangermann, M.; Müller, K. R.; Dambre, J.; Kindermans, P. J.
2017-06-01
Objective. Brain-computer interfaces (BCI) based on event-related potentials (ERP) incorporate a decoder to classify recorded brain signals and subsequently select a control signal that drives a computer application. Standard supervised BCI decoders require a tedious calibration procedure prior to every session. Several unsupervised classification methods have been proposed that tune the decoder during actual use and as such omit this calibration. Each of these methods has its own strengths and weaknesses. Our aim is to improve overall accuracy of ERP-based BCIs without calibration. Approach. We consider two approaches for unsupervised classification of ERP signals. Learning from label proportions (LLP) was recently shown to be guaranteed to converge to a supervised decoder when enough data is available. In contrast, the formerly proposed expectation maximization (EM) based decoding for ERP-BCI does not have this guarantee. However, while this decoder has high variance due to random initialization of its parameters, it obtains a higher accuracy faster than LLP when the initialization is good. We introduce a method to optimally combine these two unsupervised decoding methods, letting one method’s strengths compensate for the weaknesses of the other and vice versa. The new method is compared to the aforementioned methods in a resimulation of an experiment with a visual speller. Main results. Analysis of the experimental results shows that the new method exceeds the performance of the previous unsupervised classification approaches in terms of ERP classification accuracy and symbol selection accuracy during the spelling experiment. Furthermore, the method shows less dependency on random initialization of model parameters and is consequently more reliable. Significance. Improving the accuracy and subsequent reliability of calibrationless BCIs makes these systems more appealing for frequent use.
NASA Astrophysics Data System (ADS)
Yang, Huijuan; Guan, Cuntai; Sui Geok Chua, Karen; San Chok, See; Wang, Chuan Chu; Kok Soon, Phua; Tang, Christina Ka Yin; Keng Ang, Kai
2014-06-01
Objective. Detection of motor imagery of hand/arm has been extensively studied for stroke rehabilitation. This paper firstly investigates the detection of motor imagery of swallow (MI-SW) and motor imagery of tongue protrusion (MI-Ton) in an attempt to find a novel solution for post-stroke dysphagia rehabilitation. Detection of MI-SW from a simple yet relevant modality such as MI-Ton is then investigated, motivated by the similarity in activation patterns between tongue movements and swallowing and there being fewer movement artifacts in performing tongue movements compared to swallowing. Approach. Novel features were extracted based on the coefficients of the dual-tree complex wavelet transform to build multiple training models for detecting MI-SW. The session-to-session classification accuracy was boosted by adaptively selecting the training model to maximize the ratio of between-classes distances versus within-class distances, using features of training and evaluation data. Main results. Our proposed method yielded averaged cross-validation (CV) classification accuracies of 70.89% and 73.79% for MI-SW and MI-Ton for ten healthy subjects, which are significantly better than the results from existing methods. In addition, averaged CV accuracies of 66.40% and 70.24% for MI-SW and MI-Ton were obtained for one stroke patient, demonstrating the detectability of MI-SW and MI-Ton from the idle state. Furthermore, averaged session-to-session classification accuracies of 72.08% and 70% were achieved for ten healthy subjects and one stroke patient using the MI-Ton model. Significance. These results and the subjectwise strong correlations in classification accuracies between MI-SW and MI-Ton demonstrated the feasibility of detecting MI-SW from MI-Ton models.
Development of remote sensing based site specific weed management for Midwest mint production
NASA Astrophysics Data System (ADS)
Gumz, Mary Saumur Paulson
Peppermint and spearmint are high value essential oil crops in Indiana, Michigan, and Wisconsin. Although the mints are profitable alternatives to corn and soybeans, mint production efficiency must improve in order to allow industry survival against foreign produced oils and synthetic flavorings. Weed control is the major input cost in mint production and tools to increase efficiency are necessary. Remote sensing-based site-specific weed management offers potential for decreasing weed control costs through simplified weed detection and control from accurate site specific weed and herbicide application maps. This research showed the practicability of remote sensing for weed detection in the mints. Research was designed to compare spectral response curves of field grown mint and weeds, and to use these data to develop spectral vegetation indices for automated weed detection. Viability of remote sensing in mint production was established using unsupervised classification, supervised classification, handheld spectroradiometer readings and spectral vegetation indices (SVIs). Unsupervised classification of multispectral images of peppermint production fields generated crop health maps with 92 and 67% accuracy in meadow and row peppermint, respectively. Supervised classification of multispectral images identified weed infestations with 97% and 85% accuracy for meadow and row peppermint, respectively. Supervised classification showed that peppermint was spectrally distinct from weeds, but the accuracy of these measures was dependent on extensive ground referencing which is impractical and too costly for on-farm use. Handheld spectroradiometer measurements of peppermint, spearmint, and several weeds and crop and weed mixtures were taken over three years from greenhouse grown plants, replicated field plots, and production peppermint and spearmint fields. Results showed that mints have greater near infrared (NIR) and lower green reflectance and a steeper red edge slope than all weed species. These distinguishing characteristics were combined to develop narrow band and broadband spectral vegetation indices (SVIs, ratios of NIR/green reflectance), that were effective in differentiating mint from key weed species. Hyperspectral images of production peppermint and spearmint fields were then classified using SVI-based classification. Narrowband and broadband SVIs classified early season peppermint and spearmint with 64 to 100% accuracy compared to 79 to 100% accuracy for supervised classification of multispectral images of the same fields. Broadband SVIs have potential for use as an automated spectral indicator for weeds in the mints since they require minimal ground referencing and can be calculated from multispectral imagery which is cheaper and more readily available than hyperspectral imagery. This research will allow growers to implement remote sensing based site specific weed management in mint resulting in reduced grower input costs and reduced herbicide entry into the environment and will have applications in other specialty and meadow crops.
ERIC Educational Resources Information Center
Duvekot, Jorieke; van der Ende, Jan; Verhulst, Frank C.; Greaves-Lord, Kirstin
2015-01-01
The screening accuracy of the parent and teacher-reported Social Responsiveness Scale (SRS) was compared with an autism spectrum disorder (ASD) classification according to (1) the Developmental, Dimensional, and Diagnostic Interview (3Di), (2) the Autism Diagnostic Observation Schedule (ADOS), (3) both the 3Di and ADOS, in 186 children referred to…
ERIC Educational Resources Information Center
Montoye, Alexander H. K.; Pivarnik, James M.; Mudd, Lanay M.; Biswas, Subir; Pfeiffer, Karin A.
2016-01-01
The purpose of this article is to compare accuracy of activity type prediction models for accelerometers worn on the hip, wrists, and thigh. Forty-four adults performed sedentary, ambulatory, lifestyle, and exercise activities (14 total, 10 categories) for 3-10 minutes each in a 90-minute semi-structured laboratory protocol. Artificial neural…
Comparison of wheat classification accuracy using different classifiers of the image-100 system
NASA Technical Reports Server (NTRS)
Dejesusparada, N. (Principal Investigator); Chen, S. C.; Moreira, M. A.; Delima, A. M.
1981-01-01
Classification results using single-cell and multi-cell signature acquisition options, a point-by-point Gaussian maximum-likelihood classifier, and K-means clustering of the Image-100 system are presented. Conclusions reached are that: a better indication of correct classification can be provided by using a test area which contains various cover types of the study area; classification accuracy should be evaluated considering both the percentages of correct classification and error of commission; supervised classification approaches are better than K-means clustering; Gaussian distribution maximum likelihood classifier is better than Single-cell and Multi-cell Signature Acquisition Options of the Image-100 system; and in order to obtain a high classification accuracy in a large and heterogeneous crop area, using Gaussian maximum-likelihood classifier, homogeneous spectral subclasses of the study crop should be created to derive training statistics.
EEG Classification with a Sequential Decision-Making Method in Motor Imagery BCI.
Liu, Rong; Wang, Yongxuan; Newman, Geoffrey I; Thakor, Nitish V; Ying, Sarah
2017-12-01
To develop subject-specific classifier to recognize mental states fast and reliably is an important issue in brain-computer interfaces (BCI), particularly in practical real-time applications such as wheelchair or neuroprosthetic control. In this paper, a sequential decision-making strategy is explored in conjunction with an optimal wavelet analysis for EEG classification. The subject-specific wavelet parameters based on a grid-search method were first developed to determine evidence accumulative curve for the sequential classifier. Then we proposed a new method to set the two constrained thresholds in the sequential probability ratio test (SPRT) based on the cumulative curve and a desired expected stopping time. As a result, it balanced the decision time of each class, and we term it balanced threshold SPRT (BTSPRT). The properties of the method were illustrated on 14 subjects' recordings from offline and online tests. Results showed the average maximum accuracy of the proposed method to be 83.4% and the average decision time of 2.77[Formula: see text]s, when compared with 79.2% accuracy and a decision time of 3.01[Formula: see text]s for the sequential Bayesian (SB) method. The BTSPRT method not only improves the classification accuracy and decision speed comparing with the other nonsequential or SB methods, but also provides an explicit relationship between stopping time, thresholds and error, which is important for balancing the speed-accuracy tradeoff. These results suggest that BTSPRT would be useful in explicitly adjusting the tradeoff between rapid decision-making and error-free device control.
Gender classification under extended operating conditions
NASA Astrophysics Data System (ADS)
Rude, Howard N.; Rizki, Mateen
2014-06-01
Gender classification is a critical component of a robust image security system. Many techniques exist to perform gender classification using facial features. In contrast, this paper explores gender classification using body features extracted from clothed subjects. Several of the most effective types of features for gender classification identified in literature were implemented and applied to the newly developed Seasonal Weather And Gender (SWAG) dataset. SWAG contains video clips of approximately 2000 samples of human subjects captured over a period of several months. The subjects are wearing casual business attire and outer garments appropriate for the specific weather conditions observed in the Midwest. The results from a series of experiments are presented that compare the classification accuracy of systems that incorporate various types and combinations of features applied to multiple looks at subjects at different image resolutions to determine a baseline performance for gender classification.
Walton, Emily; Casey, Christy; Mitsch, Jurgen; Vázquez-Diosdado, Jorge A; Yan, Juan; Dottorini, Tania; Ellis, Keith A; Winterlich, Anthony; Kaler, Jasmeet
2018-02-01
Automated behavioural classification and identification through sensors has the potential to improve health and welfare of the animals. Position of a sensor, sampling frequency and window size of segmented signal data has a major impact on classification accuracy in activity recognition and energy needs for the sensor, yet, there are no studies in precision livestock farming that have evaluated the effect of all these factors simultaneously. The aim of this study was to evaluate the effects of position (ear and collar), sampling frequency (8, 16 and 32 Hz) of a triaxial accelerometer and gyroscope sensor and window size (3, 5 and 7 s) on the classification of important behaviours in sheep such as lying, standing and walking. Behaviours were classified using a random forest approach with 44 feature characteristics. The best performance for walking, standing and lying classification in sheep (accuracy 95%, F -score 91%-97%) was obtained using combination of 32 Hz, 7 s and 32 Hz, 5 s for both ear and collar sensors, although, results obtained with 16 Hz and 7 s window were comparable with accuracy of 91%-93% and F -score 88%-95%. Energy efficiency was best at a 7 s window. This suggests that sampling at 16 Hz with 7 s window will offer benefits in a real-time behavioural monitoring system for sheep due to reduced energy needs.
Walton, Emily; Casey, Christy; Mitsch, Jurgen; Vázquez-Diosdado, Jorge A.; Yan, Juan; Dottorini, Tania; Ellis, Keith A.; Winterlich, Anthony
2018-01-01
Automated behavioural classification and identification through sensors has the potential to improve health and welfare of the animals. Position of a sensor, sampling frequency and window size of segmented signal data has a major impact on classification accuracy in activity recognition and energy needs for the sensor, yet, there are no studies in precision livestock farming that have evaluated the effect of all these factors simultaneously. The aim of this study was to evaluate the effects of position (ear and collar), sampling frequency (8, 16 and 32 Hz) of a triaxial accelerometer and gyroscope sensor and window size (3, 5 and 7 s) on the classification of important behaviours in sheep such as lying, standing and walking. Behaviours were classified using a random forest approach with 44 feature characteristics. The best performance for walking, standing and lying classification in sheep (accuracy 95%, F-score 91%–97%) was obtained using combination of 32 Hz, 7 s and 32 Hz, 5 s for both ear and collar sensors, although, results obtained with 16 Hz and 7 s window were comparable with accuracy of 91%–93% and F-score 88%–95%. Energy efficiency was best at a 7 s window. This suggests that sampling at 16 Hz with 7 s window will offer benefits in a real-time behavioural monitoring system for sheep due to reduced energy needs. PMID:29515862
Youn, Su Hyun; Sim, Taeyong; Choi, Ahnryul; Song, Jinsung; Shin, Ki Young; Lee, Il Kwon; Heo, Hyun Mu; Lee, Daeweon; Mun, Joung Hwan
2015-06-01
Ultrasonic surgical units (USUs) have the advantage of minimizing tissue damage during surgeries that require tissue dissection by reducing problems such as coagulation and unwanted carbonization, but the disadvantage of requiring manual adjustment of power output according to the target tissue. In order to overcome this limitation, it is necessary to determine the properties of in vivo tissues automatically. We propose a multi-classifier that can accurately classify tissues based on the unique impedance of each tissue. For this purpose, a multi-classifier was built based on single classifiers with high classification rates, and the classification accuracy of the proposed model was compared with that of single classifiers for various electrode types (Type-I: 6 mm invasive; Type-II: 3 mm invasive; Type-III: surface). The sensitivity and positive predictive value (PPV) of the multi-classifier by cross checks were determined. According to the 10-fold cross validation results, the classification accuracy of the proposed model was significantly higher (p<0.05 or <0.01) than that of existing single classifiers for all electrode types. In particular, the classification accuracy of the proposed model was highest when the 3mm invasive electrode (Type-II) was used (sensitivity=97.33-100.00%; PPV=96.71-100.00%). The results of this study are an important contribution to achieving automatic optimal output power adjustment of USUs according to the properties of individual tissues. Copyright © 2015 Elsevier Ltd. All rights reserved.
Texture classification of vegetation cover in high altitude wetlands zone
NASA Astrophysics Data System (ADS)
Wentao, Zou; Bingfang, Wu; Hongbo, Ju; Hua, Liu
2014-03-01
The aim of this study was to investigate the utility of datasets composed of texture measures and other features for the classification of vegetation cover, specifically wetlands. QUEST decision tree classifier was applied to a SPOT-5 image sub-scene covering the typical wetlands area in Three River Sources region in Qinghai province, China. The dataset used for the classification comprised of: (1) spectral data and the components of principal component analysis; (2) texture measures derived from pixel basis; (3) DEM and other ancillary data covering the research area. Image textures is an important characteristic of remote sensing images; it can represent spatial variations with spectral brightness in digital numbers. When the spectral information is not enough to separate the different land covers, the texture information can be used to increase the classification accuracy. The texture measures used in this study were calculated from GLCM (Gray level Co-occurrence Matrix); eight frequently used measures were chosen to conduct the classification procedure. The results showed that variance, mean and entropy calculated by GLCM with a 9*9 size window were effective in distinguishing different vegetation types in wetlands zone. The overall accuracy of this method was 84.19% and the Kappa coefficient was 0.8261. The result indicated that the introduction of texture measures can improve the overall accuracy by 12.05% and the overall kappa coefficient by 0.1407 compared with the result using spectral and ancillary data.
Modeling time-to-event (survival) data using classification tree analysis.
Linden, Ariel; Yarnold, Paul R
2017-12-01
Time to the occurrence of an event is often studied in health research. Survival analysis differs from other designs in that follow-up times for individuals who do not experience the event by the end of the study (called censored) are accounted for in the analysis. Cox regression is the standard method for analysing censored data, but the assumptions required of these models are easily violated. In this paper, we introduce classification tree analysis (CTA) as a flexible alternative for modelling censored data. Classification tree analysis is a "decision-tree"-like classification model that provides parsimonious, transparent (ie, easy to visually display and interpret) decision rules that maximize predictive accuracy, derives exact P values via permutation tests, and evaluates model cross-generalizability. Using empirical data, we identify all statistically valid, reproducible, longitudinally consistent, and cross-generalizable CTA survival models and then compare their predictive accuracy to estimates derived via Cox regression and an unadjusted naïve model. Model performance is assessed using integrated Brier scores and a comparison between estimated survival curves. The Cox regression model best predicts average incidence of the outcome over time, whereas CTA survival models best predict either relatively high, or low, incidence of the outcome over time. Classification tree analysis survival models offer many advantages over Cox regression, such as explicit maximization of predictive accuracy, parsimony, statistical robustness, and transparency. Therefore, researchers interested in accurate prognoses and clear decision rules should consider developing models using the CTA-survival framework. © 2017 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Bangs, Corey F.; Kruse, Fred A.; Olsen, Chris R.
2013-05-01
Hyperspectral data were assessed to determine the effect of integrating spectral data and extracted texture feature data on classification accuracy. Four separate spectral ranges (hundreds of spectral bands total) were used from the Visible and Near Infrared (VNIR) and Shortwave Infrared (SWIR) portions of the electromagnetic spectrum. Haralick texture features (contrast, entropy, and correlation) were extracted from the average gray-level image for each of the four spectral ranges studied. A maximum likelihood classifier was trained using a set of ground truth regions of interest (ROIs) and applied separately to the spectral data, texture data, and a fused dataset containing both. Classification accuracy was measured by comparison of results to a separate verification set of test ROIs. Analysis indicates that the spectral range (source of the gray-level image) used to extract the texture feature data has a significant effect on the classification accuracy. This result applies to texture-only classifications as well as the classification of integrated spectral data and texture feature data sets. Overall classification improvement for the integrated data sets was near 1%. Individual improvement for integrated spectral and texture classification of the "Urban" class showed approximately 9% accuracy increase over spectral-only classification. Texture-only classification accuracy was highest for the "Dirt Path" class at approximately 92% for the spectral range from 947 to 1343nm. This research demonstrates the effectiveness of texture feature data for more accurate analysis of hyperspectral data and the importance of selecting the correct spectral range to be used for the gray-level image source to extract these features.
Bisenius, Sandrine; Mueller, Karsten; Diehl-Schmid, Janine; Fassbender, Klaus; Grimmer, Timo; Jessen, Frank; Kassubek, Jan; Kornhuber, Johannes; Landwehrmeyer, Bernhard; Ludolph, Albert; Schneider, Anja; Anderl-Straub, Sarah; Stuke, Katharina; Danek, Adrian; Otto, Markus; Schroeter, Matthias L
2017-01-01
Primary progressive aphasia (PPA) encompasses the three subtypes nonfluent/agrammatic variant PPA, semantic variant PPA, and the logopenic variant PPA, which are characterized by distinct patterns of language difficulties and regional brain atrophy. To validate the potential of structural magnetic resonance imaging data for early individual diagnosis, we used support vector machine classification on grey matter density maps obtained by voxel-based morphometry analysis to discriminate PPA subtypes (44 patients: 16 nonfluent/agrammatic variant PPA, 17 semantic variant PPA, 11 logopenic variant PPA) from 20 healthy controls (matched for sample size, age, and gender) in the cohort of the multi-center study of the German consortium for frontotemporal lobar degeneration. Here, we compared a whole-brain with a meta-analysis-based disease-specific regions-of-interest approach for support vector machine classification. We also used support vector machine classification to discriminate the three PPA subtypes from each other. Whole brain support vector machine classification enabled a very high accuracy between 91 and 97% for identifying specific PPA subtypes vs. healthy controls, and 78/95% for the discrimination between semantic variant vs. nonfluent/agrammatic or logopenic PPA variants. Only for the discrimination between nonfluent/agrammatic and logopenic PPA variants accuracy was low with 55%. Interestingly, the regions that contributed the most to the support vector machine classification of patients corresponded largely to the regions that were atrophic in these patients as revealed by group comparisons. Although the whole brain approach took also into account regions that were not covered in the regions-of-interest approach, both approaches showed similar accuracies due to the disease-specificity of the selected networks. Conclusion, support vector machine classification of multi-center structural magnetic resonance imaging data enables prediction of PPA subtypes with a very high accuracy paving the road for its application in clinical settings.
Accuracy of Remotely Sensed Classifications For Stratification of Forest and Nonforest Lands
Raymond L. Czaplewski; Paul L. Patterson
2001-01-01
We specify accuracy standards for remotely sensed classifications used by FIA to stratify landscapes into two categories: forest and nonforest. Accuracy must be highest when forest area approaches 100 percent of the landscape. If forest area is rare in a landscape, then accuracy in the nonforest stratum must be very high, even at the expense of accuracy in the forest...
NASA Astrophysics Data System (ADS)
Shahriari Nia, Morteza; Wang, Daisy Zhe; Bohlman, Stephanie Ann; Gader, Paul; Graves, Sarah J.; Petrovic, Milenko
2015-01-01
Hyperspectral images can be used to identify savannah tree species at the landscape scale, which is a key step in measuring biomass and carbon, and tracking changes in species distributions, including invasive species, in these ecosystems. Before automated species mapping can be performed, image processing and atmospheric correction is often performed, which can potentially affect the performance of classification algorithms. We determine how three processing and correction techniques (atmospheric correction, Gaussian filters, and shade/green vegetation filters) affect the prediction accuracy of classification of tree species at pixel level from airborne visible/infrared imaging spectrometer imagery of longleaf pine savanna in Central Florida, United States. Species classification using fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) atmospheric correction outperformed ATCOR in the majority of cases. Green vegetation (normalized difference vegetation index) and shade (near-infrared) filters did not increase classification accuracy when applied to large and continuous patches of specific species. Finally, applying a Gaussian filter reduces interband noise and increases species classification accuracy. Using the optimal preprocessing steps, our classification accuracy of six species classes is about 75%.
NASA Astrophysics Data System (ADS)
Wei, Hongqiang; Zhou, Guiyun; Zhou, Junjie
2018-04-01
The classification of leaf and wood points is an essential preprocessing step for extracting inventory measurements and canopy characterization of trees from the terrestrial laser scanning (TLS) data. The geometry-based approach is one of the widely used classification method. In the geometry-based method, it is common practice to extract salient features at one single scale before the features are used for classification. It remains unclear how different scale(s) used affect the classification accuracy and efficiency. To assess the scale effect on the classification accuracy and efficiency, we extracted the single-scale and multi-scale salient features from the point clouds of two oak trees of different sizes and conducted the classification on leaf and wood. Our experimental results show that the balanced accuracy of the multi-scale method is higher than the average balanced accuracy of the single-scale method by about 10 % for both trees. The average speed-up ratio of single scale classifiers over multi-scale classifier for each tree is higher than 30.
PCA based feature reduction to improve the accuracy of decision tree c4.5 classification
NASA Astrophysics Data System (ADS)
Nasution, M. Z. F.; Sitompul, O. S.; Ramli, M.
2018-03-01
Splitting attribute is a major process in Decision Tree C4.5 classification. However, this process does not give a significant impact on the establishment of the decision tree in terms of removing irrelevant features. It is a major problem in decision tree classification process called over-fitting resulting from noisy data and irrelevant features. In turns, over-fitting creates misclassification and data imbalance. Many algorithms have been proposed to overcome misclassification and overfitting on classifications Decision Tree C4.5. Feature reduction is one of important issues in classification model which is intended to remove irrelevant data in order to improve accuracy. The feature reduction framework is used to simplify high dimensional data to low dimensional data with non-correlated attributes. In this research, we proposed a framework for selecting relevant and non-correlated feature subsets. We consider principal component analysis (PCA) for feature reduction to perform non-correlated feature selection and Decision Tree C4.5 algorithm for the classification. From the experiments conducted using available data sets from UCI Cervical cancer data set repository with 858 instances and 36 attributes, we evaluated the performance of our framework based on accuracy, specificity and precision. Experimental results show that our proposed framework is robust to enhance classification accuracy with 90.70% accuracy rates.
NASA Astrophysics Data System (ADS)
Gutierrez-Velez, V. H.; DeFries, R. S.
2011-12-01
Oil palm expansion has led to clearing of extensive forest areas in the tropics. However quantitative assessments of the magnitude of oil palm expansion to deforestation have been challenging due in large part to the limitations presented by conventional optical data sets for discriminating plantations from forests and other tree cover vegetations. Recently available information from active remote sensors has opened the possibility of using these data sources to overcome these limitations. The purpose of this analysis is to evaluate the accuracy of oil palm classification when using ALOS/PALSAR active satellite data in conjunction with Landsat information, compared to the use of Landsat data only. The analysis takes place in a focused region around the city of Pucallpa in the Ucayali province of the Peruvian Amazon for the year 2010. Oil palm plantations were separated in five categories consisting of four age classes (0-3, 3-5, 5-10 and > 10 yrs) and an additional class accounting for degraded plantations older than 15 yr. Other land covers were water bodies, unvegetated land, short and tall grass, fallow, secondary vegetation, and forest. Classifications were performed using random forests. Training points for calibration and validation consisted of 411 polygons measured in areas representative of the land covers of interest and totaled 6,367 ha. Overall classification accuracy increased from 89.9% using only Landsat data sets to 94.3% using both Landast and ALOS/PALSAR. Both user's and producer's accuracy increased in all classes when using both data sets except for producer's accuracy in short grass which decreased by 1%. The largest increase in user's accuracy was obtained in oil palm plantations older than 10 years from 62 to 80% while producer's accuracy improved the most in plantations in age class 3-5 from 63 to 80%. Results demonstrate the suitability of data from ALOS/PALSAR and other active remote sensors to improve classification of oil palm plantations in age classes and discriminate them from other land covers. Results suggest a potential for improving discrimination of other tree cover types using a combination of active and conventional optical remote sensors.
Multiple Spectral-Spatial Classification Approach for Hyperspectral Data
NASA Technical Reports Server (NTRS)
Tarabalka, Yuliya; Benediktsson, Jon Atli; Chanussot, Jocelyn; Tilton, James C.
2010-01-01
A .new multiple classifier approach for spectral-spatial classification of hyperspectral images is proposed. Several classifiers are used independently to classify an image. For every pixel, if all the classifiers have assigned this pixel to the same class, the pixel is kept as a marker, i.e., a seed of the spatial region, with the corresponding class label. We propose to use spectral-spatial classifiers at the preliminary step of the marker selection procedure, each of them combining the results of a pixel-wise classification and a segmentation map. Different segmentation methods based on dissimilar principles lead to different classification results. Furthermore, a minimum spanning forest is built, where each tree is rooted on a classification -driven marker and forms a region in the spectral -spatial classification: map. Experimental results are presented for two hyperspectral airborne images. The proposed method significantly improves classification accuracies, when compared to previously proposed classification techniques.
Kudyakov, Rustam; Bowen, James; Ewen, Edward; West, Suzanne L; Daoud, Yahya; Fleming, Neil; Masica, Andrew
2012-02-01
Use of electronic health record (EHR) content for comparative effectiveness research (CER) and population health management requires significant data configuration. A retrospective cohort study was conducted using patients with diabetes followed longitudinally (N=36,353) in the EHR deployed at outpatient practice networks of 2 health care systems. A data extraction and classification algorithm targeting identification of patients with a new diagnosis of type 2 diabetes mellitus (T2DM) was applied, with the main criterion being a minimum 30-day window between the first visit documented in the EHR and the entry of T2DM on the EHR problem list. Chart reviews (N=144) validated the performance of refining this EHR classification algorithm with external administrative data. Extraction using EHR data alone designated 3205 patients as newly diagnosed with T2DM with classification accuracy of 70.1%. Use of external administrative data on that preselected population improved classification accuracy of cases identified as new T2DM diagnosis (positive predictive value was 91.9% with that step). Laboratory and medication data did not help case classification. The final cohort using this 2-stage classification process comprised 1972 patients with a new diagnosis of T2DM. Data use from current EHR systems for CER and disease management mandates substantial tailoring. Quality between EHR clinical data generated in daily care and that required for population health research varies. As evidenced by this process for classification of newly diagnosed T2DM cases, validation of EHR data with external sources can be a valuable step.
De Nunzio, Cosimo; Pastore, Antonio Luigi; Lombardo, Riccardo; Simone, Giuseppe; Leonardo, Costantino; Mastroianni, Riccardo; Collura, Devis; Muto, Giovanni; Gallucci, Michele; Carbone, Antonio; Fuschi, Andrea; Dutto, Lorenzo; Witt, Joern Heinrich; De Dominicis, Carlo; Tubaro, Andrea
2018-06-01
To evaluate the differences between the old and the new Gleason score classification systems in upgrading and downgrading rates. Between 2012 and 2015, we identified 9703 patients treated with retropubic radical prostatectomy (RP) in four tertiary centers. Biopsy specimens as well as radical prostatectomy specimens were graded according to both 2005 Gleason and 2014 ISUP five-tier Gleason grading system (five-tier GG system). Upgrading and downgrading rates on radical prostatectomy were first recorded for both classifications and then compared. The accuracy of the biopsy for each histological classification was determined by using the kappa coefficient of agreement and by assessing sensitivity, specificity, positive and negative predictive value. The five-tier GG system presented a lower clinically significant upgrading rate (1895/9703: 19,5% vs 2332/9703:24.0%; p = .001) and a similar clinically significant downgrading rate (756/9703: 7,7% vs 779/9703: 8%; p = .267) when compared to the 2005 ISUP classification. When evaluating their accuracy, the new five-tier GG system presented a better specificity (91% vs 83%) and a better negative predictive value (78% vs 60%). The kappa-statistics measures of agreement between needle biopsy and radical prostatectomy specimens were poor and good respectively for the five-tier GG system and for the 2005 Gleason score (k = 0.360 ± 0.007 vs k = 0.426 ± 0.007). The new Epstein classification significantly reduces upgrading events. The implementation of this new classification could better define prostate cancer aggressiveness with important clinical implications, particularly in prostate cancer management. Copyright © 2018 Elsevier Ltd, BASO ~ The Association for Cancer Surgery, and the European Society of Surgical Oncology. All rights reserved.
Qureshi, Muhammad Naveed Iqbal; Min, Beomjun; Jo, Hang Joon; Lee, Boreom
2016-01-01
The classification of neuroimaging data for the diagnosis of certain brain diseases is one of the main research goals of the neuroscience and clinical communities. In this study, we performed multiclass classification using a hierarchical extreme learning machine (H-ELM) classifier. We compared the performance of this classifier with that of a support vector machine (SVM) and basic extreme learning machine (ELM) for cortical MRI data from attention deficit/hyperactivity disorder (ADHD) patients. We used 159 structural MRI images of children from the publicly available ADHD-200 MRI dataset. The data consisted of three types, namely, typically developing (TDC), ADHD-inattentive (ADHD-I), and ADHD-combined (ADHD-C). We carried out feature selection by using standard SVM-based recursive feature elimination (RFE-SVM) that enabled us to achieve good classification accuracy (60.78%). In this study, we found the RFE-SVM feature selection approach in combination with H-ELM to effectively enable the acquisition of high multiclass classification accuracy rates for structural neuroimaging data. In addition, we found that the most important features for classification were the surface area of the superior frontal lobe, and the cortical thickness, volume, and mean surface area of the whole cortex. PMID:27500640
Qureshi, Muhammad Naveed Iqbal; Min, Beomjun; Jo, Hang Joon; Lee, Boreom
2016-01-01
The classification of neuroimaging data for the diagnosis of certain brain diseases is one of the main research goals of the neuroscience and clinical communities. In this study, we performed multiclass classification using a hierarchical extreme learning machine (H-ELM) classifier. We compared the performance of this classifier with that of a support vector machine (SVM) and basic extreme learning machine (ELM) for cortical MRI data from attention deficit/hyperactivity disorder (ADHD) patients. We used 159 structural MRI images of children from the publicly available ADHD-200 MRI dataset. The data consisted of three types, namely, typically developing (TDC), ADHD-inattentive (ADHD-I), and ADHD-combined (ADHD-C). We carried out feature selection by using standard SVM-based recursive feature elimination (RFE-SVM) that enabled us to achieve good classification accuracy (60.78%). In this study, we found the RFE-SVM feature selection approach in combination with H-ELM to effectively enable the acquisition of high multiclass classification accuracy rates for structural neuroimaging data. In addition, we found that the most important features for classification were the surface area of the superior frontal lobe, and the cortical thickness, volume, and mean surface area of the whole cortex.
Multi-level discriminative dictionary learning with application to large scale image classification.
Shen, Li; Sun, Gang; Huang, Qingming; Wang, Shuhui; Lin, Zhouchen; Wu, Enhua
2015-10-01
The sparse coding technique has shown flexibility and capability in image representation and analysis. It is a powerful tool in many visual applications. Some recent work has shown that incorporating the properties of task (such as discrimination for classification task) into dictionary learning is effective for improving the accuracy. However, the traditional supervised dictionary learning methods suffer from high computation complexity when dealing with large number of categories, making them less satisfactory in large scale applications. In this paper, we propose a novel multi-level discriminative dictionary learning method and apply it to large scale image classification. Our method takes advantage of hierarchical category correlation to encode multi-level discriminative information. Each internal node of the category hierarchy is associated with a discriminative dictionary and a classification model. The dictionaries at different layers are learnt to capture the information of different scales. Moreover, each node at lower layers also inherits the dictionary of its parent, so that the categories at lower layers can be described with multi-scale information. The learning of dictionaries and associated classification models is jointly conducted by minimizing an overall tree loss. The experimental results on challenging data sets demonstrate that our approach achieves excellent accuracy and competitive computation cost compared with other sparse coding methods for large scale image classification.
Li, Yiqing; Wang, Yu; Zi, Yanyang; Zhang, Mingquan
2015-10-21
The various multi-sensor signal features from a diesel engine constitute a complex high-dimensional dataset. The non-linear dimensionality reduction method, t-distributed stochastic neighbor embedding (t-SNE), provides an effective way to implement data visualization for complex high-dimensional data. However, irrelevant features can deteriorate the performance of data visualization, and thus, should be eliminated a priori. This paper proposes a feature subset score based t-SNE (FSS-t-SNE) data visualization method to deal with the high-dimensional data that are collected from multi-sensor signals. In this method, the optimal feature subset is constructed by a feature subset score criterion. Then the high-dimensional data are visualized in 2-dimension space. According to the UCI dataset test, FSS-t-SNE can effectively improve the classification accuracy. An experiment was performed with a large power marine diesel engine to validate the proposed method for diesel engine malfunction classification. Multi-sensor signals were collected by a cylinder vibration sensor and a cylinder pressure sensor. Compared with other conventional data visualization methods, the proposed method shows good visualization performance and high classification accuracy in multi-malfunction classification of a diesel engine.
Li, Yiqing; Wang, Yu; Zi, Yanyang; Zhang, Mingquan
2015-01-01
The various multi-sensor signal features from a diesel engine constitute a complex high-dimensional dataset. The non-linear dimensionality reduction method, t-distributed stochastic neighbor embedding (t-SNE), provides an effective way to implement data visualization for complex high-dimensional data. However, irrelevant features can deteriorate the performance of data visualization, and thus, should be eliminated a priori. This paper proposes a feature subset score based t-SNE (FSS-t-SNE) data visualization method to deal with the high-dimensional data that are collected from multi-sensor signals. In this method, the optimal feature subset is constructed by a feature subset score criterion. Then the high-dimensional data are visualized in 2-dimension space. According to the UCI dataset test, FSS-t-SNE can effectively improve the classification accuracy. An experiment was performed with a large power marine diesel engine to validate the proposed method for diesel engine malfunction classification. Multi-sensor signals were collected by a cylinder vibration sensor and a cylinder pressure sensor. Compared with other conventional data visualization methods, the proposed method shows good visualization performance and high classification accuracy in multi-malfunction classification of a diesel engine. PMID:26506347
NASA Astrophysics Data System (ADS)
Raziff, Abdul Rafiez Abdul; Sulaiman, Md Nasir; Mustapha, Norwati; Perumal, Thinagaran
2017-10-01
Gait recognition is widely used in many applications. In the application of the gait identification especially in people, the number of classes (people) is many which may comprise to more than 20. Due to the large amount of classes, the usage of single classification mapping (direct classification) may not be suitable as most of the existing algorithms are mostly designed for the binary classification. Furthermore, having many classes in a dataset may result in the possibility of having a high degree of overlapped class boundary. This paper discusses the application of multiclass classifier mappings such as one-vs-all (OvA), one-vs-one (OvO) and random correction code (RCC) on handheld based smartphone gait signal for person identification. The results is then compared with a single J48 decision tree for benchmark. From the result, it can be said that using multiclass classification mapping method thus partially improved the overall accuracy especially on OvO and RCC with width factor more than 4. For OvA, the accuracy result is worse than a single J48 due to a high number of classes.
Improved classification accuracy by feature extraction using genetic algorithms
NASA Astrophysics Data System (ADS)
Patriarche, Julia; Manduca, Armando; Erickson, Bradley J.
2003-05-01
A feature extraction algorithm has been developed for the purposes of improving classification accuracy. The algorithm uses a genetic algorithm / hill-climber hybrid to generate a set of linearly recombined features, which may be of reduced dimensionality compared with the original set. The genetic algorithm performs the global exploration, and a hill climber explores local neighborhoods. Hybridizing the genetic algorithm with a hill climber improves both the rate of convergence, and the final overall cost function value; it also reduces the sensitivity of the genetic algorithm to parameter selection. The genetic algorithm includes the operators: crossover, mutation, and deletion / reactivation - the last of these effects dimensionality reduction. The feature extractor is supervised, and is capable of deriving a separate feature space for each tissue (which are reintegrated during classification). A non-anatomical digital phantom was developed as a gold standard for testing purposes. In tests with the phantom, and with images of multiple sclerosis patients, classification with feature extractor derived features yielded lower error rates than using standard pulse sequences, and with features derived using principal components analysis. Using the multiple sclerosis patient data, the algorithm resulted in a mean 31% reduction in classification error of pure tissues.
Knauer, Uwe; Matros, Andrea; Petrovic, Tijana; Zanker, Timothy; Scott, Eileen S; Seiffert, Udo
2017-01-01
Hyperspectral imaging is an emerging means of assessing plant vitality, stress parameters, nutrition status, and diseases. Extraction of target values from the high-dimensional datasets either relies on pixel-wise processing of the full spectral information, appropriate selection of individual bands, or calculation of spectral indices. Limitations of such approaches are reduced classification accuracy, reduced robustness due to spatial variation of the spectral information across the surface of the objects measured as well as a loss of information intrinsic to band selection and use of spectral indices. In this paper we present an improved spatial-spectral segmentation approach for the analysis of hyperspectral imaging data and its application for the prediction of powdery mildew infection levels (disease severity) of intact Chardonnay grape bunches shortly before veraison. Instead of calculating texture features (spatial features) for the huge number of spectral bands independently, dimensionality reduction by means of Linear Discriminant Analysis (LDA) was applied first to derive a few descriptive image bands. Subsequent classification was based on modified Random Forest classifiers and selective extraction of texture parameters from the integral image representation of the image bands generated. Dimensionality reduction, integral images, and the selective feature extraction led to improved classification accuracies of up to [Formula: see text] for detached berries used as a reference sample (training dataset). Our approach was validated by predicting infection levels for a sample of 30 intact bunches. Classification accuracy improved with the number of decision trees of the Random Forest classifier. These results corresponded with qPCR results. An accuracy of 0.87 was achieved in classification of healthy, infected, and severely diseased bunches. However, discrimination between visually healthy and infected bunches proved to be challenging for a few samples, perhaps due to colonized berries or sparse mycelia hidden within the bunch or airborne conidia on the berries that were detected by qPCR. An advanced approach to hyperspectral image classification based on combined spatial and spectral image features, potentially applicable to many available hyperspectral sensor technologies, has been developed and validated to improve the detection of powdery mildew infection levels of Chardonnay grape bunches. The spatial-spectral approach improved especially the detection of light infection levels compared with pixel-wise spectral data analysis. This approach is expected to improve the speed and accuracy of disease detection once the thresholds for fungal biomass detected by hyperspectral imaging are established; it can also facilitate monitoring in plant phenotyping of grapevine and additional crops.
NASA Astrophysics Data System (ADS)
Guo, Yiqing; Jia, Xiuping; Paull, David
2018-06-01
The explosive availability of remote sensing images has challenged supervised classification algorithms such as Support Vector Machines (SVM), as training samples tend to be highly limited due to the expensive and laborious task of ground truthing. The temporal correlation and spectral similarity between multitemporal images have opened up an opportunity to alleviate this problem. In this study, a SVM-based Sequential Classifier Training (SCT-SVM) approach is proposed for multitemporal remote sensing image classification. The approach leverages the classifiers of previous images to reduce the required number of training samples for the classifier training of an incoming image. For each incoming image, a rough classifier is firstly predicted based on the temporal trend of a set of previous classifiers. The predicted classifier is then fine-tuned into a more accurate position with current training samples. This approach can be applied progressively to sequential image data, with only a small number of training samples being required from each image. Experiments were conducted with Sentinel-2A multitemporal data over an agricultural area in Australia. Results showed that the proposed SCT-SVM achieved better classification accuracies compared with two state-of-the-art model transfer algorithms. When training data are insufficient, the overall classification accuracy of the incoming image was improved from 76.18% to 94.02% with the proposed SCT-SVM, compared with those obtained without the assistance from previous images. These results demonstrate that the leverage of a priori information from previous images can provide advantageous assistance for later images in multitemporal image classification.
IMPACTS OF PATCH SIZE AND LANDSCAPE HETEROGENEITY ON THEMATIC IMAGE CLASSIFICATION ACCURACY
Impacts of Patch Size and Landscape Heterogeneity on Thematic Image Classification Accuracy.
Currently, most thematic accuracy assessments of classified remotely sensed images oily account for errors between the various classes employed, at particular pixels of interest, thu...
Sharma, Harshita; Zerbe, Norman; Klempert, Iris; Hellwich, Olaf; Hufnagl, Peter
2017-11-01
Deep learning using convolutional neural networks is an actively emerging field in histological image analysis. This study explores deep learning methods for computer-aided classification in H&E stained histopathological whole slide images of gastric carcinoma. An introductory convolutional neural network architecture is proposed for two computerized applications, namely, cancer classification based on immunohistochemical response and necrosis detection based on the existence of tumor necrosis in the tissue. Classification performance of the developed deep learning approach is quantitatively compared with traditional image analysis methods in digital histopathology requiring prior computation of handcrafted features, such as statistical measures using gray level co-occurrence matrix, Gabor filter-bank responses, LBP histograms, gray histograms, HSV histograms and RGB histograms, followed by random forest machine learning. Additionally, the widely known AlexNet deep convolutional framework is comparatively analyzed for the corresponding classification problems. The proposed convolutional neural network architecture reports favorable results, with an overall classification accuracy of 0.6990 for cancer classification and 0.8144 for necrosis detection. Copyright © 2017 Elsevier Ltd. All rights reserved.
Shin, Younghak; Lee, Seungchan; Ahn, Minkyu; Cho, Hohyun; Jun, Sung Chan; Lee, Heung-No
2015-11-01
One of the main problems related to electroencephalogram (EEG) based brain-computer interface (BCI) systems is the non-stationarity of the underlying EEG signals. This results in the deterioration of the classification performance during experimental sessions. Therefore, adaptive classification techniques are required for EEG based BCI applications. In this paper, we propose simple adaptive sparse representation based classification (SRC) schemes. Supervised and unsupervised dictionary update techniques for new test data and a dictionary modification method by using the incoherence measure of the training data are investigated. The proposed methods are very simple and additional computation for the re-training of the classifier is not needed. The proposed adaptive SRC schemes are evaluated using two BCI experimental datasets. The proposed methods are assessed by comparing classification results with the conventional SRC and other adaptive classification methods. On the basis of the results, we find that the proposed adaptive schemes show relatively improved classification accuracy as compared to conventional methods without requiring additional computation. Copyright © 2015 Elsevier Ltd. All rights reserved.
Tahmasian, Masoud; Jamalabadi, Hamidreza; Abedini, Mina; Ghadami, Mohammad R; Sepehry, Amir A; Knight, David C; Khazaie, Habibolah
2017-05-22
Sleep disturbance is common in chronic post-traumatic stress disorder (PTSD). However, prior work has demonstrated that there are inconsistencies between subjective and objective assessments of sleep disturbance in PTSD. Therefore, we investigated whether subjective or objective sleep assessment has greater clinical utility to differentiate PTSD patients from healthy subjects. Further, we evaluated whether the combination of subjective and objective methods improves the accuracy of classification into patient versus healthy groups, which has important diagnostic implications. We recruited 32 chronic war-induced PTSD patients and 32 age- and gender-matched healthy subjects to participate in this study. Subjective (i.e. from three self-reported sleep questionnaires) and objective sleep-related data (i.e. from actigraphy scores) were collected from each participant. Subjective, objective, and combined (subjective and objective) sleep data were then analyzed using support vector machine classification. The classification accuracy, sensitivity, and specificity for subjective variables were 89.2%, 89.3%, and 89%, respectively. The classification accuracy, sensitivity, and specificity for objective variables were 65%, 62.3%, and 67.8%, respectively. The classification accuracy, sensitivity, and specificity for the aggregate variables (combination of subjective and objective variables) were 91.6%, 93.0%, and 90.3%, respectively. Our findings indicate that classification accuracy using subjective measurements is superior to objective measurements and the combination of both assessments appears to improve the classification accuracy for differentiating PTSD patients from healthy individuals. Copyright © 2017 Elsevier B.V. All rights reserved.
Breast density characterization using texton distributions.
Petroudi, Styliani; Brady, Michael
2011-01-01
Breast density has been shown to be one of the most significant risks for developing breast cancer, with women with dense breasts at four to six times higher risk. The Breast Imaging Reporting and Data System (BI-RADS) has a four class classification scheme that describes the different breast densities. However, there is great inter and intra observer variability among clinicians in reporting a mammogram's density class. This work presents a novel texture classification method and its application for the development of a completely automated breast density classification system. The new method represents the mammogram using textons, which can be thought of as the building blocks of texture under the operational definition of Leung and Malik as clustered filter responses. The new proposed method characterizes the mammographic appearance of the different density patterns by evaluating the texton spatial dependence matrix (TDSM) in the breast region's corresponding texton map. The TSDM is a texture model that captures both statistical and structural texture characteristics. The normalized TSDM matrices are evaluated for mammograms from the different density classes and corresponding texture models are established. Classification is achieved using a chi-square distance measure. The fully automated TSDM breast density classification method is quantitatively evaluated on mammograms from all density classes from the Oxford Mammogram Database. The incorporation of texton spatial dependencies allows for classification accuracy reaching over 82%. The breast density classification accuracy is better using texton TSDM compared to simple texton histograms.
NASA Astrophysics Data System (ADS)
Erener, A.
2013-04-01
Automatic extraction of urban features from high resolution satellite images is one of the main applications in remote sensing. It is useful for wide scale applications, namely: urban planning, urban mapping, disaster management, GIS (geographic information systems) updating, and military target detection. One common approach to detecting urban features from high resolution images is to use automatic classification methods. This paper has four main objectives with respect to detecting buildings. The first objective is to compare the performance of the most notable supervised classification algorithms, including the maximum likelihood classifier (MLC) and the support vector machine (SVM). In this experiment the primary consideration is the impact of kernel configuration on the performance of the SVM. The second objective of the study is to explore the suitability of integrating additional bands, namely first principal component (1st PC) and the intensity image, for original data for multi classification approaches. The performance evaluation of classification results is done using two different accuracy assessment methods: pixel based and object based approaches, which reflect the third aim of the study. The objective here is to demonstrate the differences in the evaluation of accuracies of classification methods. Considering consistency, the same set of ground truth data which is produced by labeling the building boundaries in the GIS environment is used for accuracy assessment. Lastly, the fourth aim is to experimentally evaluate variation in the accuracy of classifiers for six different real situations in order to identify the impact of spatial and spectral diversity on results. The method is applied to Quickbird images for various urban complexity levels, extending from simple to complex urban patterns. The simple surface type includes a regular urban area with low density and systematic buildings with brick rooftops. The complex surface type involves almost all kinds of challenges, such as high dense build up areas, regions with bare soil, and small and large buildings with different rooftops, such as concrete, brick, and metal. Using the pixel based accuracy assessment it was shown that the percent building detection (PBD) and quality percent (QP) of the MLC and SVM depend on the complexity and texture variation of the region. Generally, PBD values range between 70% and 90% for the MLC and SVM, respectively. No substantial improvements were observed when the SVM and MLC classifications were developed by the addition of more variables, instead of the use of only four bands. In the evaluation of object based accuracy assessment, it was demonstrated that while MLC and SVM provide higher rates of correct detection, they also provide higher rates of false alarms.
NASA Astrophysics Data System (ADS)
Liu, Yansong; Monteiro, Sildomar T.; Saber, Eli
2015-10-01
Changes in vegetation cover, building construction, road network and traffic conditions caused by urban expansion affect the human habitat as well as the natural environment in rapidly developing cities. It is crucial to assess these changes and respond accordingly by identifying man-made and natural structures with accurate classification algorithms. With the increase in use of multi-sensor remote sensing systems, researchers are able to obtain a more complete description of the scene of interest. By utilizing multi-sensor data, the accuracy of classification algorithms can be improved. In this paper, we propose a method for combining 3D LiDAR point clouds and high-resolution color images to classify urban areas using Gaussian processes (GP). GP classification is a powerful non-parametric classification method that yields probabilistic classification results. It makes predictions in a way that addresses the uncertainty of real world. In this paper, we attempt to identify man-made and natural objects in urban areas including buildings, roads, trees, grass, water and vehicles. LiDAR features are derived from the 3D point clouds and the spatial and color features are extracted from RGB images. For classification, we use the Laplacian approximation for GP binary classification on the new combined feature space. The multiclass classification has been implemented by using one-vs-all binary classification strategy. The result of applying support vector machines (SVMs) and logistic regression (LR) classifier is also provided for comparison. Our experiments show a clear improvement of classification results by using the two sensors combined instead of each sensor separately. Also we found the advantage of applying GP approach to handle the uncertainty in classification result without compromising accuracy compared to SVM, which is considered as the state-of-the-art classification method.
Shi, Rong; Schraedley-Desmond, Pamela; Napel, Sandy; Olcott, Eric W; Jeffrey, R Brooke; Yee, Judy; Zalis, Michael E; Margolis, Daniel; Paik, David S; Sherbondy, Anthony J; Sundaram, Padmavathi; Beaulieu, Christopher F
2006-06-01
To retrospectively determine if three-dimensional (3D) viewing improves radiologists' accuracy in classifying true-positive (TP) and false-positive (FP) polyp candidates identified with computer-aided detection (CAD) and to determine candidate polyp features that are associated with classification accuracy, with known polyps serving as the reference standard. Institutional review board approval and informed consent were obtained; this study was HIPAA compliant. Forty-seven computed tomographic (CT) colonography data sets were obtained in 26 men and 10 women (age range, 42-76 years). Four radiologists classified 705 polyp candidates (53 TP candidates, 652 FP candidates) identified with CAD; initially, only two-dimensional images were used, but these were later supplemented with 3D rendering. Another radiologist unblinded to colonoscopy findings characterized the features of each candidate, assessed colon distention and preparation, and defined the true nature of FP candidates. Receiver operating characteristic curves were used to compare readers' performance, and repeated-measures analysis of variance was used to test features that affect interpretation. Use of 3D viewing improved classification accuracy for three readers and increased the area under the receiver operating characteristic curve to 0.96-0.97 (P<.001). For TP candidates, maximum polyp width (P=.038), polyp height (P=.019), and preparation (P=.004) significantly affected accuracy. For FP candidates, colonic segment (P=.007), attenuation (P<.001), surface smoothness (P<.001), distention (P=.034), preparation (P<.001), and true nature of candidate lesions (P<.001) significantly affected accuracy. Use of 3D viewing increases reader accuracy in the classification of polyp candidates identified with CAD. Polyp size and examination quality are significantly associated with accuracy. Copyright (c) RSNA, 2006.
Cisler, Josh M.; Bush, Keith; James, G. Andrew; Smitherman, Sonet; Kilts, Clinton D.
2015-01-01
Posttraumatic Stress Disorder (PTSD) is characterized by intrusive recall of the traumatic memory. While numerous studies have investigated the neural processing mechanisms engaged during trauma memory recall in PTSD, these analyses have only focused on group-level contrasts that reveal little about the predictive validity of the identified brain regions. By contrast, a multivariate pattern analysis (MVPA) approach towards identifying the neural mechanisms engaged during trauma memory recall would entail testing whether a multivariate set of brain regions is reliably predictive of (i.e., discriminates) whether an individual is engaging in trauma or non-trauma memory recall. Here, we use a MVPA approach to test 1) whether trauma memory vs neutral memory recall can be predicted reliably using a multivariate set of brain regions among women with PTSD related to assaultive violence exposure (N=16), 2) the methodological parameters (e.g., spatial smoothing, number of memory recall repetitions, etc.) that optimize classification accuracy and reproducibility of the feature weight spatial maps, and 3) the correspondence between brain regions that discriminate trauma memory recall and the brain regions predicted by neurocircuitry models of PTSD. Cross-validation classification accuracy was significantly above chance for all methodological permutations tested; mean accuracy across participants was 76% for the methodological parameters selected as optimal for both efficiency and accuracy. Classification accuracy was significantly better for a voxel-wise approach relative to voxels within restricted regions-of-interest (ROIs); classification accuracy did not differ when using PTSD-related ROIs compared to randomly generated ROIs. ROI-based analyses suggested the reliable involvement of the left hippocampus in discriminating memory recall across participants and that the contribution of the left amygdala to the decision function was dependent upon PTSD symptom severity. These results have methodological implications for real-time fMRI neurofeedback of the trauma memory in PTSD and conceptual implications for neurocircuitry models of PTSD that attempt to explain core neural processing mechanisms mediating PTSD. PMID:26241958
Cisler, Josh M; Bush, Keith; James, G Andrew; Smitherman, Sonet; Kilts, Clinton D
2015-01-01
Posttraumatic Stress Disorder (PTSD) is characterized by intrusive recall of the traumatic memory. While numerous studies have investigated the neural processing mechanisms engaged during trauma memory recall in PTSD, these analyses have only focused on group-level contrasts that reveal little about the predictive validity of the identified brain regions. By contrast, a multivariate pattern analysis (MVPA) approach towards identifying the neural mechanisms engaged during trauma memory recall would entail testing whether a multivariate set of brain regions is reliably predictive of (i.e., discriminates) whether an individual is engaging in trauma or non-trauma memory recall. Here, we use a MVPA approach to test 1) whether trauma memory vs neutral memory recall can be predicted reliably using a multivariate set of brain regions among women with PTSD related to assaultive violence exposure (N=16), 2) the methodological parameters (e.g., spatial smoothing, number of memory recall repetitions, etc.) that optimize classification accuracy and reproducibility of the feature weight spatial maps, and 3) the correspondence between brain regions that discriminate trauma memory recall and the brain regions predicted by neurocircuitry models of PTSD. Cross-validation classification accuracy was significantly above chance for all methodological permutations tested; mean accuracy across participants was 76% for the methodological parameters selected as optimal for both efficiency and accuracy. Classification accuracy was significantly better for a voxel-wise approach relative to voxels within restricted regions-of-interest (ROIs); classification accuracy did not differ when using PTSD-related ROIs compared to randomly generated ROIs. ROI-based analyses suggested the reliable involvement of the left hippocampus in discriminating memory recall across participants and that the contribution of the left amygdala to the decision function was dependent upon PTSD symptom severity. These results have methodological implications for real-time fMRI neurofeedback of the trauma memory in PTSD and conceptual implications for neurocircuitry models of PTSD that attempt to explain core neural processing mechanisms mediating PTSD.
A Machine Learning-based Method for Question Type Classification in Biomedical Question Answering.
Sarrouti, Mourad; Ouatik El Alaoui, Said
2017-05-18
Biomedical question type classification is one of the important components of an automatic biomedical question answering system. The performance of the latter depends directly on the performance of its biomedical question type classification system, which consists of assigning a category to each question in order to determine the appropriate answer extraction algorithm. This study aims to automatically classify biomedical questions into one of the four categories: (1) yes/no, (2) factoid, (3) list, and (4) summary. In this paper, we propose a biomedical question type classification method based on machine learning approaches to automatically assign a category to a biomedical question. First, we extract features from biomedical questions using the proposed handcrafted lexico-syntactic patterns. Then, we feed these features for machine-learning algorithms. Finally, the class label is predicted using the trained classifiers. Experimental evaluations performed on large standard annotated datasets of biomedical questions, provided by the BioASQ challenge, demonstrated that our method exhibits significant improved performance when compared to four baseline systems. The proposed method achieves a roughly 10-point increase over the best baseline in terms of accuracy. Moreover, the obtained results show that using handcrafted lexico-syntactic patterns as features' provider of support vector machine (SVM) lead to the highest accuracy of 89.40 %. The proposed method can automatically classify BioASQ questions into one of the four categories: yes/no, factoid, list, and summary. Furthermore, the results demonstrated that our method produced the best classification performance compared to four baseline systems.
A Survey on the Feasibility of Sound Classification on Wireless Sensor Nodes
Salomons, Etto L.; Havinga, Paul J. M.
2015-01-01
Wireless sensor networks are suitable to gain context awareness for indoor environments. As sound waves form a rich source of context information, equipping the nodes with microphones can be of great benefit. The algorithms to extract features from sound waves are often highly computationally intensive. This can be problematic as wireless nodes are usually restricted in resources. In order to be able to make a proper decision about which features to use, we survey how sound is used in the literature for global sound classification, age and gender classification, emotion recognition, person verification and identification and indoor and outdoor environmental sound classification. The results of the surveyed algorithms are compared with respect to accuracy and computational load. The accuracies are taken from the surveyed papers; the computational loads are determined by benchmarking the algorithms on an actual sensor node. We conclude that for indoor context awareness, the low-cost algorithms for feature extraction perform equally well as the more computationally-intensive variants. As the feature extraction still requires a large amount of processing time, we present four possible strategies to deal with this problem. PMID:25822142
McRoy, Susan; Jones, Sean; Kurmally, Adam
2016-09-01
This article examines methods for automated question classification applied to cancer-related questions that people have asked on the web. This work is part of a broader effort to provide automated question answering for health education. We created a new corpus of consumer-health questions related to cancer and a new taxonomy for those questions. We then compared the effectiveness of different statistical methods for developing classifiers, including weighted classification and resampling. Basic methods for building classifiers were limited by the high variability in the natural distribution of questions and typical refinement approaches of feature selection and merging categories achieved only small improvements to classifier accuracy. Best performance was achieved using weighted classification and resampling methods, the latter yielding an accuracy of F1 = 0.963. Thus, it would appear that statistical classifiers can be trained on natural data, but only if natural distributions of classes are smoothed. Such classifiers would be useful for automated question answering, for enriching web-based content, or assisting clinical professionals to answer questions. © The Author(s) 2015.
Identification and classification of similar looking food grains
NASA Astrophysics Data System (ADS)
Anami, B. S.; Biradar, Sunanda D.; Savakar, D. G.; Kulkarni, P. V.
2013-01-01
This paper describes the comparative study of Artificial Neural Network (ANN) and Support Vector Machine (SVM) classifiers by taking a case study of identification and classification of four pairs of similar looking food grains namely, Finger Millet, Mustard, Soyabean, Pigeon Pea, Aniseed, Cumin-seeds, Split Greengram and Split Blackgram. Algorithms are developed to acquire and process color images of these grains samples. The developed algorithms are used to extract 18 colors-Hue Saturation Value (HSV), and 42 wavelet based texture features. Back Propagation Neural Network (BPNN)-based classifier is designed using three feature sets namely color - HSV, wavelet-texture and their combined model. SVM model for color- HSV model is designed for the same set of samples. The classification accuracies ranging from 93% to 96% for color-HSV, ranging from 78% to 94% for wavelet texture model and from 92% to 97% for combined model are obtained for ANN based models. The classification accuracy ranging from 80% to 90% is obtained for color-HSV based SVM model. Training time required for the SVM based model is substantially lesser than ANN for the same set of images.
NASA Astrophysics Data System (ADS)
Adjorlolo, Clement; Mutanga, Onisimo; Cho, Moses A.; Ismail, Riyad
2013-04-01
In this paper, a user-defined inter-band correlation filter function was used to resample hyperspectral data and thereby mitigate the problem of multicollinearity in classification analysis. The proposed resampling technique convolves the spectral dependence information between a chosen band-centre and its shorter and longer wavelength neighbours. Weighting threshold of inter-band correlation (WTC, Pearson's r) was calculated, whereby r = 1 at the band-centre. Various WTC (r = 0.99, r = 0.95 and r = 0.90) were assessed, and bands with coefficients beyond a chosen threshold were assigned r = 0. The resultant data were used in the random forest analysis to classify in situ C3 and C4 grass canopy reflectance. The respective WTC datasets yielded improved classification accuracies (kappa = 0.82, 0.79 and 0.76) with less correlated wavebands when compared to resampled Hyperion bands (kappa = 0.76). Overall, the results obtained from this study suggested that resampling of hyperspectral data should account for the spectral dependence information to improve overall classification accuracy as well as reducing the problem of multicollinearity.
Clemans, Katherine H; Musci, Rashelle J; Leoutsakos, Jeannie-Marie S; Ialongo, Nicholas S
2014-04-01
This study compared the ability of teacher, parent, and peer reports of aggressive behavior in early childhood to accurately classify cases of maladaptive outcomes in late adolescence and early adulthood. Weighted kappa analyses determined optimal cut points and relative classification accuracy among teacher, parent, and peer reports of aggression assessed for 691 students (54% male; 84% African American and 13% White) in the fall of first grade. Outcomes included antisocial personality, substance use, incarceration history, risky sexual behavior, and failure to graduate from high school on time. Peer reports were the most accurate classifier of all outcomes in the full sample. For most outcomes, the addition of teacher or parent reports did not improve overall classification accuracy once peer reports were accounted for. Additional gender-specific and adjusted kappa analyses supported the superior classification utility of the peer report measure. The results suggest that peer reports provided the most useful classification information of the 3 aggression measures. Implications for targeted intervention efforts in which screening measures are used to identify at-risk children are discussed.
Global Optimization Ensemble Model for Classification Methods
Anwar, Hina; Qamar, Usman; Muzaffar Qureshi, Abdul Wahab
2014-01-01
Supervised learning is the process of data mining for deducing rules from training datasets. A broad array of supervised learning algorithms exists, every one of them with its own advantages and drawbacks. There are some basic issues that affect the accuracy of classifier while solving a supervised learning problem, like bias-variance tradeoff, dimensionality of input space, and noise in the input data space. All these problems affect the accuracy of classifier and are the reason that there is no global optimal method for classification. There is not any generalized improvement method that can increase the accuracy of any classifier while addressing all the problems stated above. This paper proposes a global optimization ensemble model for classification methods (GMC) that can improve the overall accuracy for supervised learning problems. The experimental results on various public datasets showed that the proposed model improved the accuracy of the classification models from 1% to 30% depending upon the algorithm complexity. PMID:24883382
Speaker-sensitive emotion recognition via ranking: Studies on acted and spontaneous speech☆
Cao, Houwei; Verma, Ragini; Nenkova, Ani
2014-01-01
We introduce a ranking approach for emotion recognition which naturally incorporates information about the general expressivity of speakers. We demonstrate that our approach leads to substantial gains in accuracy compared to conventional approaches. We train ranking SVMs for individual emotions, treating the data from each speaker as a separate query, and combine the predictions from all rankers to perform multi-class prediction. The ranking method provides two natural benefits. It captures speaker specific information even in speaker-independent training/testing conditions. It also incorporates the intuition that each utterance can express a mix of possible emotion and that considering the degree to which each emotion is expressed can be productively exploited to identify the dominant emotion. We compare the performance of the rankers and their combination to standard SVM classification approaches on two publicly available datasets of acted emotional speech, Berlin and LDC, as well as on spontaneous emotional data from the FAU Aibo dataset. On acted data, ranking approaches exhibit significantly better performance compared to SVM classification both in distinguishing a specific emotion from all others and in multi-class prediction. On the spontaneous data, which contains mostly neutral utterances with a relatively small portion of less intense emotional utterances, ranking-based classifiers again achieve much higher precision in identifying emotional utterances than conventional SVM classifiers. In addition, we discuss the complementarity of conventional SVM and ranking-based classifiers. On all three datasets we find dramatically higher accuracy for the test items on whose prediction the two methods agree compared to the accuracy of individual methods. Furthermore on the spontaneous data the ranking and standard classification are complementary and we obtain marked improvement when we combine the two classifiers by late-stage fusion. PMID:25422534
Speaker-sensitive emotion recognition via ranking: Studies on acted and spontaneous speech☆
Cao, Houwei; Verma, Ragini; Nenkova, Ani
2015-01-01
We introduce a ranking approach for emotion recognition which naturally incorporates information about the general expressivity of speakers. We demonstrate that our approach leads to substantial gains in accuracy compared to conventional approaches. We train ranking SVMs for individual emotions, treating the data from each speaker as a separate query, and combine the predictions from all rankers to perform multi-class prediction. The ranking method provides two natural benefits. It captures speaker specific information even in speaker-independent training/testing conditions. It also incorporates the intuition that each utterance can express a mix of possible emotion and that considering the degree to which each emotion is expressed can be productively exploited to identify the dominant emotion. We compare the performance of the rankers and their combination to standard SVM classification approaches on two publicly available datasets of acted emotional speech, Berlin and LDC, as well as on spontaneous emotional data from the FAU Aibo dataset. On acted data, ranking approaches exhibit significantly better performance compared to SVM classification both in distinguishing a specific emotion from all others and in multi-class prediction. On the spontaneous data, which contains mostly neutral utterances with a relatively small portion of less intense emotional utterances, ranking-based classifiers again achieve much higher precision in identifying emotional utterances than conventional SVM classifiers. In addition, we discuss the complementarity of conventional SVM and ranking-based classifiers. On all three datasets we find dramatically higher accuracy for the test items on whose prediction the two methods agree compared to the accuracy of individual methods. Furthermore on the spontaneous data the ranking and standard classification are complementary and we obtain marked improvement when we combine the two classifiers by late-stage fusion.
Devaney, John; Barrett, Brian; Barrett, Frank; Redmond, John; O Halloran, John
2015-01-01
Quantification of spatial and temporal changes in forest cover is an essential component of forest monitoring programs. Due to its cloud free capability, Synthetic Aperture Radar (SAR) is an ideal source of information on forest dynamics in countries with near-constant cloud-cover. However, few studies have investigated the use of SAR for forest cover estimation in landscapes with highly sparse and fragmented forest cover. In this study, the potential use of L-band SAR for forest cover estimation in two regions (Longford and Sligo) in Ireland is investigated and compared to forest cover estimates derived from three national (Forestry2010, Prime2, National Forest Inventory), one pan-European (Forest Map 2006) and one global forest cover (Global Forest Change) product. Two machine-learning approaches (Random Forests and Extremely Randomised Trees) are evaluated. Both Random Forests and Extremely Randomised Trees classification accuracies were high (98.1-98.5%), with differences between the two classifiers being minimal (<0.5%). Increasing levels of post classification filtering led to a decrease in estimated forest area and an increase in overall accuracy of SAR-derived forest cover maps. All forest cover products were evaluated using an independent validation dataset. For the Longford region, the highest overall accuracy was recorded with the Forestry2010 dataset (97.42%) whereas in Sligo, highest overall accuracy was obtained for the Prime2 dataset (97.43%), although accuracies of SAR-derived forest maps were comparable. Our findings indicate that spaceborne radar could aid inventories in regions with low levels of forest cover in fragmented landscapes. The reduced accuracies observed for the global and pan-continental forest cover maps in comparison to national and SAR-derived forest maps indicate that caution should be exercised when applying these datasets for national reporting.
Devaney, John; Barrett, Brian; Barrett, Frank; Redmond, John; O`Halloran, John
2015-01-01
Quantification of spatial and temporal changes in forest cover is an essential component of forest monitoring programs. Due to its cloud free capability, Synthetic Aperture Radar (SAR) is an ideal source of information on forest dynamics in countries with near-constant cloud-cover. However, few studies have investigated the use of SAR for forest cover estimation in landscapes with highly sparse and fragmented forest cover. In this study, the potential use of L-band SAR for forest cover estimation in two regions (Longford and Sligo) in Ireland is investigated and compared to forest cover estimates derived from three national (Forestry2010, Prime2, National Forest Inventory), one pan-European (Forest Map 2006) and one global forest cover (Global Forest Change) product. Two machine-learning approaches (Random Forests and Extremely Randomised Trees) are evaluated. Both Random Forests and Extremely Randomised Trees classification accuracies were high (98.1–98.5%), with differences between the two classifiers being minimal (<0.5%). Increasing levels of post classification filtering led to a decrease in estimated forest area and an increase in overall accuracy of SAR-derived forest cover maps. All forest cover products were evaluated using an independent validation dataset. For the Longford region, the highest overall accuracy was recorded with the Forestry2010 dataset (97.42%) whereas in Sligo, highest overall accuracy was obtained for the Prime2 dataset (97.43%), although accuracies of SAR-derived forest maps were comparable. Our findings indicate that spaceborne radar could aid inventories in regions with low levels of forest cover in fragmented landscapes. The reduced accuracies observed for the global and pan-continental forest cover maps in comparison to national and SAR-derived forest maps indicate that caution should be exercised when applying these datasets for national reporting. PMID:26262681
NASA Astrophysics Data System (ADS)
Liu, Tao; Abd-Elrahman, Amr
2018-05-01
Deep convolutional neural network (DCNN) requires massive training datasets to trigger its image classification power, while collecting training samples for remote sensing application is usually an expensive process. When DCNN is simply implemented with traditional object-based image analysis (OBIA) for classification of Unmanned Aerial systems (UAS) orthoimage, its power may be undermined if the number training samples is relatively small. This research aims to develop a novel OBIA classification approach that can take advantage of DCNN by enriching the training dataset automatically using multi-view data. Specifically, this study introduces a Multi-View Object-based classification using Deep convolutional neural network (MODe) method to process UAS images for land cover classification. MODe conducts the classification on multi-view UAS images instead of directly on the orthoimage, and gets the final results via a voting procedure. 10-fold cross validation results show the mean overall classification accuracy increasing substantially from 65.32%, when DCNN was applied on the orthoimage to 82.08% achieved when MODe was implemented. This study also compared the performances of the support vector machine (SVM) and random forest (RF) classifiers with DCNN under traditional OBIA and the proposed multi-view OBIA frameworks. The results indicate that the advantage of DCNN over traditional classifiers in terms of accuracy is more obvious when these classifiers were applied with the proposed multi-view OBIA framework than when these classifiers were applied within the traditional OBIA framework.
Feature Selection Methods for Zero-Shot Learning of Neural Activity.
Caceres, Carlos A; Roos, Matthew J; Rupp, Kyle M; Milsap, Griffin; Crone, Nathan E; Wolmetz, Michael E; Ratto, Christopher R
2017-01-01
Dimensionality poses a serious challenge when making predictions from human neuroimaging data. Across imaging modalities, large pools of potential neural features (e.g., responses from particular voxels, electrodes, and temporal windows) have to be related to typically limited sets of stimuli and samples. In recent years, zero-shot prediction models have been introduced for mapping between neural signals and semantic attributes, which allows for classification of stimulus classes not explicitly included in the training set. While choices about feature selection can have a substantial impact when closed-set accuracy, open-set robustness, and runtime are competing design objectives, no systematic study of feature selection for these models has been reported. Instead, a relatively straightforward feature stability approach has been adopted and successfully applied across models and imaging modalities. To characterize the tradeoffs in feature selection for zero-shot learning, we compared correlation-based stability to several other feature selection techniques on comparable data sets from two distinct imaging modalities: functional Magnetic Resonance Imaging and Electrocorticography. While most of the feature selection methods resulted in similar zero-shot prediction accuracies and spatial/spectral patterns of selected features, there was one exception; A novel feature/attribute correlation approach was able to achieve those accuracies with far fewer features, suggesting the potential for simpler prediction models that yield high zero-shot classification accuracy.
NASA Astrophysics Data System (ADS)
Klomp, Sander; van der Sommen, Fons; Swager, Anne-Fré; Zinger, Svitlana; Schoon, Erik J.; Curvers, Wouter L.; Bergman, Jacques J.; de With, Peter H. N.
2017-03-01
Volumetric Laser Endomicroscopy (VLE) is a promising technique for the detection of early neoplasia in Barrett's Esophagus (BE). VLE generates hundreds of high resolution, grayscale, cross-sectional images of the esophagus. However, at present, classifying these images is a time consuming and cumbersome effort performed by an expert using a clinical prediction model. This paper explores the feasibility of using computer vision techniques to accurately predict the presence of dysplastic tissue in VLE BE images. Our contribution is threefold. First, a benchmarking is performed for widely applied machine learning techniques and feature extraction methods. Second, three new features based on the clinical detection model are proposed, having superior classification accuracy and speed, compared to earlier work. Third, we evaluate automated parameter tuning by applying simple grid search and feature selection methods. The results are evaluated on a clinically validated dataset of 30 dysplastic and 30 non-dysplastic VLE images. Optimal classification accuracy is obtained by applying a support vector machine and using our modified Haralick features and optimal image cropping, obtaining an area under the receiver operating characteristic of 0.95 compared to the clinical prediction model at 0.81. Optimal execution time is achieved using a proposed mean and median feature, which is extracted at least factor 2.5 faster than alternative features with comparable performance.
NASA Astrophysics Data System (ADS)
Tamimi, E.; Ebadi, H.; Kiani, A.
2017-09-01
Automatic building detection from High Spatial Resolution (HSR) images is one of the most important issues in Remote Sensing (RS). Due to the limited number of spectral bands in HSR images, using other features will lead to improve accuracy. By adding these features, the presence probability of dependent features will be increased, which leads to accuracy reduction. In addition, some parameters should be determined in Support Vector Machine (SVM) classification. Therefore, it is necessary to simultaneously determine classification parameters and select independent features according to image type. Optimization algorithm is an efficient method to solve this problem. On the other hand, pixel-based classification faces several challenges such as producing salt-paper results and high computational time in high dimensional data. Hence, in this paper, a novel method is proposed to optimize object-based SVM classification by applying continuous Ant Colony Optimization (ACO) algorithm. The advantages of the proposed method are relatively high automation level, independency of image scene and type, post processing reduction for building edge reconstruction and accuracy improvement. The proposed method was evaluated by pixel-based SVM and Random Forest (RF) classification in terms of accuracy. In comparison with optimized pixel-based SVM classification, the results showed that the proposed method improved quality factor and overall accuracy by 17% and 10%, respectively. Also, in the proposed method, Kappa coefficient was improved by 6% rather than RF classification. Time processing of the proposed method was relatively low because of unit of image analysis (image object). These showed the superiority of the proposed method in terms of time and accuracy.
Improved Fuzzy K-Nearest Neighbor Using Modified Particle Swarm Optimization
NASA Astrophysics Data System (ADS)
Jamaluddin; Siringoringo, Rimbun
2017-12-01
Fuzzy k-Nearest Neighbor (FkNN) is one of the most powerful classification methods. The presence of fuzzy concepts in this method successfully improves its performance on almost all classification issues. The main drawbackof FKNN is that it is difficult to determine the parameters. These parameters are the number of neighbors (k) and fuzzy strength (m). Both parameters are very sensitive. This makes it difficult to determine the values of ‘m’ and ‘k’, thus making FKNN difficult to control because no theories or guides can deduce how proper ‘m’ and ‘k’ should be. This study uses Modified Particle Swarm Optimization (MPSO) to determine the best value of ‘k’ and ‘m’. MPSO is focused on the Constriction Factor Method. Constriction Factor Method is an improvement of PSO in order to avoid local circumstances optima. The model proposed in this study was tested on the German Credit Dataset. The test of the data/The data test has been standardized by UCI Machine Learning Repository which is widely applied to classification problems. The application of MPSO to the determination of FKNN parameters is expected to increase the value of classification performance. Based on the experiments that have been done indicating that the model offered in this research results in a better classification performance compared to the Fk-NN model only. The model offered in this study has an accuracy rate of 81%, while. With using Fk-NN model, it has the accuracy of 70%. At the end is done comparison of research model superiority with 2 other classification models;such as Naive Bayes and Decision Tree. This research model has a better performance level, where Naive Bayes has accuracy 75%, and the decision tree model has 70%
Jaiswara, Ranjana; Nandi, Diptarup; Balakrishnan, Rohini
2013-01-01
Traditional taxonomy based on morphology has often failed in accurate species identification owing to the occurrence of cryptic species, which are reproductively isolated but morphologically identical. Molecular data have thus been used to complement morphology in species identification. The sexual advertisement calls in several groups of acoustically communicating animals are species-specific and can thus complement molecular data as non-invasive tools for identification. Several statistical tools and automated identifier algorithms have been used to investigate the efficiency of acoustic signals in species identification. Despite a plethora of such methods, there is a general lack of knowledge regarding the appropriate usage of these methods in specific taxa. In this study, we investigated the performance of two commonly used statistical methods, discriminant function analysis (DFA) and cluster analysis, in identification and classification based on acoustic signals of field cricket species belonging to the subfamily Gryllinae. Using a comparative approach we evaluated the optimal number of species and calling song characteristics for both the methods that lead to most accurate classification and identification. The accuracy of classification using DFA was high and was not affected by the number of taxa used. However, a constraint in using discriminant function analysis is the need for a priori classification of songs. Accuracy of classification using cluster analysis, which does not require a priori knowledge, was maximum for 6-7 taxa and decreased significantly when more than ten taxa were analysed together. We also investigated the efficacy of two novel derived acoustic features in improving the accuracy of identification. Our results show that DFA is a reliable statistical tool for species identification using acoustic signals. Our results also show that cluster analysis of acoustic signals in crickets works effectively for species classification and identification.
Mathieu, Renaud; Aryal, Jagannath; Chong, Albert K
2007-11-20
Effective assessment of biodiversity in cities requires detailed vegetation maps.To date, most remote sensing of urban vegetation has focused on thematically coarse landcover products. Detailed habitat maps are created by manual interpretation of aerialphotographs, but this is time consuming and costly at large scale. To address this issue, wetested the effectiveness of object-based classifications that use automated imagesegmentation to extract meaningful ground features from imagery. We applied thesetechniques to very high resolution multispectral Ikonos images to produce vegetationcommunity maps in Dunedin City, New Zealand. An Ikonos image was orthorectified and amulti-scale segmentation algorithm used to produce a hierarchical network of image objects.The upper level included four coarse strata: industrial/commercial (commercial buildings),residential (houses and backyard private gardens), vegetation (vegetation patches larger than0.8/1ha), and water. We focused on the vegetation stratum that was segmented at moredetailed level to extract and classify fifteen classes of vegetation communities. The firstclassification yielded a moderate overall classification accuracy (64%, κ = 0.52), which ledus to consider a simplified classification with ten vegetation classes. The overallclassification accuracy from the simplified classification was 77% with a κ value close tothe excellent range (κ = 0.74). These results compared favourably with similar studies inother environments. We conclude that this approach does not provide maps as detailed as those produced by manually interpreting aerial photographs, but it can still extract ecologically significant classes. It is an efficient way to generate accurate and detailed maps in significantly shorter time. The final map accuracy could be improved by integrating segmentation, automated and manual classification in the mapping process, especially when considering important vegetation classes with limited spectral contrast.
Identification of phenological stages and vegetative types for land use classification
NASA Technical Reports Server (NTRS)
Mckendrick, J. D. (Principal Investigator)
1973-01-01
The author has identified the following significant results. Classification of digital data for mapping Alaskan vegetation has been compared to ground truth data and found to have accuracies as high as 90%. These classifications are broad scale types as are currently being used on the Major Ecosystems of Alaska map prepared by the Joint Federal-State Land Use Planning Commission for Alaska. Cost estimates for several options using the ERTS-1 digital data to map the Alaskan land mass at the 1:250,000 scale ranged between $2.17 to $1.49 per square mile.
Murat, Miraemiliana; Abu, Arpah; Yap, Hwa Jen; Yong, Kien-Thai
2017-01-01
Plants play a crucial role in foodstuff, medicine, industry, and environmental protection. The skill of recognising plants is very important in some applications, including conservation of endangered species and rehabilitation of lands after mining activities. However, it is a difficult task to identify plant species because it requires specialized knowledge. Developing an automated classification system for plant species is necessary and valuable since it can help specialists as well as the public in identifying plant species easily. Shape descriptors were applied on the myDAUN dataset that contains 45 tropical shrub species collected from the University of Malaya (UM), Malaysia. Based on literature review, this is the first study in the development of tropical shrub species image dataset and classification using a hybrid of leaf shape and machine learning approach. Four types of shape descriptors were used in this study namely morphological shape descriptors (MSD), Histogram of Oriented Gradients (HOG), Hu invariant moments (Hu) and Zernike moments (ZM). Single descriptor, as well as the combination of hybrid descriptors were tested and compared. The tropical shrub species are classified using six different classifiers, which are artificial neural network (ANN), random forest (RF), support vector machine (SVM), k-nearest neighbour (k-NN), linear discriminant analysis (LDA) and directed acyclic graph multiclass least squares twin support vector machine (DAG MLSTSVM). In addition, three types of feature selection methods were tested in the myDAUN dataset, Relief, Correlation-based feature selection (CFS) and Pearson’s coefficient correlation (PCC). The well-known Flavia dataset and Swedish Leaf dataset were used as the validation dataset on the proposed methods. The results showed that the hybrid of all descriptors of ANN outperformed the other classifiers with an average classification accuracy of 98.23% for the myDAUN dataset, 95.25% for the Flavia dataset and 99.89% for the Swedish Leaf dataset. In addition, the Relief feature selection method achieved the highest classification accuracy of 98.13% after 80 (or 60%) of the original features were reduced, from 133 to 53 descriptors in the myDAUN dataset with the reduction in computational time. Subsequently, the hybridisation of four descriptors gave the best results compared to others. It is proven that the combination MSD and HOG were good enough for tropical shrubs species classification. Hu and ZM descriptors also improved the accuracy in tropical shrubs species classification in terms of invariant to translation, rotation and scale. ANN outperformed the others for tropical shrub species classification in this study. Feature selection methods can be used in the classification of tropical shrub species, as the comparable results could be obtained with the reduced descriptors and reduced in computational time and cost. PMID:28924506
Murat, Miraemiliana; Chang, Siow-Wee; Abu, Arpah; Yap, Hwa Jen; Yong, Kien-Thai
2017-01-01
Plants play a crucial role in foodstuff, medicine, industry, and environmental protection. The skill of recognising plants is very important in some applications, including conservation of endangered species and rehabilitation of lands after mining activities. However, it is a difficult task to identify plant species because it requires specialized knowledge. Developing an automated classification system for plant species is necessary and valuable since it can help specialists as well as the public in identifying plant species easily. Shape descriptors were applied on the myDAUN dataset that contains 45 tropical shrub species collected from the University of Malaya (UM), Malaysia. Based on literature review, this is the first study in the development of tropical shrub species image dataset and classification using a hybrid of leaf shape and machine learning approach. Four types of shape descriptors were used in this study namely morphological shape descriptors (MSD), Histogram of Oriented Gradients (HOG), Hu invariant moments (Hu) and Zernike moments (ZM). Single descriptor, as well as the combination of hybrid descriptors were tested and compared. The tropical shrub species are classified using six different classifiers, which are artificial neural network (ANN), random forest (RF), support vector machine (SVM), k-nearest neighbour (k-NN), linear discriminant analysis (LDA) and directed acyclic graph multiclass least squares twin support vector machine (DAG MLSTSVM). In addition, three types of feature selection methods were tested in the myDAUN dataset, Relief, Correlation-based feature selection (CFS) and Pearson's coefficient correlation (PCC). The well-known Flavia dataset and Swedish Leaf dataset were used as the validation dataset on the proposed methods. The results showed that the hybrid of all descriptors of ANN outperformed the other classifiers with an average classification accuracy of 98.23% for the myDAUN dataset, 95.25% for the Flavia dataset and 99.89% for the Swedish Leaf dataset. In addition, the Relief feature selection method achieved the highest classification accuracy of 98.13% after 80 (or 60%) of the original features were reduced, from 133 to 53 descriptors in the myDAUN dataset with the reduction in computational time. Subsequently, the hybridisation of four descriptors gave the best results compared to others. It is proven that the combination MSD and HOG were good enough for tropical shrubs species classification. Hu and ZM descriptors also improved the accuracy in tropical shrubs species classification in terms of invariant to translation, rotation and scale. ANN outperformed the others for tropical shrub species classification in this study. Feature selection methods can be used in the classification of tropical shrub species, as the comparable results could be obtained with the reduced descriptors and reduced in computational time and cost.
Saini, Harsh; Lal, Sunil Pranit; Naidu, Vimal Vikash; Pickering, Vincel Wince; Singh, Gurmeet; Tsunoda, Tatsuhiko; Sharma, Alok
2016-12-05
High dimensional feature space generally degrades classification in several applications. In this paper, we propose a strategy called gene masking, in which non-contributing dimensions are heuristically removed from the data to improve classification accuracy. Gene masking is implemented via a binary encoded genetic algorithm that can be integrated seamlessly with classifiers during the training phase of classification to perform feature selection. It can also be used to discriminate between features that contribute most to the classification, thereby, allowing researchers to isolate features that may have special significance. This technique was applied on publicly available datasets whereby it substantially reduced the number of features used for classification while maintaining high accuracies. The proposed technique can be extremely useful in feature selection as it heuristically removes non-contributing features to improve the performance of classifiers.
Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods.
Polat, Huseyin; Danaei Mehr, Homay; Cetin, Aydin
2017-04-01
As Chronic Kidney Disease progresses slowly, early detection and effective treatment are the only cure to reduce the mortality rate. Machine learning techniques are gaining significance in medical diagnosis because of their classification ability with high accuracy rates. The accuracy of classification algorithms depend on the use of correct feature selection algorithms to reduce the dimension of datasets. In this study, Support Vector Machine classification algorithm was used to diagnose Chronic Kidney Disease. To diagnose the Chronic Kidney Disease, two essential types of feature selection methods namely, wrapper and filter approaches were chosen to reduce the dimension of Chronic Kidney Disease dataset. In wrapper approach, classifier subset evaluator with greedy stepwise search engine and wrapper subset evaluator with the Best First search engine were used. In filter approach, correlation feature selection subset evaluator with greedy stepwise search engine and filtered subset evaluator with the Best First search engine were used. The results showed that the Support Vector Machine classifier by using filtered subset evaluator with the Best First search engine feature selection method has higher accuracy rate (98.5%) in the diagnosis of Chronic Kidney Disease compared to other selected methods.
User-Independent Motion State Recognition Using Smartphone Sensors
Gu, Fuqiang; Kealy, Allison; Khoshelham, Kourosh; Shang, Jianga
2015-01-01
The recognition of locomotion activities (e.g., walking, running, still) is important for a wide range of applications like indoor positioning, navigation, location-based services, and health monitoring. Recently, there has been a growing interest in activity recognition using accelerometer data. However, when utilizing only acceleration-based features, it is difficult to differentiate varying vertical motion states from horizontal motion states especially when conducting user-independent classification. In this paper, we also make use of the newly emerging barometer built in modern smartphones, and propose a novel feature called pressure derivative from the barometer readings for user motion state recognition, which is proven to be effective for distinguishing vertical motion states and does not depend on specific users’ data. Seven types of motion states are defined and six commonly-used classifiers are compared. In addition, we utilize the motion state history and the characteristics of people’s motion to improve the classification accuracies of those classifiers. Experimental results show that by using the historical information and human’s motion characteristics, we can achieve user-independent motion state classification with an accuracy of up to 90.7%. In addition, we analyze the influence of the window size and smartphone pose on the accuracy. PMID:26690163
User-Independent Motion State Recognition Using Smartphone Sensors.
Gu, Fuqiang; Kealy, Allison; Khoshelham, Kourosh; Shang, Jianga
2015-12-04
The recognition of locomotion activities (e.g., walking, running, still) is important for a wide range of applications like indoor positioning, navigation, location-based services, and health monitoring. Recently, there has been a growing interest in activity recognition using accelerometer data. However, when utilizing only acceleration-based features, it is difficult to differentiate varying vertical motion states from horizontal motion states especially when conducting user-independent classification. In this paper, we also make use of the newly emerging barometer built in modern smartphones, and propose a novel feature called pressure derivative from the barometer readings for user motion state recognition, which is proven to be effective for distinguishing vertical motion states and does not depend on specific users' data. Seven types of motion states are defined and six commonly-used classifiers are compared. In addition, we utilize the motion state history and the characteristics of people's motion to improve the classification accuracies of those classifiers. Experimental results show that by using the historical information and human's motion characteristics, we can achieve user-independent motion state classification with an accuracy of up to 90.7%. In addition, we analyze the influence of the window size and smartphone pose on the accuracy.
Analysis of miRNA expression profile based on SVM algorithm
NASA Astrophysics Data System (ADS)
Ting-ting, Dai; Chang-ji, Shan; Yan-shou, Dong; Yi-duo, Bian
2018-05-01
Based on mirna expression spectrum data set, a new data mining algorithm - tSVM - KNN (t statistic with support vector machine - k nearest neighbor) is proposed. the idea of the algorithm is: firstly, the feature selection of the data set is carried out by the unified measurement method; Secondly, SVM - KNN algorithm, which combines support vector machine (SVM) and k - nearest neighbor (k - nearest neighbor) is used as classifier. Simulation results show that SVM - KNN algorithm has better classification ability than SVM and KNN alone. Tsvm - KNN algorithm only needs 5 mirnas to obtain 96.08 % classification accuracy in terms of the number of mirna " tags" and recognition accuracy. compared with similar algorithms, tsvm - KNN algorithm has obvious advantages.
LBP and SIFT based facial expression recognition
NASA Astrophysics Data System (ADS)
Sumer, Omer; Gunes, Ece O.
2015-02-01
This study compares the performance of local binary patterns (LBP) and scale invariant feature transform (SIFT) with support vector machines (SVM) in automatic classification of discrete facial expressions. Facial expression recognition is a multiclass classification problem and seven classes; happiness, anger, sadness, disgust, surprise, fear and comtempt are classified. Using SIFT feature vectors and linear SVM, 93.1% mean accuracy is acquired on CK+ database. On the other hand, the performance of LBP-based classifier with linear SVM is reported on SFEW using strictly person independent (SPI) protocol. Seven-class mean accuracy on SFEW is 59.76%. Experiments on both databases showed that LBP features can be used in a fairly descriptive way if a good localization of facial points and partitioning strategy are followed.
Al Ajmi, Eiman; Forghani, Behzad; Reinhold, Caroline; Bayat, Maryam; Forghani, Reza
2018-06-01
There is a rich amount of quantitative information in spectral datasets generated from dual-energy CT (DECT). In this study, we compare the performance of texture analysis performed on multi-energy datasets to that of virtual monochromatic images (VMIs) at 65 keV only, using classification of the two most common benign parotid neoplasms as a testing paradigm. Forty-two patients with pathologically proven Warthin tumour (n = 25) or pleomorphic adenoma (n = 17) were evaluated. Texture analysis was performed on VMIs ranging from 40 to 140 keV in 5-keV increments (multi-energy analysis) or 65-keV VMIs only, which is typically considered equivalent to single-energy CT. Random forest (RF) models were constructed for outcome prediction using separate randomly selected training and testing sets or the entire patient set. Using multi-energy texture analysis, tumour classification in the independent testing set had accuracy, sensitivity, specificity, positive predictive value, and negative predictive value of 92%, 86%, 100%, 100%, and 83%, compared to 75%, 57%, 100%, 100%, and 63%, respectively, for single-energy analysis. Multi-energy texture analysis demonstrates superior performance compared to single-energy texture analysis of VMIs at 65 keV for classification of benign parotid tumours. • We present and validate a paradigm for texture analysis of DECT scans. • Multi-energy dataset texture analysis is superior to single-energy dataset texture analysis. • DECT texture analysis has high accura\\cy for diagnosis of benign parotid tumours. • DECT texture analysis with machine learning can enhance non-invasive diagnostic tumour evaluation.
Classification of teeth in cone-beam CT using deep convolutional neural network.
Miki, Yuma; Muramatsu, Chisako; Hayashi, Tatsuro; Zhou, Xiangrong; Hara, Takeshi; Katsumata, Akitoshi; Fujita, Hiroshi
2017-01-01
Dental records play an important role in forensic identification. To this end, postmortem dental findings and teeth conditions are recorded in a dental chart and compared with those of antemortem records. However, most dentists are inexperienced at recording the dental chart for corpses, and it is a physically and mentally laborious task, especially in large scale disasters. Our goal is to automate the dental filing process by using dental x-ray images. In this study, we investigated the application of a deep convolutional neural network (DCNN) for classifying tooth types on dental cone-beam computed tomography (CT) images. Regions of interest (ROIs) including single teeth were extracted from CT slices. Fifty two CT volumes were randomly divided into 42 training and 10 test cases, and the ROIs obtained from the training cases were used for training the DCNN. For examining the sampling effect, random sampling was performed 3 times, and training and testing were repeated. We used the AlexNet network architecture provided in the Caffe framework, which consists of 5 convolution layers, 3 pooling layers, and 2 full connection layers. For reducing the overtraining effect, we augmented the data by image rotation and intensity transformation. The test ROIs were classified into 7 tooth types by the trained network. The average classification accuracy using the augmented training data by image rotation and intensity transformation was 88.8%. Compared with the result without data augmentation, data augmentation resulted in an approximately 5% improvement in classification accuracy. This indicates that the further improvement can be expected by expanding the CT dataset. Unlike the conventional methods, the proposed method is advantageous in obtaining high classification accuracy without the need for precise tooth segmentation. The proposed tooth classification method can be useful in automatic filing of dental charts for forensic identification. Copyright © 2016 Elsevier Ltd. All rights reserved.
Analysis of near infrared spectra for age-grading of wild populations of Anopheles gambiae.
Krajacich, Benjamin J; Meyers, Jacob I; Alout, Haoues; Dabiré, Roch K; Dowell, Floyd E; Foy, Brian D
2017-11-07
Understanding the age-structure of mosquito populations, especially malaria vectors such as Anopheles gambiae, is important for assessing the risk of infectious mosquitoes, and how vector control interventions may impact this risk. The use of near-infrared spectroscopy (NIRS) for age-grading has been demonstrated previously on laboratory and semi-field mosquitoes, but to date has not been utilized on wild-caught mosquitoes whose age is externally validated via parity status or parasite infection stage. In this study, we developed regression and classification models using NIRS on datasets of wild An. gambiae (s.l.) reared from larvae collected from the field in Burkina Faso, and two laboratory strains. We compared the accuracy of these models for predicting the ages of wild-caught mosquitoes that had been scored for their parity status as well as for positivity for Plasmodium sporozoites. Regression models utilizing variable selection increased predictive accuracy over the more common full-spectrum partial least squares (PLS) approach for cross-validation of the datasets, validation, and independent test sets. Models produced from datasets that included the greatest range of mosquito samples (i.e. different sampling locations and times) had the highest predictive accuracy on independent testing sets, though overall accuracy on these samples was low. For classification, we found that intramodel accuracy ranged between 73.5-97.0% for grouping of mosquitoes into "early" and "late" age classes, with the highest prediction accuracy found in laboratory colonized mosquitoes. However, this accuracy was decreased on test sets, with the highest classification of an independent set of wild-caught larvae reared to set ages being 69.6%. Variation in NIRS data, likely from dietary, genetic, and other factors limits the accuracy of this technique with wild-caught mosquitoes. Alternative algorithms may help improve prediction accuracy, but care should be taken to either maximize variety in models or minimize confounders.
A review of supervised object-based land-cover image classification
NASA Astrophysics Data System (ADS)
Ma, Lei; Li, Manchun; Ma, Xiaoxue; Cheng, Liang; Du, Peijun; Liu, Yongxue
2017-08-01
Object-based image classification for land-cover mapping purposes using remote-sensing imagery has attracted significant attention in recent years. Numerous studies conducted over the past decade have investigated a broad array of sensors, feature selection, classifiers, and other factors of interest. However, these research results have not yet been synthesized to provide coherent guidance on the effect of different supervised object-based land-cover classification processes. In this study, we first construct a database with 28 fields using qualitative and quantitative information extracted from 254 experimental cases described in 173 scientific papers. Second, the results of the meta-analysis are reported, including general characteristics of the studies (e.g., the geographic range of relevant institutes, preferred journals) and the relationships between factors of interest (e.g., spatial resolution and study area or optimal segmentation scale, accuracy and number of targeted classes), especially with respect to the classification accuracy of different sensors, segmentation scale, training set size, supervised classifiers, and land-cover types. Third, useful data on supervised object-based image classification are determined from the meta-analysis. For example, we find that supervised object-based classification is currently experiencing rapid advances, while development of the fuzzy technique is limited in the object-based framework. Furthermore, spatial resolution correlates with the optimal segmentation scale and study area, and Random Forest (RF) shows the best performance in object-based classification. The area-based accuracy assessment method can obtain stable classification performance, and indicates a strong correlation between accuracy and training set size, while the accuracy of the point-based method is likely to be unstable due to mixed objects. In addition, the overall accuracy benefits from higher spatial resolution images (e.g., unmanned aerial vehicle) or agricultural sites where it also correlates with the number of targeted classes. More than 95.6% of studies involve an area less than 300 ha, and the spatial resolution of images is predominantly between 0 and 2 m. Furthermore, we identify some methods that may advance supervised object-based image classification. For example, deep learning and type-2 fuzzy techniques may further improve classification accuracy. Lastly, scientists are strongly encouraged to report results of uncertainty studies to further explore the effects of varied factors on supervised object-based image classification.
NASA Technical Reports Server (NTRS)
Tarabalka, Y.; Tilton, J. C.; Benediktsson, J. A.; Chanussot, J.
2012-01-01
The Hierarchical SEGmentation (HSEG) algorithm, which combines region object finding with region object clustering, has given good performances for multi- and hyperspectral image analysis. This technique produces at its output a hierarchical set of image segmentations. The automated selection of a single segmentation level is often necessary. We propose and investigate the use of automatically selected markers for this purpose. In this paper, a novel Marker-based HSEG (M-HSEG) method for spectral-spatial classification of hyperspectral images is proposed. Two classification-based approaches for automatic marker selection are adapted and compared for this purpose. Then, a novel constrained marker-based HSEG algorithm is applied, resulting in a spectral-spatial classification map. Three different implementations of the M-HSEG method are proposed and their performances in terms of classification accuracies are compared. The experimental results, presented for three hyperspectral airborne images, demonstrate that the proposed approach yields accurate segmentation and classification maps, and thus is attractive for remote sensing image analysis.
Quirós, Elia; Felicísimo, Angel M; Cuartero, Aurora
2009-01-01
This work proposes a new method to classify multi-spectral satellite images based on multivariate adaptive regression splines (MARS) and compares this classification system with the more common parallelepiped and maximum likelihood (ML) methods. We apply the classification methods to the land cover classification of a test zone located in southwestern Spain. The basis of the MARS method and its associated procedures are explained in detail, and the area under the ROC curve (AUC) is compared for the three methods. The results show that the MARS method provides better results than the parallelepiped method in all cases, and it provides better results than the maximum likelihood method in 13 cases out of 17. These results demonstrate that the MARS method can be used in isolation or in combination with other methods to improve the accuracy of soil cover classification. The improvement is statistically significant according to the Wilcoxon signed rank test.
Li, Zhaohua; Wang, Yuduo; Quan, Wenxiang; Wu, Tongning; Lv, Bin
2015-02-15
Based on near-infrared spectroscopy (NIRS), recent converging evidence has been observed that patients with schizophrenia exhibit abnormal functional activities in the prefrontal cortex during a verbal fluency task (VFT). Therefore, some studies have attempted to employ NIRS measurements to differentiate schizophrenia patients from healthy controls with different classification methods. However, no systematic evaluation was conducted to compare their respective classification performances on the same study population. In this study, we evaluated the classification performance of four classification methods (including linear discriminant analysis, k-nearest neighbors, Gaussian process classifier, and support vector machines) on an NIRS-aided schizophrenia diagnosis. We recruited a large sample of 120 schizophrenia patients and 120 healthy controls and measured the hemoglobin response in the prefrontal cortex during the VFT using a multichannel NIRS system. Features for classification were extracted from three types of NIRS data in each channel. We subsequently performed a principal component analysis (PCA) for feature selection prior to comparison of the different classification methods. We achieved a maximum accuracy of 85.83% and an overall mean accuracy of 83.37% using a PCA-based feature selection on oxygenated hemoglobin signals and support vector machine classifier. This is the first comprehensive evaluation of different classification methods for the diagnosis of schizophrenia based on different types of NIRS signals. Our results suggested that, using the appropriate classification method, NIRS has the potential capacity to be an effective objective biomarker for the diagnosis of schizophrenia. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Jung, Jinha; Pasolli, Edoardo; Prasad, Saurabh; Tilton, James C.; Crawford, Melba M.
2014-01-01
Acquiring current, accurate land-use information is critical for monitoring and understanding the impact of anthropogenic activities on natural environments.Remote sensing technologies are of increasing importance because of their capability to acquire information for large areas in a timely manner, enabling decision makers to be more effective in complex environments. Although optical imagery has demonstrated to be successful for land cover classification, active sensors, such as light detection and ranging (LiDAR), have distinct capabilities that can be exploited to improve classification results. However, utilization of LiDAR data for land cover classification has not been fully exploited. Moreover, spatial-spectral classification has recently gained significant attention since classification accuracy can be improved by extracting additional information from the neighboring pixels. Although spatial information has been widely used for spectral data, less attention has been given to LiDARdata. In this work, a new framework for land cover classification using discrete return LiDAR data is proposed. Pseudo-waveforms are generated from the LiDAR data and processed by hierarchical segmentation. Spatial featuresare extracted in a region-based way using a new unsupervised strategy for multiple pruning of the segmentation hierarchy. The proposed framework is validated experimentally on a real dataset acquired in an urban area. Better classification results are exhibited by the proposed framework compared to the cases in which basic LiDAR products such as digital surface model and intensity image are used. Moreover, the proposed region-based feature extraction strategy results in improved classification accuracies in comparison with a more traditional window-based approach.
Practical Issues in Estimating Classification Accuracy and Consistency with R Package cacIRT
ERIC Educational Resources Information Center
Lathrop, Quinn N.
2015-01-01
There are two main lines of research in estimating classification accuracy (CA) and classification consistency (CC) under Item Response Theory (IRT). The R package cacIRT provides computer implementations of both approaches in an accessible and unified framework. Even with available implementations, there remains decisions a researcher faces when…
Variance estimates and confidence intervals for the Kappa measure of classification accuracy
M. A. Kalkhan; R. M. Reich; R. L. Czaplewski
1997-01-01
The Kappa statistic is frequently used to characterize the results of an accuracy assessment used to evaluate land use and land cover classifications obtained by remotely sensed data. This statistic allows comparisons of alternative sampling designs, classification algorithms, photo-interpreters, and so forth. In order to make these comparisons, it is...
Hyperspectral feature mapping classification based on mathematical morphology
NASA Astrophysics Data System (ADS)
Liu, Chang; Li, Junwei; Wang, Guangping; Wu, Jingli
2016-03-01
This paper proposed a hyperspectral feature mapping classification algorithm based on mathematical morphology. Without the priori information such as spectral library etc., the spectral and spatial information can be used to realize the hyperspectral feature mapping classification. The mathematical morphological erosion and dilation operations are performed respectively to extract endmembers. The spectral feature mapping algorithm is used to carry on hyperspectral image classification. The hyperspectral image collected by AVIRIS is applied to evaluate the proposed algorithm. The proposed algorithm is compared with minimum Euclidean distance mapping algorithm, minimum Mahalanobis distance mapping algorithm, SAM algorithm and binary encoding mapping algorithm. From the results of the experiments, it is illuminated that the proposed algorithm's performance is better than that of the other algorithms under the same condition and has higher classification accuracy.
Foliar and woody materials discriminated using terrestrial LiDAR in a mixed natural forest
NASA Astrophysics Data System (ADS)
Zhu, Xi; Skidmore, Andrew K.; Darvishzadeh, Roshanak; Niemann, K. Olaf; Liu, Jing; Shi, Yifang; Wang, Tiejun
2018-02-01
Separation of foliar and woody materials using remotely sensed data is crucial for the accurate estimation of leaf area index (LAI) and woody biomass across forest stands. In this paper, we present a new method to accurately separate foliar and woody materials using terrestrial LiDAR point clouds obtained from ten test sites in a mixed forest in Bavarian Forest National Park, Germany. Firstly, we applied and compared an adaptive radius near-neighbor search algorithm with a fixed radius near-neighbor search method in order to obtain both radiometric and geometric features derived from terrestrial LiDAR point clouds. Secondly, we used a random forest machine learning algorithm to classify foliar and woody materials and examined the impact of understory and slope on the classification accuracy. An average overall accuracy of 84.4% (Kappa = 0.75) was achieved across all experimental plots. The adaptive radius near-neighbor search method outperformed the fixed radius near-neighbor search method. The classification accuracy was significantly higher when the combination of both radiometric and geometric features was utilized. The analysis showed that increasing slope and understory coverage had a significant negative effect on the overall classification accuracy. Our results suggest that the utilization of the adaptive radius near-neighbor search method coupling both radiometric and geometric features has the potential to accurately discriminate foliar and woody materials from terrestrial LiDAR data in a mixed natural forest.
Automatic Fault Characterization via Abnormality-Enhanced Classification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bronevetsky, G; Laguna, I; de Supinski, B R
Enterprise and high-performance computing systems are growing extremely large and complex, employing hundreds to hundreds of thousands of processors and software/hardware stacks built by many people across many organizations. As the growing scale of these machines increases the frequency of faults, system complexity makes these faults difficult to detect and to diagnose. Current system management techniques, which focus primarily on efficient data access and query mechanisms, require system administrators to examine the behavior of various system services manually. Growing system complexity is making this manual process unmanageable: administrators require more effective management tools that can detect faults and help tomore » identify their root causes. System administrators need timely notification when a fault is manifested that includes the type of fault, the time period in which it occurred and the processor on which it originated. Statistical modeling approaches can accurately characterize system behavior. However, the complex effects of system faults make these tools difficult to apply effectively. This paper investigates the application of classification and clustering algorithms to fault detection and characterization. We show experimentally that naively applying these methods achieves poor accuracy. Further, we design novel techniques that combine classification algorithms with information on the abnormality of application behavior to improve detection and characterization accuracy. Our experiments demonstrate that these techniques can detect and characterize faults with 65% accuracy, compared to just 5% accuracy for naive approaches.« less
NASA Astrophysics Data System (ADS)
Quesada-Barriuso, Pablo; Heras, Dora B.; Argüello, Francisco
2016-10-01
The classification of remote sensing hyperspectral images for land cover applications is a very intensive topic. In the case of supervised classification, Support Vector Machines (SVMs) play a dominant role. Recently, the Extreme Learning Machine algorithm (ELM) has been extensively used. The classification scheme previously published by the authors, and called WT-EMP, introduces spatial information in the classification process by means of an Extended Morphological Profile (EMP) that is created from features extracted by wavelets. In addition, the hyperspectral image is denoised in the 2-D spatial domain, also using wavelets and it is joined to the EMP via a stacked vector. In this paper, the scheme is improved achieving two goals. The first one is to reduce the classification time while preserving the accuracy of the classification by using ELM instead of SVM. The second one is to improve the accuracy results by performing not only a 2-D denoising for every spectral band, but also a previous additional 1-D spectral signature denoising applied to each pixel vector of the image. For each denoising the image is transformed by applying a 1-D or 2-D wavelet transform, and then a NeighShrink thresholding is applied. Improvements in terms of classification accuracy are obtained, especially for images with close regions in the classification reference map, because in these cases the accuracy of the classification in the edges between classes is more relevant.
Classification of radiolarian images with hand-crafted and deep features
NASA Astrophysics Data System (ADS)
Keçeli, Ali Seydi; Kaya, Aydın; Keçeli, Seda Uzunçimen
2017-12-01
Radiolarians are planktonic protozoa and are important biostratigraphic and paleoenvironmental indicators for paleogeographic reconstructions. Radiolarian paleontology still remains as a low cost and the one of the most convenient way to obtain dating of deep ocean sediments. Traditional methods for identifying radiolarians are time-consuming and cannot scale to the granularity or scope necessary for large-scale studies. Automated image classification will allow making these analyses promptly. In this study, a method for automatic radiolarian image classification is proposed on Scanning Electron Microscope (SEM) images of radiolarians to ease species identification of fossilized radiolarians. The proposed method uses both hand-crafted features like invariant moments, wavelet moments, Gabor features, basic morphological features and deep features obtained from a pre-trained Convolutional Neural Network (CNN). Feature selection is applied over deep features to reduce high dimensionality. Classification outcomes are analyzed to compare hand-crafted features, deep features, and their combinations. Results show that the deep features obtained from a pre-trained CNN are more discriminative comparing to hand-crafted ones. Additionally, feature selection utilizes to the computational cost of classification algorithms and have no negative effect on classification accuracy.
NASA Astrophysics Data System (ADS)
Matikainen, L.; Karila, K.; Hyyppä, J.; Puttonen, E.; Litkey, P.; Ahokas, E.
2017-10-01
This article summarises our first results and experiences on the use of multispectral airborne laser scanner (ALS) data. Optech Titan multispectral ALS data over a large suburban area in Finland were acquired on three different dates in 2015-2016. We investigated the feasibility of the data from the first date for land cover classification and road mapping. Object-based analyses with segmentation and random forests classification were used. The potential of the data for change detection of buildings and roads was also demonstrated. The overall accuracy of land cover classification results with six classes was 96 % compared with validation points. The data also showed high potential for road detection, road surface classification and change detection. The multispectral intensity information appeared to be very important for automated classifications. Compared to passive aerial images, the intensity images have interesting advantages, such as the lack of shadows. Currently, we focus on analyses and applications with the multitemporal multispectral data. Important questions include, for example, the potential and challenges of the multitemporal data for change detection.
Voice based gender classification using machine learning
NASA Astrophysics Data System (ADS)
Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.
2017-11-01
Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.
Disregarding population specificity: its influence on the sex assessment methods from the tibia.
Kotěrová, Anežka; Velemínská, Jana; Dupej, Ján; Brzobohatá, Hana; Pilný, Aleš; Brůžek, Jaroslav
2017-01-01
Forensic anthropology has developed classification techniques for sex estimation of unknown skeletal remains, for example population-specific discriminant function analyses. These methods were designed for populations that lived mostly in the late nineteenth and twentieth centuries. Their level of reliability or misclassification is important for practical use in today's forensic practice; it is, however, unknown. We addressed the question of what the likelihood of errors would be if population specificity of discriminant functions of the tibia were disregarded. Moreover, five classification functions in a Czech sample were proposed (accuracies 82.1-87.5 %, sex bias ranged from -1.3 to -5.4 %). We measured ten variables traditionally used for sex assessment of the tibia on a sample of 30 male and 26 female models from recent Czech population. To estimate the classification accuracy and error (misclassification) rates ignoring population specificity, we selected published classification functions of tibia for the Portuguese, south European, and the North American populations. These functions were applied on the dimensions of the Czech population. Comparing the classification success of the reference and the tested Czech sample showed that females from Czech population were significantly overestimated and mostly misclassified as males. Overall accuracy of sex assessment significantly decreased (53.6-69.7 %), sex bias -29.4-100 %, which is most probably caused by secular trend and the generally high variability of body size. Results indicate that the discriminant functions, developed for skeletal series representing geographically and chronologically diverse populations, are not applicable in current forensic investigations. Finally, implications and recommendations for future research are discussed.
Hyperspectral analysis of columbia spotted frog habitat
Shive, J.P.; Pilliod, D.S.; Peterson, C.R.
2010-01-01
Wildlife managers increasingly are using remotely sensed imagery to improve habitat delineations and sampling strategies. Advances in remote sensing technology, such as hyperspectral imagery, provide more information than previously was available with multispectral sensors. We evaluated accuracy of high-resolution hyperspectral image classifications to identify wetlands and wetland habitat features important for Columbia spotted frogs (Rana luteiventris) and compared the results to multispectral image classification and United States Geological Survey topographic maps. The study area spanned 3 lake basins in the Salmon River Mountains, Idaho, USA. Hyperspectral data were collected with an airborne sensor on 30 June 2002 and on 8 July 2006. A 12-year comprehensive ground survey of the study area for Columbia spotted frog reproduction served as validation for image classifications. Hyperspectral image classification accuracy of wetlands was high, with a producer's accuracy of 96 (44 wetlands) correctly classified with the 2002 data and 89 (41 wetlands) correctly classified with the 2006 data. We applied habitat-based rules to delineate breeding habitat from other wetlands, and successfully predicted 74 (14 wetlands) of known breeding wetlands for the Columbia spotted frog. Emergent sedge microhabitat classification showed promise for directly predicting Columbia spotted frog egg mass locations within a wetland by correctly identifying 72 (23 of 32) of known locations. Our study indicates hyperspectral imagery can be an effective tool for mapping spotted frog breeding habitat in the selected mountain basins. We conclude that this technique has potential for improving site selection for inventory and monitoring programs conducted across similar wetland habitat and can be a useful tool for delineating wildlife habitats. ?? 2010 The Wildlife Society.
Three-Class Mammogram Classification Based on Descriptive CNN Features
Zhang, Qianni; Jadoon, Adeel
2017-01-01
In this paper, a novel classification technique for large data set of mammograms using a deep learning method is proposed. The proposed model targets a three-class classification study (normal, malignant, and benign cases). In our model we have presented two methods, namely, convolutional neural network-discrete wavelet (CNN-DW) and convolutional neural network-curvelet transform (CNN-CT). An augmented data set is generated by using mammogram patches. To enhance the contrast of mammogram images, the data set is filtered by contrast limited adaptive histogram equalization (CLAHE). In the CNN-DW method, enhanced mammogram images are decomposed as its four subbands by means of two-dimensional discrete wavelet transform (2D-DWT), while in the second method discrete curvelet transform (DCT) is used. In both methods, dense scale invariant feature (DSIFT) for all subbands is extracted. Input data matrix containing these subband features of all the mammogram patches is created that is processed as input to convolutional neural network (CNN). Softmax layer and support vector machine (SVM) layer are used to train CNN for classification. Proposed methods have been compared with existing methods in terms of accuracy rate, error rate, and various validation assessment measures. CNN-DW and CNN-CT have achieved accuracy rate of 81.83% and 83.74%, respectively. Simulation results clearly validate the significance and impact of our proposed model as compared to other well-known existing techniques. PMID:28191461
Three-Class Mammogram Classification Based on Descriptive CNN Features.
Jadoon, M Mohsin; Zhang, Qianni; Haq, Ihsan Ul; Butt, Sharjeel; Jadoon, Adeel
2017-01-01
In this paper, a novel classification technique for large data set of mammograms using a deep learning method is proposed. The proposed model targets a three-class classification study (normal, malignant, and benign cases). In our model we have presented two methods, namely, convolutional neural network-discrete wavelet (CNN-DW) and convolutional neural network-curvelet transform (CNN-CT). An augmented data set is generated by using mammogram patches. To enhance the contrast of mammogram images, the data set is filtered by contrast limited adaptive histogram equalization (CLAHE). In the CNN-DW method, enhanced mammogram images are decomposed as its four subbands by means of two-dimensional discrete wavelet transform (2D-DWT), while in the second method discrete curvelet transform (DCT) is used. In both methods, dense scale invariant feature (DSIFT) for all subbands is extracted. Input data matrix containing these subband features of all the mammogram patches is created that is processed as input to convolutional neural network (CNN). Softmax layer and support vector machine (SVM) layer are used to train CNN for classification. Proposed methods have been compared with existing methods in terms of accuracy rate, error rate, and various validation assessment measures. CNN-DW and CNN-CT have achieved accuracy rate of 81.83% and 83.74%, respectively. Simulation results clearly validate the significance and impact of our proposed model as compared to other well-known existing techniques.
An Optimal Set of Flesh Points on Tongue and Lips for Speech-Movement Classification
Samal, Ashok; Rong, Panying; Green, Jordan R.
2016-01-01
Purpose The authors sought to determine an optimal set of flesh points on the tongue and lips for classifying speech movements. Method The authors used electromagnetic articulographs (Carstens AG500 and NDI Wave) to record tongue and lip movements from 13 healthy talkers who articulated 8 vowels, 11 consonants, a phonetically balanced set of words, and a set of short phrases during the recording. We used a machine-learning classifier (support-vector machine) to classify the speech stimuli on the basis of articulatory movements. We then compared classification accuracies of the flesh-point combinations to determine an optimal set of sensors. Results When data from the 4 sensors (T1: the vicinity between the tongue tip and tongue blade; T4: the tongue-body back; UL: the upper lip; and LL: the lower lip) were combined, phoneme and word classifications were most accurate and were comparable with the full set (including T2: the tongue-body front; and T3: the tongue-body front). Conclusion We identified a 4-sensor set—that is, T1, T4, UL, LL—that yielded a classification accuracy (91%–95%) equivalent to that using all 6 sensors. These findings provide an empirical basis for selecting sensors and their locations for scientific and emerging clinical applications that incorporate articulatory movements. PMID:26564030
Adebileje, Sikiru Afolabi; Ghasemi, Keyvan; Aiyelabegan, Hammed Tanimowo; Saligheh Rad, Hamidreza
2017-04-01
Proton magnetic resonance spectroscopy is a powerful noninvasive technique that complements the structural images of cMRI, which aids biomedical and clinical researches, by identifying and visualizing the compositions of various metabolites within the tissues of interest. However, accurate classification of proton magnetic resonance spectroscopy is still a challenging issue in clinics due to low signal-to-noise ratio, overlapping peaks of metabolites, and the presence of background macromolecules. This paper evaluates the performance of a discriminate dictionary learning classifiers based on projective dictionary pair learning method for brain gliomas proton magnetic resonance spectroscopy spectra classification task, and the result were compared with the sub-dictionary learning methods. The proton magnetic resonance spectroscopy data contain a total of 150 spectra (74 healthy, 23 grade II, 23 grade III, and 30 grade IV) from two databases. The datasets from both databases were first coupled together, followed by column normalization. The Kennard-Stone algorithm was used to split the datasets into its training and test sets. Performance comparison based on the overall accuracy, sensitivity, specificity, and precision was conducted. Based on the overall accuracy of our classification scheme, the dictionary pair learning method was found to outperform the sub-dictionary learning methods 97.78% compared with 68.89%, respectively. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
The Ex Vivo Eye Irritation Test as an alternative test method for serious eye damage/eye irritation.
Spöler, Felix; Kray, Oya; Kray, Stefan; Panfil, Claudia; Schrage, Norbert F
2015-07-01
Ocular irritation testing is a common requirement for the classification, labelling and packaging of chemicals (substances and mixtures). The in vivo Draize rabbit eye test (OECD Test Guideline 405) is considered to be the regulatory reference method for the classification of chemicals according to their potential to induce eye injury. In the Draize test, chemicals are applied to rabbit eyes in vivo, and changes are monitored over time. If no damage is observed, the chemical is not categorised. Otherwise, the classification depends on the severity and reversibility of the damage. Alternative test methods have to be designed to match the classifications from the in vivo reference method. However, observation of damage reversibility is usually not possible in vitro. Within the present study, a new organotypic method based on rabbit corneas obtained from food production is demonstrated to close this gap. The Ex Vivo Eye Irritation Test (EVEIT) retains the full biochemical activity of the corneal epithelium, epithelial stem cells and endothelium. This permits the in-depth analysis of ocular chemical trauma beyond that achievable by using established in vitro methods. In particular, the EVEIT is the first test to permit the direct monitoring of recovery of all corneal layers after damage. To develop a prediction model for the EVEIT that is comparable to the GHS system, 37 reference chemicals were analysed. The experimental data were used to derive a three-level potency ranking of eye irritation and corrosion that best fits the GHS categorisation. In vivo data available in the literature were used for comparison. When compared with GHS classification predictions, the overall accuracy of the three-level potency ranking was 78%. The classification of chemicals as irritating versus non-irritating resulted in 96% sensitivity, 91% specificity and 95% accuracy. 2015 FRAME.
Baltzer, Pascal A T; Dietzel, Matthias; Kaiser, Werner A
2013-08-01
In the face of multiple available diagnostic criteria in MR-mammography (MRM), a practical algorithm for lesion classification is needed. Such an algorithm should be as simple as possible and include only important independent lesion features to differentiate benign from malignant lesions. This investigation aimed to develop a simple classification tree for differential diagnosis in MRM. A total of 1,084 lesions in standardised MRM with subsequent histological verification (648 malignant, 436 benign) were investigated. Seventeen lesion criteria were assessed by 2 readers in consensus. Classification analysis was performed using the chi-squared automatic interaction detection (CHAID) method. Results include the probability for malignancy for every descriptor combination in the classification tree. A classification tree incorporating 5 lesion descriptors with a depth of 3 ramifications (1, root sign; 2, delayed enhancement pattern; 3, border, internal enhancement and oedema) was calculated. Of all 1,084 lesions, 262 (40.4 %) and 106 (24.3 %) could be classified as malignant and benign with an accuracy above 95 %, respectively. Overall diagnostic accuracy was 88.4 %. The classification algorithm reduced the number of categorical descriptors from 17 to 5 (29.4 %), resulting in a high classification accuracy. More than one third of all lesions could be classified with accuracy above 95 %. • A practical algorithm has been developed to classify lesions found in MR-mammography. • A simple decision tree consisting of five criteria reaches high accuracy of 88.4 %. • Unique to this approach, each classification is associated with a diagnostic certainty. • Diagnostic certainty of greater than 95 % is achieved in 34 % of all cases.
Zhang, Xiaoheng; Wang, Lirui; Cao, Yao; Wang, Pin; Zhang, Cheng; Yang, Liuyang; Li, Yongming; Zhang, Yanling; Cheng, Oumei
2018-02-01
Diagnosis of Parkinson's disease (PD) based on speech data has been proved to be an effective way in recent years. However, current researches just care about the feature extraction and classifier design, and do not consider the instance selection. Former research by authors showed that the instance selection can lead to improvement on classification accuracy. However, no attention is paid on the relationship between speech sample and feature until now. Therefore, a new diagnosis algorithm of PD is proposed in this paper by simultaneously selecting speech sample and feature based on relevant feature weighting algorithm and multiple kernel method, so as to find their synergy effects, thereby improving classification accuracy. Experimental results showed that this proposed algorithm obtained apparent improvement on classification accuracy. It can obtain mean classification accuracy of 82.5%, which was 30.5% higher than the relevant algorithm. Besides, the proposed algorithm detected the synergy effects of speech sample and feature, which is valuable for speech marker extraction.
Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa; Al-Garadi, Mohammed Ali
2018-06-01
Text categorization has been used extensively in recent years to classify plain-text clinical reports. This study employs text categorization techniques for the classification of open narrative forensic autopsy reports. One of the key steps in text classification is document representation. In document representation, a clinical report is transformed into a format that is suitable for classification. The traditional document representation technique for text categorization is the bag-of-words (BoW) technique. In this study, the traditional BoW technique is ineffective in classifying forensic autopsy reports because it merely extracts frequent but discriminative features from clinical reports. Moreover, this technique fails to capture word inversion, as well as word-level synonymy and polysemy, when classifying autopsy reports. Hence, the BoW technique suffers from low accuracy and low robustness unless it is improved with contextual and application-specific information. To overcome the aforementioned limitations of the BoW technique, this research aims to develop an effective conceptual graph-based document representation (CGDR) technique to classify 1500 forensic autopsy reports from four (4) manners of death (MoD) and sixteen (16) causes of death (CoD). Term-based and Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) based conceptual features were extracted and represented through graphs. These features were then used to train a two-level text classifier. The first level classifier was responsible for predicting MoD. In addition, the second level classifier was responsible for predicting CoD using the proposed conceptual graph-based document representation technique. To demonstrate the significance of the proposed technique, its results were compared with those of six (6) state-of-the-art document representation techniques. Lastly, this study compared the effects of one-level classification and two-level classification on the experimental results. The experimental results indicated that the CGDR technique achieved 12% to 15% improvement in accuracy compared with fully automated document representation baseline techniques. Moreover, two-level classification obtained better results compared with one-level classification. The promising results of the proposed conceptual graph-based document representation technique suggest that pathologists can adopt the proposed system as their basis for second opinion, thereby supporting them in effectively determining CoD. Copyright © 2018 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Navratil, Peter; Wilps, Hans
2013-01-01
Three different object-based image classification techniques are applied to high-resolution satellite data for the mapping of the habitats of Asian migratory locust (Locusta migratoria migratoria) in the southern Aral Sea basin, Uzbekistan. A set of panchromatic and multispectral Système Pour l'Observation de la Terre-5 satellite images was spectrally enhanced by normalized difference vegetation index and tasseled cap transformation and segmented into image objects, which were then classified by three different classification approaches: a rule-based hierarchical fuzzy threshold (HFT) classification method was compared to a supervised nearest neighbor classifier and classification tree analysis by the quick, unbiased, efficient statistical trees algorithm. Special emphasis was laid on the discrimination of locust feeding and breeding habitats due to the significance of this discrimination for practical locust control. Field data on vegetation and land cover, collected at the time of satellite image acquisition, was used to evaluate classification accuracy. The results show that a robust HFT classifier outperformed the two automated procedures by 13% overall accuracy. The classification method allowed a reliable discrimination of locust feeding and breeding habitats, which is of significant importance for the application of the resulting data for an economically and environmentally sound control of locust pests because exact spatial knowledge on the habitat types allows a more effective surveying and use of pesticides.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Getman, Daniel J
2008-01-01
Many attempts to observe changes in terrestrial systems over time would be significantly enhanced if it were possible to improve the accuracy of classifications of low-resolution historic satellite data. In an effort to examine improving the accuracy of historic satellite image classification by combining satellite and air photo data, two experiments were undertaken in which low-resolution multispectral data and high-resolution panchromatic data were combined and then classified using the ECHO spectral-spatial image classification algorithm and the Maximum Likelihood technique. The multispectral data consisted of 6 multispectral channels (30-meter pixel resolution) from Landsat 7. These data were augmented with panchromatic datamore » (15m pixel resolution) from Landsat 7 in the first experiment, and with a mosaic of digital aerial photography (1m pixel resolution) in the second. The addition of the Landsat 7 panchromatic data provided a significant improvement in the accuracy of classifications made using the ECHO algorithm. Although the inclusion of aerial photography provided an improvement in accuracy, this improvement was only statistically significant at a 40-60% level. These results suggest that once error levels associated with combining aerial photography and multispectral satellite data are reduced, this approach has the potential to significantly enhance the precision and accuracy of classifications made using historic remotely sensed data, as a way to extend the time range of efforts to track temporal changes in terrestrial systems.« less
ERIC Educational Resources Information Center
Wyse, Adam E.; Babcock, Ben
2016-01-01
A common suggestion made in the psychometric literature for fixed-length classification tests is that one should design tests so that they have maximum information at the cut score. Designing tests in this way is believed to maximize the classification accuracy and consistency of the assessment. This article uses simulated examples to illustrate…
A New Data Mining Scheme Using Artificial Neural Networks
Kamruzzaman, S. M.; Jehad Sarkar, A. M.
2011-01-01
Classification is one of the data mining problems receiving enormous attention in the database community. Although artificial neural networks (ANNs) have been successfully applied in a wide range of machine learning applications, they are however often regarded as black boxes, i.e., their predictions cannot be explained. To enhance the explanation of ANNs, a novel algorithm to extract symbolic rules from ANNs has been proposed in this paper. ANN methods have not been effectively utilized for data mining tasks because how the classifications were made is not explicitly stated as symbolic rules that are suitable for verification or interpretation by human experts. With the proposed approach, concise symbolic rules with high accuracy, that are easily explainable, can be extracted from the trained ANNs. Extracted rules are comparable with other methods in terms of number of rules, average number of conditions for a rule, and the accuracy. The effectiveness of the proposed approach is clearly demonstrated by the experimental results on a set of benchmark data mining classification problems. PMID:22163866
Zhang, Wenyu; Zhang, Zhenjiang
2015-01-01
Decision fusion in sensor networks enables sensors to improve classification accuracy while reducing the energy consumption and bandwidth demand for data transmission. In this paper, we focus on the decentralized multi-class classification fusion problem in wireless sensor networks (WSNs) and a new simple but effective decision fusion rule based on belief function theory is proposed. Unlike existing belief function based decision fusion schemes, the proposed approach is compatible with any type of classifier because the basic belief assignments (BBAs) of each sensor are constructed on the basis of the classifier’s training output confusion matrix and real-time observations. We also derive explicit global BBA in the fusion center under Dempster’s combinational rule, making the decision making operation in the fusion center greatly simplified. Also, sending the whole BBA structure to the fusion center is avoided. Experimental results demonstrate that the proposed fusion rule has better performance in fusion accuracy compared with the naïve Bayes rule and weighted majority voting rule. PMID:26295399
NASA Astrophysics Data System (ADS)
Yekkehkhany, B.; Safari, A.; Homayouni, S.; Hasanlou, M.
2014-10-01
In this paper, a framework is developed based on Support Vector Machines (SVM) for crop classification using polarimetric features extracted from multi-temporal Synthetic Aperture Radar (SAR) imageries. The multi-temporal integration of data not only improves the overall retrieval accuracy but also provides more reliable estimates with respect to single-date data. Several kernel functions are employed and compared in this study for mapping the input space to higher Hilbert dimension space. These kernel functions include linear, polynomials and Radial Based Function (RBF). The method is applied to several UAVSAR L-band SAR images acquired over an agricultural area near Winnipeg, Manitoba, Canada. In this research, the temporal alpha features of H/A/α decomposition method are used in classification. The experimental tests show an SVM classifier with RBF kernel for three dates of data increases the Overall Accuracy (OA) to up to 3% in comparison to using linear kernel function, and up to 1% in comparison to a 3rd degree polynomial kernel function.
Validation of Accelerometer Cut-Points in Children With Cerebral Palsy Aged 4 to 5 Years.
Keawutan, Piyapa; Bell, Kristie L; Oftedal, Stina; Davies, Peter S W; Boyd, Roslyn N
2016-01-01
To derive and validate triaxial accelerometer cut-points in children with cerebral palsy (CP) and compare these with previously established cut-points in children with typical development. Eighty-four children with CP aged 4 to 5 years wore the ActiGraph during a play-based gross motor function measure assessment that was video-taped for direct observation. Receiver operating characteristic and Bland-Altman plots were used for analyses. The ActiGraph had good classification accuracy in Gross Motor Function Classification System (GMFCS) levels III and V and fair classification accuracy in GMFCS levels I, II, and IV. These results support the use of the previously established cut-points for sedentary time of 820 counts per minute in children with CP aged 4 to 5 years across all functional abilities. The cut-point provides an objective measure of sedentary and active time in children with CP. The cut-point is applicable to group data but not for individual children.
Optical signal processing using photonic reservoir computing
NASA Astrophysics Data System (ADS)
Salehi, Mohammad Reza; Dehyadegari, Louiza
2014-10-01
As a new approach to recognition and classification problems, photonic reservoir computing has such advantages as parallel information processing, power efficient and high speed. In this paper, a photonic structure has been proposed for reservoir computing which is investigated using a simple, yet, non-partial noisy time series prediction task. This study includes the application of a suitable topology with self-feedbacks in a network of SOA's - which lends the system a strong memory - and leads to adjusting adequate parameters resulting in perfect recognition accuracy (100%) for noise-free time series, which shows a 3% improvement over previous results. For the classification of noisy time series, the rate of accuracy showed a 4% increase and amounted to 96%. Furthermore, an analytical approach was suggested to solve rate equations which led to a substantial decrease in the simulation time, which is an important parameter in classification of large signals such as speech recognition, and better results came up compared with previous works.
Laufer, Shlomi; D'Angelo, Anne-Lise D; Kwan, Calvin; Ray, Rebbeca D; Yudkowsky, Rachel; Boulet, John R; McGaghie, William C; Pugh, Carla M
2017-12-01
Develop new performance evaluation standards for the clinical breast examination (CBE). There are several, technical aspects of a proper CBE. Our recent work discovered a significant, linear relationship between palpation force and CBE accuracy. This article investigates the relationship between other technical aspects of the CBE and accuracy. This performance assessment study involved data collection from physicians (n = 553) attending 3 different clinical meetings between 2013 and 2014: American Society of Breast Surgeons, American Academy of Family Physicians, and American College of Obstetricians and Gynecologists. Four, previously validated, sensor-enabled breast models were used for clinical skills assessment. Models A and B had solitary, superficial, 2 cm and 1 cm soft masses, respectively. Models C and D had solitary, deep, 2 cm hard and moderately firm masses, respectively. Finger movements (search technique) from 1137 CBE video recordings were independently classified by 2 observers. Final classifications were compared with CBE accuracy. Accuracy rates were model A = 99.6%, model B = 89.7%, model C = 75%, and model D = 60%. Final classification categories for search technique included rubbing movement, vertical movement, piano fingers, and other. Interrater reliability was (k = 0.79). Rubbing movement was 4 times more likely to yield an accurate assessment (odds ratio 3.81, P < 0.001) compared with vertical movement and piano fingers. Piano fingers had the highest failure rate (36.5%). Regression analysis of search pattern, search technique, palpation force, examination time, and 6 demographic variables, revealed that search technique independently and significantly affected CBE accuracy (P < 0.001). Our results support measurement and classification of CBE techniques and provide the foundation for a new paradigm in teaching and assessing hands-on clinical skills. The newly described piano fingers palpation technique was noted to have unusually high failure rates. Medical educators should be aware of the potential differences in effectiveness for various CBE techniques.
A new pre-classification method based on associative matching method
NASA Astrophysics Data System (ADS)
Katsuyama, Yutaka; Minagawa, Akihiro; Hotta, Yoshinobu; Omachi, Shinichiro; Kato, Nei
2010-01-01
Reducing the time complexity of character matching is critical to the development of efficient Japanese Optical Character Recognition (OCR) systems. To shorten processing time, recognition is usually split into separate preclassification and recognition stages. For high overall recognition performance, the pre-classification stage must both have very high classification accuracy and return only a small number of putative character categories for further processing. Furthermore, for any practical system, the speed of the pre-classification stage is also critical. The associative matching (AM) method has often been used for fast pre-classification, because its use of a hash table and reliance solely on logical bit operations to select categories makes it highly efficient. However, redundant certain level of redundancy exists in the hash table because it is constructed using only the minimum and maximum values of the data on each axis and therefore does not take account of the distribution of the data. We propose a modified associative matching method that satisfies the performance criteria described above but in a fraction of the time by modifying the hash table to reflect the underlying distribution of training characters. Furthermore, we show that our approach outperforms pre-classification by clustering, ANN and conventional AM in terms of classification accuracy, discriminative power and speed. Compared to conventional associative matching, the proposed approach results in a 47% reduction in total processing time across an evaluation test set comprising 116,528 Japanese character images.
ATLS Hypovolemic Shock Classification by Prediction of Blood Loss in Rats Using Regression Models.
Choi, Soo Beom; Choi, Joon Yul; Park, Jee Soo; Kim, Deok Won
2016-07-01
In our previous study, our input data set consisted of 78 rats, the blood loss in percent as a dependent variable, and 11 independent variables (heart rate, systolic blood pressure, diastolic blood pressure, mean arterial pressure, pulse pressure, respiration rate, temperature, perfusion index, lactate concentration, shock index, and new index (lactate concentration/perfusion)). The machine learning methods for multicategory classification were applied to a rat model in acute hemorrhage to predict the four Advanced Trauma Life Support (ATLS) hypovolemic shock classes for triage in our previous study. However, multicategory classification is much more difficult and complicated than binary classification. We introduce a simple approach for classifying ATLS hypovolaemic shock class by predicting blood loss in percent using support vector regression and multivariate linear regression (MLR). We also compared the performance of the classification models using absolute and relative vital signs. The accuracies of support vector regression and MLR models with relative values by predicting blood loss in percent were 88.5% and 84.6%, respectively. These were better than the best accuracy of 80.8% of the direct multicategory classification using the support vector machine one-versus-one model in our previous study for the same validation data set. Moreover, the simple MLR models with both absolute and relative values could provide possibility of the future clinical decision support system for ATLS classification. The perfusion index and new index were more appropriate with relative changes than absolute values.
Accuracy of automated classification of major depressive disorder as a function of symptom severity.
Ramasubbu, Rajamannar; Brown, Matthew R G; Cortese, Filmeno; Gaxiola, Ismael; Goodyear, Bradley; Greenshaw, Andrew J; Dursun, Serdar M; Greiner, Russell
2016-01-01
Growing evidence documents the potential of machine learning for developing brain based diagnostic methods for major depressive disorder (MDD). As symptom severity may influence brain activity, we investigated whether the severity of MDD affected the accuracies of machine learned MDD-vs-Control diagnostic classifiers. Forty-five medication-free patients with DSM-IV defined MDD and 19 healthy controls participated in the study. Based on depression severity as determined by the Hamilton Rating Scale for Depression (HRSD), MDD patients were sorted into three groups: mild to moderate depression (HRSD 14-19), severe depression (HRSD 20-23), and very severe depression (HRSD ≥ 24). We collected functional magnetic resonance imaging (fMRI) data during both resting-state and an emotional-face matching task. Patients in each of the three severity groups were compared against controls in separate analyses, using either the resting-state or task-based fMRI data. We use each of these six datasets with linear support vector machine (SVM) binary classifiers for identifying individuals as patients or controls. The resting-state fMRI data showed statistically significant classification accuracy only for the very severe depression group (accuracy 66%, p = 0.012 corrected), while mild to moderate (accuracy 58%, p = 1.0 corrected) and severe depression (accuracy 52%, p = 1.0 corrected) were only at chance. With task-based fMRI data, the automated classifier performed at chance in all three severity groups. Binary linear SVM classifiers achieved significant classification of very severe depression with resting-state fMRI, but the contribution of brain measurements may have limited potential in differentiating patients with less severe depression from healthy controls.
Spatial and thematic assessment of object-based forest stand delineation using an OFA-matrix
NASA Astrophysics Data System (ADS)
Hernando, A.; Tiede, D.; Albrecht, F.; Lang, S.
2012-10-01
The delineation and classification of forest stands is a crucial aspect of forest management. Object-based image analysis (OBIA) can be used to produce detailed maps of forest stands from either orthophotos or very high resolution satellite imagery. However, measures are then required for evaluating and quantifying both the spatial and thematic accuracy of the OBIA output. In this paper we present an approach for delineating forest stands and a new Object Fate Analysis (OFA) matrix for accuracy assessment. A two-level object-based orthophoto analysis was first carried out to delineate stands on the Dehesa Boyal public land in central Spain (Avila Province). Two structural features were first created for use in class modelling, enabling good differentiation between stands: a relational tree cover cluster feature, and an arithmetic ratio shadow/tree feature. We then extended the OFA comparison approach with an OFA-matrix to enable concurrent validation of thematic and spatial accuracies. Its diagonal shows the proportion of spatial and thematic coincidence between a reference data and the corresponding classification. New parameters for Spatial Thematic Loyalty (STL), Spatial Thematic Loyalty Overall (STLOVERALL) and Maximal Interfering Object (MIO) are introduced to summarise the OFA-matrix accuracy assessment. A stands map generated by OBIA (classification data) was compared with a map of the same area produced from photo interpretation and field data (reference data). In our example the OFA-matrix results indicate good spatial and thematic accuracies (>65%) for all stand classes except for the shrub stands (31.8%), and a good STLOVERALL (69.8%). The OFA-matrix has therefore been shown to be a valid tool for OBIA accuracy assessment.
Estimating workload using EEG spectral power and ERPs in the n-back task
NASA Astrophysics Data System (ADS)
Brouwer, Anne-Marie; Hogervorst, Maarten A.; van Erp, Jan B. F.; Heffelaar, Tobias; Zimmerman, Patrick H.; Oostenveld, Robert
2012-08-01
Previous studies indicate that both electroencephalogram (EEG) spectral power (in particular the alpha and theta band) and event-related potentials (ERPs) (in particular the P300) can be used as a measure of mental work or memory load. We compare their ability to estimate workload level in a well-controlled task. In addition, we combine both types of measures in a single classification model to examine whether this results in higher classification accuracy than either one alone. Participants watched a sequence of visually presented letters and indicated whether or not the current letter was the same as the one (n instances) before. Workload was varied by varying n. We developed different classification models using ERP features, frequency power features or a combination (fusion). Training and testing of the models simulated an online workload estimation situation. All our ERP, power and fusion models provide classification accuracies between 80% and 90% when distinguishing between the highest and the lowest workload condition after 2 min. For 32 out of 35 participants, classification was significantly higher than chance level after 2.5 s (or one letter) as estimated by the fusion model. Differences between the models are rather small, though the fusion model performs better than the other models when only short data segments are available for estimating workload.
NASA Astrophysics Data System (ADS)
Malatesta, Luca; Attorre, Fabio; Altobelli, Alfredo; Adeeb, Ahmed; De Sanctis, Michele; Taleb, Nadim M.; Scholte, Paul T.; Vitale, Marcello
2013-01-01
Socotra Island (Yemen), a global biodiversity hotspot, is characterized by high geomorphological and biological diversity. In this study, we present a high-resolution vegetation map of the island based on combining vegetation analysis and classification with remote sensing. Two different image classification approaches were tested to assess the most accurate one in mapping the vegetation mosaic of Socotra. Spectral signatures of the vegetation classes were obtained through a Gaussian mixture distribution model, and a sequential maximum a posteriori (SMAP) classification was applied to account for the heterogeneity and the complex spatial pattern of the arid vegetation. This approach was compared to the traditional maximum likelihood (ML) classification. Satellite data were represented by a RapidEye image with 5 m pixel resolution and five spectral bands. Classified vegetation relevés were used to obtain the training and evaluation sets for the main plant communities. Postclassification sorting was performed to adjust the classification through various rule-based operations. Twenty-eight classes were mapped, and SMAP, with an accuracy of 87%, proved to be more effective than ML (accuracy: 66%). The resulting map will represent an important instrument for the elaboration of conservation strategies and the sustainable use of natural resources in the island.
Multi-class SVM model for fMRI-based classification and grading of liver fibrosis
NASA Astrophysics Data System (ADS)
Freiman, M.; Sela, Y.; Edrei, Y.; Pappo, O.; Joskowicz, L.; Abramovitch, R.
2010-03-01
We present a novel non-invasive automatic method for the classification and grading of liver fibrosis from fMRI maps based on hepatic hemodynamic changes. This method automatically creates a model for liver fibrosis grading based on training datasets. Our supervised learning method evaluates hepatic hemodynamics from an anatomical MRI image and three T2*-W fMRI signal intensity time-course scans acquired during the breathing of air, air-carbon dioxide, and carbogen. It constructs a statistical model of liver fibrosis from these fMRI scans using a binary-based one-against-all multi class Support Vector Machine (SVM) classifier. We evaluated the resulting classification model with the leave-one out technique and compared it to both full multi-class SVM and K-Nearest Neighbor (KNN) classifications. Our experimental study analyzed 57 slice sets from 13 mice, and yielded a 98.2% separation accuracy between healthy and low grade fibrotic subjects, and an overall accuracy of 84.2% for fibrosis grading. These results are better than the existing image-based methods which can only discriminate between healthy and high grade fibrosis subjects. With appropriate extensions, our method may be used for non-invasive classification and progression monitoring of liver fibrosis in human patients instead of more invasive approaches, such as biopsy or contrast-enhanced imaging.
Barua, Shaibal; Begum, Shahina; Ahmed, Mobyen Uddin
2015-01-01
Machine learning algorithms play an important role in computer science research. Recent advancement in sensor data collection in clinical sciences lead to a complex, heterogeneous data processing, and analysis for patient diagnosis and prognosis. Diagnosis and treatment of patients based on manual analysis of these sensor data are difficult and time consuming. Therefore, development of Knowledge-based systems to support clinicians in decision-making is important. However, it is necessary to perform experimental work to compare performances of different machine learning methods to help to select appropriate method for a specific characteristic of data sets. This paper compares classification performance of three popular machine learning methods i.e., case-based reasoning, neutral networks and support vector machine to diagnose stress of vehicle drivers using finger temperature and heart rate variability. The experimental results show that case-based reasoning outperforms other two methods in terms of classification accuracy. Case-based reasoning has achieved 80% and 86% accuracy to classify stress using finger temperature and heart rate variability. On contrary, both neural network and support vector machine have achieved less than 80% accuracy by using both physiological signals.
Schmidt, Robert L; Walker, Brandon S; Cohen, Michael B
2015-03-01
Reliable estimates of accuracy are important for any diagnostic test. Diagnostic accuracy studies are subject to unique sources of bias. Verification bias and classification bias are 2 sources of bias that commonly occur in diagnostic accuracy studies. Statistical methods are available to estimate the impact of these sources of bias when they occur alone. The impact of interactions when these types of bias occur together has not been investigated. We developed mathematical relationships to show the combined effect of verification bias and classification bias. A wide range of case scenarios were generated to assess the impact of bias components and interactions on total bias. Interactions between verification bias and classification bias caused overestimation of sensitivity and underestimation of specificity. Interactions had more effect on sensitivity than specificity. Sensitivity was overestimated by at least 7% in approximately 6% of the tested scenarios. Specificity was underestimated by at least 7% in less than 0.1% of the scenarios. Interactions between verification bias and classification bias create distortions in accuracy estimates that are greater than would be predicted from each source of bias acting independently. © 2014 American Cancer Society.
Compensatory neurofuzzy model for discrete data classification in biomedical
NASA Astrophysics Data System (ADS)
Ceylan, Rahime
2015-03-01
Biomedical data is separated to two main sections: signals and discrete data. So, studies in this area are about biomedical signal classification or biomedical discrete data classification. There are artificial intelligence models which are relevant to classification of ECG, EMG or EEG signals. In same way, in literature, many models exist for classification of discrete data taken as value of samples which can be results of blood analysis or biopsy in medical process. Each algorithm could not achieve high accuracy rate on classification of signal and discrete data. In this study, compensatory neurofuzzy network model is presented for classification of discrete data in biomedical pattern recognition area. The compensatory neurofuzzy network has a hybrid and binary classifier. In this system, the parameters of fuzzy systems are updated by backpropagation algorithm. The realized classifier model is conducted to two benchmark datasets (Wisconsin Breast Cancer dataset and Pima Indian Diabetes dataset). Experimental studies show that compensatory neurofuzzy network model achieved 96.11% accuracy rate in classification of breast cancer dataset and 69.08% accuracy rate was obtained in experiments made on diabetes dataset with only 10 iterations.
Younghak Shin; Balasingham, Ilangko
2017-07-01
Colonoscopy is a standard method for screening polyps by highly trained physicians. Miss-detected polyps in colonoscopy are potential risk factor for colorectal cancer. In this study, we investigate an automatic polyp classification framework. We aim to compare two different approaches named hand-craft feature method and convolutional neural network (CNN) based deep learning method. Combined shape and color features are used for hand craft feature extraction and support vector machine (SVM) method is adopted for classification. For CNN approach, three convolution and pooling based deep learning framework is used for classification purpose. The proposed framework is evaluated using three public polyp databases. From the experimental results, we have shown that the CNN based deep learning framework shows better classification performance than the hand-craft feature based methods. It achieves over 90% of classification accuracy, sensitivity, specificity and precision.
NASA Technical Reports Server (NTRS)
Nalepka, R. F. (Principal Investigator); Sadowski, F. E.; Sarno, J. E.
1976-01-01
The author has identified the following significant results. A supervised classification within two separate ground areas of the Sam Houston National Forest was carried out for two sq meters spatial resolution MSS data. Data were progressively coarsened to simulate five additional cases of spatial resolution ranging up to 64 sq meters. Similar processing and analysis of all spatial resolutions enabled evaluations of the effect of spatial resolution on classification accuracy for various levels of detail and the effects on area proportion estimation for very general forest features. For very coarse resolutions, a subset of spectral channels which simulated the proposed thematic mapper channels was used to study classification accuracy.
NASA Technical Reports Server (NTRS)
Card, Don H.; Strong, Laurence L.
1989-01-01
An application of a classification accuracy assessment procedure is described for a vegetation and land cover map prepared by digital image processing of LANDSAT multispectral scanner data. A statistical sampling procedure called Stratified Plurality Sampling was used to assess the accuracy of portions of a map of the Arctic National Wildlife Refuge coastal plain. Results are tabulated as percent correct classification overall as well as per category with associated confidence intervals. Although values of percent correct were disappointingly low for most categories, the study was useful in highlighting sources of classification error and demonstrating shortcomings of the plurality sampling method.
Cirrhosis Diagnosis and Liver Fibrosis Staging: Transient Elastometry Versus Cirrhosis Blood Test.
Calès, Paul; Boursier, Jérôme; Oberti, Frédéric; Bardou, Derek; Zarski, Jean-Pierre; de Lédinghen, Victor
2015-07-01
Elastometry is more accurate than blood tests for cirrhosis diagnosis. However, blood tests were developed for significant fibrosis, with the exception of CirrhoMeter developed for cirrhosis. We compared the performance of Fibroscan and CirrhoMeter, and classic binary cirrhosis diagnosis versus new fibrosis staging for cirrhosis diagnosis. The diagnostic population included 679 patients with hepatitis C and liver biopsy (Metavir staging and morphometry), Fibroscan, and CirrhoMeter. The prognostic population included 1110 patients with chronic liver disease and both tests. Binary diagnosis: AUROCs for cirrhosis were: Fibroscan: 0.905; CirrhoMeter: 0.857; and P=0.041. Accuracy (Youden cutoff) was: Fibroscan: 85.4%; CirrhoMeter: 79.2%; and P<0.001. Fibrosis classification provided 6 classes (F0/1, F1/2, F2±1, F3±1, F3/4, and F4). Accuracy was: Fibroscan: 88.2%; CirrhoMeter: 88.8%; and P=0.77. A simplified fibrosis classification comprised 3 categories: discrete (F1±1), moderate (F2±1), and severe (F3/4) fibrosis. Using this simplified classification, CirrhoMeter predicted survival better than Fibroscan (respectively, χ=37.9 and 19.7 by log-rank test), but both predicted it well (P<0.001 by log-rank test). Comparison: binary diagnosis versus fibrosis classification, respectively, overall accuracy: CirrhoMeter: 79.2% versus 88.8% (P<0.001); Fibroscan: 85.4% versus 88.2% (P=0.127); positive predictive value for cirrhosis by Fibroscan: Youden cutoff (11.1 kPa): 49.1% versus cutoffs of F3/4 (17.6 kPa): 67.6% and F4 classes (25.7 kPa): 82.4%. Fibroscan's usual binary cutoffs for cirrhosis diagnosis are not sufficiently accurate. Fibrosis classification should be preferred over binary diagnosis. A cirrhosis-specific blood test markedly attenuates the accuracy deficit for cirrhosis diagnosis of usual blood tests versus transient elastometry, and may offer better prognostication.
Detection of epileptic seizure in EEG signals using linear least squares preprocessing.
Roshan Zamir, Z
2016-09-01
An epileptic seizure is a transient event of abnormal excessive neuronal discharge in the brain. This unwanted event can be obstructed by detection of electrical changes in the brain that happen before the seizure takes place. The automatic detection of seizures is necessary since the visual screening of EEG recordings is a time consuming task and requires experts to improve the diagnosis. Much of the prior research in detection of seizures has been developed based on artificial neural network, genetic programming, and wavelet transforms. Although the highest achieved accuracy for classification is 100%, there are drawbacks, such as the existence of unbalanced datasets and the lack of investigations in performances consistency. To address these, four linear least squares-based preprocessing models are proposed to extract key features of an EEG signal in order to detect seizures. The first two models are newly developed. The original signal (EEG) is approximated by a sinusoidal curve. Its amplitude is formed by a polynomial function and compared with the predeveloped spline function. Different statistical measures, namely classification accuracy, true positive and negative rates, false positive and negative rates and precision, are utilised to assess the performance of the proposed models. These metrics are derived from confusion matrices obtained from classifiers. Different classifiers are used over the original dataset and the set of extracted features. The proposed models significantly reduce the dimension of the classification problem and the computational time while the classification accuracy is improved in most cases. The first and third models are promising feature extraction methods with the classification accuracy of 100%. Logistic, LazyIB1, LazyIB5, and J48 are the best classifiers. Their true positive and negative rates are 1 while false positive and negative rates are 0 and the corresponding precision values are 1. Numerical results suggest that these models are robust and efficient for detecting epileptic seizure. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Mejia Tobar, Alejandra; Hyoudou, Rikiya; Kita, Kahori; Nakamura, Tatsuhiro; Kambara, Hiroyuki; Ogata, Yousuke; Hanakawa, Takashi; Koike, Yasuharu; Yoshimura, Natsue
2017-01-01
The classification of ankle movements from non-invasive brain recordings can be applied to a brain-computer interface (BCI) to control exoskeletons, prosthesis, and functional electrical stimulators for the benefit of patients with walking impairments. In this research, ankle flexion and extension tasks at two force levels in both legs, were classified from cortical current sources estimated by a hierarchical variational Bayesian method, using electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) recordings. The hierarchical prior for the current source estimation from EEG was obtained from activated brain areas and their intensities from an fMRI group (second-level) analysis. The fMRI group analysis was performed on regions of interest defined over the primary motor cortex, the supplementary motor area, and the somatosensory area, which are well-known to contribute to movement control. A sparse logistic regression method was applied for a nine-class classification (eight active tasks and a resting control task) obtaining a mean accuracy of 65.64% for time series of current sources, estimated from the EEG and the fMRI signals using a variational Bayesian method, and a mean accuracy of 22.19% for the classification of the pre-processed of EEG sensor signals, with a chance level of 11.11%. The higher classification accuracy of current sources, when compared to EEG classification accuracy, was attributed to the high number of sources and the different signal patterns obtained in the same vertex for different motor tasks. Since the inverse filter estimation for current sources can be done offline with the present method, the present method is applicable to real-time BCIs. Finally, due to the highly enhanced spatial distribution of current sources over the brain cortex, this method has the potential to identify activation patterns to design BCIs for the control of an affected limb in patients with stroke, or BCIs from motor imagery in patients with spinal cord injury.
Yang, Hao; Zhang, Junran; Jiang, Xiaomei; Liu, Fei
2018-04-01
In recent years, with the rapid development of machine learning techniques,the deep learning algorithm has been widely used in one-dimensional physiological signal processing. In this paper we used electroencephalography (EEG) signals based on deep belief network (DBN) model in open source frameworks of deep learning to identify emotional state (positive, negative and neutrals), then the results of DBN were compared with support vector machine (SVM). The EEG signals were collected from the subjects who were under different emotional stimuli, and DBN and SVM were adopted to identify the EEG signals with changes of different characteristics and different frequency bands. We found that the average accuracy of differential entropy (DE) feature by DBN is 89.12%±6.54%, which has a better performance than previous research based on the same data set. At the same time, the classification effects of DBN are better than the results from traditional SVM (the average classification accuracy of 84.2%±9.24%) and its accuracy and stability have a better trend. In three experiments with different time points, single subject can achieve the consistent results of classification by using DBN (the mean standard deviation is1.44%), and the experimental results show that the system has steady performance and good repeatability. According to our research, the characteristic of DE has a better classification result than other characteristics. Furthermore, the Beta band and the Gamma band in the emotional recognition model have higher classification accuracy. To sum up, the performances of classifiers have a promotion by using the deep learning algorithm, which has a reference for establishing a more accurate system of emotional recognition. Meanwhile, we can trace through the results of recognition to find out the brain regions and frequency band that are related to the emotions, which can help us to understand the emotional mechanism better. This study has a high academic value and practical significance, so further investigation still needs to be done.
Xia, Jiaqi; Peng, Zhenling; Qi, Dawei; Mu, Hongbo; Yang, Jianyi
2017-03-15
Protein fold classification is a critical step in protein structure prediction. There are two possible ways to classify protein folds. One is through template-based fold assignment and the other is ab-initio prediction using machine learning algorithms. Combination of both solutions to improve the prediction accuracy was never explored before. We developed two algorithms, HH-fold and SVM-fold for protein fold classification. HH-fold is a template-based fold assignment algorithm using the HHsearch program. SVM-fold is a support vector machine-based ab-initio classification algorithm, in which a comprehensive set of features are extracted from three complementary sequence profiles. These two algorithms are then combined, resulting to the ensemble approach TA-fold. We performed a comprehensive assessment for the proposed methods by comparing with ab-initio methods and template-based threading methods on six benchmark datasets. An accuracy of 0.799 was achieved by TA-fold on the DD dataset that consists of proteins from 27 folds. This represents improvement of 5.4-11.7% over ab-initio methods. After updating this dataset to include more proteins in the same folds, the accuracy increased to 0.971. In addition, TA-fold achieved >0.9 accuracy on a large dataset consisting of 6451 proteins from 184 folds. Experiments on the LE dataset show that TA-fold consistently outperforms other threading methods at the family, superfamily and fold levels. The success of TA-fold is attributed to the combination of template-based fold assignment and ab-initio classification using features from complementary sequence profiles that contain rich evolution information. http://yanglab.nankai.edu.cn/TA-fold/. yangjy@nankai.edu.cn or mhb-506@163.com. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Terrain-Moisture Classification Using GPS Surface-Reflected Signals
NASA Technical Reports Server (NTRS)
Grant, Michael S.; Acton, Scott T.; Katzberg, Stephen J.
2006-01-01
In this study we present a novel method of land surface classification using surface-reflected GPS signals in combination with digital imagery. Two GPS-derived classification features are merged with visible image data to create terrain-moisture (TM) classes, defined here as visibly identifiable terrain or landcover classes containing a surface/soil moisture component. As compared to using surface imagery alone, classification accuracy is significantly improved for a number of visible classes when adding the GPS-based signal features. Since the strength of the reflected GPS signal is proportional to the amount of moisture in the surface, use of these GPS features provides information about the surface that is not obtainable using visible wavelengths alone. Application areas include hydrology, precision agriculture, and wetlands mapping.
Gas Classification Using Deep Convolutional Neural Networks.
Peng, Pai; Zhao, Xiaojin; Pan, Xiaofang; Ye, Wenbin
2018-01-08
In this work, we propose a novel Deep Convolutional Neural Network (DCNN) tailored for gas classification. Inspired by the great success of DCNN in the field of computer vision, we designed a DCNN with up to 38 layers. In general, the proposed gas neural network, named GasNet, consists of: six convolutional blocks, each block consist of six layers; a pooling layer; and a fully-connected layer. Together, these various layers make up a powerful deep model for gas classification. Experimental results show that the proposed DCNN method is an effective technique for classifying electronic nose data. We also demonstrate that the DCNN method can provide higher classification accuracy than comparable Support Vector Machine (SVM) methods and Multiple Layer Perceptron (MLP).
Gas Classification Using Deep Convolutional Neural Networks
Peng, Pai; Zhao, Xiaojin; Pan, Xiaofang; Ye, Wenbin
2018-01-01
In this work, we propose a novel Deep Convolutional Neural Network (DCNN) tailored for gas classification. Inspired by the great success of DCNN in the field of computer vision, we designed a DCNN with up to 38 layers. In general, the proposed gas neural network, named GasNet, consists of: six convolutional blocks, each block consist of six layers; a pooling layer; and a fully-connected layer. Together, these various layers make up a powerful deep model for gas classification. Experimental results show that the proposed DCNN method is an effective technique for classifying electronic nose data. We also demonstrate that the DCNN method can provide higher classification accuracy than comparable Support Vector Machine (SVM) methods and Multiple Layer Perceptron (MLP). PMID:29316723
A supervised learning rule for classification of spatiotemporal spike patterns.
Lilin Guo; Zhenzhong Wang; Adjouadi, Malek
2016-08-01
This study introduces a novel supervised algorithm for spiking neurons that take into consideration synapse delays and axonal delays associated with weights. It can be utilized for both classification and association and uses several biologically influenced properties, such as axonal and synaptic delays. This algorithm also takes into consideration spike-timing-dependent plasticity as in Remote Supervised Method (ReSuMe). This paper focuses on the classification aspect alone. Spiked neurons trained according to this proposed learning rule are capable of classifying different categories by the associated sequences of precisely timed spikes. Simulation results have shown that the proposed learning method greatly improves classification accuracy when compared to the Spike Pattern Association Neuron (SPAN) and the Tempotron learning rule.
Statistical sensor fusion of ECG data using automotive-grade sensors
NASA Astrophysics Data System (ADS)
Koenig, A.; Rehg, T.; Rasshofer, R.
2015-11-01
Driver states such as fatigue, stress, aggression, distraction or even medical emergencies continue to be yield to severe mistakes in driving and promote accidents. A pathway towards improving driver state assessment can be found in psycho-physiological measures to directly quantify the driver's state from physiological recordings. Although heart rate is a well-established physiological variable that reflects cognitive stress, obtaining heart rate contactless and reliably is a challenging task in an automotive environment. Our aim was to investigate, how sensory fusion of two automotive grade sensors would influence the accuracy of automatic classification of cognitive stress levels. We induced cognitive stress in subjects and estimated levels from their heart rate signals, acquired from automotive ready ECG sensors. Using signal quality indices and Kalman filters, we were able to decrease Root Mean Squared Error (RMSE) of heart rate recordings by 10 beats per minute. We then trained a neural network to classify the cognitive workload state of subjects from heart rate and compared classification performance for ground truth, the individual sensors and the fused heart rate signal. We obtained an increase of 5 % higher correct classification by fusing signals as compared to individual sensors, staying only 4 % below the maximally possible classification accuracy from ground truth. These results are a first step towards real world applications of psycho-physiological measurements in vehicle settings. Future implementations of driver state modeling will be able to draw from a larger pool of data sources, such as additional physiological values or vehicle related data, which can be expected to drive classification to significantly higher values.
Montgomery, Valencia; Harris, Katie; Stabler, Anthony; Lu, Lisa H
2017-05-01
To examine how the duration of time delay between Wechsler Memory Scale (WMS) Logical Memory I and Logical Memory II (LM) affected participants' recall performance. There are 46,146 total Logical Memory administrations to participants diagnosed with either Alzheimer's disease (AD), vascular dementia (VaD), or normal cognition in the National Alzheimer's Disease Coordinating Center's Uniform Data Set. Only 50% of the sample was administered the standard 20-35 min of delay as specified by WMS-R and WMS-III. We found a significant effect of delay time duration on proportion of information retained for the VaD group compared to its control group, which remained after adding LMI raw score as a covariate. There was poorer retention of information with longer delay for this group. This association was not as strong for the AD and cognitively normal groups. A 24.5-min delay was most optimal for differentiating AD from VaD participants (47.7% classification accuracy), an 18.5-min delay was most optimal for differentiating AD versus normal participants (51.7% classification accuracy), and a 22.5-min delay was most optimal for differentiating VaD versus normal participants (52.9% classification accuracy). Considering diagnostic implications, our findings suggest that test administration should incorporate precise tracking of delay periods. We recommend a 20-min delay with 18-25-min range. Poor classification accuracy based on LM data alone is a reminder that story memory performance is only one piece of data that contributes to complex clinical decisions. However, strict adherence to the recommended range yields optimal data for diagnostic decisions. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Gromski, Piotr S; Correa, Elon; Vaughan, Andrew A; Wedge, David C; Turner, Michael L; Goodacre, Royston
2014-11-01
Accurate detection of certain chemical vapours is important, as these may be diagnostic for the presence of weapons, drugs of misuse or disease. In order to achieve this, chemical sensors could be deployed remotely. However, the readout from such sensors is a multivariate pattern, and this needs to be interpreted robustly using powerful supervised learning methods. Therefore, in this study, we compared the classification accuracy of four pattern recognition algorithms which include linear discriminant analysis (LDA), partial least squares-discriminant analysis (PLS-DA), random forests (RF) and support vector machines (SVM) which employed four different kernels. For this purpose, we have used electronic nose (e-nose) sensor data (Wedge et al., Sensors Actuators B Chem 143:365-372, 2009). In order to allow direct comparison between our four different algorithms, we employed two model validation procedures based on either 10-fold cross-validation or bootstrapping. The results show that LDA (91.56% accuracy) and SVM with a polynomial kernel (91.66% accuracy) were very effective at analysing these e-nose data. These two models gave superior prediction accuracy, sensitivity and specificity in comparison to the other techniques employed. With respect to the e-nose sensor data studied here, our findings recommend that SVM with a polynomial kernel should be favoured as a classification method over the other statistical models that we assessed. SVM with non-linear kernels have the advantage that they can be used for classifying non-linear as well as linear mapping from analytical data space to multi-group classifications and would thus be a suitable algorithm for the analysis of most e-nose sensor data.
Singha, Mrinal; Wu, Bingfang; Zhang, Miao
2016-01-01
Accurate and timely mapping of paddy rice is vital for food security and environmental sustainability. This study evaluates the utility of temporal features extracted from coarse resolution data for object-based paddy rice classification of fine resolution data. The coarse resolution vegetation index data is first fused with the fine resolution data to generate the time series fine resolution data. Temporal features are extracted from the fused data and added with the multi-spectral data to improve the classification accuracy. Temporal features provided the crop growth information, while multi-spectral data provided the pattern variation of paddy rice. The achieved overall classification accuracy and kappa coefficient were 84.37% and 0.68, respectively. The results indicate that the use of temporal features improved the overall classification accuracy of a single-date multi-spectral image by 18.75% from 65.62% to 84.37%. The minimum sensitivity (MS) of the paddy rice classification has also been improved. The comparison showed that the mapped paddy area was analogous to the agricultural statistics at the district level. This work also highlighted the importance of feature selection to achieve higher classification accuracies. These results demonstrate the potential of the combined use of temporal and spectral features for accurate paddy rice classification. PMID:28025525
Singha, Mrinal; Wu, Bingfang; Zhang, Miao
2016-12-22
Accurate and timely mapping of paddy rice is vital for food security and environmental sustainability. This study evaluates the utility of temporal features extracted from coarse resolution data for object-based paddy rice classification of fine resolution data. The coarse resolution vegetation index data is first fused with the fine resolution data to generate the time series fine resolution data. Temporal features are extracted from the fused data and added with the multi-spectral data to improve the classification accuracy. Temporal features provided the crop growth information, while multi-spectral data provided the pattern variation of paddy rice. The achieved overall classification accuracy and kappa coefficient were 84.37% and 0.68, respectively. The results indicate that the use of temporal features improved the overall classification accuracy of a single-date multi-spectral image by 18.75% from 65.62% to 84.37%. The minimum sensitivity (MS) of the paddy rice classification has also been improved. The comparison showed that the mapped paddy area was analogous to the agricultural statistics at the district level. This work also highlighted the importance of feature selection to achieve higher classification accuracies. These results demonstrate the potential of the combined use of temporal and spectral features for accurate paddy rice classification.
A Transform-Based Feature Extraction Approach for Motor Imagery Tasks Classification
Khorshidtalab, Aida; Mesbah, Mostefa; Salami, Momoh J. E.
2015-01-01
In this paper, we present a new motor imagery classification method in the context of electroencephalography (EEG)-based brain–computer interface (BCI). This method uses a signal-dependent orthogonal transform, referred to as linear prediction singular value decomposition (LP-SVD), for feature extraction. The transform defines the mapping as the left singular vectors of the LP coefficient filter impulse response matrix. Using a logistic tree-based model classifier; the extracted features are classified into one of four motor imagery movements. The proposed approach was first benchmarked against two related state-of-the-art feature extraction approaches, namely, discrete cosine transform (DCT) and adaptive autoregressive (AAR)-based methods. By achieving an accuracy of 67.35%, the LP-SVD approach outperformed the other approaches by large margins (25% compared with DCT and 6 % compared with AAR-based methods). To further improve the discriminatory capability of the extracted features and reduce the computational complexity, we enlarged the extracted feature subset by incorporating two extra features, namely, Q- and the Hotelling’s \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$T^{2}$ \\end{document} statistics of the transformed EEG and introduced a new EEG channel selection method. The performance of the EEG classification based on the expanded feature set and channel selection method was compared with that of a number of the state-of-the-art classification methods previously reported with the BCI IIIa competition data set. Our method came second with an average accuracy of 81.38%. PMID:27170898
Mehrang, Saeed; Pietilä, Julia; Korhonen, Ilkka
2018-02-22
Wrist-worn sensors have better compliance for activity monitoring compared to hip, waist, ankle or chest positions. However, wrist-worn activity monitoring is challenging due to the wide degree of freedom for the hand movements, as well as similarity of hand movements in different activities such as varying intensities of cycling. To strengthen the ability of wrist-worn sensors in detecting human activities more accurately, motion signals can be complemented by physiological signals such as optical heart rate (HR) based on photoplethysmography. In this paper, an activity monitoring framework using an optical HR sensor and a triaxial wrist-worn accelerometer is presented. We investigated a range of daily life activities including sitting, standing, household activities and stationary cycling with two intensities. A random forest (RF) classifier was exploited to detect these activities based on the wrist motions and optical HR. The highest overall accuracy of 89.6 ± 3.9% was achieved with a forest of a size of 64 trees and 13-s signal segments with 90% overlap. Removing the HR-derived features decreased the classification accuracy of high-intensity cycling by almost 7%, but did not affect the classification accuracies of other activities. A feature reduction utilizing the feature importance scores of RF was also carried out and resulted in a shrunken feature set of only 21 features. The overall accuracy of the classification utilizing the shrunken feature set was 89.4 ± 4.2%, which is almost equivalent to the above-mentioned peak overall accuracy.
Austin, Peter C.; Tu, Jack V.; Ho, Jennifer E.; Levy, Daniel; Lee, Douglas S.
2014-01-01
Objective Physicians classify patients into those with or without a specific disease. Furthermore, there is often interest in classifying patients according to disease etiology or subtype. Classification trees are frequently used to classify patients according to the presence or absence of a disease. However, classification trees can suffer from limited accuracy. In the data-mining and machine learning literature, alternate classification schemes have been developed. These include bootstrap aggregation (bagging), boosting, random forests, and support vector machines. Study design and Setting We compared the performance of these classification methods with those of conventional classification trees to classify patients with heart failure according to the following sub-types: heart failure with preserved ejection fraction (HFPEF) vs. heart failure with reduced ejection fraction (HFREF). We also compared the ability of these methods to predict the probability of the presence of HFPEF with that of conventional logistic regression. Results We found that modern, flexible tree-based methods from the data mining literature offer substantial improvement in prediction and classification of heart failure sub-type compared to conventional classification and regression trees. However, conventional logistic regression had superior performance for predicting the probability of the presence of HFPEF compared to the methods proposed in the data mining literature. Conclusion The use of tree-based methods offers superior performance over conventional classification and regression trees for predicting and classifying heart failure subtypes in a population-based sample of patients from Ontario. However, these methods do not offer substantial improvements over logistic regression for predicting the presence of HFPEF. PMID:23384592
Land cover mapping in Latvia using hyperspectral airborne and simulated Sentinel-2 data
NASA Astrophysics Data System (ADS)
Jakovels, Dainis; Filipovs, Jevgenijs; Brauns, Agris; Taskovs, Juris; Erins, Gatis
2016-08-01
Land cover mapping in Latvia is performed as part of the Corine Land Cover (CLC) initiative every six years. The advantage of CLC is the creation of a standardized nomenclature and mapping protocol comparable across all European countries, thereby making it a valuable information source at the European level. However, low spatial resolution and accuracy, infrequent updates and expensive manual production has limited its use at the national level. As of now, there is no remote sensing based high resolution land cover and land use services designed specifically for Latvia which would account for the country's natural and land use specifics and end-user interests. The European Space Agency launched the Sentinel-2 satellite in 2015 aiming to provide continuity of free high resolution multispectral satellite data thereby presenting an opportunity to develop and adapted land cover and land use algorithm which accounts for national enduser needs. In this study, land cover mapping scheme according to national end-user needs was developed and tested in two pilot territories (Cesis and Burtnieki). Hyperspectral airborne data covering spectral range 400-2500 nm was acquired in summer 2015 using Airborne Surveillance and Environmental Monitoring System (ARSENAL). The gathered data was tested for land cover classification of seven general classes (urban/artificial, bare, forest, shrubland, agricultural/grassland, wetlands, water) and sub-classes specific for Latvia as well as simulation of Sentinel-2 satellite data. Hyperspectral data sets consist of 122 spectral bands in visible to near infrared spectral range (356-950 nm) and 100 bands in short wave infrared (950-2500 nm). Classification of land cover was tested separately for each sensor data and fused cross-sensor data. The best overall classification accuracy 84.2% and satisfactory classification accuracy (more than 80%) for 9 of 13 classes was obtained using Support Vector Machine (SVM) classifier with 109 band hyperspectral data. Grassland and agriculture land demonstrated lowest classification accuracy in pixel based approach, but result significantly improved by looking at agriculture polygons registered in Rural Support Service data as objects. The test of simulated Sentinel-2 bands for land cover mapping using SVM classifier showed 82.8% overall accuracy and satisfactory separation of 7 classes. SVM provided highest overall accuracy 84.2% in comparison to 75.9% for k-Nearest Neighbor and 79.2% Linear Discriminant Analysis classifiers.
Classification accuracy for stratification with remotely sensed data
Raymond L. Czaplewski; Paul L. Patterson
2003-01-01
Tools are developed that help specify the classification accuracy required from remotely sensed data. These tools are applied during the planning stage of a sample survey that will use poststratification, prestratification with proportional allocation, or double sampling for stratification. Accuracy standards are developed in terms of an âerror matrix,â which is...
Sosenko, Jay M; Skyler, Jay S; Mahon, Jeffrey; Krischer, Jeffrey P; Greenbaum, Carla J; Rafkin, Lisa E; Beam, Craig A; Boulware, David C; Matheson, Della; Cuthbertson, David; Herold, Kevan C; Eisenbarth, George; Palmer, Jerry P
2014-04-01
OBJECTIVE We studied the utility of the Diabetes Prevention Trial-Type 1 Risk Score (DPTRS) for improving the accuracy of type 1 diabetes (T1D) risk classification in TrialNet Natural History Study (TNNHS) participants. RESEARCH DESIGN AND METHODS The cumulative incidence of T1D was compared between normoglycemic individuals with DPTRS values >7.00 and dysglycemic individuals in the TNNHS (n = 991). It was also compared between individuals with DPTRS values <7.00 or >7.00 among those with dysglycemia and those with multiple autoantibodies in the TNNHS. DPTRS values >7.00 were compared with dysglycemia for characterizing risk in Diabetes Prevention Trial-Type 1 (DPT-1) (n = 670) and TNNHS participants. The reliability of DPTRS values >7.00 was compared with dysglycemia in the TNNHS. RESULTS The cumulative incidence of T1D for normoglycemic TNNHS participants with DPTRS values >7.00 was comparable to those with dysglycemia. Among those with dysglycemia, the cumulative incidence was much higher (P < 0.001) for those with DPTRS values >7.00 than for those with values <7.00 (3-year risks: 0.16 for <7.00 and 0.46 for >7.00). Dysglycemic individuals in DPT-1 were at much higher risk for T1D than those with dysglycemia in the TNNHS (P < 0.001); there was no significant difference in risk between the studies among those with DPTRS values >7.00. The proportion in the TNNHS reverting from dysglycemia to normoglycemia at the next visit was higher than the proportion reverting from DPTRS values >7.00 to values <7.00 (36 vs. 23%). CONCLUSIONS DPTRS thresholds can improve T1D risk classification accuracy by identifying high-risk normoglycemic and low-risk dysglycemic individuals. The 7.00 DPTRS threshold characterizes risk more consistently between populations and has greater reliability than dysglycemia.
Kruskal-Wallis-based computationally efficient feature selection for face recognition.
Ali Khan, Sajid; Hussain, Ayyaz; Basit, Abdul; Akram, Sheeraz
2014-01-01
Face recognition in today's technological world, and face recognition applications attain much more importance. Most of the existing work used frontal face images to classify face image. However these techniques fail when applied on real world face images. The proposed technique effectively extracts the prominent facial features. Most of the features are redundant and do not contribute to representing face. In order to eliminate those redundant features, computationally efficient algorithm is used to select the more discriminative face features. Extracted features are then passed to classification step. In the classification step, different classifiers are ensemble to enhance the recognition accuracy rate as single classifier is unable to achieve the high accuracy. Experiments are performed on standard face database images and results are compared with existing techniques.
Cooperative Learning for Distributed In-Network Traffic Classification
NASA Astrophysics Data System (ADS)
Joseph, S. B.; Loo, H. R.; Ismail, I.; Andromeda, T.; Marsono, M. N.
2017-04-01
Inspired by the concept of autonomic distributed/decentralized network management schemes, we consider the issue of information exchange among distributed network nodes to network performance and promote scalability for in-network monitoring. In this paper, we propose a cooperative learning algorithm for propagation and synchronization of network information among autonomic distributed network nodes for online traffic classification. The results show that network nodes with sharing capability perform better with a higher average accuracy of 89.21% (sharing data) and 88.37% (sharing clusters) compared to 88.06% for nodes without cooperative learning capability. The overall performance indicates that cooperative learning is promising for distributed in-network traffic classification.
NASA Astrophysics Data System (ADS)
Hramov, Alexander E.; Frolov, Nikita S.; Musatov, Vyachaslav Yu.
2018-02-01
In present work we studied features of the human brain states classification, corresponding to the real movements of hands and legs. For this purpose we used supervised learning algorithm based on feed-forward artificial neural networks (ANNs) with error back-propagation along with the support vector machine (SVM) method. We compared the quality of operator movements classification by means of EEG signals obtained experimentally in the absence of preliminary processing and after filtration in different ranges up to 25 Hz. It was shown that low-frequency filtering of multichannel EEG data significantly improved accuracy of operator movements classification.
Metric learning for automatic sleep stage classification.
Phan, Huy; Do, Quan; Do, The-Luan; Vu, Duc-Lung
2013-01-01
We introduce in this paper a metric learning approach for automatic sleep stage classification based on single-channel EEG data. We show that learning a global metric from training data instead of using the default Euclidean metric, the k-nearest neighbor classification rule outperforms state-of-the-art methods on Sleep-EDF dataset with various classification settings. The overall accuracy for Awake/Sleep and 4-class classification setting are 98.32% and 94.49% respectively. Furthermore, the superior accuracy is achieved by performing classification on a low-dimensional feature space derived from time and frequency domains and without the need for artifact removal as a preprocessing step.
NASA Astrophysics Data System (ADS)
Hoffmeister, Dirk; Kramm, Tanja; Curdt, Constanze; Maleki, Sedigheh; Khormali, Farhad; Kehl, Martin
2016-04-01
The Iranian loess plateau is covered by loess deposits, up to 70 m thick. Tectonic uplift triggered deep erosion and valley incision into the loess and underlying marine deposits. Soil development strongly relates to the aspect of these incised slopes, because on northern slopes vegetation protects the soil surface against erosion and facilitates formation and preservation of a Cambisol, whereas on south-facing slopes soils were probably eroded and weakly developed Entisols formed. While the whole area is intensively stocked with sheep and goat, rain-fed cropping of winter wheat is practiced on the valley floors. Most time of the year, the soil surface is unprotected against rainfall, which is one of the factors promoting soil erosion and serious flooding. However, little information is available on soil distribution, plant cover and the geomorphological evolution of the plateau, as well as on potentials and problems in land use. Thus, digital landform and soil mapping is needed. As a requirement of digital landform and soil mapping, four different landform classification methods were compared and evaluated. These geomorphometric classifications were run on two different scales. On the whole area an ASTER GDEM and SRTM dataset (30 m pixel resolution) was used. Likewise, two high-resolution digital elevation models were derived from Pléiades satellite stereo-imagery (< 1m pixel resolution, 10 by 10 km). The high-resolution information of this dataset was aggregated to datasets of 5 and 10 m scale. The applied classification methods are the Geomorphons approach, an object-based image approach, the topographical position index and a mainly slope based approach. The accuracy of the classification was checked with a location related image dataset obtained in a field survey (n ~ 150) in September 2015. The accuracy of the DEMs was compared to measured DGPS trenches and map-based elevation data. The overall derived accuracy of the landform classification based on the high-resolution DEM with a resolution of 5 m is approximately 70% and on a 10 m resolution >58%. For the 30 m resolution datasets is the achieved accuracy approximately 40%, as several small scale features are not recognizable in this resolution. Thus, for an accurate differentiation between different important landform types, high-resolution datasets are necessary for this strongly shaped area. One major problem of this approach are the different classes derived by each method and the various class annotations. The result of this evaluation will be regarded for the derivation of landform and soil maps.
NASA Astrophysics Data System (ADS)
Zhu, Zhe; Gallant, Alisa L.; Woodcock, Curtis E.; Pengra, Bruce; Olofsson, Pontus; Loveland, Thomas R.; Jin, Suming; Dahal, Devendra; Yang, Limin; Auch, Roger F.
2016-12-01
The U.S. Geological Survey's Land Change Monitoring, Assessment, and Projection (LCMAP) initiative is a new end-to-end capability to continuously track and characterize changes in land cover, use, and condition to better support research and applications relevant to resource management and environmental change. Among the LCMAP product suite are annual land cover maps that will be available to the public. This paper describes an approach to optimize the selection of training and auxiliary data for deriving the thematic land cover maps based on all available clear observations from Landsats 4-8. Training data were selected from map products of the U.S. Geological Survey's Land Cover Trends project. The Random Forest classifier was applied for different classification scenarios based on the Continuous Change Detection and Classification (CCDC) algorithm. We found that extracting training data proportionally to the occurrence of land cover classes was superior to an equal distribution of training data per class, and suggest using a total of 20,000 training pixels to classify an area about the size of a Landsat scene. The problem of unbalanced training data was alleviated by extracting a minimum of 600 training pixels and a maximum of 8000 training pixels per class. We additionally explored removing outliers contained within the training data based on their spectral and spatial criteria, but observed no significant improvement in classification results. We also tested the importance of different types of auxiliary data that were available for the conterminous United States, including: (a) five variables used by the National Land Cover Database, (b) three variables from the cloud screening "Function of mask" (Fmask) statistics, and (c) two variables from the change detection results of CCDC. We found that auxiliary variables such as a Digital Elevation Model and its derivatives (aspect, position index, and slope), potential wetland index, water probability, snow probability, and cloud probability improved the accuracy of land cover classification. Compared to the original strategy of the CCDC algorithm (500 pixels per class), the use of the optimal strategy improved the classification accuracies substantially (15-percentage point increase in overall accuracy and 4-percentage point increase in minimum accuracy).
ERIC Educational Resources Information Center
Bramley, Tom
2010-01-01
Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…
Thematic accuracy of the National Land Cover Database (NLCD) 2001 land cover for Alaska
Selkowitz, D.J.; Stehman, S.V.
2011-01-01
The National Land Cover Database (NLCD) 2001 Alaska land cover classification is the first 30-m resolution land cover product available covering the entire state of Alaska. The accuracy assessment of the NLCD 2001 Alaska land cover classification employed a geographically stratified three-stage sampling design to select the reference sample of pixels. Reference land cover class labels were determined via fixed wing aircraft, as the high resolution imagery used for determining the reference land cover classification in the conterminous U.S. was not available for most of Alaska. Overall thematic accuracy for the Alaska NLCD was 76.2% (s.e. 2.8%) at Level II (12 classes evaluated) and 83.9% (s.e. 2.1%) at Level I (6 classes evaluated) when agreement was defined as a match between the map class and either the primary or alternate reference class label. When agreement was defined as a match between the map class and primary reference label only, overall accuracy was 59.4% at Level II and 69.3% at Level I. The majority of classification errors occurred at Level I of the classification hierarchy (i.e., misclassifications were generally to a different Level I class, not to a Level II class within the same Level I class). Classification accuracy was higher for more abundant land cover classes and for pixels located in the interior of homogeneous land cover patches. ?? 2011.
NASA Astrophysics Data System (ADS)
Kurniawan, Dian; Suparti; Sugito
2018-05-01
Population growth in Indonesia has increased every year. According to the population census conducted by the Central Bureau of Statistics (BPS) in 2010, the population of Indonesia has reached 237.6 million people. Therefore, to control the population growth rate, the government hold Family Planning or Keluarga Berencana (KB) program for couples of childbearing age. The purpose of this program is to improve the health of mothers and children in order to manifest prosperous society by controlling births while ensuring control of population growth. The data used in this study is the updated family data of Semarang city in 2016 that conducted by National Family Planning Coordinating Board (BKKBN). From these data, classifiers with kernel discriminant analysis will be obtained, and also classification accuracy will be obtained from that method. The result of the analysis showed that normal kernel discriminant analysis gives 71.05 % classification accuracy with 28.95 % classification error. Whereas triweight kernel discriminant analysis gives 73.68 % classification accuracy with 26.32 % classification error. Using triweight kernel discriminant for data preprocessing of family planning participation of childbearing age couples in Semarang City of 2016 can be stated better than with normal kernel discriminant.
Zmiri, Dror; Shahar, Yuval; Taieb-Maimon, Meirav
2012-04-01
To test the feasibility of classifying emergency department patients into severity grades using data mining methods. Emergency department records of 402 patients were classified into five severity grades by two expert physicians. The Naïve Bayes and C4.5 algorithms were applied to produce classifiers from patient data into severity grades. The classifiers' results over several subsets of the data were compared with the physicians' assessments, with a random classifier, and with a classifier that selects the maximal-prevalence class. Positive predictive value, multiple-class extensions of sensitivity and specificity combinations, and entropy change. The mean accuracy of the data mining classifiers was 52.94 ± 5.89%, significantly better (P < 0.05) than the mean accuracy of a random classifier (34.60 ± 2.40%). The entropy of the input data sets was reduced through classification by a mean of 10.1%. Allowing for classification deviations of one severity grade led to mean accuracy of 85.42 ± 1.42%. The classifiers' accuracy in that case was similar to the physicians' consensus rate. Learning from consensus records led to better performance. Reducing the number of severity grades improved results in certain cases. The performance of the Naïve Bayes and C4.5 algorithms was similar; in unbalanced data sets, Naïve Bayes performed better. It is possible to produce a computerized classification model for the severity grade of triage patients, using data mining methods. Learning from patient records regarding which there is a consensus of several physicians is preferable to learning from each physician's patients. Either Naïve Bayes or C4.5 can be used; Naïve Bayes is preferable for unbalanced data sets. An ambiguity in the intermediate severity grades seems to hamper both the physicians' agreement and the classifiers' accuracy. © 2010 Blackwell Publishing Ltd.
The study of vehicle classification equipment with solutions to improve accuracy in Oklahoma.
DOT National Transportation Integrated Search
2014-12-01
The accuracy of vehicle counting and classification data is vital for appropriate future highway and road : design, including determining pavement characteristics, eliminating traffic jams, and improving safety. : Organizations relying on vehicle cla...
Performance Analysis of Classification Methods for Indoor Localization in Vlc Networks
NASA Astrophysics Data System (ADS)
Sánchez-Rodríguez, D.; Alonso-González, I.; Sánchez-Medina, J.; Ley-Bosch, C.; Díaz-Vilariño, L.
2017-09-01
Indoor localization has gained considerable attention over the past decade because of the emergence of numerous location-aware services. Research works have been proposed on solving this problem by using wireless networks. Nevertheless, there is still much room for improvement in the quality of the proposed classification models. In the last years, the emergence of Visible Light Communication (VLC) brings a brand new approach to high quality indoor positioning. Among its advantages, this new technology is immune to electromagnetic interference and has the advantage of having a smaller variance of received signal power compared to RF based technologies. In this paper, a performance analysis of seventeen machine leaning classifiers for indoor localization in VLC networks is carried out. The analysis is accomplished in terms of accuracy, average distance error, computational cost, training size, precision and recall measurements. Results show that most of classifiers harvest an accuracy above 90 %. The best tested classifier yielded a 99.0 % accuracy, with an average error distance of 0.3 centimetres.
Paraskevopoulou, Sivylla E; Wu, Di; Eftekhar, Amir; Constandinou, Timothy G
2014-09-30
This work presents a novel unsupervised algorithm for real-time adaptive clustering of neural spike data (spike sorting). The proposed Hierarchical Adaptive Means (HAM) clustering method combines centroid-based clustering with hierarchical cluster connectivity to classify incoming spikes using groups of clusters. It is described how the proposed method can adaptively track the incoming spike data without requiring any past history, iteration or training and autonomously determines the number of spike classes. Its performance (classification accuracy) has been tested using multiple datasets (both simulated and recorded) achieving a near-identical accuracy compared to k-means (using 10-iterations and provided with the number of spike classes). Also, its robustness in applying to different feature extraction methods has been demonstrated by achieving classification accuracies above 80% across multiple datasets. Last but crucially, its low complexity, that has been quantified through both memory and computation requirements makes this method hugely attractive for future hardware implementation. Copyright © 2014 Elsevier B.V. All rights reserved.
Genome-Wide Comparative Gene Family Classification
Frech, Christian; Chen, Nansheng
2010-01-01
Correct classification of genes into gene families is important for understanding gene function and evolution. Although gene families of many species have been resolved both computationally and experimentally with high accuracy, gene family classification in most newly sequenced genomes has not been done with the same high standard. This project has been designed to develop a strategy to effectively and accurately classify gene families across genomes. We first examine and compare the performance of computer programs developed for automated gene family classification. We demonstrate that some programs, including the hierarchical average-linkage clustering algorithm MC-UPGMA and the popular Markov clustering algorithm TRIBE-MCL, can reconstruct manual curation of gene families accurately. However, their performance is highly sensitive to parameter setting, i.e. different gene families require different program parameters for correct resolution. To circumvent the problem of parameterization, we have developed a comparative strategy for gene family classification. This strategy takes advantage of existing curated gene families of reference species to find suitable parameters for classifying genes in related genomes. To demonstrate the effectiveness of this novel strategy, we use TRIBE-MCL to classify chemosensory and ABC transporter gene families in C. elegans and its four sister species. We conclude that fully automated programs can establish biologically accurate gene families if parameterized accordingly. Comparative gene family classification finds optimal parameters automatically, thus allowing rapid insights into gene families of newly sequenced species. PMID:20976221
Dai, Shengfa; Wei, Qingguo
2017-01-01
Common spatial pattern algorithm is widely used to estimate spatial filters in motor imagery based brain-computer interfaces. However, use of a large number of channels will make common spatial pattern tend to over-fitting and the classification of electroencephalographic signals time-consuming. To overcome these problems, it is necessary to choose an optimal subset of the whole channels to save computational time and improve the classification accuracy. In this paper, a novel method named backtracking search optimization algorithm is proposed to automatically select the optimal channel set for common spatial pattern. Each individual in the population is a N-dimensional vector, with each component representing one channel. A population of binary codes generate randomly in the beginning, and then channels are selected according to the evolution of these codes. The number and positions of 1's in the code denote the number and positions of chosen channels. The objective function of backtracking search optimization algorithm is defined as the combination of classification error rate and relative number of channels. Experimental results suggest that higher classification accuracy can be achieved with much fewer channels compared to standard common spatial pattern with whole channels.
Informal settlement classification using point-cloud and image-based features from UAV data
NASA Astrophysics Data System (ADS)
Gevaert, C. M.; Persello, C.; Sliuzas, R.; Vosselman, G.
2017-03-01
Unmanned Aerial Vehicles (UAVs) are capable of providing very high resolution and up-to-date information to support informal settlement upgrading projects. In order to provide accurate basemaps, urban scene understanding through the identification and classification of buildings and terrain is imperative. However, common characteristics of informal settlements such as small, irregular buildings with heterogeneous roof material and large presence of clutter challenge state-of-the-art algorithms. Furthermore, it is of interest to analyse which fundamental attributes are suitable for describing these objects in different geographic locations. This work investigates how 2D radiometric and textural features, 2.5D topographic features, and 3D geometric features obtained from UAV imagery can be integrated to obtain a high classification accuracy in challenging classification problems for the analysis of informal settlements. UAV datasets from informal settlements in two different countries are compared in order to identify salient features for specific objects in heterogeneous urban environments. Findings show that the integration of 2D and 3D features leads to an overall accuracy of 91.6% and 95.2% respectively for informal settlements in Kigali, Rwanda and Maldonado, Uruguay.
Clemans, Katherine H.; Musci, Rashelle J.; Leoutsakos, Jeannie-Marie S.; Ialongo, Nicholas S.
2014-01-01
Objective This study compared the ability of teacher, parent, and peer reports of aggressive behavior in early childhood to accurately classify cases of maladaptive outcomes in late adolescence and early adulthood. Method Weighted kappa analyses determined optimal cut points and relative classification accuracy among teacher, parent, and peer reports of aggression assessed for 691 students (54% male; 84% African American, 13% White) in the fall of first grade. Outcomes included antisocial personality, substance use, incarceration history, risky sexual behavior, and failure to graduate from high school on time. Results Peer reports were the most accurate classifier of all outcomes in the full sample. For most outcomes, the addition of teacher or parent reports did not improve overall classification accuracy once peer reports were accounted for. Additional gender-specific and adjusted kappa analyses supported the superior classification utility of the peer report measure. Conclusion The results suggest that peer reports provided the most useful classification information of the three aggression measures. Implications for targeted intervention efforts which use screening measures to identify at-risk children are discussed. PMID:24512126
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds.
Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M; Bloom, Peter H; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds
Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data. PMID:28403159
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds
Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael J.; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.
Sevel, Landrew S; Boissoneault, Jeff; Letzen, Janelle E; Robinson, Michael E; Staud, Roland
2018-05-30
Chronic fatigue syndrome (CFS) is a disorder associated with fatigue, pain, and structural/functional abnormalities seen during magnetic resonance brain imaging (MRI). Therefore, we evaluated the performance of structural MRI (sMRI) abnormalities in the classification of CFS patients versus healthy controls and compared it to machine learning (ML) classification based upon self-report (SR). Participants included 18 CFS patients and 15 healthy controls (HC). All subjects underwent T1-weighted sMRI and provided visual analogue-scale ratings of fatigue, pain intensity, anxiety, depression, anger, and sleep quality. sMRI data were segmented using FreeSurfer and 61 regions based on functional and structural abnormalities previously reported in patients with CFS. Classification was performed in RapidMiner using a linear support vector machine and bootstrap optimism correction. We compared ML classifiers based on (1) 61 a priori sMRI regional estimates and (2) SR ratings. The sMRI model achieved 79.58% classification accuracy. The SR (accuracy = 95.95%) outperformed both sMRI models. Estimates from multiple brain areas related to cognition, emotion, and memory contributed strongly to group classification. This is the first ML-based group classification of CFS. Our findings suggest that sMRI abnormalities are useful for discriminating CFS patients from HC, but SR ratings remain most effective in classification tasks.
Feature Selection Methods for Zero-Shot Learning of Neural Activity
Caceres, Carlos A.; Roos, Matthew J.; Rupp, Kyle M.; Milsap, Griffin; Crone, Nathan E.; Wolmetz, Michael E.; Ratto, Christopher R.
2017-01-01
Dimensionality poses a serious challenge when making predictions from human neuroimaging data. Across imaging modalities, large pools of potential neural features (e.g., responses from particular voxels, electrodes, and temporal windows) have to be related to typically limited sets of stimuli and samples. In recent years, zero-shot prediction models have been introduced for mapping between neural signals and semantic attributes, which allows for classification of stimulus classes not explicitly included in the training set. While choices about feature selection can have a substantial impact when closed-set accuracy, open-set robustness, and runtime are competing design objectives, no systematic study of feature selection for these models has been reported. Instead, a relatively straightforward feature stability approach has been adopted and successfully applied across models and imaging modalities. To characterize the tradeoffs in feature selection for zero-shot learning, we compared correlation-based stability to several other feature selection techniques on comparable data sets from two distinct imaging modalities: functional Magnetic Resonance Imaging and Electrocorticography. While most of the feature selection methods resulted in similar zero-shot prediction accuracies and spatial/spectral patterns of selected features, there was one exception; A novel feature/attribute correlation approach was able to achieve those accuracies with far fewer features, suggesting the potential for simpler prediction models that yield high zero-shot classification accuracy. PMID:28690513
Liu, Aiming; Liu, Quan; Ai, Qingsong; Xie, Yi; Chen, Anqi
2017-01-01
Motor Imagery (MI) electroencephalography (EEG) is widely studied for its non-invasiveness, easy availability, portability, and high temporal resolution. As for MI EEG signal processing, the high dimensions of features represent a research challenge. It is necessary to eliminate redundant features, which not only create an additional overhead of managing the space complexity, but also might include outliers, thereby reducing classification accuracy. The firefly algorithm (FA) can adaptively select the best subset of features, and improve classification accuracy. However, the FA is easily entrapped in a local optimum. To solve this problem, this paper proposes a method of combining the firefly algorithm and learning automata (LA) to optimize feature selection for motor imagery EEG. We employed a method of combining common spatial pattern (CSP) and local characteristic-scale decomposition (LCD) algorithms to obtain a high dimensional feature set, and classified it by using the spectral regression discriminant analysis (SRDA) classifier. Both the fourth brain–computer interface competition data and real-time data acquired in our designed experiments were used to verify the validation of the proposed method. Compared with genetic and adaptive weight particle swarm optimization algorithms, the experimental results show that our proposed method effectively eliminates redundant features, and improves the classification accuracy of MI EEG signals. In addition, a real-time brain–computer interface system was implemented to verify the feasibility of our proposed methods being applied in practical brain–computer interface systems. PMID:29117100
Liu, Aiming; Chen, Kun; Liu, Quan; Ai, Qingsong; Xie, Yi; Chen, Anqi
2017-11-08
Motor Imagery (MI) electroencephalography (EEG) is widely studied for its non-invasiveness, easy availability, portability, and high temporal resolution. As for MI EEG signal processing, the high dimensions of features represent a research challenge. It is necessary to eliminate redundant features, which not only create an additional overhead of managing the space complexity, but also might include outliers, thereby reducing classification accuracy. The firefly algorithm (FA) can adaptively select the best subset of features, and improve classification accuracy. However, the FA is easily entrapped in a local optimum. To solve this problem, this paper proposes a method of combining the firefly algorithm and learning automata (LA) to optimize feature selection for motor imagery EEG. We employed a method of combining common spatial pattern (CSP) and local characteristic-scale decomposition (LCD) algorithms to obtain a high dimensional feature set, and classified it by using the spectral regression discriminant analysis (SRDA) classifier. Both the fourth brain-computer interface competition data and real-time data acquired in our designed experiments were used to verify the validation of the proposed method. Compared with genetic and adaptive weight particle swarm optimization algorithms, the experimental results show that our proposed method effectively eliminates redundant features, and improves the classification accuracy of MI EEG signals. In addition, a real-time brain-computer interface system was implemented to verify the feasibility of our proposed methods being applied in practical brain-computer interface systems.
NASA Astrophysics Data System (ADS)
Legara, Erika Fille; Monterola, Christopher; Abundo, Cheryl
2011-01-01
We demonstrate an accurate procedure based on linear discriminant analysis that allows automatic authorship classification of opinion column articles. First, we extract the following stylometric features of 157 column articles from four authors: statistics on high frequency words, number of words per sentence, and number of sentences per paragraph. Then, by systematically ranking these features based on an effect size criterion, we show that we can achieve an average classification accuracy of 93% for the test set. In comparison, frequency size based ranking has an average accuracy of 80%. The highest possible average classification accuracy of our data merely relying on chance is ∼31%. By carrying out sensitivity analysis, we show that the effect size criterion is superior than frequency ranking because there exist low frequency words that significantly contribute to successful author discrimination. Consistent results are seen when the procedure is applied in classifying the undisputed Federalist papers of Alexander Hamilton and James Madison. To the best of our knowledge, the work is the first attempt in classifying opinion column articles, that by virtue of being shorter in length (as compared to novels or short stories), are more prone to over-fitting issues. The near perfect classification for the longer papers supports this claim. Our results provide an important insight on authorship attribution that has been overlooked in previous studies: that ranking discriminant variables based on word frequency counts is not necessarily an optimal procedure.
A robust data scaling algorithm to improve classification accuracies in biomedical data.
Cao, Xi Hang; Stojkovic, Ivan; Obradovic, Zoran
2016-09-09
Machine learning models have been adapted in biomedical research and practice for knowledge discovery and decision support. While mainstream biomedical informatics research focuses on developing more accurate models, the importance of data preprocessing draws less attention. We propose the Generalized Logistic (GL) algorithm that scales data uniformly to an appropriate interval by learning a generalized logistic function to fit the empirical cumulative distribution function of the data. The GL algorithm is simple yet effective; it is intrinsically robust to outliers, so it is particularly suitable for diagnostic/classification models in clinical/medical applications where the number of samples is usually small; it scales the data in a nonlinear fashion, which leads to potential improvement in accuracy. To evaluate the effectiveness of the proposed algorithm, we conducted experiments on 16 binary classification tasks with different variable types and cover a wide range of applications. The resultant performance in terms of area under the receiver operation characteristic curve (AUROC) and percentage of correct classification showed that models learned using data scaled by the GL algorithm outperform the ones using data scaled by the Min-max and the Z-score algorithm, which are the most commonly used data scaling algorithms. The proposed GL algorithm is simple and effective. It is robust to outliers, so no additional denoising or outlier detection step is needed in data preprocessing. Empirical results also show models learned from data scaled by the GL algorithm have higher accuracy compared to the commonly used data scaling algorithms.
Xie, Hong-Bo; Huang, Hu; Wu, Jianhua; Liu, Lei
2015-02-01
We present a multiclass fuzzy relevance vector machine (FRVM) learning mechanism and evaluate its performance to classify multiple hand motions using surface electromyographic (sEMG) signals. The relevance vector machine (RVM) is a sparse Bayesian kernel method which avoids some limitations of the support vector machine (SVM). However, RVM still suffers the difficulty of possible unclassifiable regions in multiclass problems. We propose two fuzzy membership function-based FRVM algorithms to solve such problems, based on experiments conducted on seven healthy subjects and two amputees with six hand motions. Two feature sets, namely, AR model coefficients and room mean square value (AR-RMS), and wavelet transform (WT) features, are extracted from the recorded sEMG signals. Fuzzy support vector machine (FSVM) analysis was also conducted for wide comparison in terms of accuracy, sparsity, training and testing time, as well as the effect of training sample sizes. FRVM yielded comparable classification accuracy with dramatically fewer support vectors in comparison with FSVM. Furthermore, the processing delay of FRVM was much less than that of FSVM, whilst training time of FSVM much faster than FRVM. The results indicate that FRVM classifier trained using sufficient samples can achieve comparable generalization capability as FSVM with significant sparsity in multi-channel sEMG classification, which is more suitable for sEMG-based real-time control applications.
A Dirichlet process model for classifying and forecasting epidemic curves.
Nsoesie, Elaine O; Leman, Scotland C; Marathe, Madhav V
2014-01-09
A forecast can be defined as an endeavor to quantitatively estimate a future event or probabilities assigned to a future occurrence. Forecasting stochastic processes such as epidemics is challenging since there are several biological, behavioral, and environmental factors that influence the number of cases observed at each point during an epidemic. However, accurate forecasts of epidemics would impact timely and effective implementation of public health interventions. In this study, we introduce a Dirichlet process (DP) model for classifying and forecasting influenza epidemic curves. The DP model is a nonparametric Bayesian approach that enables the matching of current influenza activity to simulated and historical patterns, identifies epidemic curves different from those observed in the past and enables prediction of the expected epidemic peak time. The method was validated using simulated influenza epidemics from an individual-based model and the accuracy was compared to that of the tree-based classification technique, Random Forest (RF), which has been shown to achieve high accuracy in the early prediction of epidemic curves using a classification approach. We also applied the method to forecasting influenza outbreaks in the United States from 1997-2013 using influenza-like illness (ILI) data from the Centers for Disease Control and Prevention (CDC). We made the following observations. First, the DP model performed as well as RF in identifying several of the simulated epidemics. Second, the DP model correctly forecasted the peak time several days in advance for most of the simulated epidemics. Third, the accuracy of identifying epidemics different from those already observed improved with additional data, as expected. Fourth, both methods correctly classified epidemics with higher reproduction numbers (R) with a higher accuracy compared to epidemics with lower R values. Lastly, in the classification of seasonal influenza epidemics based on ILI data from the CDC, the methods' performance was comparable. Although RF requires less computational time compared to the DP model, the algorithm is fully supervised implying that epidemic curves different from those previously observed will always be misclassified. In contrast, the DP model can be unsupervised, semi-supervised or fully supervised. Since both methods have their relative merits, an approach that uses both RF and the DP model could be beneficial.
Byun, Wonwoo; Lee, Jung-Min; Kim, Youngwon; Brusseau, Timothy A
2018-03-26
This study examined the accuracy of the Fitbit activity tracker (FF) for quantifying sedentary behavior (SB) and varying intensities of physical activity (PA) in 3-5-year-old children. Twenty-eight healthy preschool-aged children (Girls: 46%, Mean age: 4.8 ± 1.0 years) wore the FF and were directly observed while performing a set of various unstructured and structured free-living activities from sedentary to vigorous intensity. The classification accuracy of the FF for measuring SB, light PA (LPA), moderate-to-vigorous PA (MVPA), and total PA (TPA) was examined calculating Pearson correlation coefficients (r), mean absolute percent error (MAPE), Cohen's kappa ( k ), sensitivity (Se), specificity (Sp), and area under the receiver operating curve (ROC-AUC). The classification accuracies of the FF (ROC-AUC) were 0.92, 0.63, 0.77 and 0.92 for SB, LPA, MVPA and TPA, respectively. Similarly, values of kappa, Se, Sp and percentage of correct classification were consistently high for SB and TPA, but low for LPA and MVPA. The FF demonstrated excellent classification accuracy for assessing SB and TPA, but lower accuracy for classifying LPA and MVPA. Our findings suggest that the FF should be considered as a valid instrument for assessing time spent sedentary and overall physical activity in preschool-aged children.
Feature Selection and Effective Classifiers.
ERIC Educational Resources Information Center
Deogun, Jitender S.; Choubey, Suresh K.; Raghavan, Vijay V.; Sever, Hayri
1998-01-01
Develops and analyzes four algorithms for feature selection in the context of rough set methodology. Experimental results confirm the expected relationship between the time complexity of these algorithms and the classification accuracy of the resulting upper classifiers. When compared, results of upper classifiers perform better than lower…
Application of Hyperspectral Imaging to Detect Sclerotinia sclerotiorum on Oilseed Rape Stems
Kong, Wenwen; Zhang, Chu; Huang, Weihao
2018-01-01
Hyperspectral imaging covering the spectral range of 384–1034 nm combined with chemometric methods was used to detect Sclerotinia sclerotiorum (SS) on oilseed rape stems by two sample sets (60 healthy and 60 infected stems for each set). Second derivative spectra and PCA loadings were used to select the optimal wavelengths. Discriminant models were built and compared to detect SS on oilseed rape stems, including partial least squares-discriminant analysis, radial basis function neural network, support vector machine and extreme learning machine. The discriminant models using full spectra and optimal wavelengths showed good performance with classification accuracies of over 80% for the calibration and prediction set. Comparing all developed models, the optimal classification accuracies of the calibration and prediction set were over 90%. The similarity of selected optimal wavelengths also indicated the feasibility of using hyperspectral imaging to detect SS on oilseed rape stems. The results indicated that hyperspectral imaging could be used as a fast, non-destructive and reliable technique to detect plant diseases on stems. PMID:29300315
NASA Astrophysics Data System (ADS)
Niazmardi, S.; Safari, A.; Homayouni, S.
2017-09-01
Crop mapping through classification of Satellite Image Time-Series (SITS) data can provide very valuable information for several agricultural applications, such as crop monitoring, yield estimation, and crop inventory. However, the SITS data classification is not straightforward. Because different images of a SITS data have different levels of information regarding the classification problems. Moreover, the SITS data is a four-dimensional data that cannot be classified using the conventional classification algorithms. To address these issues in this paper, we presented a classification strategy based on Multiple Kernel Learning (MKL) algorithms for SITS data classification. In this strategy, initially different kernels are constructed from different images of the SITS data and then they are combined into a composite kernel using the MKL algorithms. The composite kernel, once constructed, can be used for the classification of the data using the kernel-based classification algorithms. We compared the computational time and the classification performances of the proposed classification strategy using different MKL algorithms for the purpose of crop mapping. The considered MKL algorithms are: MKL-Sum, SimpleMKL, LPMKL and Group-Lasso MKL algorithms. The experimental tests of the proposed strategy on two SITS data sets, acquired by SPOT satellite sensors, showed that this strategy was able to provide better performances when compared to the standard classification algorithm. The results also showed that the optimization method of the used MKL algorithms affects both the computational time and classification accuracy of this strategy.
Ground Truth Sampling and LANDSAT Accuracy Assessment
NASA Technical Reports Server (NTRS)
Robinson, J. W.; Gunther, F. J.; Campbell, W. J.
1982-01-01
It is noted that the key factor in any accuracy assessment of remote sensing data is the method used for determining the ground truth, independent of the remote sensing data itself. The sampling and accuracy procedures developed for nuclear power plant siting study are described. The purpose of the sampling procedure was to provide data for developing supervised classifications for two study sites and for assessing the accuracy of that and the other procedures used. The purpose of the accuracy assessment was to allow the comparison of the cost and accuracy of various classification procedures as applied to various data types.
Multi-Temporal Classification and Change Detection Using Uav Images
NASA Astrophysics Data System (ADS)
Makuti, S.; Nex, F.; Yang, M. Y.
2018-05-01
In this paper different methodologies for the classification and change detection of UAV image blocks are explored. UAV is not only the cheapest platform for image acquisition but it is also the easiest platform to operate in repeated data collections over a changing area like a building construction site. Two change detection techniques have been evaluated in this study: the pre-classification and the post-classification algorithms. These methods are based on three main steps: feature extraction, classification and change detection. A set of state of the art features have been used in the tests: colour features (HSV), textural features (GLCM) and 3D geometric features. For classification purposes Conditional Random Field (CRF) has been used: the unary potential was determined using the Random Forest algorithm while the pairwise potential was defined by the fully connected CRF. In the performed tests, different feature configurations and settings have been considered to assess the performance of these methods in such challenging task. Experimental results showed that the post-classification approach outperforms the pre-classification change detection method. This was analysed using the overall accuracy, where by post classification have an accuracy of up to 62.6 % and the pre classification change detection have an accuracy of 46.5 %. These results represent a first useful indication for future works and developments.
Automated structural classification of lipids by machine learning.
Taylor, Ryan; Miller, Ryan H; Miller, Ryan D; Porter, Michael; Dalgleish, James; Prince, John T
2015-03-01
Modern lipidomics is largely dependent upon structural ontologies because of the great diversity exhibited in the lipidome, but no automated lipid classification exists to facilitate this partitioning. The size of the putative lipidome far exceeds the number currently classified, despite a decade of work. Automated classification would benefit ongoing classification efforts by decreasing the time needed and increasing the accuracy of classification while providing classifications for mass spectral identification algorithms. We introduce a tool that automates classification into the LIPID MAPS ontology of known lipids with >95% accuracy and novel lipids with 63% accuracy. The classification is based upon simple chemical characteristics and modern machine learning algorithms. The decision trees produced are intelligible and can be used to clarify implicit assumptions about the current LIPID MAPS classification scheme. These characteristics and decision trees are made available to facilitate alternative implementations. We also discovered many hundreds of lipids that are currently misclassified in the LIPID MAPS database, strongly underscoring the need for automated classification. Source code and chemical characteristic lists as SMARTS search strings are available under an open-source license at https://www.github.com/princelab/lipid_classifier. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Toward an Attention-Based Diagnostic Tool for Patients With Locked-in Syndrome.
Lesenfants, Damien; Habbal, Dina; Chatelle, Camille; Soddu, Andrea; Laureys, Steven; Noirhomme, Quentin
2018-03-01
Electroencephalography (EEG) has been proposed as a supplemental tool for reducing clinical misdiagnosis in severely brain-injured populations helping to distinguish conscious from unconscious patients. We studied the use of spectral entropy as a measure of focal attention in order to develop a motor-independent, portable, and objective diagnostic tool for patients with locked-in syndrome (LIS), answering the issues of accuracy and training requirement. Data from 20 healthy volunteers, 6 LIS patients, and 10 patients with a vegetative state/unresponsive wakefulness syndrome (VS/UWS) were included. Spectral entropy was computed during a gaze-independent 2-class (attention vs rest) paradigm, and compared with EEG rhythms (delta, theta, alpha, and beta) classification. Spectral entropy classification during the attention-rest paradigm showed 93% and 91% accuracy in healthy volunteers and LIS patients respectively. VS/UWS patients were at chance level. EEG rhythms classification reached a lower accuracy than spectral entropy. Resting-state EEG spectral entropy could not distinguish individual VS/UWS patients from LIS patients. The present study provides evidence that an EEG-based measure of attention could detect command-following in patients with severe motor disabilities. The entropy system could detect a response to command in all healthy subjects and LIS patients, while none of the VS/UWS patients showed a response to command using this system.
Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa
2018-07-01
Automatic text classification techniques are useful for classifying plaintext medical documents. This study aims to automatically predict the cause of death from free text forensic autopsy reports by comparing various schemes for feature extraction, term weighing or feature value representation, text classification, and feature reduction. For experiments, the autopsy reports belonging to eight different causes of death were collected, preprocessed and converted into 43 master feature vectors using various schemes for feature extraction, representation, and reduction. The six different text classification techniques were applied on these 43 master feature vectors to construct a classification model that can predict the cause of death. Finally, classification model performance was evaluated using four performance measures i.e. overall accuracy, macro precision, macro-F-measure, and macro recall. From experiments, it was found that that unigram features obtained the highest performance compared to bigram, trigram, and hybrid-gram features. Furthermore, in feature representation schemes, term frequency, and term frequency with inverse document frequency obtained similar and better results when compared with binary frequency, and normalized term frequency with inverse document frequency. Furthermore, the chi-square feature reduction approach outperformed Pearson correlation, and information gain approaches. Finally, in text classification algorithms, support vector machine classifier outperforms random forest, Naive Bayes, k-nearest neighbor, decision tree, and ensemble-voted classifier. Our results and comparisons hold practical importance and serve as references for future works. Moreover, the comparison outputs will act as state-of-art techniques to compare future proposals with existing automated text classification techniques. Copyright © 2017 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Performance of Activity Classification Algorithms in Free-living Older Adults
Sasaki, Jeffer Eidi; Hickey, Amanda; Staudenmayer, John; John, Dinesh; Kent, Jane A.; Freedson, Patty S.
2015-01-01
Purpose To compare activity type classification rates of machine learning algorithms trained on laboratory versus free-living accelerometer data in older adults. Methods Thirty-five older adults (21F and 14M ; 70.8 ± 4.9 y) performed selected activities in the laboratory while wearing three ActiGraph GT3X+ activity monitors (dominant hip, wrist, and ankle). Monitors were initialized to collect raw acceleration data at a sampling rate of 80 Hz. Fifteen of the participants also wore the GT3X+ in free-living settings and were directly observed for 2-3 hours. Time- and frequency- domain features from acceleration signals of each monitor were used to train Random Forest (RF) and Support Vector Machine (SVM) models to classify five activity types: sedentary, standing, household, locomotion, and recreational activities. All algorithms were trained on lab data (RFLab and SVMLab) and free-living data (RFFL and SVMFL) using 20 s signal sampling windows. Classification accuracy rates of both types of algorithms were tested on free-living data using a leave-one-out technique. Results Overall classification accuracy rates for the algorithms developed from lab data were between 49% (wrist) to 55% (ankle) for the SVMLab algorithms, and 49% (wrist) to 54% (ankle) for RFLab algorithms. The classification accuracy rates for SVMFL and RFFL algorithms ranged from 58% (wrist) to 69% (ankle) and from 61% (wrist) to 67% (ankle), respectively. Conclusion Our algorithms developed on free-living accelerometer data were more accurate in classifying activity type in free-living older adults than our algorithms developed on laboratory accelerometer data. Future studies should consider using free-living accelerometer data to train machine-learning algorithms in older adults. PMID:26673129