NASA Astrophysics Data System (ADS)
Anitha, J.; Vijila, C. Kezi Selva; Hemanth, D. Jude
2010-02-01
Diabetic retinopathy (DR) is a chronic eye disease for which early detection is highly essential to avoid any fatal results. Image processing of retinal images emerge as a feasible tool for this early diagnosis. Digital image processing techniques involve image classification which is a significant technique to detect the abnormality in the eye. Various automated classification systems have been developed in the recent years but most of them lack high classification accuracy. Artificial neural networks are the widely preferred artificial intelligence technique since it yields superior results in terms of classification accuracy. In this work, Radial Basis function (RBF) neural network based bi-level classification system is proposed to differentiate abnormal DR Images and normal retinal images. The results are analyzed in terms of classification accuracy, sensitivity and specificity. A comparative analysis is performed with the results of the probabilistic classifier namely Bayesian classifier to show the superior nature of neural classifier. Experimental results show promising results for the neural classifier in terms of the performance measures.
An alternative respiratory sounds classification system utilizing artificial neural networks.
Oweis, Rami J; Abdulhay, Enas W; Khayal, Amer; Awad, Areen
2015-01-01
Computerized lung sound analysis involves recording lung sound via an electronic device, followed by computer analysis and classification based on specific signal characteristics as non-linearity and nonstationarity caused by air turbulence. An automatic analysis is necessary to avoid dependence on expert skills. This work revolves around exploiting autocorrelation in the feature extraction stage. All process stages were implemented in MATLAB. The classification process was performed comparatively using both artificial neural networks (ANNs) and adaptive neuro-fuzzy inference systems (ANFIS) toolboxes. The methods have been applied to 10 different respiratory sounds for classification. The ANN was superior to the ANFIS system and returned superior performance parameters. Its accuracy, specificity, and sensitivity were 98.6%, 100%, and 97.8%, respectively. The obtained parameters showed superiority to many recent approaches. The promising proposed method is an efficient fast tool for the intended purpose as manifested in the performance parameters, specifically, accuracy, specificity, and sensitivity. Furthermore, it may be added that utilizing the autocorrelation function in the feature extraction in such applications results in enhanced performance and avoids undesired computation complexities compared to other techniques.
Metric learning for automatic sleep stage classification.
Phan, Huy; Do, Quan; Do, The-Luan; Vu, Duc-Lung
2013-01-01
We introduce in this paper a metric learning approach for automatic sleep stage classification based on single-channel EEG data. We show that learning a global metric from training data instead of using the default Euclidean metric, the k-nearest neighbor classification rule outperforms state-of-the-art methods on Sleep-EDF dataset with various classification settings. The overall accuracy for Awake/Sleep and 4-class classification setting are 98.32% and 94.49% respectively. Furthermore, the superior accuracy is achieved by performing classification on a low-dimensional feature space derived from time and frequency domains and without the need for artifact removal as a preprocessing step.
HEp-2 cell image classification method based on very deep convolutional networks with small datasets
NASA Astrophysics Data System (ADS)
Lu, Mengchi; Gao, Long; Guo, Xifeng; Liu, Qiang; Yin, Jianping
2017-07-01
Human Epithelial-2 (HEp-2) cell images staining patterns classification have been widely used to identify autoimmune diseases by the anti-Nuclear antibodies (ANA) test in the Indirect Immunofluorescence (IIF) protocol. Because manual test is time consuming, subjective and labor intensive, image-based Computer Aided Diagnosis (CAD) systems for HEp-2 cell classification are developing. However, methods proposed recently are mostly manual features extraction with low accuracy. Besides, the scale of available benchmark datasets is small, which does not exactly suitable for using deep learning methods. This issue will influence the accuracy of cell classification directly even after data augmentation. To address these issues, this paper presents a high accuracy automatic HEp-2 cell classification method with small datasets, by utilizing very deep convolutional networks (VGGNet). Specifically, the proposed method consists of three main phases, namely image preprocessing, feature extraction and classification. Moreover, an improved VGGNet is presented to address the challenges of small-scale datasets. Experimental results over two benchmark datasets demonstrate that the proposed method achieves superior performance in terms of accuracy compared with existing methods.
An accuracy assessment of forest disturbance mapping in the western Great Lakes
P.L. Zimmerman; I.W. Housman; C.H. Perry; R.A. Chastain; J.B. Webb; M.V. Finco
2013-01-01
The increasing availability of satellite imagery has spurred the production of thematic land cover maps based on satellite data. These maps are more valuable to the scientific community and land managers when the accuracy of their classifications has been assessed. Here, we assessed the accuracy of a map of forest disturbance in the watersheds of Lake Superior and Lake...
Information extraction with object based support vector machines and vegetation indices
NASA Astrophysics Data System (ADS)
Ustuner, Mustafa; Abdikan, Saygin; Balik Sanli, Fusun
2016-07-01
Information extraction through remote sensing data is important for policy and decision makers as extracted information provide base layers for many application of real world. Classification of remotely sensed data is the one of the most common methods of extracting information however it is still a challenging issue because several factors are affecting the accuracy of the classification. Resolution of the imagery, number and homogeneity of land cover classes, purity of training data and characteristic of adopted classifiers are just some of these challenging factors. Object based image classification has some superiority than pixel based classification for high resolution images since it uses geometry and structure information besides spectral information. Vegetation indices are also commonly used for the classification process since it provides additional spectral information for vegetation, forestry and agricultural areas. In this study, the impacts of the Normalized Difference Vegetation Index (NDVI) and Normalized Difference Red Edge Index (NDRE) on the classification accuracy of RapidEye imagery were investigated. Object based Support Vector Machines were implemented for the classification of crop types for the study area located in Aegean region of Turkey. Results demonstrated that the incorporation of NDRE increase the classification accuracy from 79,96% to 86,80% as overall accuracy, however NDVI decrease the classification accuracy from 79,96% to 78,90%. Moreover it is proven than object based classification with RapidEye data give promising results for crop type mapping and analysis.
Tahmasian, Masoud; Jamalabadi, Hamidreza; Abedini, Mina; Ghadami, Mohammad R; Sepehry, Amir A; Knight, David C; Khazaie, Habibolah
2017-05-22
Sleep disturbance is common in chronic post-traumatic stress disorder (PTSD). However, prior work has demonstrated that there are inconsistencies between subjective and objective assessments of sleep disturbance in PTSD. Therefore, we investigated whether subjective or objective sleep assessment has greater clinical utility to differentiate PTSD patients from healthy subjects. Further, we evaluated whether the combination of subjective and objective methods improves the accuracy of classification into patient versus healthy groups, which has important diagnostic implications. We recruited 32 chronic war-induced PTSD patients and 32 age- and gender-matched healthy subjects to participate in this study. Subjective (i.e. from three self-reported sleep questionnaires) and objective sleep-related data (i.e. from actigraphy scores) were collected from each participant. Subjective, objective, and combined (subjective and objective) sleep data were then analyzed using support vector machine classification. The classification accuracy, sensitivity, and specificity for subjective variables were 89.2%, 89.3%, and 89%, respectively. The classification accuracy, sensitivity, and specificity for objective variables were 65%, 62.3%, and 67.8%, respectively. The classification accuracy, sensitivity, and specificity for the aggregate variables (combination of subjective and objective variables) were 91.6%, 93.0%, and 90.3%, respectively. Our findings indicate that classification accuracy using subjective measurements is superior to objective measurements and the combination of both assessments appears to improve the classification accuracy for differentiating PTSD patients from healthy individuals. Copyright © 2017 Elsevier B.V. All rights reserved.
Mexican Hat Wavelet Kernel ELM for Multiclass Classification.
Wang, Jie; Song, Yi-Fan; Ma, Tian-Lei
2017-01-01
Kernel extreme learning machine (KELM) is a novel feedforward neural network, which is widely used in classification problems. To some extent, it solves the existing problems of the invalid nodes and the large computational complexity in ELM. However, the traditional KELM classifier usually has a low test accuracy when it faces multiclass classification problems. In order to solve the above problem, a new classifier, Mexican Hat wavelet KELM classifier, is proposed in this paper. The proposed classifier successfully improves the training accuracy and reduces the training time in the multiclass classification problems. Moreover, the validity of the Mexican Hat wavelet as a kernel function of ELM is rigorously proved. Experimental results on different data sets show that the performance of the proposed classifier is significantly superior to the compared classifiers.
Multiple confidence estimates as indices of eyewitness memory.
Sauer, James D; Brewer, Neil; Weber, Nathan
2008-08-01
Eyewitness identification decisions are vulnerable to various influences on witnesses' decision criteria that contribute to false identifications of innocent suspects and failures to choose perpetrators. An alternative procedure using confidence estimates to assess the degree of match between novel and previously viewed faces was investigated. Classification algorithms were applied to participants' confidence data to determine when a confidence value or pattern of confidence values indicated a positive response. Experiment 1 compared confidence group classification accuracy with a binary decision control group's accuracy on a standard old-new face recognition task and found superior accuracy for the confidence group for target-absent trials but not for target-present trials. Experiment 2 used a face mini-lineup task and found reduced target-present accuracy offset by large gains in target-absent accuracy. Using a standard lineup paradigm, Experiments 3 and 4 also found improved classification accuracy for target-absent lineups and, with a more sophisticated algorithm, for target-present lineups. This demonstrates the accessibility of evidence for recognition memory decisions and points to a more sensitive index of memory quality than is afforded by binary decisions.
Fuzzy membership functions for analysis of high-resolution CT images of diffuse pulmonary diseases.
Almeida, Eliana; Rangayyan, Rangaraj M; Azevedo-Marques, Paulo M
2015-08-01
We propose the use of fuzzy membership functions to analyze images of diffuse pulmonary diseases (DPDs) based on fractal and texture features. The features were extracted from preprocessed regions of interest (ROIs) selected from high-resolution computed tomography images. The ROIs represent five different patterns of DPDs and normal lung tissue. A Gaussian mixture model (GMM) was constructed for each feature, with six Gaussians modeling the six patterns. Feature selection was performed and the GMMs of the five significant features were used. From the GMMs, fuzzy membership functions were obtained by a probability-possibility transformation and further statistical analysis was performed. An average classification accuracy of 63.5% was obtained for the six classes. For four of the six classes, the classification accuracy was superior to 65%, and the best classification accuracy was 75.5% for one class. The use of fuzzy membership functions to assist in pattern classification is an alternative to deterministic approaches to explore strategies for medical diagnosis.
Pathological brain detection based on wavelet entropy and Hu moment invariants.
Zhang, Yudong; Wang, Shuihua; Sun, Ping; Phillips, Preetha
2015-01-01
With the aim of developing an accurate pathological brain detection system, we proposed a novel automatic computer-aided diagnosis (CAD) to detect pathological brains from normal brains obtained by magnetic resonance imaging (MRI) scanning. The problem still remained a challenge for technicians and clinicians, since MR imaging generated an exceptionally large information dataset. A new two-step approach was proposed in this study. We used wavelet entropy (WE) and Hu moment invariants (HMI) for feature extraction, and the generalized eigenvalue proximal support vector machine (GEPSVM) for classification. To further enhance classification accuracy, the popular radial basis function (RBF) kernel was employed. The 10 runs of k-fold stratified cross validation result showed that the proposed "WE + HMI + GEPSVM + RBF" method was superior to existing methods w.r.t. classification accuracy. It obtained the average classification accuracies of 100%, 100%, and 99.45% over Dataset-66, Dataset-160, and Dataset-255, respectively. The proposed method is effective and can be applied to realistic use.
NASA Astrophysics Data System (ADS)
Tamimi, E.; Ebadi, H.; Kiani, A.
2017-09-01
Automatic building detection from High Spatial Resolution (HSR) images is one of the most important issues in Remote Sensing (RS). Due to the limited number of spectral bands in HSR images, using other features will lead to improve accuracy. By adding these features, the presence probability of dependent features will be increased, which leads to accuracy reduction. In addition, some parameters should be determined in Support Vector Machine (SVM) classification. Therefore, it is necessary to simultaneously determine classification parameters and select independent features according to image type. Optimization algorithm is an efficient method to solve this problem. On the other hand, pixel-based classification faces several challenges such as producing salt-paper results and high computational time in high dimensional data. Hence, in this paper, a novel method is proposed to optimize object-based SVM classification by applying continuous Ant Colony Optimization (ACO) algorithm. The advantages of the proposed method are relatively high automation level, independency of image scene and type, post processing reduction for building edge reconstruction and accuracy improvement. The proposed method was evaluated by pixel-based SVM and Random Forest (RF) classification in terms of accuracy. In comparison with optimized pixel-based SVM classification, the results showed that the proposed method improved quality factor and overall accuracy by 17% and 10%, respectively. Also, in the proposed method, Kappa coefficient was improved by 6% rather than RF classification. Time processing of the proposed method was relatively low because of unit of image analysis (image object). These showed the superiority of the proposed method in terms of time and accuracy.
Identifying Wrist Fracture Patients with High Accuracy by Automatic Categorization of X-ray Reports
de Bruijn, Berry; Cranney, Ann; O’Donnell, Siobhan; Martin, Joel D.; Forster, Alan J.
2006-01-01
The authors performed this study to determine the accuracy of several text classification methods to categorize wrist x-ray reports. We randomly sampled 751 textual wrist x-ray reports. Two expert reviewers rated the presence (n = 301) or absence (n = 450) of an acute fracture of wrist. We developed two information retrieval (IR) text classification methods and a machine learning method using a support vector machine (TC-1). In cross-validation on the derivation set (n = 493), TC-1 outperformed the two IR based methods and six benchmark classifiers, including Naive Bayes and a Neural Network. In the validation set (n = 258), TC-1 demonstrated consistent performance with 93.8% accuracy; 95.5% sensitivity; 92.9% specificity; and 87.5% positive predictive value. TC-1 was easy to implement and superior in performance to the other classification methods. PMID:16929046
Classification of urban features using airborne hyperspectral data
NASA Astrophysics Data System (ADS)
Ganesh Babu, Bharath
Accurate mapping and modeling of urban environments are critical for their efficient and successful management. Superior understanding of complex urban environments is made possible by using modern geospatial technologies. This research focuses on thematic classification of urban land use and land cover (LULC) using 248 bands of 2.0 meter resolution hyperspectral data acquired from an airborne imaging spectrometer (AISA+) on 24th July 2006 in and near Terre Haute, Indiana. Three distinct study areas including two commercial classes, two residential classes, and two urban parks/recreational classes were selected for classification and analysis. Four commonly used classification methods -- maximum likelihood (ML), extraction and classification of homogeneous objects (ECHO), spectral angle mapper (SAM), and iterative self organizing data analysis (ISODATA) - were applied to each data set. Accuracy assessment was conducted and overall accuracies were compared between the twenty four resulting thematic maps. With the exception of SAM and ISODATA in a complex commercial area, all methods employed classified the designated urban features with more than 80% accuracy. The thematic classification from ECHO showed the best agreement with ground reference samples. The residential area with relatively homogeneous composition was classified consistently with highest accuracy by all four of the classification methods used. The average accuracy amongst the classifiers was 93.60% for this area. When individually observed, the complex recreational area (Deming Park) was classified with the highest accuracy by ECHO, with an accuracy of 96.80% and 96.10% Kappa. The average accuracy amongst all the classifiers was 92.07%. The commercial area with relatively high complexity was classified with the least accuracy by all classifiers. The lowest accuracy was achieved by SAM at 63.90% with 59.20% Kappa. This was also the lowest accuracy in the entire analysis. This study demonstrates the potential for using the visible and near infrared (VNIR) bands from AISA+ hyperspectral data in urban LULC classification. Based on their performance, the need for further research using ECHO and SAM is underscored. The importance incorporating imaging spectrometer data in high resolution urban feature mapping is emphasized.
Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders.
Subasi, Abdulhamit
2013-06-01
Support vector machine (SVM) is an extensively used machine learning method with many biomedical signal classification applications. In this study, a novel PSO-SVM model has been proposed that hybridized the particle swarm optimization (PSO) and SVM to improve the EMG signal classification accuracy. This optimization mechanism involves kernel parameter setting in the SVM training procedure, which significantly influences the classification accuracy. The experiments were conducted on the basis of EMG signal to classify into normal, neurogenic or myopathic. In the proposed method the EMG signals were decomposed into the frequency sub-bands using discrete wavelet transform (DWT) and a set of statistical features were extracted from these sub-bands to represent the distribution of wavelet coefficients. The obtained results obviously validate the superiority of the SVM method compared to conventional machine learning methods, and suggest that further significant enhancements in terms of classification accuracy can be achieved by the proposed PSO-SVM classification system. The PSO-SVM yielded an overall accuracy of 97.41% on 1200 EMG signals selected from 27 subject records against 96.75%, 95.17% and 94.08% for the SVM, the k-NN and the RBF classifiers, respectively. PSO-SVM is developed as an efficient tool so that various SVMs can be used conveniently as the core of PSO-SVM for diagnosis of neuromuscular disorders. Copyright © 2013 Elsevier Ltd. All rights reserved.
Fusion and Gaussian mixture based classifiers for SONAR data
NASA Astrophysics Data System (ADS)
Kotari, Vikas; Chang, KC
2011-06-01
Underwater mines are inexpensive and highly effective weapons. They are difficult to detect and classify. Hence detection and classification of underwater mines is essential for the safety of naval vessels. This necessitates a formulation of highly efficient classifiers and detection techniques. Current techniques primarily focus on signals from one source. Data fusion is known to increase the accuracy of detection and classification. In this paper, we formulated a fusion-based classifier and a Gaussian mixture model (GMM) based classifier for classification of underwater mines. The emphasis has been on sound navigation and ranging (SONAR) signals due to their extensive use in current naval operations. The classifiers have been tested on real SONAR data obtained from University of California Irvine (UCI) repository. The performance of both GMM based classifier and fusion based classifier clearly demonstrate their superior classification accuracy over conventional single source cases and validate our approach.
Men, Hong; Fu, Songlin; Yang, Jialin; Cheng, Meiqi; Shi, Yan; Liu, Jingjing
2018-01-18
Paraffin odor intensity is an important quality indicator when a paraffin inspection is performed. Currently, paraffin odor level assessment is mainly dependent on an artificial sensory evaluation. In this paper, we developed a paraffin odor analysis system to classify and grade four kinds of paraffin samples. The original feature set was optimized using Principal Component Analysis (PCA) and Partial Least Squares (PLS). Support Vector Machine (SVM), Random Forest (RF), and Extreme Learning Machine (ELM) were applied to three different feature data sets for classification and level assessment of paraffin. For classification, the model based on SVM, with an accuracy rate of 100%, was superior to that based on RF, with an accuracy rate of 98.33-100%, and ELM, with an accuracy rate of 98.01-100%. For level assessment, the R² related to the training set was above 0.97 and the R² related to the test set was above 0.87. Through comprehensive comparison, the generalization of the model based on ELM was superior to those based on SVM and RF. The scoring errors for the three models were 0.0016-0.3494, lower than the error of 0.5-1.0 measured by industry standard experts, meaning these methods have a higher prediction accuracy for scoring paraffin level.
Couple Graph Based Label Propagation Method for Hyperspectral Remote Sensing Data Classification
NASA Astrophysics Data System (ADS)
Wang, X. P.; Hu, Y.; Chen, J.
2018-04-01
Graph based semi-supervised classification method are widely used for hyperspectral image classification. We present a couple graph based label propagation method, which contains both the adjacency graph and the similar graph. We propose to construct the similar graph by using the similar probability, which utilize the label similarity among examples probably. The adjacency graph was utilized by a common manifold learning method, which has effective improve the classification accuracy of hyperspectral data. The experiments indicate that the couple graph Laplacian which unite both the adjacency graph and the similar graph, produce superior classification results than other manifold Learning based graph Laplacian and Sparse representation based graph Laplacian in label propagation framework.
Wen, Tingxi; Zhang, Zhongnan
2017-01-01
Abstract In this paper, genetic algorithm-based frequency-domain feature search (GAFDS) method is proposed for the electroencephalogram (EEG) analysis of epilepsy. In this method, frequency-domain features are first searched and then combined with nonlinear features. Subsequently, these features are selected and optimized to classify EEG signals. The extracted features are analyzed experimentally. The features extracted by GAFDS show remarkable independence, and they are superior to the nonlinear features in terms of the ratio of interclass distance and intraclass distance. Moreover, the proposed feature search method can search for features of instantaneous frequency in a signal after Hilbert transformation. The classification results achieved using these features are reasonable; thus, GAFDS exhibits good extensibility. Multiple classical classifiers (i.e., k-nearest neighbor, linear discriminant analysis, decision tree, AdaBoost, multilayer perceptron, and Naïve Bayes) achieve satisfactory classification accuracies by using the features generated by the GAFDS method and the optimized feature selection. The accuracies for 2-classification and 3-classification problems may reach up to 99% and 97%, respectively. Results of several cross-validation experiments illustrate that GAFDS is effective in the extraction of effective features for EEG classification. Therefore, the proposed feature selection and optimization model can improve classification accuracy. PMID:28489789
Wen, Tingxi; Zhang, Zhongnan
2017-05-01
In this paper, genetic algorithm-based frequency-domain feature search (GAFDS) method is proposed for the electroencephalogram (EEG) analysis of epilepsy. In this method, frequency-domain features are first searched and then combined with nonlinear features. Subsequently, these features are selected and optimized to classify EEG signals. The extracted features are analyzed experimentally. The features extracted by GAFDS show remarkable independence, and they are superior to the nonlinear features in terms of the ratio of interclass distance and intraclass distance. Moreover, the proposed feature search method can search for features of instantaneous frequency in a signal after Hilbert transformation. The classification results achieved using these features are reasonable; thus, GAFDS exhibits good extensibility. Multiple classical classifiers (i.e., k-nearest neighbor, linear discriminant analysis, decision tree, AdaBoost, multilayer perceptron, and Naïve Bayes) achieve satisfactory classification accuracies by using the features generated by the GAFDS method and the optimized feature selection. The accuracies for 2-classification and 3-classification problems may reach up to 99% and 97%, respectively. Results of several cross-validation experiments illustrate that GAFDS is effective in the extraction of effective features for EEG classification. Therefore, the proposed feature selection and optimization model can improve classification accuracy.
Online clustering algorithms for radar emitter classification.
Liu, Jun; Lee, Jim P Y; Senior; Li, Lingjie; Luo, Zhi-Quan; Wong, K Max
2005-08-01
Radar emitter classification is a special application of data clustering for classifying unknown radar emitters from received radar pulse samples. The main challenges of this task are the high dimensionality of radar pulse samples, small sample group size, and closely located radar pulse clusters. In this paper, two new online clustering algorithms are developed for radar emitter classification: One is model-based using the Minimum Description Length (MDL) criterion and the other is based on competitive learning. Computational complexity is analyzed for each algorithm and then compared. Simulation results show the superior performance of the model-based algorithm over competitive learning in terms of better classification accuracy, flexibility, and stability.
Farrell, Todd R.; Weir, Richard F. ff.
2011-01-01
The use of surface versus intramuscular electrodes as well as the effect of electrode targeting on pattern-recognition-based multifunctional prosthesis control was explored. Surface electrodes are touted for their ability to record activity from relatively large portions of muscle tissue. Intramuscular electromyograms (EMGs) can provide focal recordings from deep muscles of the forearm and independent signals relatively free of crosstalk. However, little work has been done to compare the two. Additionally, while previous investigations have either targeted electrodes to specific muscles or used untargeted (symmetric) electrode arrays, no work has compared these approaches to determine if one is superior. The classification accuracies of pattern-recognition-based classifiers utilizing surface and intramuscular as well as targeted and untargeted electrodes were compared across 11 subjects. A repeated-measures analysis of variance revealed that when only EMG amplitude information was used from all available EMG channels, the targeted surface, targeted intramuscular, and untargeted surface electrodes produced similar classification accuracies while the untargeted intramuscular electrodes produced significantly lower accuracies. However, no statistical differences were observed between any of the electrode conditions when additional features were extracted from the EMG signal. It was concluded that the choice of electrode should be driven by clinical factors, such as signal robustness/stability, cost, etc., instead of by classification accuracy. PMID:18713689
Men, Hong; Fu, Songlin; Yang, Jialin; Cheng, Meiqi; Shi, Yan
2018-01-01
Paraffin odor intensity is an important quality indicator when a paraffin inspection is performed. Currently, paraffin odor level assessment is mainly dependent on an artificial sensory evaluation. In this paper, we developed a paraffin odor analysis system to classify and grade four kinds of paraffin samples. The original feature set was optimized using Principal Component Analysis (PCA) and Partial Least Squares (PLS). Support Vector Machine (SVM), Random Forest (RF), and Extreme Learning Machine (ELM) were applied to three different feature data sets for classification and level assessment of paraffin. For classification, the model based on SVM, with an accuracy rate of 100%, was superior to that based on RF, with an accuracy rate of 98.33–100%, and ELM, with an accuracy rate of 98.01–100%. For level assessment, the R2 related to the training set was above 0.97 and the R2 related to the test set was above 0.87. Through comprehensive comparison, the generalization of the model based on ELM was superior to those based on SVM and RF. The scoring errors for the three models were 0.0016–0.3494, lower than the error of 0.5–1.0 measured by industry standard experts, meaning these methods have a higher prediction accuracy for scoring paraffin level. PMID:29346328
Selection-Fusion Approach for Classification of Datasets with Missing Values
Ghannad-Rezaie, Mostafa; Soltanian-Zadeh, Hamid; Ying, Hao; Dong, Ming
2010-01-01
This paper proposes a new approach based on missing value pattern discovery for classifying incomplete data. This approach is particularly designed for classification of datasets with a small number of samples and a high percentage of missing values where available missing value treatment approaches do not usually work well. Based on the pattern of the missing values, the proposed approach finds subsets of samples for which most of the features are available and trains a classifier for each subset. Then, it combines the outputs of the classifiers. Subset selection is translated into a clustering problem, allowing derivation of a mathematical framework for it. A trade off is established between the computational complexity (number of subsets) and the accuracy of the overall classifier. To deal with this trade off, a numerical criterion is proposed for the prediction of the overall performance. The proposed method is applied to seven datasets from the popular University of California, Irvine data mining archive and an epilepsy dataset from Henry Ford Hospital, Detroit, Michigan (total of eight datasets). Experimental results show that classification accuracy of the proposed method is superior to those of the widely used multiple imputations method and four other methods. They also show that the level of superiority depends on the pattern and percentage of missing values. PMID:20212921
Lin, Kuan-Cheng; Hsieh, Yi-Hsiu
2015-10-01
The classification and analysis of data is an important issue in today's research. Selecting a suitable set of features makes it possible to classify an enormous quantity of data quickly and efficiently. Feature selection is generally viewed as a problem of feature subset selection, such as combination optimization problems. Evolutionary algorithms using random search methods have proven highly effective in obtaining solutions to problems of optimization in a diversity of applications. In this study, we developed a hybrid evolutionary algorithm based on endocrine-based particle swarm optimization (EPSO) and artificial bee colony (ABC) algorithms in conjunction with a support vector machine (SVM) for the selection of optimal feature subsets for the classification of datasets. The results of experiments using specific UCI medical datasets demonstrate that the accuracy of the proposed hybrid evolutionary algorithm is superior to that of basic PSO, EPSO and ABC algorithms, with regard to classification accuracy using subsets with a reduced number of features.
A new blood vessel extraction technique using edge enhancement and object classification.
Badsha, Shahriar; Reza, Ahmed Wasif; Tan, Kim Geok; Dimyati, Kaharudin
2013-12-01
Diabetic retinopathy (DR) is increasing progressively pushing the demand of automatic extraction and classification of severity of diseases. Blood vessel extraction from the fundus image is a vital and challenging task. Therefore, this paper presents a new, computationally simple, and automatic method to extract the retinal blood vessel. The proposed method comprises several basic image processing techniques, namely edge enhancement by standard template, noise removal, thresholding, morphological operation, and object classification. The proposed method has been tested on a set of retinal images. The retinal images were collected from the DRIVE database and we have employed robust performance analysis to evaluate the accuracy. The results obtained from this study reveal that the proposed method offers an average accuracy of about 97 %, sensitivity of 99 %, specificity of 86 %, and predictive value of 98 %, which is superior to various well-known techniques.
Shermeyer, Jacob S.; Haack, Barry N.
2015-01-01
Two forestry-change detection methods are described, compared, and contrasted for estimating deforestation and growth in threatened forests in southern Peru from 2000 to 2010. The methods used in this study rely on freely available data, including atmospherically corrected Landsat 5 Thematic Mapper and Moderate Resolution Imaging Spectroradiometer (MODIS) vegetation continuous fields (VCF). The two methods include a conventional supervised signature extraction method and a unique self-calibrating method called MODIS VCF guided forest/nonforest (FNF) masking. The process chain for each of these methods includes a threshold classification of MODIS VCF, training data or signature extraction, signature evaluation, k-nearest neighbor classification, analyst-guided reclassification, and postclassification image differencing to generate forest change maps. Comparisons of all methods were based on an accuracy assessment using 500 validation pixels. Results of this accuracy assessment indicate that FNF masking had a 5% higher overall accuracy and was superior to conventional supervised classification when estimating forest change. Both methods succeeded in classifying persistently forested and nonforested areas, and both had limitations when classifying forest change.
Al Ajmi, Eiman; Forghani, Behzad; Reinhold, Caroline; Bayat, Maryam; Forghani, Reza
2018-06-01
There is a rich amount of quantitative information in spectral datasets generated from dual-energy CT (DECT). In this study, we compare the performance of texture analysis performed on multi-energy datasets to that of virtual monochromatic images (VMIs) at 65 keV only, using classification of the two most common benign parotid neoplasms as a testing paradigm. Forty-two patients with pathologically proven Warthin tumour (n = 25) or pleomorphic adenoma (n = 17) were evaluated. Texture analysis was performed on VMIs ranging from 40 to 140 keV in 5-keV increments (multi-energy analysis) or 65-keV VMIs only, which is typically considered equivalent to single-energy CT. Random forest (RF) models were constructed for outcome prediction using separate randomly selected training and testing sets or the entire patient set. Using multi-energy texture analysis, tumour classification in the independent testing set had accuracy, sensitivity, specificity, positive predictive value, and negative predictive value of 92%, 86%, 100%, 100%, and 83%, compared to 75%, 57%, 100%, 100%, and 63%, respectively, for single-energy analysis. Multi-energy texture analysis demonstrates superior performance compared to single-energy texture analysis of VMIs at 65 keV for classification of benign parotid tumours. • We present and validate a paradigm for texture analysis of DECT scans. • Multi-energy dataset texture analysis is superior to single-energy dataset texture analysis. • DECT texture analysis has high accura\\cy for diagnosis of benign parotid tumours. • DECT texture analysis with machine learning can enhance non-invasive diagnostic tumour evaluation.
Clemans, Katherine H; Musci, Rashelle J; Leoutsakos, Jeannie-Marie S; Ialongo, Nicholas S
2014-04-01
This study compared the ability of teacher, parent, and peer reports of aggressive behavior in early childhood to accurately classify cases of maladaptive outcomes in late adolescence and early adulthood. Weighted kappa analyses determined optimal cut points and relative classification accuracy among teacher, parent, and peer reports of aggression assessed for 691 students (54% male; 84% African American and 13% White) in the fall of first grade. Outcomes included antisocial personality, substance use, incarceration history, risky sexual behavior, and failure to graduate from high school on time. Peer reports were the most accurate classifier of all outcomes in the full sample. For most outcomes, the addition of teacher or parent reports did not improve overall classification accuracy once peer reports were accounted for. Additional gender-specific and adjusted kappa analyses supported the superior classification utility of the peer report measure. The results suggest that peer reports provided the most useful classification information of the 3 aggression measures. Implications for targeted intervention efforts in which screening measures are used to identify at-risk children are discussed.
Thyroid Nodule Classification in Ultrasound Images by Fine-Tuning Deep Convolutional Neural Network.
Chi, Jianning; Walia, Ekta; Babyn, Paul; Wang, Jimmy; Groot, Gary; Eramian, Mark
2017-08-01
With many thyroid nodules being incidentally detected, it is important to identify as many malignant nodules as possible while excluding those that are highly likely to be benign from fine needle aspiration (FNA) biopsies or surgeries. This paper presents a computer-aided diagnosis (CAD) system for classifying thyroid nodules in ultrasound images. We use deep learning approach to extract features from thyroid ultrasound images. Ultrasound images are pre-processed to calibrate their scale and remove the artifacts. A pre-trained GoogLeNet model is then fine-tuned using the pre-processed image samples which leads to superior feature extraction. The extracted features of the thyroid ultrasound images are sent to a Cost-sensitive Random Forest classifier to classify the images into "malignant" and "benign" cases. The experimental results show the proposed fine-tuned GoogLeNet model achieves excellent classification performance, attaining 98.29% classification accuracy, 99.10% sensitivity and 93.90% specificity for the images in an open access database (Pedraza et al. 16), while 96.34% classification accuracy, 86% sensitivity and 99% specificity for the images in our local health region database.
NASA Astrophysics Data System (ADS)
Li, Manchun; Ma, Lei; Blaschke, Thomas; Cheng, Liang; Tiede, Dirk
2016-07-01
Geographic Object-Based Image Analysis (GEOBIA) is becoming more prevalent in remote sensing classification, especially for high-resolution imagery. Many supervised classification approaches are applied to objects rather than pixels, and several studies have been conducted to evaluate the performance of such supervised classification techniques in GEOBIA. However, these studies did not systematically investigate all relevant factors affecting the classification (segmentation scale, training set size, feature selection and mixed objects). In this study, statistical methods and visual inspection were used to compare these factors systematically in two agricultural case studies in China. The results indicate that Random Forest (RF) and Support Vector Machines (SVM) are highly suitable for GEOBIA classifications in agricultural areas and confirm the expected general tendency, namely that the overall accuracies decline with increasing segmentation scale. All other investigated methods except for RF and SVM are more prone to obtain a lower accuracy due to the broken objects at fine scales. In contrast to some previous studies, the RF classifiers yielded the best results and the k-nearest neighbor classifier were the worst results, in most cases. Likewise, the RF and Decision Tree classifiers are the most robust with or without feature selection. The results of training sample analyses indicated that the RF and adaboost. M1 possess a superior generalization capability, except when dealing with small training sample sizes. Furthermore, the classification accuracies were directly related to the homogeneity/heterogeneity of the segmented objects for all classifiers. Finally, it was suggested that RF should be considered in most cases for agricultural mapping.
Findeisen, Peter; Peccerella, Teresa; Post, Stefan; Wenz, Frederik; Neumaier, Michael
2008-04-01
Serum is a difficult matrix for the identification of biomarkers by mass spectrometry (MS). This is due to high-abundance proteins and their complex processing by a multitude of endogenous proteases making rigorous standardisation difficult. Here, we have investigated the use of defined exogenous reporter peptides as substrates for disease-specific proteases with respect to improved standardisation and disease classification accuracy. A recombinant N-terminal fragment of the Adenomatous Polyposis Coli (APC) protein was digested with trypsin to yield a peptide mixture for subsequent Reporter Peptide Spiking (RPS) of serum. Different preanalytical handling of serum samples was simulated by storage of serum samples for up to 6 h at ambient temperature, followed by RPS, further incubation under standardised conditions and testing for stability of protease-generated MS profiles. To demonstrate the superior classification accuracy achieved by RPS, a pilot profiling experiment was performed using serum specimens from pancreatic cancer patients (n = 50) and healthy controls (n = 50). After RPS six different peak categories could be defined, two of which (categories C and D) are modulated by endogenous proteases. These latter are relevant for improved classification accuracy as shown by enhanced disease-specific classification from 78% to 87% in unspiked and spiked samples, respectively. Peaks of these categories presented with unchanged signal intensities regardless of preanalytical conditions. The use of RPS generally improved the signal intensities of protease-generated peptide peaks. RPS circumvents preanalytical variabilities and improves classification accuracies. Our approach will be helpful to introduce MS-based proteomic profiling into routine laboratory testing.
Mapping Winter Wheat with Multi-Temporal SAR and Optical Images in an Urban Agricultural Region
Zhou, Tao; Pan, Jianjun; Zhang, Peiyu; Wei, Shanbao; Han, Tao
2017-01-01
Winter wheat is the second largest food crop in China. It is important to obtain reliable winter wheat acreage to guarantee the food security for the most populous country in the world. This paper focuses on assessing the feasibility of in-season winter wheat mapping and investigating potential classification improvement by using SAR (Synthetic Aperture Radar) images, optical images, and the integration of both types of data in urban agricultural regions with complex planting structures in Southern China. Both SAR (Sentinel-1A) and optical (Landsat-8) data were acquired, and classification using different combinations of Sentinel-1A-derived information and optical images was performed using a support vector machine (SVM) and a random forest (RF) method. The interference coherence and texture images were obtained and used to assess the effect of adding them to the backscatter intensity images on the classification accuracy. The results showed that the use of four Sentinel-1A images acquired before the jointing period of winter wheat can provide satisfactory winter wheat classification accuracy, with an F1 measure of 87.89%. The combination of SAR and optical images for winter wheat mapping achieved the best F1 measure–up to 98.06%. The SVM was superior to RF in terms of the overall accuracy and the kappa coefficient, and was faster than RF, while the RF classifier was slightly better than SVM in terms of the F1 measure. In addition, the classification accuracy can be effectively improved by adding the texture and coherence images to the backscatter intensity data. PMID:28587066
Abnormality detection of mammograms by discriminative dictionary learning on DSIFT descriptors.
Tavakoli, Nasrin; Karimi, Maryam; Nejati, Mansour; Karimi, Nader; Reza Soroushmehr, S M; Samavi, Shadrokh; Najarian, Kayvan
2017-07-01
Detection and classification of breast lesions using mammographic images are one of the most difficult studies in medical image processing. A number of learning and non-learning methods have been proposed for detecting and classifying these lesions. However, the accuracy of the detection/classification still needs improvement. In this paper we propose a powerful classification method based on sparse learning to diagnose breast cancer in mammograms. For this purpose, a supervised discriminative dictionary learning approach is applied on dense scale invariant feature transform (DSIFT) features. A linear classifier is also simultaneously learned with the dictionary which can effectively classify the sparse representations. Our experimental results show the superior performance of our method compared to existing approaches.
Domínguez, Rocio Berenice; Moreno-Barón, Laura; Muñoz, Roberto; Gutiérrez, Juan Manuel
2014-01-01
This paper describes a new method based on a voltammetric electronic tongue (ET) for the recognition of distinctive features in coffee samples. An ET was directly applied to different samples from the main Mexican coffee regions without any pretreatment before the analysis. The resulting electrochemical information was modeled with two different mathematical tools, namely Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM). Growing conditions (i.e., organic or non-organic practices and altitude of crops) were considered for a first classification. LDA results showed an average discrimination rate of 88% ± 6.53% while SVM successfully accomplished an overall accuracy of 96.4% ± 3.50% for the same task. A second classification based on geographical origin of samples was carried out. Results showed an overall accuracy of 87.5% ± 7.79% for LDA and a superior performance of 97.5% ± 3.22% for SVM. Given the complexity of coffee samples, the high accuracy percentages achieved by ET coupled with SVM in both classification problems suggested a potential applicability of ET in the assessment of selected coffee features with a simpler and faster methodology along with a null sample pretreatment. In addition, the proposed method can be applied to authentication assessment while improving cost, time and accuracy of the general procedure. PMID:25254303
Domínguez, Rocio Berenice; Moreno-Barón, Laura; Muñoz, Roberto; Gutiérrez, Juan Manuel
2014-09-24
This paper describes a new method based on a voltammetric electronic tongue (ET) for the recognition of distinctive features in coffee samples. An ET was directly applied to different samples from the main Mexican coffee regions without any pretreatment before the analysis. The resulting electrochemical information was modeled with two different mathematical tools, namely Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM). Growing conditions (i.e., organic or non-organic practices and altitude of crops) were considered for a first classification. LDA results showed an average discrimination rate of 88% ± 6.53% while SVM successfully accomplished an overall accuracy of 96.4% ± 3.50% for the same task. A second classification based on geographical origin of samples was carried out. Results showed an overall accuracy of 87.5% ± 7.79% for LDA and a superior performance of 97.5% ± 3.22% for SVM. Given the complexity of coffee samples, the high accuracy percentages achieved by ET coupled with SVM in both classification problems suggested a potential applicability of ET in the assessment of selected coffee features with a simpler and faster methodology along with a null sample pretreatment. In addition, the proposed method can be applied to authentication assessment while improving cost, time and accuracy of the general procedure.
Xu, Fangzhou; Zhou, Weidong; Zhen, Yilin; Yuan, Qi; Wu, Qi
2016-09-01
The feature extraction and classification of brain signal is very significant in brain-computer interface (BCI). In this study, we describe an algorithm for motor imagery (MI) classification of electrocorticogram (ECoG)-based BCI. The proposed approach employs multi-resolution fractal measures and local binary pattern (LBP) operators to form a combined feature for characterizing an ECoG epoch recording from the right hemisphere of the brain. A classifier is trained by using the gradient boosting in conjunction with ordinary least squares (OLS) method. The fractal intercept, lacunarity and LBP features are extracted to classify imagined movements of either the left small finger or the tongue. Experimental results on dataset I of BCI competition III demonstrate the superior performance of our method. The cross-validation accuracy and accuracy is 90.6% and 95%, respectively. Furthermore, the low computational burden of this method makes it a promising candidate for real-time BCI systems.
Qureshi, Muhammad Naveed Iqbal; Min, Beomjun; Jo, Hang Joon; Lee, Boreom
2016-01-01
The classification of neuroimaging data for the diagnosis of certain brain diseases is one of the main research goals of the neuroscience and clinical communities. In this study, we performed multiclass classification using a hierarchical extreme learning machine (H-ELM) classifier. We compared the performance of this classifier with that of a support vector machine (SVM) and basic extreme learning machine (ELM) for cortical MRI data from attention deficit/hyperactivity disorder (ADHD) patients. We used 159 structural MRI images of children from the publicly available ADHD-200 MRI dataset. The data consisted of three types, namely, typically developing (TDC), ADHD-inattentive (ADHD-I), and ADHD-combined (ADHD-C). We carried out feature selection by using standard SVM-based recursive feature elimination (RFE-SVM) that enabled us to achieve good classification accuracy (60.78%). In this study, we found the RFE-SVM feature selection approach in combination with H-ELM to effectively enable the acquisition of high multiclass classification accuracy rates for structural neuroimaging data. In addition, we found that the most important features for classification were the surface area of the superior frontal lobe, and the cortical thickness, volume, and mean surface area of the whole cortex. PMID:27500640
Qureshi, Muhammad Naveed Iqbal; Min, Beomjun; Jo, Hang Joon; Lee, Boreom
2016-01-01
The classification of neuroimaging data for the diagnosis of certain brain diseases is one of the main research goals of the neuroscience and clinical communities. In this study, we performed multiclass classification using a hierarchical extreme learning machine (H-ELM) classifier. We compared the performance of this classifier with that of a support vector machine (SVM) and basic extreme learning machine (ELM) for cortical MRI data from attention deficit/hyperactivity disorder (ADHD) patients. We used 159 structural MRI images of children from the publicly available ADHD-200 MRI dataset. The data consisted of three types, namely, typically developing (TDC), ADHD-inattentive (ADHD-I), and ADHD-combined (ADHD-C). We carried out feature selection by using standard SVM-based recursive feature elimination (RFE-SVM) that enabled us to achieve good classification accuracy (60.78%). In this study, we found the RFE-SVM feature selection approach in combination with H-ELM to effectively enable the acquisition of high multiclass classification accuracy rates for structural neuroimaging data. In addition, we found that the most important features for classification were the surface area of the superior frontal lobe, and the cortical thickness, volume, and mean surface area of the whole cortex.
NASA Astrophysics Data System (ADS)
Shupe, Scott Marshall
2000-10-01
Vegetation mapping in and regions facilitates ecological studies, land management, and provides a record to which future land changes can be compared. Accurate and representative mapping of desert vegetation requires a sound field sampling program and a methodology to transform the data collected into a representative classification system. Time and cost constraints require that a remote sensing approach be used if such a classification system is to be applied on a regional scale. However, desert vegetation may be sparse and thus difficult to sense at typical satellite resolutions, especially given the problem of soil reflectance. This study was designed to address these concerns by conducting vegetation mapping research using field and satellite data from the US Army Yuma Proving Ground (USYPG) in Southwest Arizona. Line and belt transect data from the Army's Land Condition Trend Analysis (LCTA) Program were transformed into relative cover and relative density classification schemes using cluster analysis. Ordination analysis of the same data produced two and three-dimensional graphs on which the homogeneity of each vegetation class could be examined. It was found that the use of correspondence analysis (CA), detrended correspondence analysis (DCA), and non-metric multidimensional scaling (NMS) ordination methods was superior to the use of any single ordination method for helping to clarify between-class and within-class relationships in vegetation composition. Analysis of these between-class and within-class relationships were of key importance in examining how well relative cover and relative density schemes characterize the USYPG vegetation. Using these two classification schemes as reference data, maximum likelihood and artificial neural net classifications were then performed on a coregistered dataset consisting of a summer Landsat Thematic Mapper (TM) image, one spring and one summer ERS-1 microwave image, and elevation, slope, and aspect layers. Classifications using a combination of ERS-1 imagery and elevation, slope, and aspect data were superior to classifications carried out using Landsat TM data alone. In all classification iterations it was consistently found that the highest classification accuracy was obtained by using a combination of Landsat TM, ERS-1, and elevation, slope, and aspect data. Maximum likelihood classification accuracy was found to be higher than artificial neural net classification in all cases.
Multispectral scanner system parameter study and analysis software system description, volume 2
NASA Technical Reports Server (NTRS)
Landgrebe, D. A. (Principal Investigator); Mobasseri, B. G.; Wiersma, D. J.; Wiswell, E. R.; Mcgillem, C. D.; Anuta, P. E.
1978-01-01
The author has identified the following significant results. The integration of the available methods provided the analyst with the unified scanner analysis package (USAP), the flexibility and versatility of which was superior to many previous integrated techniques. The USAP consisted of three main subsystems; (1) a spatial path, (2) a spectral path, and (3) a set of analytic classification accuracy estimators which evaluated the system performance. The spatial path consisted of satellite and/or aircraft data, data correlation analyzer, scanner IFOV, and random noise model. The output of the spatial path was fed into the analytic classification and accuracy predictor. The spectral path consisted of laboratory and/or field spectral data, EXOSYS data retrieval, optimum spectral function calculation, data transformation, and statistics calculation. The output of the spectral path was fended into the stratified posterior performance estimator.
NASA Astrophysics Data System (ADS)
Liu, F.; Chen, T.; He, J.; Wen, Q.; Yu, F.; Gu, X.; Wang, Z.
2018-04-01
In recent years, the quick upgrading and improvement of SAR sensors provide beneficial complements for the traditional optical remote sensing in the aspects of theory, technology and data. In this paper, Sentinel-1A SAR data and GF-1 optical data were selected for image fusion, and more emphases were put on the dryland crop classification under a complex crop planting structure, regarding corn and cotton as the research objects. Considering the differences among various data fusion methods, the principal component analysis (PCA), Gram-Schmidt (GS), Brovey and wavelet transform (WT) methods were compared with each other, and the GS and Brovey methods were proved to be more applicable in the study area. Then, the classification was conducted based on the object-oriented technique process. And for the GS, Brovey fusion images and GF-1 optical image, the nearest neighbour algorithm was adopted to realize the supervised classification with the same training samples. Based on the sample plots in the study area, the accuracy assessment was conducted subsequently. The values of overall accuracy and kappa coefficient of fusion images were all higher than those of GF-1 optical image, and GS method performed better than Brovey method. In particular, the overall accuracy of GS fusion image was 79.8 %, and the Kappa coefficient was 0.644. Thus, the results showed that GS and Brovey fusion images were superior to optical images for dryland crop classification. This study suggests that the fusion of SAR and optical images is reliable for dryland crop classification under a complex crop planting structure.
Kos, Gregor; Sieger, Markus; McMullin, David; Zahradnik, Celine; Sulyok, Michael; Öner, Tuba; Mizaikoff, Boris; Krska, Rudolf
2016-10-01
The rapid identification of mycotoxins such as deoxynivalenol and aflatoxin B 1 in agricultural commodities is an ongoing concern for food importers and processors. While sophisticated chromatography-based methods are well established for regulatory testing by food safety authorities, few techniques exist to provide a rapid assessment for traders. This study advances the development of a mid-infrared spectroscopic method, recording spectra with little sample preparation. Spectral data were classified using a bootstrap-aggregated (bagged) decision tree method, evaluating the protein and carbohydrate absorption regions of the spectrum. The method was able to classify 79% of 110 maize samples at the European Union regulatory limit for deoxynivalenol of 1750 µg kg -1 and, for the first time, 77% of 92 peanut samples at 8 µg kg -1 of aflatoxin B 1 . A subset model revealed a dependency on variety and type of fungal infection. The employed CRC and SBL maize varieties could be pooled in the model with a reduction of classification accuracy from 90% to 79%. Samples infected with Fusarium verticillioides were removed, leaving samples infected with F. graminearum and F. culmorum in the dataset improving classification accuracy from 73% to 79%. A 500 µg kg -1 classification threshold for deoxynivalenol in maize performed even better with 85% accuracy. This is assumed to be due to a larger number of samples around the threshold increasing representativity. Comparison with established principal component analysis classification, which consistently showed overlapping clusters, confirmed the superior performance of bagged decision tree classification.
Xu, Kele; Feng, Dawei; Mi, Haibo
2017-11-23
The automatic detection of diabetic retinopathy is of vital importance, as it is the main cause of irreversible vision loss in the working-age population in the developed world. The early detection of diabetic retinopathy occurrence can be very helpful for clinical treatment; although several different feature extraction approaches have been proposed, the classification task for retinal images is still tedious even for those trained clinicians. Recently, deep convolutional neural networks have manifested superior performance in image classification compared to previous handcrafted feature-based image classification methods. Thus, in this paper, we explored the use of deep convolutional neural network methodology for the automatic classification of diabetic retinopathy using color fundus image, and obtained an accuracy of 94.5% on our dataset, outperforming the results obtained by using classical approaches.
A Directed Acyclic Graph-Large Margin Distribution Machine Model for Music Symbol Classification
Wen, Cuihong; Zhang, Jing; Rebelo, Ana; Cheng, Fanyong
2016-01-01
Optical Music Recognition (OMR) has received increasing attention in recent years. In this paper, we propose a classifier based on a new method named Directed Acyclic Graph-Large margin Distribution Machine (DAG-LDM). The DAG-LDM is an improvement of the Large margin Distribution Machine (LDM), which is a binary classifier that optimizes the margin distribution by maximizing the margin mean and minimizing the margin variance simultaneously. We modify the LDM to the DAG-LDM to solve the multi-class music symbol classification problem. Tests are conducted on more than 10000 music symbol images, obtained from handwritten and printed images of music scores. The proposed method provides superior classification capability and achieves much higher classification accuracy than the state-of-the-art algorithms such as Support Vector Machines (SVMs) and Neural Networks (NNs). PMID:26985826
A Directed Acyclic Graph-Large Margin Distribution Machine Model for Music Symbol Classification.
Wen, Cuihong; Zhang, Jing; Rebelo, Ana; Cheng, Fanyong
2016-01-01
Optical Music Recognition (OMR) has received increasing attention in recent years. In this paper, we propose a classifier based on a new method named Directed Acyclic Graph-Large margin Distribution Machine (DAG-LDM). The DAG-LDM is an improvement of the Large margin Distribution Machine (LDM), which is a binary classifier that optimizes the margin distribution by maximizing the margin mean and minimizing the margin variance simultaneously. We modify the LDM to the DAG-LDM to solve the multi-class music symbol classification problem. Tests are conducted on more than 10000 music symbol images, obtained from handwritten and printed images of music scores. The proposed method provides superior classification capability and achieves much higher classification accuracy than the state-of-the-art algorithms such as Support Vector Machines (SVMs) and Neural Networks (NNs).
Austin, Peter C.; Tu, Jack V.; Ho, Jennifer E.; Levy, Daniel; Lee, Douglas S.
2014-01-01
Objective Physicians classify patients into those with or without a specific disease. Furthermore, there is often interest in classifying patients according to disease etiology or subtype. Classification trees are frequently used to classify patients according to the presence or absence of a disease. However, classification trees can suffer from limited accuracy. In the data-mining and machine learning literature, alternate classification schemes have been developed. These include bootstrap aggregation (bagging), boosting, random forests, and support vector machines. Study design and Setting We compared the performance of these classification methods with those of conventional classification trees to classify patients with heart failure according to the following sub-types: heart failure with preserved ejection fraction (HFPEF) vs. heart failure with reduced ejection fraction (HFREF). We also compared the ability of these methods to predict the probability of the presence of HFPEF with that of conventional logistic regression. Results We found that modern, flexible tree-based methods from the data mining literature offer substantial improvement in prediction and classification of heart failure sub-type compared to conventional classification and regression trees. However, conventional logistic regression had superior performance for predicting the probability of the presence of HFPEF compared to the methods proposed in the data mining literature. Conclusion The use of tree-based methods offers superior performance over conventional classification and regression trees for predicting and classifying heart failure subtypes in a population-based sample of patients from Ontario. However, these methods do not offer substantial improvements over logistic regression for predicting the presence of HFPEF. PMID:23384592
Clemans, Katherine H.; Musci, Rashelle J.; Leoutsakos, Jeannie-Marie S.; Ialongo, Nicholas S.
2014-01-01
Objective This study compared the ability of teacher, parent, and peer reports of aggressive behavior in early childhood to accurately classify cases of maladaptive outcomes in late adolescence and early adulthood. Method Weighted kappa analyses determined optimal cut points and relative classification accuracy among teacher, parent, and peer reports of aggression assessed for 691 students (54% male; 84% African American, 13% White) in the fall of first grade. Outcomes included antisocial personality, substance use, incarceration history, risky sexual behavior, and failure to graduate from high school on time. Results Peer reports were the most accurate classifier of all outcomes in the full sample. For most outcomes, the addition of teacher or parent reports did not improve overall classification accuracy once peer reports were accounted for. Additional gender-specific and adjusted kappa analyses supported the superior classification utility of the peer report measure. Conclusion The results suggest that peer reports provided the most useful classification information of the three aggression measures. Implications for targeted intervention efforts which use screening measures to identify at-risk children are discussed. PMID:24512126
Accurate, Rapid Taxonomic Classification of Fungal Large-Subunit rRNA Genes
Liu, Kuan-Liang; Porras-Alfaro, Andrea; Eichorst, Stephanie A.
2012-01-01
Taxonomic and phylogenetic fingerprinting based on sequence analysis of gene fragments from the large-subunit rRNA (LSU) gene or the internal transcribed spacer (ITS) region is becoming an integral part of fungal classification. The lack of an accurate and robust classification tool trained by a validated sequence database for taxonomic placement of fungal LSU genes is a severe limitation in taxonomic analysis of fungal isolates or large data sets obtained from environmental surveys. Using a hand-curated set of 8,506 fungal LSU gene fragments, we determined the performance characteristics of a naïve Bayesian classifier across multiple taxonomic levels and compared the classifier performance to that of a sequence similarity-based (BLASTN) approach. The naïve Bayesian classifier was computationally more rapid (>460-fold with our system) than the BLASTN approach, and it provided equal or superior classification accuracy. Classifier accuracies were compared using sequence fragments of 100 bp and 400 bp and two different PCR primer anchor points to mimic sequence read lengths commonly obtained using current high-throughput sequencing technologies. Accuracy was higher with 400-bp sequence reads than with 100-bp reads. It was also significantly affected by sequence location across the 1,400-bp test region. The highest accuracy was obtained across either the D1 or D2 variable region. The naïve Bayesian classifier provides an effective and rapid means to classify fungal LSU sequences from large environmental surveys. The training set and tool are publicly available through the Ribosomal Database Project (http://rdp.cme.msu.edu/classifier/classifier.jsp). PMID:22194300
A lung sound classification system based on the rational dilation wavelet transform.
Ulukaya, Sezer; Serbes, Gorkem; Sen, Ipek; Kahya, Yasemin P
2016-08-01
In this work, a wavelet based classification system that aims to discriminate crackle, normal and wheeze lung sounds is presented. While the previous works related with this problem use constant low Q-factor wavelets, which have limited frequency resolution and can not cope with oscillatory signals, in the proposed system, the Rational Dilation Wavelet Transform, whose Q-factors can be tuned, is employed. Proposed system yields an accuracy of 95 % for crackle, 97 % for wheeze, 93.50 % for normal and 95.17 % for total sound signal types using energy feature subset and proposed approach is superior to conventional low Q-factor wavelet analysis.
Yuan, Yuan; Lin, Jianzhe; Wang, Qi
2016-12-01
Hyperspectral image (HSI) classification is a crucial issue in remote sensing. Accurate classification benefits a large number of applications such as land use analysis and marine resource utilization. But high data correlation brings difficulty to reliable classification, especially for HSI with abundant spectral information. Furthermore, the traditional methods often fail to well consider the spatial coherency of HSI that also limits the classification performance. To address these inherent obstacles, a novel spectral-spatial classification scheme is proposed in this paper. The proposed method mainly focuses on multitask joint sparse representation (MJSR) and a stepwise Markov random filed framework, which are claimed to be two main contributions in this procedure. First, the MJSR not only reduces the spectral redundancy, but also retains necessary correlation in spectral field during classification. Second, the stepwise optimization further explores the spatial correlation that significantly enhances the classification accuracy and robustness. As far as several universal quality evaluation indexes are concerned, the experimental results on Indian Pines and Pavia University demonstrate the superiority of our method compared with the state-of-the-art competitors.
2018-01-01
Background and Objective. Needle electromyography can be used to detect the number of changes and morphological changes in motor unit potentials of patients with axonal neuropathy. General mathematical methods of pattern recognition and signal analysis were applied to recognize neuropathic changes. This study validates the possibility of extending and refining turns-amplitude analysis using permutation entropy and signal energy. Methods. In this study, we examined needle electromyography in 40 neuropathic individuals and 40 controls. The number of turns, amplitude between turns, signal energy, and “permutation entropy” were used as features for support vector machine classification. Results. The obtained results proved the superior classification performance of the combinations of all of the above-mentioned features compared to the combinations of fewer features. The lowest accuracy from the tested combinations of features had peak-ratio analysis. Conclusion. Using the combination of permutation entropy with signal energy, number of turns and mean amplitude in SVM classification can be used to refine the diagnosis of polyneuropathies examined by needle electromyography. PMID:29606959
Improved Fuzzy K-Nearest Neighbor Using Modified Particle Swarm Optimization
NASA Astrophysics Data System (ADS)
Jamaluddin; Siringoringo, Rimbun
2017-12-01
Fuzzy k-Nearest Neighbor (FkNN) is one of the most powerful classification methods. The presence of fuzzy concepts in this method successfully improves its performance on almost all classification issues. The main drawbackof FKNN is that it is difficult to determine the parameters. These parameters are the number of neighbors (k) and fuzzy strength (m). Both parameters are very sensitive. This makes it difficult to determine the values of ‘m’ and ‘k’, thus making FKNN difficult to control because no theories or guides can deduce how proper ‘m’ and ‘k’ should be. This study uses Modified Particle Swarm Optimization (MPSO) to determine the best value of ‘k’ and ‘m’. MPSO is focused on the Constriction Factor Method. Constriction Factor Method is an improvement of PSO in order to avoid local circumstances optima. The model proposed in this study was tested on the German Credit Dataset. The test of the data/The data test has been standardized by UCI Machine Learning Repository which is widely applied to classification problems. The application of MPSO to the determination of FKNN parameters is expected to increase the value of classification performance. Based on the experiments that have been done indicating that the model offered in this research results in a better classification performance compared to the Fk-NN model only. The model offered in this study has an accuracy rate of 81%, while. With using Fk-NN model, it has the accuracy of 70%. At the end is done comparison of research model superiority with 2 other classification models;such as Naive Bayes and Decision Tree. This research model has a better performance level, where Naive Bayes has accuracy 75%, and the decision tree model has 70%
Stimulus specificity of a steady-state visual-evoked potential-based brain-computer interface.
Ng, Kian B; Bradley, Andrew P; Cunnington, Ross
2012-06-01
The mechanisms of neural excitation and inhibition when given a visual stimulus are well studied. It has been established that changing stimulus specificity such as luminance contrast or spatial frequency can alter the neuronal activity and thus modulate the visual-evoked response. In this paper, we study the effect that stimulus specificity has on the classification performance of a steady-state visual-evoked potential-based brain-computer interface (SSVEP-BCI). For example, we investigate how closely two visual stimuli can be placed before they compete for neural representation in the cortex and thus influence BCI classification accuracy. We characterize stimulus specificity using the four stimulus parameters commonly encountered in SSVEP-BCI design: temporal frequency, spatial size, number of simultaneously displayed stimuli and their spatial proximity. By varying these quantities and measuring the SSVEP-BCI classification accuracy, we are able to determine the parameters that provide optimal performance. Our results show that superior SSVEP-BCI accuracy is attained when stimuli are placed spatially more than 5° apart, with size that subtends at least 2° of visual angle, when using a tagging frequency of between high alpha and beta band. These findings may assist in deciding the stimulus parameters for optimal SSVEP-BCI design.
Stimulus specificity of a steady-state visual-evoked potential-based brain-computer interface
NASA Astrophysics Data System (ADS)
Ng, Kian B.; Bradley, Andrew P.; Cunnington, Ross
2012-06-01
The mechanisms of neural excitation and inhibition when given a visual stimulus are well studied. It has been established that changing stimulus specificity such as luminance contrast or spatial frequency can alter the neuronal activity and thus modulate the visual-evoked response. In this paper, we study the effect that stimulus specificity has on the classification performance of a steady-state visual-evoked potential-based brain-computer interface (SSVEP-BCI). For example, we investigate how closely two visual stimuli can be placed before they compete for neural representation in the cortex and thus influence BCI classification accuracy. We characterize stimulus specificity using the four stimulus parameters commonly encountered in SSVEP-BCI design: temporal frequency, spatial size, number of simultaneously displayed stimuli and their spatial proximity. By varying these quantities and measuring the SSVEP-BCI classification accuracy, we are able to determine the parameters that provide optimal performance. Our results show that superior SSVEP-BCI accuracy is attained when stimuli are placed spatially more than 5° apart, with size that subtends at least 2° of visual angle, when using a tagging frequency of between high alpha and beta band. These findings may assist in deciding the stimulus parameters for optimal SSVEP-BCI design.
NASA Astrophysics Data System (ADS)
Majumder, S. K.; Krishna, H.; Sidramesh, M.; Chaturvedi, P.; Gupta, P. K.
2011-08-01
We report the results of a comparative evaluation of in vivo fluorescence and Raman spectroscopy for diagnosis of oral neoplasia. The study carried out at Tata Memorial Hospital, Mumbai, involved 26 healthy volunteers and 138 patients being screened for neoplasm of oral cavity. Spectral measurements were taken from multiple sites of abnormal as well as apparently uninvolved contra-lateral regions of the oral cavity in each patient. The different tissue sites investigated belonged to one of the four histopathology categories: 1) squamous cell carcinoma (SCC), 2) oral sub-mucous fibrosis (OSMF), 3) leukoplakia (LP) and 4) normal squamous tissue. A probability based multivariate statistical algorithm utilizing nonlinear Maximum Representation and Discrimination Feature for feature extraction and Sparse Multinomial Logistic Regression for classification was developed for direct multi-class classification in a leave-one-patient-out cross validation mode. The results reveal that the performance of Raman spectroscopy is considerably superior to that of fluorescence in stratifying the oral tissues into respective histopathologic categories. The best classification accuracy was observed to be 90%, 93%, 94%, and 89% for SCC, SMF, leukoplakia, and normal oral tissues, respectively, on the basis of leave-one-patient-out cross-validation, with an overall accuracy of 91%. However, when a binary classification was employed to distinguish spectra from all the SCC, SMF and leukoplakik tissue sites together from normal, fluorescence and Raman spectroscopy were seen to have almost comparable performances with Raman yielding marginally better classification accuracy of 98.5% as compared to 94% of fluorescence.
Shirahata, Mitsuaki; Iwao-Koizumi, Kyoko; Saito, Sakae; Ueno, Noriko; Oda, Masashi; Hashimoto, Nobuo; Takahashi, Jun A; Kato, Kikuya
2007-12-15
Current morphology-based glioma classification methods do not adequately reflect the complex biology of gliomas, thus limiting their prognostic ability. In this study, we focused on anaplastic oligodendroglioma and glioblastoma, which typically follow distinct clinical courses. Our goal was to construct a clinically useful molecular diagnostic system based on gene expression profiling. The expression of 3,456 genes in 32 patients, 12 and 20 of whom had prognostically distinct anaplastic oligodendroglioma and glioblastoma, respectively, was measured by PCR array. Next to unsupervised methods, we did supervised analysis using a weighted voting algorithm to construct a diagnostic system discriminating anaplastic oligodendroglioma from glioblastoma. The diagnostic accuracy of this system was evaluated by leave-one-out cross-validation. The clinical utility was tested on a microarray-based data set of 50 malignant gliomas from a previous study. Unsupervised analysis showed divergent global gene expression patterns between the two tumor classes. A supervised binary classification model showed 100% (95% confidence interval, 89.4-100%) diagnostic accuracy by leave-one-out cross-validation using 168 diagnostic genes. Applied to a gene expression data set from a previous study, our model correlated better with outcome than histologic diagnosis, and also displayed 96.6% (28 of 29) consistency with the molecular classification scheme used for these histologically controversial gliomas in the original article. Furthermore, we observed that histologically diagnosed glioblastoma samples that shared anaplastic oligodendroglioma molecular characteristics tended to be associated with longer survival. Our molecular diagnostic system showed reproducible clinical utility and prognostic ability superior to traditional histopathologic diagnosis for malignant glioma.
NASA Astrophysics Data System (ADS)
Legara, Erika Fille; Monterola, Christopher; Abundo, Cheryl
2011-01-01
We demonstrate an accurate procedure based on linear discriminant analysis that allows automatic authorship classification of opinion column articles. First, we extract the following stylometric features of 157 column articles from four authors: statistics on high frequency words, number of words per sentence, and number of sentences per paragraph. Then, by systematically ranking these features based on an effect size criterion, we show that we can achieve an average classification accuracy of 93% for the test set. In comparison, frequency size based ranking has an average accuracy of 80%. The highest possible average classification accuracy of our data merely relying on chance is ∼31%. By carrying out sensitivity analysis, we show that the effect size criterion is superior than frequency ranking because there exist low frequency words that significantly contribute to successful author discrimination. Consistent results are seen when the procedure is applied in classifying the undisputed Federalist papers of Alexander Hamilton and James Madison. To the best of our knowledge, the work is the first attempt in classifying opinion column articles, that by virtue of being shorter in length (as compared to novels or short stories), are more prone to over-fitting issues. The near perfect classification for the longer papers supports this claim. Our results provide an important insight on authorship attribution that has been overlooked in previous studies: that ranking discriminant variables based on word frequency counts is not necessarily an optimal procedure.
Computational approaches for the classification of seed storage proteins.
Radhika, V; Rao, V Sree Hari
2015-07-01
Seed storage proteins comprise a major part of the protein content of the seed and have an important role on the quality of the seed. These storage proteins are important because they determine the total protein content and have an effect on the nutritional quality and functional properties for food processing. Transgenic plants are being used to develop improved lines for incorporation into plant breeding programs and the nutrient composition of seeds is a major target of molecular breeding programs. Hence, classification of these proteins is crucial for the development of superior varieties with improved nutritional quality. In this study we have applied machine learning algorithms for classification of seed storage proteins. We have presented an algorithm based on nearest neighbor approach for classification of seed storage proteins and compared its performance with decision tree J48, multilayer perceptron neural (MLP) network and support vector machine (SVM) libSVM. The model based on our algorithm has been able to give higher classification accuracy in comparison to the other methods.
A Hybrid Classification System for Heart Disease Diagnosis Based on the RFRS Method.
Liu, Xiao; Wang, Xiaoli; Su, Qiang; Zhang, Mo; Zhu, Yanhong; Wang, Qiugen; Wang, Qian
2017-01-01
Heart disease is one of the most common diseases in the world. The objective of this study is to aid the diagnosis of heart disease using a hybrid classification system based on the ReliefF and Rough Set (RFRS) method. The proposed system contains two subsystems: the RFRS feature selection system and a classification system with an ensemble classifier. The first system includes three stages: (i) data discretization, (ii) feature extraction using the ReliefF algorithm, and (iii) feature reduction using the heuristic Rough Set reduction algorithm that we developed. In the second system, an ensemble classifier is proposed based on the C4.5 classifier. The Statlog (Heart) dataset, obtained from the UCI database, was used for experiments. A maximum classification accuracy of 92.59% was achieved according to a jackknife cross-validation scheme. The results demonstrate that the performance of the proposed system is superior to the performances of previously reported classification techniques.
Gromski, Piotr S; Correa, Elon; Vaughan, Andrew A; Wedge, David C; Turner, Michael L; Goodacre, Royston
2014-11-01
Accurate detection of certain chemical vapours is important, as these may be diagnostic for the presence of weapons, drugs of misuse or disease. In order to achieve this, chemical sensors could be deployed remotely. However, the readout from such sensors is a multivariate pattern, and this needs to be interpreted robustly using powerful supervised learning methods. Therefore, in this study, we compared the classification accuracy of four pattern recognition algorithms which include linear discriminant analysis (LDA), partial least squares-discriminant analysis (PLS-DA), random forests (RF) and support vector machines (SVM) which employed four different kernels. For this purpose, we have used electronic nose (e-nose) sensor data (Wedge et al., Sensors Actuators B Chem 143:365-372, 2009). In order to allow direct comparison between our four different algorithms, we employed two model validation procedures based on either 10-fold cross-validation or bootstrapping. The results show that LDA (91.56% accuracy) and SVM with a polynomial kernel (91.66% accuracy) were very effective at analysing these e-nose data. These two models gave superior prediction accuracy, sensitivity and specificity in comparison to the other techniques employed. With respect to the e-nose sensor data studied here, our findings recommend that SVM with a polynomial kernel should be favoured as a classification method over the other statistical models that we assessed. SVM with non-linear kernels have the advantage that they can be used for classifying non-linear as well as linear mapping from analytical data space to multi-group classifications and would thus be a suitable algorithm for the analysis of most e-nose sensor data.
NASA Astrophysics Data System (ADS)
Dronova, I.; Gong, P.; Wang, L.; Clinton, N.; Fu, W.; Qi, S.
2011-12-01
Remote sensing-based vegetation classifications representing plant function such as photosynthesis and productivity are challenging in wetlands with complex cover and difficult field access. Recent advances in object-based image analysis (OBIA) and machine-learning algorithms offer new classification tools; however, few comparisons of different algorithms and spatial scales have been discussed to date. We applied OBIA to delineate wetland plant functional types (PFTs) for Poyang Lake, the largest freshwater lake in China and Ramsar wetland conservation site, from 30-m Landsat TM scene at the peak of spring growing season. We targeted major PFTs (C3 grasses, C3 forbs and different types of C4 grasses and aquatic vegetation) that are both key players in system's biogeochemical cycles and critical providers of waterbird habitat. Classification results were compared among: a) several object segmentation scales (with average object sizes 900-9000 m2); b) several families of statistical classifiers (including Bayesian, Logistic, Neural Network, Decision Trees and Support Vector Machines) and c) two hierarchical levels of vegetation classification, a generalized 3-class set and more detailed 6-class set. We found that classification benefited from object-based approach which allowed including object shape, texture and context descriptors in classification. While a number of classifiers achieved high accuracy at the finest pixel-equivalent segmentation scale, the highest accuracies and best agreement among algorithms occurred at coarser object scales. No single classifier was consistently superior across all scales, although selected algorithms of Neural Network, Logistic and K-Nearest Neighbors families frequently provided the best discrimination of classes at different scales. The choice of vegetation categories also affected classification accuracy. The 6-class set allowed for higher individual class accuracies but lower overall accuracies than the 3-class set because individual classes differed in scales at which they were best discriminated from others. Main classification challenges included a) presence of C3 grasses in C4-grass areas, particularly following harvesting of C4 reeds and b) mixtures of emergent, floating and submerged aquatic plants at sub-object and sub-pixel scales. We conclude that OBIA with advanced statistical classifiers offers useful instruments for landscape vegetation analyses, and that spatial scale considerations are critical in mapping PFTs, while multi-scale comparisons can be used to guide class selection. Future work will further apply fuzzy classification and field-collected spectral data for PFT analysis and compare results with MODIS PFT products.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Masci, Frank J.; Grillmair, Carl J.; Cutri, Roc M.
2014-07-01
We describe a methodology to classify periodic variable stars identified using photometric time-series measurements constructed from the Wide-field Infrared Survey Explorer (WISE) full-mission single-exposure Source Databases. This will assist in the future construction of a WISE Variable Source Database that assigns variables to specific science classes as constrained by the WISE observing cadence with statistically meaningful classification probabilities. We have analyzed the WISE light curves of 8273 variable stars identified in previous optical variability surveys (MACHO, GCVS, and ASAS) and show that Fourier decomposition techniques can be extended into the mid-IR to assist with their classification. Combined with other periodicmore » light-curve features, this sample is then used to train a machine-learned classifier based on the random forest (RF) method. Consistent with previous classification studies of variable stars in general, the RF machine-learned classifier is superior to other methods in terms of accuracy, robustness against outliers, and relative immunity to features that carry little or redundant class information. For the three most common classes identified by WISE: Algols, RR Lyrae, and W Ursae Majoris type variables, we obtain classification efficiencies of 80.7%, 82.7%, and 84.5% respectively using cross-validation analyses, with 95% confidence intervals of approximately ±2%. These accuracies are achieved at purity (or reliability) levels of 88.5%, 96.2%, and 87.8% respectively, similar to that achieved in previous automated classification studies of periodic variable stars.« less
Estimation of the Age and Amount of Brown Rice Plant Hoppers Based on Bionic Electronic Nose Use
Xu, Sai; Zhou, Zhiyan; Lu, Huazhong; Luo, Xiwen; Lan, Yubin; Zhang, Yang; Li, Yanfang
2014-01-01
The brown rice plant hopper (BRPH), Nilaparvata lugens (Stal), is one of the most important insect pests affecting rice and causes serious damage to the yield and quality of rice plants in Asia. This study used bionic electronic nose technology to sample BRPH volatiles, which vary in age and amount. Principal component analysis (PCA), linear discrimination analysis (LDA), probabilistic neural network (PNN), BP neural network (BPNN) and loading analysis (Loadings) techniques were used to analyze the sampling data. The results indicate that the PCA and LDA classification ability is poor, but the LDA classification displays superior performance relative to PCA. When a PNN was used to evaluate the BRPH age and amount, the classification rates of the training set were 100% and 96.67%, respectively, and the classification rates of the test set were 90.67% and 64.67%, respectively. When BPNN was used for the evaluation of the BRPH age and amount, the classification accuracies of the training set were 100% and 48.93%, respectively, and the classification accuracies of the test set were 96.67% and 47.33%, respectively. Loadings for BRPH volatiles indicate that the main elements of BRPHs' volatiles are sulfur-containing organics, aromatics, sulfur- and chlorine-containing organics and nitrogen oxides, which provide a reference for sensors chosen when exploited in specialized BRPH identification devices. This research proves the feasibility and broad application prospects of bionic electronic noses for BRPH recognition. PMID:25268913
An assessment of support vector machines for land cover classification
Huang, C.; Davis, L.S.; Townshend, J.R.G.
2002-01-01
The support vector machine (SVM) is a group of theoretically superior machine learning algorithms. It was found competitive with the best available machine learning algorithms in classifying high-dimensional data sets. This paper gives an introduction to the theoretical development of the SVM and an experimental evaluation of its accuracy, stability and training speed in deriving land cover classifications from satellite images. The SVM was compared to three other popular classifiers, including the maximum likelihood classifier (MLC), neural network classifiers (NNC) and decision tree classifiers (DTC). The impacts of kernel configuration on the performance of the SVM and of the selection of training data and input variables on the four classifiers were also evaluated in this experiment.
Uddin, M B; Chow, C M; Su, S W
2018-03-26
Sleep apnea (SA), a common sleep disorder, can significantly decrease the quality of life, and is closely associated with major health risks such as cardiovascular disease, sudden death, depression, and hypertension. The normal diagnostic process of SA using polysomnography is costly and time consuming. In addition, the accuracy of different classification methods to detect SA varies with the use of different physiological signals. If an effective, reliable, and accurate classification method is developed, then the diagnosis of SA and its associated treatment will be time-efficient and economical. This study aims to systematically review the literature and present an overview of classification methods to detect SA using respiratory and oximetry signals and address the automated detection approach. Sixty-two included studies revealed the application of single and multiple signals (respiratory and oximetry) for the diagnosis of SA. Both airflow and oxygen saturation signals alone were effective in detecting SA in the case of binary decision-making, whereas multiple signals were good for multi-class detection. In addition, some machine learning methods were superior to the other classification methods for SA detection using respiratory and oximetry signals. To deal with the respiratory and oximetry signals, a good choice of classification method as well as the consideration of associated factors would result in high accuracy in the detection of SA. An accurate classification method should provide a high detection rate with an automated (independent of human action) analysis of respiratory and oximetry signals. Future high-quality automated studies using large samples of data from multiple patient groups or record batches are recommended.
Noise tolerant dendritic lattice associative memories
NASA Astrophysics Data System (ADS)
Ritter, Gerhard X.; Schmalz, Mark S.; Hayden, Eric; Tucker, Marc
2011-09-01
Linear classifiers based on computation over the real numbers R (e.g., with operations of addition and multiplication) denoted by (R, +, x), have been represented extensively in the literature of pattern recognition. However, a different approach to pattern classification involves the use of addition, maximum, and minimum operations over the reals in the algebra (R, +, maximum, minimum) These pattern classifiers, based on lattice algebra, have been shown to exhibit superior information storage capacity, fast training and short convergence times, high pattern classification accuracy, and low computational cost. Such attributes are not always found, for example, in classical neural nets based on the linear inner product. In a special type of lattice associative memory (LAM), called a dendritic LAM or DLAM, it is possible to achieve noise-tolerant pattern classification by varying the design of noise or error acceptance bounds. This paper presents theory and algorithmic approaches for the computation of noise-tolerant lattice associative memories (LAMs) under a variety of input constraints. Of particular interest are the classification of nonergodic data in noise regimes with time-varying statistics. DLAMs, which are a specialization of LAMs derived from concepts of biological neural networks, have successfully been applied to pattern classification from hyperspectral remote sensing data, as well as spatial object recognition from digital imagery. The authors' recent research in the development of DLAMs is overviewed, with experimental results that show utility for a wide variety of pattern classification applications. Performance results are presented in terms of measured computational cost, noise tolerance, classification accuracy, and throughput for a variety of input data and noise levels.
Evans, A; Whelehan, P; Thomson, K; Brauer, K; Jordan, L; Purdie, C; McLean, D; Baker, L; Vinnicombe, S; Thompson, A
2012-07-10
The aim of this study was to assess the performance of shear wave elastography combined with BI-RADS classification of greyscale ultrasound images for benign/malignant differentiation in a large group of patients. One hundred and seventy-five consecutive patients with solid breast masses on routine ultrasonography undergoing percutaneous biopsy had the greyscale findings classified according to the American College of Radiology BI-RADS. The mean elasticity values from four shear wave images were obtained. For mean elasticity vs greyscale BI-RADS, the performance results against histology were sensitivity: 95% vs 95%, specificity: 77% vs 69%, Positive Predictive Value (PPV): 88% vs 84%, Negative Predictive Value (NPV): 90% vs 91%, and accuracy: 89% vs 86% (all P>0.05). The results for the combination (positive result from either modality counted as malignant) were sensitivity 100%, specificity 61%, PPV 82%, NPV 100%, and accuracy 86%. The combination of BI-RADS greyscale and shear wave elastography yielded superior sensitivity to BI-RADS alone (P=0.03) or shear wave alone (P=0.03). The NPV was superior in combination compared with either alone (BI-RADS P=0.01 and shear wave P=0.02). Together, BI-RADS assessment of greyscale ultrasound images and shear wave ultrasound elastography are extremely sensitive for detection of malignancy.
Evans, A; Whelehan, P; Thomson, K; Brauer, K; Jordan, L; Purdie, C; McLean, D; Baker, L; Vinnicombe, S; Thompson, A
2012-01-01
Background: The aim of this study was to assess the performance of shear wave elastography combined with BI-RADS classification of greyscale ultrasound images for benign/malignant differentiation in a large group of patients. Methods: One hundred and seventy-five consecutive patients with solid breast masses on routine ultrasonography undergoing percutaneous biopsy had the greyscale findings classified according to the American College of Radiology BI-RADS. The mean elasticity values from four shear wave images were obtained. Results: For mean elasticity vs greyscale BI-RADS, the performance results against histology were sensitivity: 95% vs 95%, specificity: 77% vs 69%, Positive Predictive Value (PPV): 88% vs 84%, Negative Predictive Value (NPV): 90% vs 91%, and accuracy: 89% vs 86% (all P>0.05). The results for the combination (positive result from either modality counted as malignant) were sensitivity 100%, specificity 61%, PPV 82%, NPV 100%, and accuracy 86%. The combination of BI-RADS greyscale and shear wave elastography yielded superior sensitivity to BI-RADS alone (P=0.03) or shear wave alone (P=0.03). The NPV was superior in combination compared with either alone (BI-RADS P=0.01 and shear wave P=0.02). Conclusion: Together, BI-RADS assessment of greyscale ultrasound images and shear wave ultrasound elastography are extremely sensitive for detection of malignancy. PMID:22691969
NASA Astrophysics Data System (ADS)
Liu, Chih-Hao; Du, Yong; Singh, Manmohan; Wu, Chen; Han, Zhaolong; Li, Jiasong; Mohammadzai, Qais; Raghunathan, Raksha; Hsu, Thomas; Noorani, Shezaan; Chang, Anthony; Mohan, Chandra; Larin, Kirill V.
2016-03-01
Acute Glomerulonephritis caused by anti-glomerular basement membrane disease has a high mortality due to delayed diagnosis. Thus, an accurate and early diagnosis is critical for preserving renal function. Currently, blood, urine, and tissue-based diagnoses can be time consuming, while ultrasound and CT imaging have relatively low spatial resolution. Optical coherence tomography (OCT) is a noninvasive imaging technique that provides superior spatial resolution (micron scale) as compared to ultrasound and CT. Pathological changes in tissue properties can be detected based on the optical metrics analyzed from the OCT signal, such as optical attenuation and speckle variance. Moreover, OCT does not rely on ionizing radiation as with CT imaging. In addition to structural changes, the elasticity of the kidney can significantly change due to nephritis. In this work, we utilized OCT to detect the difference in tissue properties between healthy and nephritic murine kidneys. Although OCT imaging could identify the diseased tissue, classification accuracy using only optical metrics was clinically inadequate. By combining optical metrics with elasticity, the classification accuracy improved from 76% to 95%. These results show that OCT combined with OCE can be potentially useful for nephritis detection.
Chen, C L; Kaber, D B; Dempsey, P G
2000-06-01
A new and improved method to feedforward neural network (FNN) development for application to data classification problems, such as the prediction of levels of low-back disorder (LBD) risk associated with industrial jobs, is presented. Background on FNN development for data classification is provided along with discussions of previous research and neighborhood (local) solution search methods for hard combinatorial problems. An analytical study is presented which compared prediction accuracy of a FNN based on an error-back propagation (EBP) algorithm with the accuracy of a FNN developed by considering results of local solution search (simulated annealing) for classifying industrial jobs as posing low or high risk for LBDs. The comparison demonstrated superior performance of the FNN generated using the new method. The architecture of this FNN included fewer input (predictor) variables and hidden neurons than the FNN developed based on the EBP algorithm. Independent variable selection methods and the phenomenon of 'overfitting' in FNN (and statistical model) generation for data classification are discussed. The results are supportive of the use of the new approach to FNN development for applications to musculoskeletal disorders and risk forecasting in other domains.
Montreal Cognitive Assessment (MoCA): validation study for frontotemporal dementia.
Freitas, Sandra; Simões, Mário R; Alves, Lara; Duro, Diana; Santana, Isabel
2012-09-01
The Montreal Cognitive Assessment (MoCA) is a brief instrument developed for the screening of milder forms of cognitive impairment, having surpassed the well-known limitations of the Mini-Mental State Examination (MMSE). The aim of the present study was to validate the MoCA as a cognitive screening test for behavioral-variant frontotemporal dementia (bv-FTD) by examining its psychometric properties and diagnostic accuracy. Three matched subgroups of participants were considered: bv-FTD (n = 50), Alzheimer disease (n = 50), and a control group of healthy adults (n = 50). Compared with the MMSE, the MoCA demonstrated consistently superior psychometric properties and discriminant capacity, providing comprehensive information about the patients' cognitive profiles. The diagnostic accuracy of MoCA for bv-FTD was extremely high (area under the curve AUC [MoCA] = 0.934, 95% confidence interval [CI] = 0.866-.974; AUC [MMSE] = 0.772, 95% CI = 0.677-0.850). With a cutoff below 17 points, the MoCA results for sensitivity, specificity, positive predictive value, negative predictive value, and classification accuracy were significantly superior to those of the MMSE. The MoCA is a sensitive and accurate instrument for screening the patients with bv-FTD and represents a better option than the MMSE.
Estimating local scaling properties for the classification of interstitial lung disease patterns
NASA Astrophysics Data System (ADS)
Huber, Markus B.; Nagarajan, Mahesh B.; Leinsinger, Gerda; Ray, Lawrence A.; Wismueller, Axel
2011-03-01
Local scaling properties of texture regions were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honeycombing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and the estimation of local scaling properties with Scaling Index Method (SIM). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions including the Bonferroni correction. The best classification results were obtained by the set of SIM features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers with the highest accuracy (94.1%, 93.7%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced texture features using local scaling properties can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.
NASA Astrophysics Data System (ADS)
Zhu, Zhe; Gallant, Alisa L.; Woodcock, Curtis E.; Pengra, Bruce; Olofsson, Pontus; Loveland, Thomas R.; Jin, Suming; Dahal, Devendra; Yang, Limin; Auch, Roger F.
2016-12-01
The U.S. Geological Survey's Land Change Monitoring, Assessment, and Projection (LCMAP) initiative is a new end-to-end capability to continuously track and characterize changes in land cover, use, and condition to better support research and applications relevant to resource management and environmental change. Among the LCMAP product suite are annual land cover maps that will be available to the public. This paper describes an approach to optimize the selection of training and auxiliary data for deriving the thematic land cover maps based on all available clear observations from Landsats 4-8. Training data were selected from map products of the U.S. Geological Survey's Land Cover Trends project. The Random Forest classifier was applied for different classification scenarios based on the Continuous Change Detection and Classification (CCDC) algorithm. We found that extracting training data proportionally to the occurrence of land cover classes was superior to an equal distribution of training data per class, and suggest using a total of 20,000 training pixels to classify an area about the size of a Landsat scene. The problem of unbalanced training data was alleviated by extracting a minimum of 600 training pixels and a maximum of 8000 training pixels per class. We additionally explored removing outliers contained within the training data based on their spectral and spatial criteria, but observed no significant improvement in classification results. We also tested the importance of different types of auxiliary data that were available for the conterminous United States, including: (a) five variables used by the National Land Cover Database, (b) three variables from the cloud screening "Function of mask" (Fmask) statistics, and (c) two variables from the change detection results of CCDC. We found that auxiliary variables such as a Digital Elevation Model and its derivatives (aspect, position index, and slope), potential wetland index, water probability, snow probability, and cloud probability improved the accuracy of land cover classification. Compared to the original strategy of the CCDC algorithm (500 pixels per class), the use of the optimal strategy improved the classification accuracies substantially (15-percentage point increase in overall accuracy and 4-percentage point increase in minimum accuracy).
A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification
NASA Astrophysics Data System (ADS)
Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun
2016-12-01
Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.
A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification.
Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun
2016-12-01
Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.
A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification
Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun
2016-01-01
Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value. PMID:27905520
A Pruning Neural Network Model in Credit Classification Analysis
Tang, Yajiao; Ji, Junkai; Dai, Hongwei; Yu, Yang; Todo, Yuki
2018-01-01
Nowadays, credit classification models are widely applied because they can help financial decision-makers to handle credit classification issues. Among them, artificial neural networks (ANNs) have been widely accepted as the convincing methods in the credit industry. In this paper, we propose a pruning neural network (PNN) and apply it to solve credit classification problem by adopting the well-known Australian and Japanese credit datasets. The model is inspired by synaptic nonlinearity of a dendritic tree in a biological neural model. And it is trained by an error back-propagation algorithm. The model is capable of realizing a neuronal pruning function by removing the superfluous synapses and useless dendrites and forms a tidy dendritic morphology at the end of learning. Furthermore, we utilize logic circuits (LCs) to simulate the dendritic structures successfully which makes PNN be implemented on the hardware effectively. The statistical results of our experiments have verified that PNN obtains superior performance in comparison with other classical algorithms in terms of accuracy and computational efficiency. PMID:29606961
Carnahan, Brian; Meyer, Gérard; Kuntz, Lois-Ann
2003-01-01
Multivariate classification models play an increasingly important role in human factors research. In the past, these models have been based primarily on discriminant analysis and logistic regression. Models developed from machine learning research offer the human factors professional a viable alternative to these traditional statistical classification methods. To illustrate this point, two machine learning approaches--genetic programming and decision tree induction--were used to construct classification models designed to predict whether or not a student truck driver would pass his or her commercial driver license (CDL) examination. The models were developed and validated using the curriculum scores and CDL exam performances of 37 student truck drivers who had completed a 320-hr driver training course. Results indicated that the machine learning classification models were superior to discriminant analysis and logistic regression in terms of predictive accuracy. Actual or potential applications of this research include the creation of models that more accurately predict human performance outcomes.
Shayan, Zahra; Mohammad Gholi Mezerji, Naser; Shayan, Leila; Naseri, Parisa
2015-11-03
Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.
AUCTSP: an improved biomarker gene pair class predictor.
Kagaris, Dimitri; Khamesipour, Alireza; Yiannoutsos, Constantin T
2018-06-26
The Top Scoring Pair (TSP) classifier, based on the concept of relative ranking reversals in the expressions of pairs of genes, has been proposed as a simple, accurate, and easily interpretable decision rule for classification and class prediction of gene expression profiles. The idea that differences in gene expression ranking are associated with presence or absence of disease is compelling and has strong biological plausibility. Nevertheless, the TSP formulation ignores significant available information which can improve classification accuracy and is vulnerable to selecting genes which do not have differential expression in the two conditions ("pivot" genes). We introduce the AUCTSP classifier as an alternative rank-based estimator of the magnitude of the ranking reversals involved in the original TSP. The proposed estimator is based on the Area Under the Receiver Operating Characteristic (ROC) Curve (AUC) and as such, takes into account the separation of the entire distribution of gene expression levels in gene pairs under the conditions considered, as opposed to comparing gene rankings within individual subjects as in the original TSP formulation. Through extensive simulations and case studies involving classification in ovarian, leukemia, colon, breast and prostate cancers and diffuse large b-cell lymphoma, we show the superiority of the proposed approach in terms of improving classification accuracy, avoiding overfitting and being less prone to selecting non-informative (pivot) genes. The proposed AUCTSP is a simple yet reliable and robust rank-based classifier for gene expression classification. While the AUCTSP works by the same principle as TSP, its ability to determine the top scoring gene pair based on the relative rankings of two marker genes across all subjects as opposed to each individual subject results in significant performance gains in classification accuracy. In addition, the proposed method tends to avoid selection of non-informative (pivot) genes as members of the top-scoring pair.
Lesion classification using clinical and visual data fusion by multiple kernel learning
NASA Astrophysics Data System (ADS)
Kisilev, Pavel; Hashoul, Sharbell; Walach, Eugene; Tzadok, Asaf
2014-03-01
To overcome operator dependency and to increase diagnosis accuracy in breast ultrasound (US), a lot of effort has been devoted to developing computer-aided diagnosis (CAD) systems for breast cancer detection and classification. Unfortunately, the efficacy of such CAD systems is limited since they rely on correct automatic lesions detection and localization, and on robustness of features computed based on the detected areas. In this paper we propose a new approach to boost the performance of a Machine Learning based CAD system, by combining visual and clinical data from patient files. We compute a set of visual features from breast ultrasound images, and construct the textual descriptor of patients by extracting relevant keywords from patients' clinical data files. We then use the Multiple Kernel Learning (MKL) framework to train SVM based classifier to discriminate between benign and malignant cases. We investigate different types of data fusion methods, namely, early, late, and intermediate (MKL-based) fusion. Our database consists of 408 patient cases, each containing US images, textual description of complaints and symptoms filled by physicians, and confirmed diagnoses. We show experimentally that the proposed MKL-based approach is superior to other classification methods. Even though the clinical data is very sparse and noisy, its MKL-based fusion with visual features yields significant improvement of the classification accuracy, as compared to the image features only based classifier.
[Hyperspectral remote sensing image classification based on SVM optimized by clonal selection].
Liu, Qing-Jie; Jing, Lin-Hai; Wang, Meng-Fei; Lin, Qi-Zhong
2013-03-01
Model selection for support vector machine (SVM) involving kernel and the margin parameter values selection is usually time-consuming, impacts training efficiency of SVM model and final classification accuracies of SVM hyperspectral remote sensing image classifier greatly. Firstly, based on combinatorial optimization theory and cross-validation method, artificial immune clonal selection algorithm is introduced to the optimal selection of SVM (CSSVM) kernel parameter a and margin parameter C to improve the training efficiency of SVM model. Then an experiment of classifying AVIRIS in India Pine site of USA was performed for testing the novel CSSVM, as well as a traditional SVM classifier with general Grid Searching cross-validation method (GSSVM) for comparison. And then, evaluation indexes including SVM model training time, classification overall accuracy (OA) and Kappa index of both CSSVM and GSSVM were all analyzed quantitatively. It is demonstrated that OA of CSSVM on test samples and whole image are 85.1% and 81.58, the differences from that of GSSVM are both within 0.08% respectively; And Kappa indexes reach 0.8213 and 0.7728, the differences from that of GSSVM are both within 0.001; While the ratio of model training time of CSSVM and GSSVM is between 1/6 and 1/10. Therefore, CSSVM is fast and accurate algorithm for hyperspectral image classification and is superior to GSSVM.
NASA Astrophysics Data System (ADS)
Dennison, Andrew G.
Classification of the seafloor substrate can be done with a variety of methods. These methods include Visual (dives, drop cameras); mechanical (cores, grab samples); acoustic (statistical analysis of echosounder returns). Acoustic methods offer a more powerful and efficient means of collecting useful information about the bottom type. Due to the nature of an acoustic survey, larger areas can be sampled, and by combining the collected data with visual and mechanical survey methods provide greater confidence in the classification of a mapped region. During a multibeam sonar survey, both bathymetric and backscatter data is collected. It is well documented that the statistical characteristic of a sonar backscatter mosaic is dependent on bottom type. While classifying the bottom-type on the basis on backscatter alone can accurately predict and map bottom-type, i.e a muddy area from a rocky area, it lacks the ability to resolve and capture fine textural details, an important factor in many habitat mapping studies. Statistical processing of high-resolution multibeam data can capture the pertinent details about the bottom-type that are rich in textural information. Further multivariate statistical processing can then isolate characteristic features, and provide the basis for an accurate classification scheme. The development of a new classification method is described here. It is based upon the analysis of textural features in conjunction with ground truth sampling. The processing and classification result of two geologically distinct areas in nearshore regions of Lake Superior; off the Lester River,MN and Amnicon River, WI are presented here, using the Minnesota Supercomputer Institute's Mesabi computing cluster for initial processing. Processed data is then calibrated using ground truth samples to conduct an accuracy assessment of the surveyed areas. From analysis of high-resolution bathymetry data collected at both survey sites is was possible to successfully calculate a series of measures that describe textural information about the lake floor. Further processing suggests that the features calculated capture a significant amount of statistical information about the lake floor terrain as well. Two sources of error, an anomalous heave and refraction error significantly deteriorated the quality of the processed data and resulting validate results. Ground truth samples used to validate the classification methods utilized for both survey sites, however, resulted in accuracy values ranging from 5 -30 percent at the Amnicon River, and between 60-70 percent for the Lester River. The final results suggest that this new processing methodology does adequately capture textural information about the lake floor and does provide an acceptable classification in the absence of significant data quality issues.
Evaluation of airborne image data for mapping riparian vegetation within the Grand Canyon
Davis, Philip A.; Staid, Matthew I.; Plescia, Jeffrey B.; Johnson, Jeffrey R.
2002-01-01
This study examined various types of remote-sensing data that have been acquired during a 12-month period over a portion of the Colorado River corridor to determine the type of data and conditions for data acquisition that provide the optimum classification results for mapping riparian vegetation. Issues related to vegetation mapping included time of year, number and positions of wavelength bands, and spatial resolution for data acquisition to produce accurate vegetation maps versus cost of data. Image data considered in the study consisted of scanned color-infrared (CIR) film, digital CIR, and digital multispectral data, whose resolutions from 11 cm (photographic film) to 100 cm (multispectral), that were acquired during the Spring, Summer, and Fall seasons in 2000 for five long-term monitoring sites containing riparian vegetation. Results show that digitally acquired data produce higher and more consistent classification accuracies for mapping vegetation units than do film products. The highest accuracies were obtained from nine-band multispectral data; however, a four-band subset of these data, that did not include short-wave infrared bands, produced comparable mapping results. The four-band subset consisted of the wavelength bands 0.52-0.59 µm, 0.59-0.62 µm, 0.67-0.72 µm, and 0.73-0.85 µm. Use of only three of these bands that simulate digital CIR sensors produced accuracies for several vegetation units that were 10% lower than those obtained using the full multispectral data set. Classification tests using band ratios produced lower accuracies than those using band reflectance for scanned film data; a result attributed to the relatively poor radiometric fidelity maintained by the film scanning process, whereas calibrated multispectral data produced similar classification accuracies using band reflectance and band ratios. This suggests that the intrinsic band reflectance of the vegetation is more important than inter-band reflectance differences in attaining high mapping accuracies. These results also indicate that radiometrically calibrated sensors that record a wide range of radiance produce superior results and that such sensors should be used for monitoring purposes. When texture (spatial variance) at near-infrared wavelength is combined with spectral data in classification, accuracy increased most markedly (20-30%) for the highest resolution (11-cm) CIR film data, but decreased in its effect on accuracy in lower-resolution multi-spectral image data; a result observed in previous studies (Franklin and McDermid 1993, Franklin et al. 2000, 2001). While many classification unit accuracies obtained from the 11-cm film CIR band with texture data were in fact higher than those produced using the 100-cm, nine-band multispectral data with texture, the 11-cm film CIR data produced much lower accuracies than the 100-cm multispectral data for the more sparsely populated vegetation units due to saturation of picture elements during the film scanning process in vegetation units with a high proportion of alluvium. Overall classification accuracies obtained from spectral band and texture data range from 36% to 78% for all databases considered, from 57% to 71% for the 11-cm film CIR data, and from 54% to 78% for the 100-cm multispectral data. Classification results obtained from 20-cm film CIR band and texture data, which were produced by applying a Gaussian filter to the 11-cm film CIR data, showed increases in accuracy due to texture that were similar to those observed using the original 11-cm film CIR data. This suggests that data can be collected at the lower resolution and still retain the added power of vegetation texture. Classification accuracies for the riparian vegetation units examined in this study do not appear to be influenced by season of data acquisition, although data acquired under direct sunlight produced higher overall accuracies than data acquired under overcast conditions. The latter observation, in addition to the importance of band reflectance for classification, implies that data should be acquired near summer solstice when sun elevation and reflectance is highest and when shadows cast by steep canyon walls are minimized.
Ground-based cloud classification by learning stable local binary patterns
NASA Astrophysics Data System (ADS)
Wang, Yu; Shi, Cunzhao; Wang, Chunheng; Xiao, Baihua
2018-07-01
Feature selection and extraction is the first step in implementing pattern classification. The same is true for ground-based cloud classification. Histogram features based on local binary patterns (LBPs) are widely used to classify texture images. However, the conventional uniform LBP approach cannot capture all the dominant patterns in cloud texture images, thereby resulting in low classification performance. In this study, a robust feature extraction method by learning stable LBPs is proposed based on the averaged ranks of the occurrence frequencies of all rotation invariant patterns defined in the LBPs of cloud images. The proposed method is validated with a ground-based cloud classification database comprising five cloud types. Experimental results demonstrate that the proposed method achieves significantly higher classification accuracy than the uniform LBP, local texture patterns (LTP), dominant LBP (DLBP), completed LBP (CLTP) and salient LBP (SaLBP) methods in this cloud image database and under different noise conditions. And the performance of the proposed method is comparable with that of the popular deep convolutional neural network (DCNN) method, but with less computation complexity. Furthermore, the proposed method also achieves superior performance on an independent test data set.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Park, S; Quon, H; McNutt, T
2016-06-15
Purpose: To determine if the accumulated parotid dosimetry using planning CT to daily CBCT deformation and dose re-calculation can predict for radiation-induced xerostomia. Methods: To track and dosimetrically account for the effects of anatomical changes on the parotid glands, we propagated physicians’ contours from planning CT to daily CBCT using a deformable registration with iterative CBCT intensity correction. A surface mesh for each OAR was created with the deformation applied to the mesh to obtain the deformed parotid volumes. Daily dose was computed on the deformed CT and accumulated to the last fraction. For both the accumulated and the plannedmore » parotid dosimetry, we tested the prediction power of different dosimetric parameters including D90, D50, D10, mean, standard deviation, min/max dose to the combined parotids and patient age to severe xerostomia (NCI-CTCAE grade≥2 at 6 mo follow-up). We also tested the dosimetry to parotid sub-volumes. Three classification algorithms, random tree, support vector machine, and logistic regression were tested to predict severe xerostomia using a leave-one-out validation approach. Results: We tested our prediction model on 35 HN IMRT cases. Parameters from the accumulated dosimetry model demonstrated an 89% accuracy for predicting severe xerostomia. Compared to the planning dosimetry, the accumulated dose consistently demonstrated higher prediction power with all three classification algorithms, including 11%, 5% and 30% higher accuracy, sensitivity and specificity, respectively. Geometric division of the combined parotid glands into superior-inferior regions demonstrated ∼5% increased accuracy than the whole volume. The most influential ranked features include age, mean accumulated dose of the submandibular glands and the accumulated D90 of the superior parotid glands. Conclusion: We demonstrated that the accumulated parotid dosimetry using CT-CBCT registration and dose re-calculation more accurately predicts for severe xerostomia and that the superior portion of the parotid glands may be particularly important in predicting for severe xerostomia. This work was supported in part by NIH/NCI under grant R42CA137886 and in part by Toshiba big data research project funds.« less
Classification of interstitial lung disease patterns with topological texture features
NASA Astrophysics Data System (ADS)
Huber, Markus B.; Nagarajan, Mahesh; Leinsinger, Gerda; Ray, Lawrence A.; Wismüller, Axel
2010-03-01
Topological texture features were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honey-combing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. A set of 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and three Minkowski Functionals (MFs, e.g. MF.euler). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions and the significance thresholds were adjusted for multiple comparisons by the Bonferroni correction. The best classification results were obtained by the MF features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers. The highest accuracy was found for MF.euler (97.5%, 96.6%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced topological texture features can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.
NASA Astrophysics Data System (ADS)
Klomp, Sander; van der Sommen, Fons; Swager, Anne-Fré; Zinger, Svitlana; Schoon, Erik J.; Curvers, Wouter L.; Bergman, Jacques J.; de With, Peter H. N.
2017-03-01
Volumetric Laser Endomicroscopy (VLE) is a promising technique for the detection of early neoplasia in Barrett's Esophagus (BE). VLE generates hundreds of high resolution, grayscale, cross-sectional images of the esophagus. However, at present, classifying these images is a time consuming and cumbersome effort performed by an expert using a clinical prediction model. This paper explores the feasibility of using computer vision techniques to accurately predict the presence of dysplastic tissue in VLE BE images. Our contribution is threefold. First, a benchmarking is performed for widely applied machine learning techniques and feature extraction methods. Second, three new features based on the clinical detection model are proposed, having superior classification accuracy and speed, compared to earlier work. Third, we evaluate automated parameter tuning by applying simple grid search and feature selection methods. The results are evaluated on a clinically validated dataset of 30 dysplastic and 30 non-dysplastic VLE images. Optimal classification accuracy is obtained by applying a support vector machine and using our modified Haralick features and optimal image cropping, obtaining an area under the receiver operating characteristic of 0.95 compared to the clinical prediction model at 0.81. Optimal execution time is achieved using a proposed mean and median feature, which is extracted at least factor 2.5 faster than alternative features with comparable performance.
Na, Tong; Xie, Jianyang; Zhao, Yitian; Zhao, Yifan; Liu, Yue; Wang, Yongtian; Liu, Jiang
2018-05-09
Automatic methods of analyzing of retinal vascular networks, such as retinal blood vessel detection, vascular network topology estimation, and arteries/veins classification are of great assistance to the ophthalmologist in terms of diagnosis and treatment of a wide spectrum of diseases. We propose a new framework for precisely segmenting retinal vasculatures, constructing retinal vascular network topology, and separating the arteries and veins. A nonlocal total variation inspired Retinex model is employed to remove the image intensity inhomogeneities and relatively poor contrast. For better generalizability and segmentation performance, a superpixel-based line operator is proposed as to distinguish between lines and the edges, thus allowing more tolerance in the position of the respective contours. The concept of dominant sets clustering is adopted to estimate retinal vessel topology and classify the vessel network into arteries and veins. The proposed segmentation method yields competitive results on three public data sets (STARE, DRIVE, and IOSTAR), and it has superior performance when compared with unsupervised segmentation methods, with accuracy of 0.954, 0.957, and 0.964, respectively. The topology estimation approach has been applied to five public databases (DRIVE,STARE, INSPIRE, IOSTAR, and VICAVR) and achieved high accuracy of 0.830, 0.910, 0.915, 0.928, and 0.889, respectively. The accuracies of arteries/veins classification based on the estimated vascular topology on three public databases (INSPIRE, DRIVE and VICAVR) are 0.90.9, 0.910, and 0.907, respectively. The experimental results show that the proposed framework has effectively addressed crossover problem, a bottleneck issue in segmentation and vascular topology reconstruction. The vascular topology information significantly improves the accuracy on arteries/veins classification. © 2018 American Association of Physicists in Medicine.
Methods and statistics for combining motif match scores.
Bailey, T L; Gribskov, M
1998-01-01
Position-specific scoring matrices are useful for representing and searching for protein sequence motifs. A sequence family can often be described by a group of one or more motifs, and an effective search must combine the scores for matching a sequence to each of the motifs in the group. We describe three methods for combining match scores and estimating the statistical significance of the combined scores and evaluate the search quality (classification accuracy) and the accuracy of the estimate of statistical significance of each. The three methods are: 1) sum of scores, 2) sum of reduced variates, 3) product of score p-values. We show that method 3) is superior to the other two methods in both regards, and that combining motif scores indeed gives better search accuracy. The MAST sequence homology search algorithm utilizing the product of p-values scoring method is available for interactive use and downloading at URL http:/(/)www.sdsc.edu/MEME.
Alshamlan, Hala M; Badr, Ghada H; Alohali, Yousef A
2015-06-01
Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification. Copyright © 2015 Elsevier Ltd. All rights reserved.
Learning optimal embedded cascades.
Saberian, Mohammad Javad; Vasconcelos, Nuno
2012-10-01
The problem of automatic and optimal design of embedded object detector cascades is considered. Two main challenges are identified: optimization of the cascade configuration and optimization of individual cascade stages, so as to achieve the best tradeoff between classification accuracy and speed, under a detection rate constraint. Two novel boosting algorithms are proposed to address these problems. The first, RCBoost, formulates boosting as a constrained optimization problem which is solved with a barrier penalty method. The constraint is the target detection rate, which is met at all iterations of the boosting process. This enables the design of embedded cascades of known configuration without extensive cross validation or heuristics. The second, ECBoost, searches over cascade configurations to achieve the optimal tradeoff between classification risk and speed. The two algorithms are combined into an overall boosting procedure, RCECBoost, which optimizes both the cascade configuration and its stages under a detection rate constraint, in a fully automated manner. Extensive experiments in face, car, pedestrian, and panda detection show that the resulting detectors achieve an accuracy versus speed tradeoff superior to those of previous methods.
NASA Astrophysics Data System (ADS)
Li, Shao-Xin; Zeng, Qiu-Yao; Li, Lin-Fang; Zhang, Yan-Jiao; Wan, Ming-Ming; Liu, Zhi-Ming; Xiong, Hong-Lian; Guo, Zhou-Yi; Liu, Song-Hao
2013-02-01
The ability of combining serum surface-enhanced Raman spectroscopy (SERS) with support vector machine (SVM) for improving classification esophageal cancer patients from normal volunteers is investigated. Two groups of serum SERS spectra based on silver nanoparticles (AgNPs) are obtained: one group from patients with pathologically confirmed esophageal cancer (n=30) and the other group from healthy volunteers (n=31). Principal components analysis (PCA), conventional SVM (C-SVM) and conventional SVM combination with PCA (PCA-SVM) methods are implemented to classify the same spectral dataset. Results show that a diagnostic accuracy of 77.0% is acquired for PCA technique, while diagnostic accuracies of 83.6% and 85.2% are obtained for C-SVM and PCA-SVM methods based on radial basis functions (RBF) models. The results prove that RBF SVM models are superior to PCA algorithm in classification serum SERS spectra. The study demonstrates that serum SERS in combination with SVM technique has great potential to provide an effective and accurate diagnostic schema for noninvasive detection of esophageal cancer.
NASA Astrophysics Data System (ADS)
Ma, Weiwei; Gong, Cailan; Hu, Yong; Li, Long; Meng, Peng
2015-10-01
Remote sensing technology has been broadly recognized for its convenience and efficiency in mapping vegetation, particularly in high-altitude and inaccessible areas where there are lack of in-situ observations. In this study, Landsat Thematic Mapper (TM) images and Chinese environmental mitigation satellite CCD sensor (HJ-1 CCD) images, both of which are at 30m spatial resolution were employed for identifying and monitoring of vegetation types in a area of Western China——Qinghai Lake Watershed(QHLW). A decision classification tree (DCT) algorithm using multi-characteristic including seasonal TM/HJ-1 CCD time series data combined with digital elevation models (DEMs) dataset, and a supervised maximum likelihood classification (MLC) algorithm with single-data TM image were applied vegetation classification. Accuracy of the two algorithms was assessed using field observation data. Based on produced vegetation classification maps, it was found that the DCT using multi-season data and geomorphologic parameters was superior to the MLC algorithm using single-data image, improving the overall accuracy by 11.86% at second class level and significantly reducing the "salt and pepper" noise. The DCT algorithm applied to TM /HJ-1 CCD time series data geomorphologic parameters appeared as a valuable and reliable tool for monitoring vegetation at first class level (5 vegetation classes) and second class level(8 vegetation subclasses). The DCT algorithm using multi-characteristic might provide a theoretical basis and general approach to automatic extraction of vegetation types from remote sensing imagery over plateau areas.
NASA Astrophysics Data System (ADS)
Zhu, Ying; Tan, Tuck Lee
2016-04-01
An effective and simple analytical method using Fourier transform infrared (FTIR) spectroscopy to distinguish wild-grown high-quality Ganoderma lucidum (G. lucidum) from cultivated one is of essential importance for its quality assurance and medicinal value estimation. Commonly used chemical and analytical methods using full spectrum are not so effective for the detection and interpretation due to the complex system of the herbal medicine. In this study, two penalized discriminant analysis models, penalized linear discriminant analysis (PLDA) and elastic net (Elnet),using FTIR spectroscopy have been explored for the purpose of discrimination and interpretation. The classification performances of the two penalized models have been compared with two widely used multivariate methods, principal component discriminant analysis (PCDA) and partial least squares discriminant analysis (PLSDA). The Elnet model involving a combination of L1 and L2 norm penalties enabled an automatic selection of a small number of informative spectral absorption bands and gave an excellent classification accuracy of 99% for discrimination between spectra of wild-grown and cultivated G. lucidum. Its classification performance was superior to that of the PLDA model in a pure L1 setting and outperformed the PCDA and PLSDA models using full wavelength. The well-performed selection of informative spectral features leads to substantial reduction in model complexity and improvement of classification accuracy, and it is particularly helpful for the quantitative interpretations of the major chemical constituents of G. lucidum regarding its anti-cancer effects.
An Evaluation of Item Response Theory Classification Accuracy and Consistency Indices
ERIC Educational Resources Information Center
Wyse, Adam E.; Hao, Shiqi
2012-01-01
This article introduces two new classification consistency indices that can be used when item response theory (IRT) models have been applied. The new indices are shown to be related to Rudner's classification accuracy index and Guo's classification accuracy index. The Rudner- and Guo-based classification accuracy and consistency indices are…
Spatial-temporal discriminant analysis for ERP-based brain-computer interface.
Zhang, Yu; Zhou, Guoxu; Zhao, Qibin; Jin, Jing; Wang, Xingyu; Cichocki, Andrzej
2013-03-01
Linear discriminant analysis (LDA) has been widely adopted to classify event-related potential (ERP) in brain-computer interface (BCI). Good classification performance of the ERP-based BCI usually requires sufficient data recordings for effective training of the LDA classifier, and hence a long system calibration time which however may depress the system practicability and cause the users resistance to the BCI system. In this study, we introduce a spatial-temporal discriminant analysis (STDA) to ERP classification. As a multiway extension of the LDA, the STDA method tries to maximize the discriminant information between target and nontarget classes through finding two projection matrices from spatial and temporal dimensions collaboratively, which reduces effectively the feature dimensionality in the discriminant analysis, and hence decreases significantly the number of required training samples. The proposed STDA method was validated with dataset II of the BCI Competition III and dataset recorded from our own experiments, and compared to the state-of-the-art algorithms for ERP classification. Online experiments were additionally implemented for the validation. The superior classification performance in using few training samples shows that the STDA is effective to reduce the system calibration time and improve the classification accuracy, thereby enhancing the practicability of ERP-based BCI.
Kim, Ki Wan; Hong, Hyung Gil; Nam, Gi Pyo; Park, Kang Ryoung
2017-06-30
The necessity for the classification of open and closed eyes is increasing in various fields, including analysis of eye fatigue in 3D TVs, analysis of the psychological states of test subjects, and eye status tracking-based driver drowsiness detection. Previous studies have used various methods to distinguish between open and closed eyes, such as classifiers based on the features obtained from image binarization, edge operators, or texture analysis. However, when it comes to eye images with different lighting conditions and resolutions, it can be difficult to find an optimal threshold for image binarization or optimal filters for edge and texture extraction. In order to address this issue, we propose a method to classify open and closed eye images with different conditions, acquired by a visible light camera, using a deep residual convolutional neural network. After conducting performance analysis on both self-collected and open databases, we have determined that the classification accuracy of the proposed method is superior to that of existing methods.
Gatos, Ilias; Tsantis, Stavros; Spiliopoulos, Stavros; Karnabatidis, Dimitris; Theotokas, Ioannis; Zoumpoulis, Pavlos; Loupas, Thanasis; Hazle, John D; Kagadis, George C
2016-03-01
Classify chronic liver disease (CLD) from ultrasound shear-wave elastography (SWE) imaging by means of a computer aided diagnosis (CAD) system. The proposed algorithm employs an inverse mapping technique (red-green-blue to stiffness) to quantify 85 SWE images (54 healthy and 31 with CLD). Texture analysis is then applied involving the automatic calculation of 330 first and second order textural features from every transformed stiffness value map to determine functional features that characterize liver elasticity and describe liver condition for all available stages. Consequently, a stepwise regression analysis feature selection procedure is utilized toward a reduced feature subset that is fed into the support vector machines (SVMs) classification algorithm in the design of the CAD system. With regard to the mapping procedure accuracy, the stiffness map values had an average difference of 0.01 ± 0.001 kPa compared to the quantification results derived from the color-box provided by the built-in software of the ultrasound system. Highest classification accuracy from the SVM model was 87.0% with sensitivity and specificity values of 83.3% and 89.1%, respectively. Receiver operating characteristic curves analysis gave an area under the curve value of 0.85 with [0.77-0.89] confidence interval. The proposed CAD system employing color to stiffness mapping and classification algorithms offered superior results, comparing the already published clinical studies. It could prove to be of value to physicians improving the diagnostic accuracy of CLD and can be employed as a second opinion tool for avoiding unnecessary invasive procedures.
Validity and reliability of the Paprosky acetabular defect classification.
Yu, Raymond; Hofstaetter, Jochen G; Sullivan, Thomas; Costi, Kerry; Howie, Donald W; Solomon, Lucian B
2013-07-01
The Paprosky acetabular defect classification is widely used but has not been appropriately validated. Reliability of the Paprosky system has not been evaluated in combination with standardized techniques of measurement and scoring. This study evaluated the reliability, teachability, and validity of the Paprosky acetabular defect classification. Preoperative radiographs from a random sample of 83 patients undergoing 85 acetabular revisions were classified by four observers, and their classifications were compared with quantitative intraoperative measurements. Teachability of the classification scheme was tested by dividing the four observers into two groups. The observers in Group 1 underwent three teaching sessions; those in Group 2 underwent one session and the influence of teaching on the accuracy of their classifications was ascertained. Radiographic evaluation showed statistically significant relationships with intraoperative measurements of anterior, medial, and superior acetabular defect sizes. Interobserver reliability improved substantially after teaching and did not improve without it. The weighted kappa coefficient went from 0.56 at Occasion 1 to 0.79 after three teaching sessions in Group 1 observers, and from 0.49 to 0.65 after one teaching session in Group 2 observers. The Paprosky system is valid and shows good reliability when combined with standardized definitions of radiographic landmarks and a structured analysis. Level II, diagnostic study. See the Guidelines for Authors for a complete description of levels of evidence.
NASA Astrophysics Data System (ADS)
Chen, Y.; Luo, M.; Xu, L.; Zhou, X.; Ren, J.; Zhou, J.
2018-04-01
The RF method based on grid-search parameter optimization could achieve a classification accuracy of 88.16 % in the classification of images with multiple feature variables. This classification accuracy was higher than that of SVM and ANN under the same feature variables. In terms of efficiency, the RF classification method performs better than SVM and ANN, it is more capable of handling multidimensional feature variables. The RF method combined with object-based analysis approach could highlight the classification accuracy further. The multiresolution segmentation approach on the basis of ESP scale parameter optimization was used for obtaining six scales to execute image segmentation, when the segmentation scale was 49, the classification accuracy reached the highest value of 89.58 %. The classification accuracy of object-based RF classification was 1.42 % higher than that of pixel-based classification (88.16 %), and the classification accuracy was further improved. Therefore, the RF classification method combined with object-based analysis approach could achieve relatively high accuracy in the classification and extraction of land use information for industrial and mining reclamation areas. Moreover, the interpretation of remotely sensed imagery using the proposed method could provide technical support and theoretical reference for remotely sensed monitoring land reclamation.
Nandi, Sutanu; Subramanian, Abhishek; Sarkar, Ram Rup
2017-07-25
Prediction of essential genes helps to identify a minimal set of genes that are absolutely required for the appropriate functioning and survival of a cell. The available machine learning techniques for essential gene prediction have inherent problems, like imbalanced provision of training datasets, biased choice of the best model for a given balanced dataset, choice of a complex machine learning algorithm, and data-based automated selection of biologically relevant features for classification. Here, we propose a simple support vector machine-based learning strategy for the prediction of essential genes in Escherichia coli K-12 MG1655 metabolism that integrates a non-conventional combination of an appropriate sample balanced training set, a unique organism-specific genotype, phenotype attributes that characterize essential genes, and optimal parameters of the learning algorithm to generate the best machine learning model (the model with the highest accuracy among all the models trained for different sample training sets). For the first time, we also introduce flux-coupled metabolic subnetwork-based features for enhancing the classification performance. Our strategy proves to be superior as compared to previous SVM-based strategies in obtaining a biologically relevant classification of genes with high sensitivity and specificity. This methodology was also trained with datasets of other recent supervised classification techniques for essential gene classification and tested using reported test datasets. The testing accuracy was always high as compared to the known techniques, proving that our method outperforms known methods. Observations from our study indicate that essential genes are conserved among homologous bacterial species, demonstrate high codon usage bias, GC content and gene expression, and predominantly possess a tendency to form physiological flux modules in metabolism.
Comparison of LSS-IV and LISS-III+LISS-IV merged data for classification of crops
NASA Astrophysics Data System (ADS)
Hebbar, R.; Sesha Sai, M. V. R.
2014-11-01
Resourcesat-1 satellite with its unique capability of simultaneous acquisition of multispectral images at different spatial resolutions (AWiFS, LISS-III and LISS-IV MX / Mono) has immense potential for crop inventory. The present study was carried for selection of suitable LISS-IV MX band for data fusion and its evaluation for delineation different crops in a multi-cropped area. Image fusion techniques namely intensity hue saturation (IHS), principal component analysis (PCA), brovey, high pass filter (HPF) and wavelet methods were used for merging LISS-III and LISS-IV Mono data. The merged products were evaluated visually and through universal image quality index, ERGAS and classification accuracy. The study revealed that red band of LISS-IV MX data was found to be optimal band for merging with LISS-III data in terms of maintaining both spectral and spatial information and thus, closely matching with multispectral LISS-IVMX data. Among the five data fusion techniques, wavelet method was found to be superior in retaining image quality and higher classification accuracy compared to commonly used methods of IHS, PCA and Brovey. The study indicated that LISS-IV data in mono mode with wider swath of 70 km could be exploited in place of 24km LISS-IVMX data by selection of appropriate fusion techniques by acquiring monochromatic data in the red band.
NASA Astrophysics Data System (ADS)
Schmalz, M.; Ritter, G.
Accurate multispectral or hyperspectral signature classification is key to the nonimaging detection and recognition of space objects. Additionally, signature classification accuracy depends on accurate spectral endmember determination [1]. Previous approaches to endmember computation and signature classification were based on linear operators or neural networks (NNs) expressed in terms of the algebra (R, +, x) [1,2]. Unfortunately, class separation in these methods tends to be suboptimal, and the number of signatures that can be accurately classified often depends linearly on the number of NN inputs. This can lead to poor endmember distinction, as well as potentially significant classification errors in the presence of noise or densely interleaved signatures. In contrast to traditional CNNs, autoassociative morphological memories (AMM) are a construct similar to Hopfield autoassociatived memories defined on the (R, +, ?,?) lattice algebra [3]. Unlimited storage and perfect recall of noiseless real valued patterns has been proven for AMMs [4]. However, AMMs suffer from sensitivity to specific noise models, that can be characterized as erosive and dilative noise. On the other hand, the prior definition of a set of endmembers corresponds to material spectra lying on vertices of the minimum convex region covering the image data. These vertices can be characterized as morphologically independent patterns. It has further been shown that AMMs can be based on dendritic computation [3,6]. These techniques yield improved accuracy and class segmentation/separation ability in the presence of highly interleaved signature data. In this paper, we present a procedure for endmember determination based on AMM noise sensitivity, which employs morphological dendritic computation. We show that detected endmembers can be exploited by AMM based classification techniques, to achieve accurate signature classification in the presence of noise, closely spaced or interleaved signatures, and simulated camera optical distortions. In particular, we examine two critical cases: (1) classification of multiple closely spaced signatures that are difficult to separate using distance measures, and (2) classification of materials in simulated hyperspectral images of spaceborne satellites. In each case, test data are derived from a NASA database of space material signatures. Additional analysis pertains to computational complexity and noise sensitivity, which are superior to classical NN based techniques.
Zhu, Ying; Tan, Tuck Lee
2016-04-15
An effective and simple analytical method using Fourier transform infrared (FTIR) spectroscopy to distinguish wild-grown high-quality Ganoderma lucidum (G. lucidum) from cultivated one is of essential importance for its quality assurance and medicinal value estimation. Commonly used chemical and analytical methods using full spectrum are not so effective for the detection and interpretation due to the complex system of the herbal medicine. In this study, two penalized discriminant analysis models, penalized linear discriminant analysis (PLDA) and elastic net (Elnet),using FTIR spectroscopy have been explored for the purpose of discrimination and interpretation. The classification performances of the two penalized models have been compared with two widely used multivariate methods, principal component discriminant analysis (PCDA) and partial least squares discriminant analysis (PLSDA). The Elnet model involving a combination of L1 and L2 norm penalties enabled an automatic selection of a small number of informative spectral absorption bands and gave an excellent classification accuracy of 99% for discrimination between spectra of wild-grown and cultivated G. lucidum. Its classification performance was superior to that of the PLDA model in a pure L1 setting and outperformed the PCDA and PLSDA models using full wavelength. The well-performed selection of informative spectral features leads to substantial reduction in model complexity and improvement of classification accuracy, and it is particularly helpful for the quantitative interpretations of the major chemical constituents of G. lucidum regarding its anti-cancer effects. Copyright © 2016 Elsevier B.V. All rights reserved.
Shimizu, Yu; Yoshimoto, Junichiro; Takamura, Masahiro; Okada, Go; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji
2017-01-01
In diagnostic applications of statistical machine learning methods to brain imaging data, common problems include data high-dimensionality and co-linearity, which often cause over-fitting and instability. To overcome these problems, we applied partial least squares (PLS) regression to resting-state functional magnetic resonance imaging (rs-fMRI) data, creating a low-dimensional representation that relates symptoms to brain activity and that predicts clinical measures. Our experimental results, based upon data from clinically depressed patients and healthy controls, demonstrated that PLS and its kernel variants provided significantly better prediction of clinical measures than ordinary linear regression. Subsequent classification using predicted clinical scores distinguished depressed patients from healthy controls with 80% accuracy. Moreover, loading vectors for latent variables enabled us to identify brain regions relevant to depression, including the default mode network, the right superior frontal gyrus, and the superior motor area. PMID:28700672
A robust sparse-modeling framework for estimating schizophrenia biomarkers from fMRI.
Dillon, Keith; Calhoun, Vince; Wang, Yu-Ping
2017-01-30
Our goal is to identify the brain regions most relevant to mental illness using neuroimaging. State of the art machine learning methods commonly suffer from repeatability difficulties in this application, particularly when using large and heterogeneous populations for samples. We revisit both dimensionality reduction and sparse modeling, and recast them in a common optimization-based framework. This allows us to combine the benefits of both types of methods in an approach which we call unambiguous components. We use this to estimate the image component with a constrained variability, which is best correlated with the unknown disease mechanism. We apply the method to the estimation of neuroimaging biomarkers for schizophrenia, using task fMRI data from a large multi-site study. The proposed approach yields an improvement in both robustness of the estimate and classification accuracy. We find that unambiguous components incorporate roughly two thirds of the same brain regions as sparsity-based methods LASSO and elastic net, while roughly one third of the selected regions differ. Further, unambiguous components achieve superior classification accuracy in differentiating cases from controls. Unambiguous components provide a robust way to estimate important regions of imaging data. Copyright © 2016 Elsevier B.V. All rights reserved.
Multichannel interictal spike activity detection using time-frequency entropy measure.
Thanaraj, Palani; Parvathavarthini, B
2017-06-01
Localization of interictal spikes is an important clinical step in the pre-surgical assessment of pharmacoresistant epileptic patients. The manual selection of interictal spike periods is cumbersome and involves a considerable amount of analysis workload for the physician. The primary focus of this paper is to automate the detection of interictal spikes for clinical applications in epilepsy localization. The epilepsy localization procedure involves detection of spikes in a multichannel EEG epoch. Therefore, a multichannel Time-Frequency (T-F) entropy measure is proposed to extract features related to the interictal spike activity. Least squares support vector machine is used to train the proposed feature to classify the EEG epochs as either normal or interictal spike period. The proposed T-F entropy measure, when validated with epilepsy dataset of 15 patients, shows an interictal spike classification accuracy of 91.20%, sensitivity of 100% and specificity of 84.23%. Moreover, the area under the curve of Receiver Operating Characteristics plot of 0.9339 shows the superior classification performance of the proposed T-F entropy measure. The results of this paper show a good spike detection accuracy without any prior information about the spike morphology.
Predicting aged pork quality using a portable Raman device.
Santos, C C; Zhao, J; Dong, X; Lonergan, S M; Huff-Lonergan, E; Outhouse, A; Carlson, K B; Prusa, K J; Fedler, C A; Yu, C; Shackelford, S D; King, D A; Wheeler, T L
2018-05-29
The utility of Raman spectroscopic signatures of fresh pork loin (1 d & 15 d postmortem) in predicting fresh pork tenderness and slice shear force (SSF) was determined. Partial least square models showed that sensory tenderness and SSF are weakly correlated (R 2 = 0.2). Raman spectral data were collected in 6 s using a portable Raman spectrometer (RS). A PLS regression model was developed to predict quantitatively the tenderness scores and SSF values from Raman spectral data, with very limited success. It was discovered that the prediction accuracies for day 15 post mortem samples are significantly greater than that for day 1 postmortem samples. Classification models were developed to predict tenderness at two ends of sensory quality as "poor" vs. "good". The accuracies of classification into different quality categories (1st to 4th percentile) are also greater for the day 15 postmortem samples for sensory tenderness (93.5% vs 76.3%) and SSF (92.8% vs 76.1%). RS has the potential to become a rapid on-line screening tool for the pork producers to quickly select meats with superior quality and/or cull poor quality to meet market demand/expectations. Copyright © 2018 Elsevier Ltd. All rights reserved.
Shi, Jun; Liu, Xiao; Li, Yan; Zhang, Qi; Li, Yingjie; Ying, Shihui
2015-10-30
Electroencephalography (EEG) based sleep staging is commonly used in clinical routine. Feature extraction and representation plays a crucial role in EEG-based automatic classification of sleep stages. Sparse representation (SR) is a state-of-the-art unsupervised feature learning method suitable for EEG feature representation. Collaborative representation (CR) is an effective data coding method used as a classifier. Here we use CR as a data representation method to learn features from the EEG signal. A joint collaboration model is established to develop a multi-view learning algorithm, and generate joint CR (JCR) codes to fuse and represent multi-channel EEG signals. A two-stage multi-view learning-based sleep staging framework is then constructed, in which JCR and joint sparse representation (JSR) algorithms first fuse and learning the feature representation from multi-channel EEG signals, respectively. Multi-view JCR and JSR features are then integrated and sleep stages recognized by a multiple kernel extreme learning machine (MK-ELM) algorithm with grid search. The proposed two-stage multi-view learning algorithm achieves superior performance for sleep staging. With a K-means clustering based dictionary, the mean classification accuracy, sensitivity and specificity are 81.10 ± 0.15%, 71.42 ± 0.66% and 94.57 ± 0.07%, respectively; while with the dictionary learned using the submodular optimization method, they are 80.29 ± 0.22%, 71.26 ± 0.78% and 94.38 ± 0.10%, respectively. The two-stage multi-view learning based sleep staging framework outperforms all other classification methods compared in this work, while JCR is superior to JSR. The proposed multi-view learning framework has the potential for sleep staging based on multi-channel or multi-modality polysomnography signals. Copyright © 2015 Elsevier B.V. All rights reserved.
Aggregation of Sentinel-2 time series classifications as a solution for multitemporal analysis
NASA Astrophysics Data System (ADS)
Lewiński, Stanislaw; Nowakowski, Artur; Malinowski, Radek; Rybicki, Marcin; Kukawska, Ewa; Krupiński, Michał
2017-10-01
The general aim of this work was to elaborate efficient and reliable aggregation method that could be used for creating a land cover map at a global scale from multitemporal satellite imagery. The study described in this paper presents methods for combining results of land cover/land use classifications performed on single-date Sentinel-2 images acquired at different time periods. For that purpose different aggregation methods were proposed and tested on study sites spread on different continents. The initial classifications were performed with Random Forest classifier on individual Sentinel-2 images from a time series. In the following step the resulting land cover maps were aggregated pixel by pixel using three different combinations of information on the number of occurrences of a certain land cover class within a time series and the posterior probability of particular classes resulting from the Random Forest classification. From the proposed methods two are shown superior and in most cases were able to reach or outperform the accuracy of the best individual classifications of single-date images. Moreover, the aggregations results are very stable when used on data with varying cloudiness. They also enable to reduce considerably the number of cloudy pixels in the resulting land cover map what is significant advantage for mapping areas with frequent cloud coverage.
Visual brain activity patterns classification with simultaneous EEG-fMRI: A multimodal approach.
Ahmad, Rana Fayyaz; Malik, Aamir Saeed; Kamel, Nidal; Reza, Faruque; Amin, Hafeez Ullah; Hussain, Muhammad
2017-01-01
Classification of the visual information from the brain activity data is a challenging task. Many studies reported in the literature are based on the brain activity patterns using either fMRI or EEG/MEG only. EEG and fMRI considered as two complementary neuroimaging modalities in terms of their temporal and spatial resolution to map the brain activity. For getting a high spatial and temporal resolution of the brain at the same time, simultaneous EEG-fMRI seems to be fruitful. In this article, we propose a new method based on simultaneous EEG-fMRI data and machine learning approach to classify the visual brain activity patterns. We acquired EEG-fMRI data simultaneously on the ten healthy human participants by showing them visual stimuli. Data fusion approach is used to merge EEG and fMRI data. Machine learning classifier is used for the classification purposes. Results showed that superior classification performance has been achieved with simultaneous EEG-fMRI data as compared to the EEG and fMRI data standalone. This shows that multimodal approach improved the classification accuracy results as compared with other approaches reported in the literature. The proposed simultaneous EEG-fMRI approach for classifying the brain activity patterns can be helpful to predict or fully decode the brain activity patterns.
Siuly; Yin, Xiaoxia; Hadjiloucas, Sillas; Zhang, Yanchun
2016-04-01
This work provides a performance comparison of four different machine learning classifiers: multinomial logistic regression with ridge estimators (MLR) classifier, k-nearest neighbours (KNN), support vector machine (SVM) and naïve Bayes (NB) as applied to terahertz (THz) transient time domain sequences associated with pixelated images of different powder samples. The six substances considered, although have similar optical properties, their complex insertion loss at the THz part of the spectrum is significantly different because of differences in both their frequency dependent THz extinction coefficient as well as differences in their refractive index and scattering properties. As scattering can be unquantifiable in many spectroscopic experiments, classification solely on differences in complex insertion loss can be inconclusive. The problem is addressed using two-dimensional (2-D) cross-correlations between background and sample interferograms, these ensure good noise suppression of the datasets and provide a range of statistical features that are subsequently used as inputs to the above classifiers. A cross-validation procedure is adopted to assess the performance of the classifiers. Firstly the measurements related to samples that had thicknesses of 2mm were classified, then samples at thicknesses of 4mm, and after that 3mm were classified and the success rate and consistency of each classifier was recorded. In addition, mixtures having thicknesses of 2 and 4mm as well as mixtures of 2, 3 and 4mm were presented simultaneously to all classifiers. This approach provided further cross-validation of the classification consistency of each algorithm. The results confirm the superiority in classification accuracy and robustness of the MLR (least accuracy 88.24%) and KNN (least accuracy 90.19%) algorithms which consistently outperformed the SVM (least accuracy 74.51%) and NB (least accuracy 56.86%) classifiers for the same number of feature vectors across all studies. The work establishes a general methodology for assessing the performance of other hyperspectral dataset classifiers on the basis of 2-D cross-correlations in far-infrared spectroscopy or other parts of the electromagnetic spectrum. It also advances the wider proliferation of automated THz imaging systems across new application areas e.g., biomedical imaging, industrial processing and quality control where interpretation of hyperspectral images is still under development. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Wang, Shuaiqun; Aorigele; Kong, Wei; Zeng, Weiming; Hong, Xiaomin
2016-01-01
Gene expression data composed of thousands of genes play an important role in classification platforms and disease diagnosis. Hence, it is vital to select a small subset of salient features over a large number of gene expression data. Lately, many researchers devote themselves to feature selection using diverse computational intelligence methods. However, in the progress of selecting informative genes, many computational methods face difficulties in selecting small subsets for cancer classification due to the huge number of genes (high dimension) compared to the small number of samples, noisy genes, and irrelevant genes. In this paper, we propose a new hybrid algorithm HICATS incorporating imperialist competition algorithm (ICA) which performs global search and tabu search (TS) that conducts fine-tuned search. In order to verify the performance of the proposed algorithm HICATS, we have tested it on 10 well-known benchmark gene expression classification datasets with dimensions varying from 2308 to 12600. The performance of our proposed method proved to be superior to other related works including the conventional version of binary optimization algorithm in terms of classification accuracy and the number of selected genes.
Aorigele; Zeng, Weiming; Hong, Xiaomin
2016-01-01
Gene expression data composed of thousands of genes play an important role in classification platforms and disease diagnosis. Hence, it is vital to select a small subset of salient features over a large number of gene expression data. Lately, many researchers devote themselves to feature selection using diverse computational intelligence methods. However, in the progress of selecting informative genes, many computational methods face difficulties in selecting small subsets for cancer classification due to the huge number of genes (high dimension) compared to the small number of samples, noisy genes, and irrelevant genes. In this paper, we propose a new hybrid algorithm HICATS incorporating imperialist competition algorithm (ICA) which performs global search and tabu search (TS) that conducts fine-tuned search. In order to verify the performance of the proposed algorithm HICATS, we have tested it on 10 well-known benchmark gene expression classification datasets with dimensions varying from 2308 to 12600. The performance of our proposed method proved to be superior to other related works including the conventional version of binary optimization algorithm in terms of classification accuracy and the number of selected genes. PMID:27579323
2013-01-01
Background Gene expression data could likely be a momentous help in the progress of proficient cancer diagnoses and classification platforms. Lately, many researchers analyze gene expression data using diverse computational intelligence methods, for selecting a small subset of informative genes from the data for cancer classification. Many computational methods face difficulties in selecting small subsets due to the small number of samples compared to the huge number of genes (high-dimension), irrelevant genes, and noisy genes. Methods We propose an enhanced binary particle swarm optimization to perform the selection of small subsets of informative genes which is significant for cancer classification. Particle speed, rule, and modified sigmoid function are introduced in this proposed method to increase the probability of the bits in a particle’s position to be zero. The method was empirically applied to a suite of ten well-known benchmark gene expression data sets. Results The performance of the proposed method proved to be superior to other previous related works, including the conventional version of binary particle swarm optimization (BPSO) in terms of classification accuracy and the number of selected genes. The proposed method also requires lower computational time compared to BPSO. PMID:23617960
Ma, Xiang; Schonfeld, Dan; Khokhar, Ashfaq A
2009-06-01
In this paper, we propose a novel solution to an arbitrary noncausal, multidimensional hidden Markov model (HMM) for image and video classification. First, we show that the noncausal model can be solved by splitting it into multiple causal HMMs and simultaneously solving each causal HMM using a fully synchronous distributed computing framework, therefore referred to as distributed HMMs. Next we present an approximate solution to the multiple causal HMMs that is based on an alternating updating scheme and assumes a realistic sequential computing framework. The parameters of the distributed causal HMMs are estimated by extending the classical 1-D training and classification algorithms to multiple dimensions. The proposed extension to arbitrary causal, multidimensional HMMs allows state transitions that are dependent on all causal neighbors. We, thus, extend three fundamental algorithms to multidimensional causal systems, i.e., 1) expectation-maximization (EM), 2) general forward-backward (GFB), and 3) Viterbi algorithms. In the simulations, we choose to limit ourselves to a noncausal 2-D model whose noncausality is along a single dimension, in order to significantly reduce the computational complexity. Simulation results demonstrate the superior performance, higher accuracy rate, and applicability of the proposed noncausal HMM framework to image and video classification.
A novel single neuron perceptron with universal approximation and XOR computation properties.
Lotfi, Ehsan; Akbarzadeh-T, M-R
2014-01-01
We propose a biologically motivated brain-inspired single neuron perceptron (SNP) with universal approximation and XOR computation properties. This computational model extends the input pattern and is based on the excitatory and inhibitory learning rules inspired from neural connections in the human brain's nervous system. The resulting architecture of SNP can be trained by supervised excitatory and inhibitory online learning rules. The main features of proposed single layer perceptron are universal approximation property and low computational complexity. The method is tested on 6 UCI (University of California, Irvine) pattern recognition and classification datasets. Various comparisons with multilayer perceptron (MLP) with gradient decent backpropagation (GDBP) learning algorithm indicate the superiority of the approach in terms of higher accuracy, lower time, and spatial complexity, as well as faster training. Hence, we believe the proposed approach can be generally applicable to various problems such as in pattern recognition and classification.
New Dandelion Algorithm Optimizes Extreme Learning Machine for Biomedical Classification Problems
Li, Xiguang; Zhao, Liang; Gong, Changqing; Liu, Xiaojing
2017-01-01
Inspired by the behavior of dandelion sowing, a new novel swarm intelligence algorithm, namely, dandelion algorithm (DA), is proposed for global optimization of complex functions in this paper. In DA, the dandelion population will be divided into two subpopulations, and different subpopulations will undergo different sowing behaviors. Moreover, another sowing method is designed to jump out of local optimum. In order to demonstrate the validation of DA, we compare the proposed algorithm with other existing algorithms, including bat algorithm, particle swarm optimization, and enhanced fireworks algorithm. Simulations show that the proposed algorithm seems much superior to other algorithms. At the same time, the proposed algorithm can be applied to optimize extreme learning machine (ELM) for biomedical classification problems, and the effect is considerable. At last, we use different fusion methods to form different fusion classifiers, and the fusion classifiers can achieve higher accuracy and better stability to some extent. PMID:29085425
Jongin Kim; Boreom Lee
2017-07-01
The classification of neuroimaging data for the diagnosis of Alzheimer's Disease (AD) is one of the main research goals of the neuroscience and clinical fields. In this study, we performed extreme learning machine (ELM) classifier to discriminate the AD, mild cognitive impairment (MCI) from normal control (NC). We compared the performance of ELM with that of a linear kernel support vector machine (SVM) for 718 structural MRI images from Alzheimer's Disease Neuroimaging Initiative (ADNI) database. The data consisted of normal control, MCI converter (MCI-C), MCI non-converter (MCI-NC), and AD. We employed SVM-based recursive feature elimination (RFE-SVM) algorithm to find the optimal subset of features. In this study, we found that the RFE-SVM feature selection approach in combination with ELM shows the superior classification accuracy to that of linear kernel SVM for structural T1 MRI data.
Image quality classification for DR screening using deep learning.
FengLi Yu; Jing Sun; Annan Li; Jun Cheng; Cheng Wan; Jiang Liu
2017-07-01
The quality of input images significantly affects the outcome of automated diabetic retinopathy (DR) screening systems. Unlike the previous methods that only consider simple low-level features such as hand-crafted geometric and structural features, in this paper we propose a novel method for retinal image quality classification (IQC) that performs computational algorithms imitating the working of the human visual system. The proposed algorithm combines unsupervised features from saliency map and supervised features coming from convolutional neural networks (CNN), which are fed to an SVM to automatically detect high quality vs poor quality retinal fundus images. We demonstrate the superior performance of our proposed algorithm on a large retinal fundus image dataset and the method could achieve higher accuracy than other methods. Although retinal images are used in this study, the methodology is applicable to the image quality assessment and enhancement of other types of medical images.
Variance approximations for assessments of classification accuracy
R. L. Czaplewski
1994-01-01
Variance approximations are derived for the weighted and unweighted kappa statistics, the conditional kappa statistic, and conditional probabilities. These statistics are useful to assess classification accuracy, such as accuracy of remotely sensed classifications in thematic maps when compared to a sample of reference classifications made in the field. Published...
NASA Astrophysics Data System (ADS)
Ye, Su; Chen, Dongmei; Yu, Jie
2016-04-01
In remote sensing, conventional supervised change-detection methods usually require effective training data for multiple change types. This paper introduces a more flexible and efficient procedure that seeks to identify only the changes that users are interested in, here after referred to as "targeted change detection". Based on a one-class classifier "Support Vector Domain Description (SVDD)", a novel algorithm named "Three-layer SVDD Fusion (TLSF)" is developed specially for targeted change detection. The proposed algorithm combines one-class classification generated from change vector maps, as well as before- and after-change images in order to get a more reliable detecting result. In addition, this paper introduces a detailed workflow for implementing this algorithm. This workflow has been applied to two case studies with different practical monitoring objectives: urban expansion and forest fire assessment. The experiment results of these two case studies show that the overall accuracy of our proposed algorithm is superior (Kappa statistics are 86.3% and 87.8% for Case 1 and 2, respectively), compared to applying SVDD to change vector analysis and post-classification comparison.
A Study of Deep CNN-Based Classification of Open and Closed Eyes Using a Visible Light Camera Sensor
Kim, Ki Wan; Hong, Hyung Gil; Nam, Gi Pyo; Park, Kang Ryoung
2017-01-01
The necessity for the classification of open and closed eyes is increasing in various fields, including analysis of eye fatigue in 3D TVs, analysis of the psychological states of test subjects, and eye status tracking-based driver drowsiness detection. Previous studies have used various methods to distinguish between open and closed eyes, such as classifiers based on the features obtained from image binarization, edge operators, or texture analysis. However, when it comes to eye images with different lighting conditions and resolutions, it can be difficult to find an optimal threshold for image binarization or optimal filters for edge and texture extraction. In order to address this issue, we propose a method to classify open and closed eye images with different conditions, acquired by a visible light camera, using a deep residual convolutional neural network. After conducting performance analysis on both self-collected and open databases, we have determined that the classification accuracy of the proposed method is superior to that of existing methods. PMID:28665361
Cai, Suxian; Yang, Shanshan; Zheng, Fang; Lu, Meng; Wu, Yunfeng; Krishnan, Sridhar
2013-01-01
Analysis of knee joint vibration (VAG) signals can provide quantitative indices for detection of knee joint pathology at an early stage. In addition to the statistical features developed in the related previous studies, we extracted two separable features, that is, the number of atoms derived from the wavelet matching pursuit decomposition and the number of significant signal turns detected with the fixed threshold in the time domain. To perform a better classification over the data set of 89 VAG signals, we applied a novel classifier fusion system based on the dynamic weighted fusion (DWF) method to ameliorate the classification performance. For comparison, a single leastsquares support vector machine (LS-SVM) and the Bagging ensemble were used for the classification task as well. The results in terms of overall accuracy in percentage and area under the receiver operating characteristic curve obtained with the DWF-based classifier fusion method reached 88.76% and 0.9515, respectively, which demonstrated the effectiveness and superiority of the DWF method with two distinct features for the VAG signal analysis. PMID:23573175
Cates, Benjamin; Sim, Taeyong; Heo, Hyun Mu; Kim, Bori; Kim, Hyunggun; Mun, Joung Hwan
2018-01-01
In order to overcome the current limitations in current threshold-based and machine learning-based fall detectors, an insole system and novel fall classification model were created. Because high-acceleration activities have a high risk for falls, and because of the potential damage that is associated with falls during high-acceleration activities, four low-acceleration activities, four high-acceleration activities, and eight types of high-acceleration falls were performed by twenty young male subjects. Encompassing a total of 800 falls and 320 min of activities of daily life (ADLs), the created Support Vector Machine model’s Leave-One-Out cross-validation provides a fall detection sensitivity (0.996), specificity (1.000), and accuracy (0.999). These classification results are similar or superior to other fall detection models in the literature, while also including high-acceleration ADLs to challenge the classification model, and simultaneously reducing the burden that is associated with wearable sensors and increasing user comfort by inserting the insole system into the shoe. PMID:29673165
Ramsey, Elijah W.; Nelson, Gene A.; Sapkota, Sijan
1998-01-01
A progressive classification of a marsh and forest system using Landsat Thematic Mapper (TM), color infrared (CIR) photograph, and ERS-1 synthetic aperture radar (SAR) data improved classification accuracy when compared to classification using solely TM reflective band data. The classification resulted in a detailed identification of differences within a nearly monotypic black needlerush marsh. Accuracy percentages of these classes were surprisingly high given the complexities of classification. The detailed classification resulted in a more accurate portrayal of the marsh transgressive sequence than was obtainable with TM data alone. Individual sensor contribution to the improved classification was compared to that using only the six reflective TM bands. Individually, the green reflective CIR and SAR data identified broad categories of water, marsh, and forest. In combination with TM, SAR and the green CIR band each improved overall accuracy by about 3% and 15% respectively. The SAR data improved the TM classification accuracy mostly in the marsh classes. The green CIR data also improved the marsh classification accuracy and accuracies in some water classes. The final combination of all sensor data improved almost all class accuracies from 2% to 70% with an overall improvement of about 20% over TM data alone. Not only was the identification of vegetation types improved, but the spatial detail of the classification approached 10 m in some areas.
Comparison of Feature Selection Techniques in Machine Learning for Anatomical Brain MRI in Dementia.
Tohka, Jussi; Moradi, Elaheh; Huttunen, Heikki
2016-07-01
We present a comparative split-half resampling analysis of various data driven feature selection and classification methods for the whole brain voxel-based classification analysis of anatomical magnetic resonance images. We compared support vector machines (SVMs), with or without filter based feature selection, several embedded feature selection methods and stability selection. While comparisons of the accuracy of various classification methods have been reported previously, the variability of the out-of-training sample classification accuracy and the set of selected features due to independent training and test sets have not been previously addressed in a brain imaging context. We studied two classification problems: 1) Alzheimer's disease (AD) vs. normal control (NC) and 2) mild cognitive impairment (MCI) vs. NC classification. In AD vs. NC classification, the variability in the test accuracy due to the subject sample did not vary between different methods and exceeded the variability due to different classifiers. In MCI vs. NC classification, particularly with a large training set, embedded feature selection methods outperformed SVM-based ones with the difference in the test accuracy exceeding the test accuracy variability due to the subject sample. The filter and embedded methods produced divergent feature patterns for MCI vs. NC classification that suggests the utility of the embedded feature selection for this problem when linked with the good generalization performance. The stability of the feature sets was strongly correlated with the number of features selected, weakly correlated with the stability of classification accuracy, and uncorrelated with the average classification accuracy.
ERIC Educational Resources Information Center
Wang, Wenyi; Song, Lihong; Chen, Ping; Meng, Yaru; Ding, Shuliang
2015-01-01
Classification consistency and accuracy are viewed as important indicators for evaluating the reliability and validity of classification results in cognitive diagnostic assessment (CDA). Pattern-level classification consistency and accuracy indices were introduced by Cui, Gierl, and Chang. However, the indices at the attribute level have not yet…
NASA Astrophysics Data System (ADS)
Lemma, Hanibal; Frankl, Amaury; Poesen, Jean; Adgo, Enyew; Nyssen, Jan
2017-04-01
Object-oriented image classification has been gaining prominence in the field of remote sensing and provides a valid alternative to the 'traditional' pixel based methods. Recent studies have proven the superiority of the object-based approach. So far, object-oriented land cover classifications have been applied either at limited spatial coverages (ranging 2 to 1091 km2) or by using very high resolution (0.5-16 m) imageries. The main aim of this study is to drive land cover information for large area from Landsat 8 OLI surface reflectance using the Estimation of Scale Parameter (ESP) tool and the object oriented software eCognition. The available land cover map of Lake Tana Basin (Ethiopia) is about 20 years old with a courser spatial scale (1:250,000) and has limited use for environmental modelling and monitoring studies. Up-to-date and basin wide land cover maps are essential to overcome haphazard natural resources management, land degradation and reduced agricultural production. Indeed, object-oriented approach involves image segmentation prior to classification, i.e. adjacent similar pixels are aggregated into segments as long as the heterogeneity in the spectral and spatial domains is minimized. For each segmented object, different attributes (spectral, textural and shape) were calculated and used for in subsequent classification analysis. Moreover, the commonly used error matrix is employed to determine the quality of the land cover map. As a result, the multiresolution segmentation (with parameters of scale=30, shape=0.3 and Compactness=0.7) produces highly homogeneous image objects as it is observed in different sample locations in google earth. Out of the 15,089 km2 area of the basin, cultivated land is dominant (69%) followed by water bodies (21%), grassland (4.8%), forest (3.7%) and shrubs (1.1%). Wetlands, artificial surfaces and bare land cover only about 1% of the basin. The overall classification accuracy is 80% with a Kappa coefficient of 0.75. With regard to individual classes, the classification show higher Producer's and User's accuracy (above 84%) for cultivated land, water bodies and forest, but lower (less than 70%) for shrubs, bare land and grassland. Key words: accuracy assessment, eCognition, Estimation of Scale Parameter, land cover, Landsat 8, remote sensing
Burneo-Garcés, Carlos; Marín-Morales, Agar; Pérez-García, Miguel
2018-01-01
The ability of a wide range of psychological and actuarial measures to characterize crimes in the prison population has not yet been compared in a single study. Our main objective was to determine if the discriminant capacity of psychological measures (PM) and actuarial data (AD) varies according to the crime. An Ecuadorian sample of 576 men convicted of Robbery, Murder, Rape and Drug Possession crimes was evaluated through an ad hoc questionnaire, prison files and the Spanish adaptation of the Personality Assessment Inventory. Discriminant analysis was used to establish, for each crime, the discriminant capacity and the classification accuracy of a model composed of AD (socio-demographic and judicial measures) and a second model incorporating PM. The AD showed a superior discriminant capacity, whilst the contribution of both types of measures varied according to the crime. The PM generated some increase in the correct classification percentages for Murder, Rape and Drug Possession, but their contribution was zero for the crime of Robbery. Specific profiles of each crime were obtained from the strongest significant correlations between the value of each explanatory variable and the probability of belonging to the crime. The AD model is more robust when these four crimes are characterized. The contribution of AD and PM depends on the crime, and the inclusion of PM in actuarial models moderately optimizes the classification accuracy of Murder, Rape, and Drug Possession crimes. PMID:29874264
Decision Making Based on Fuzzy Aggregation Operators for Medical Diagnosis from Dental X-ray images.
Ngan, Tran Thi; Tuan, Tran Manh; Son, Le Hoang; Minh, Nguyen Hai; Dey, Nilanjan
2016-12-01
Medical diagnosis is considered as an important step in dentistry treatment which assists clinicians to give their decision about diseases of a patient. It has been affirmed that the accuracy of medical diagnosis, which is much influenced by the clinicians' experience and knowledge, plays an important role to effective treatment therapies. In this paper, we propose a novel decision making method based on fuzzy aggregation operators for medical diagnosis from dental X-Ray images. It firstly divides a dental X-Ray image into some segments and identified equivalent diseases by a classification method called Affinity Propagation Clustering (APC+). Lastly, the most potential disease is found using fuzzy aggregation operators. The experimental validation on real dental datasets of Hanoi Medical University Hospital, Vietnam showed the superiority of the proposed method against the relevant ones in terms of accuracy.
[Accuracy improvement of spectral classification of crop using microwave backscatter data].
Jia, Kun; Li, Qiang-Zi; Tian, Yi-Chen; Wu, Bing-Fang; Zhang, Fei-Fei; Meng, Ji-Hua
2011-02-01
In the present study, VV polarization microwave backscatter data used for improving accuracies of spectral classification of crop is investigated. Classification accuracy using different classifiers based on the fusion data of HJ satellite multi-spectral and Envisat ASAR VV backscatter data are compared. The results indicate that fusion data can take full advantage of spectral information of HJ multi-spectral data and the structure sensitivity feature of ASAR VV polarization data. The fusion data enlarges the spectral difference among different classifications and improves crop classification accuracy. The classification accuracy using fusion data can be increased by 5 percent compared to the single HJ data. Furthermore, ASAR VV polarization data is sensitive to non-agrarian area of planted field, and VV polarization data joined classification can effectively distinguish the field border. VV polarization data associating with multi-spectral data used in crop classification enlarges the application of satellite data and has the potential of spread in the domain of agriculture.
Raghu, S; Sriraam, N; Kumar, G Pradeep
2017-02-01
Electroencephalogram shortly termed as EEG is considered as the fundamental segment for the assessment of the neural activities in the brain. In cognitive neuroscience domain, EEG-based assessment method is found to be superior due to its non-invasive ability to detect deep brain structure while exhibiting superior spatial resolutions. Especially for studying the neurodynamic behavior of epileptic seizures, EEG recordings reflect the neuronal activity of the brain and thus provide required clinical diagnostic information for the neurologist. This specific proposed study makes use of wavelet packet based log and norm entropies with a recurrent Elman neural network (REN) for the automated detection of epileptic seizures. Three conditions, normal, pre-ictal and epileptic EEG recordings were considered for the proposed study. An adaptive Weiner filter was initially applied to remove the power line noise of 50 Hz from raw EEG recordings. Raw EEGs were segmented into 1 s patterns to ensure stationarity of the signal. Then wavelet packet using Haar wavelet with a five level decomposition was introduced and two entropies, log and norm were estimated and were applied to REN classifier to perform binary classification. The non-linear Wilcoxon statistical test was applied to observe the variation in the features under these conditions. The effect of log energy entropy (without wavelets) was also studied. It was found from the simulation results that the wavelet packet log entropy with REN classifier yielded a classification accuracy of 99.70 % for normal-pre-ictal, 99.70 % for normal-epileptic and 99.85 % for pre-ictal-epileptic.
NASA Technical Reports Server (NTRS)
Justice, C.; Townshend, J. (Principal Investigator)
1981-01-01
Two unsupervised classification procedures were applied to ratioed and unratioed LANDSAT multispectral scanner data of an area of spatially complex vegetation and terrain. An objective accuracy assessment was undertaken on each classification and comparison was made of the classification accuracies. The two unsupervised procedures use the same clustering algorithm. By on procedure the entire area is clustered and by the other a representative sample of the area is clustered and the resulting statistics are extrapolated to the remaining area using a maximum likelihood classifier. Explanation is given of the major steps in the classification procedures including image preprocessing; classification; interpretation of cluster classes; and accuracy assessment. Of the four classifications undertaken, the monocluster block approach on the unratioed data gave the highest accuracy of 80% for five coarse cover classes. This accuracy was increased to 84% by applying a 3 x 3 contextual filter to the classified image. A detailed description and partial explanation is provided for the major misclassification. The classification of the unratioed data produced higher percentage accuracies than for the ratioed data and the monocluster block approach gave higher accuracies than clustering the entire area. The moncluster block approach was additionally the most economical in terms of computing time.
Zhou, Tao; Li, Zhaofu; Pan, Jianjun
2018-01-27
This paper focuses on evaluating the ability and contribution of using backscatter intensity, texture, coherence, and color features extracted from Sentinel-1A data for urban land cover classification and comparing different multi-sensor land cover mapping methods to improve classification accuracy. Both Landsat-8 OLI and Hyperion images were also acquired, in combination with Sentinel-1A data, to explore the potential of different multi-sensor urban land cover mapping methods to improve classification accuracy. The classification was performed using a random forest (RF) method. The results showed that the optimal window size of the combination of all texture features was 9 × 9, and the optimal window size was different for each individual texture feature. For the four different feature types, the texture features contributed the most to the classification, followed by the coherence and backscatter intensity features; and the color features had the least impact on the urban land cover classification. Satisfactory classification results can be obtained using only the combination of texture and coherence features, with an overall accuracy up to 91.55% and a kappa coefficient up to 0.8935, respectively. Among all combinations of Sentinel-1A-derived features, the combination of the four features had the best classification result. Multi-sensor urban land cover mapping obtained higher classification accuracy. The combination of Sentinel-1A and Hyperion data achieved higher classification accuracy compared to the combination of Sentinel-1A and Landsat-8 OLI images, with an overall accuracy of up to 99.12% and a kappa coefficient up to 0.9889. When Sentinel-1A data was added to Hyperion images, the overall accuracy and kappa coefficient were increased by 4.01% and 0.0519, respectively.
Can segmentation evaluation metric be used as an indicator of land cover classification accuracy?
NASA Astrophysics Data System (ADS)
Švab Lenarčič, Andreja; Đurić, Nataša; Čotar, Klemen; Ritlop, Klemen; Oštir, Krištof
2016-10-01
It is a broadly established belief that the segmentation result significantly affects subsequent image classification accuracy. However, the actual correlation between the two has never been evaluated. Such an evaluation would be of considerable importance for any attempts to automate the object-based classification process, as it would reduce the amount of user intervention required to fine-tune the segmentation parameters. We conducted an assessment of segmentation and classification by analyzing 100 different segmentation parameter combinations, 3 classifiers, 5 land cover classes, 20 segmentation evaluation metrics, and 7 classification accuracy measures. The reliability definition of segmentation evaluation metrics as indicators of land cover classification accuracy was based on the linear correlation between the two. All unsupervised metrics that are not based on number of segments have a very strong correlation with all classification measures and are therefore reliable as indicators of land cover classification accuracy. On the other hand, correlation at supervised metrics is dependent on so many factors that it cannot be trusted as a reliable classification quality indicator. Algorithms for land cover classification studied in this paper are widely used; therefore, presented results are applicable to a wider area.
Leighton, Angela; Weinborn, Michael; Maybery, Murray
2014-10-01
Bigler (2012) and Larrabee (2012) recently addressed the state of the science surrounding performance validity tests (PVTs) in a dialogue highlighting evidence for the valid and increased use of PVTs, but also for unresolved problems. Specifically, Bigler criticized the lack of guidance from neurocognitive processing theory in the PVT literature. For example, individual PVTs have applied the simultaneous forced-choice methodology using a variety of test characteristics (e.g., word vs. picture stimuli) with known neurocognitive processing implications (e.g., the "picture superiority effect"). However, the influence of such variations on classification accuracy has been inadequately evaluated, particularly among cognitively impaired individuals. The current review places the PVT literature in the context of neurocognitive processing theory, and identifies potential methodological factors to account for the significant variability we identified in classification accuracy across current PVTs. We subsequently evaluated the utility of a well-known cognitive manipulation to provide a Clinical Analogue Methodology (CAM), that is, to alter the PVT performance of healthy individuals to be similar to that of a cognitively impaired group. Initial support was found, suggesting the CAM may be useful alongside other approaches (analogue malingering methodology) for the systematic evaluation of PVTs, particularly the influence of specific neurocognitive processing components on performance.
Land-cover classification in a moist tropical region of Brazil with Landsat TM imagery.
Li, Guiying; Lu, Dengsheng; Moran, Emilio; Hetrick, Scott
2011-01-01
This research aims to improve land-cover classification accuracy in a moist tropical region in Brazil by examining the use of different remote sensing-derived variables and classification algorithms. Different scenarios based on Landsat Thematic Mapper (TM) spectral data and derived vegetation indices and textural images, and different classification algorithms - maximum likelihood classification (MLC), artificial neural network (ANN), classification tree analysis (CTA), and object-based classification (OBC), were explored. The results indicated that a combination of vegetation indices as extra bands into Landsat TM multispectral bands did not improve the overall classification performance, but the combination of textural images was valuable for improving vegetation classification accuracy. In particular, the combination of both vegetation indices and textural images into TM multispectral bands improved overall classification accuracy by 5.6% and kappa coefficient by 6.25%. Comparison of the different classification algorithms indicated that CTA and ANN have poor classification performance in this research, but OBC improved primary forest and pasture classification accuracies. This research indicates that use of textural images or use of OBC are especially valuable for improving the vegetation classes such as upland and liana forest classes having complex stand structures and having relatively large patch sizes.
Land-cover classification in a moist tropical region of Brazil with Landsat TM imagery
LI, GUIYING; LU, DENGSHENG; MORAN, EMILIO; HETRICK, SCOTT
2011-01-01
This research aims to improve land-cover classification accuracy in a moist tropical region in Brazil by examining the use of different remote sensing-derived variables and classification algorithms. Different scenarios based on Landsat Thematic Mapper (TM) spectral data and derived vegetation indices and textural images, and different classification algorithms – maximum likelihood classification (MLC), artificial neural network (ANN), classification tree analysis (CTA), and object-based classification (OBC), were explored. The results indicated that a combination of vegetation indices as extra bands into Landsat TM multispectral bands did not improve the overall classification performance, but the combination of textural images was valuable for improving vegetation classification accuracy. In particular, the combination of both vegetation indices and textural images into TM multispectral bands improved overall classification accuracy by 5.6% and kappa coefficient by 6.25%. Comparison of the different classification algorithms indicated that CTA and ANN have poor classification performance in this research, but OBC improved primary forest and pasture classification accuracies. This research indicates that use of textural images or use of OBC are especially valuable for improving the vegetation classes such as upland and liana forest classes having complex stand structures and having relatively large patch sizes. PMID:22368311
Sedykh, Alexander; Zhu, Hao; Tang, Hao; Zhang, Liying; Richard, Ann; Rusyn, Ivan; Tropsha, Alexander
2011-01-01
Background Quantitative high-throughput screening (qHTS) assays are increasingly being used to inform chemical hazard identification. Hundreds of chemicals have been tested in dozens of cell lines across extensive concentration ranges by the National Toxicology Program in collaboration with the National Institutes of Health Chemical Genomics Center. Objectives Our goal was to test a hypothesis that dose–response data points of the qHTS assays can serve as biological descriptors of assayed chemicals and, when combined with conventional chemical descriptors, improve the accuracy of quantitative structure–activity relationship (QSAR) models applied to prediction of in vivo toxicity end points. Methods We obtained cell viability qHTS concentration–response data for 1,408 substances assayed in 13 cell lines from PubChem; for a subset of these compounds, rodent acute toxicity half-maximal lethal dose (LD50) data were also available. We used the k nearest neighbor classification and random forest QSAR methods to model LD50 data using chemical descriptors either alone (conventional models) or combined with biological descriptors derived from the concentration–response qHTS data (hybrid models). Critical to our approach was the use of a novel noise-filtering algorithm to treat qHTS data. Results Both the external classification accuracy and coverage (i.e., fraction of compounds in the external set that fall within the applicability domain) of the hybrid QSAR models were superior to conventional models. Conclusions Concentration–response qHTS data may serve as informative biological descriptors of molecules that, when combined with conventional chemical descriptors, may considerably improve the accuracy and utility of computational approaches for predicting in vivo animal toxicity end points. PMID:20980217
An adaptive deep Q-learning strategy for handwritten digit recognition.
Qiao, Junfei; Wang, Gongming; Li, Wenjing; Chen, Min
2018-02-22
Handwritten digits recognition is a challenging problem in recent years. Although many deep learning-based classification algorithms are studied for handwritten digits recognition, the recognition accuracy and running time still need to be further improved. In this paper, an adaptive deep Q-learning strategy is proposed to improve accuracy and shorten running time for handwritten digit recognition. The adaptive deep Q-learning strategy combines the feature-extracting capability of deep learning and the decision-making of reinforcement learning to form an adaptive Q-learning deep belief network (Q-ADBN). First, Q-ADBN extracts the features of original images using an adaptive deep auto-encoder (ADAE), and the extracted features are considered as the current states of Q-learning algorithm. Second, Q-ADBN receives Q-function (reward signal) during recognition of the current states, and the final handwritten digits recognition is implemented by maximizing the Q-function using Q-learning algorithm. Finally, experimental results from the well-known MNIST dataset show that the proposed Q-ADBN has a superiority to other similar methods in terms of accuracy and running time. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhang, Aizhu; Sun, Genyun; Wang, Zhenjie
2015-12-01
The serious information redundancy in hyperspectral images (HIs) cannot contribute to the data analysis accuracy, instead it require expensive computational resources. Consequently, to identify the most useful and valuable information from the HIs, thereby improve the accuracy of data analysis, this paper proposed a novel hyperspectral band selection method using the hybrid genetic algorithm and gravitational search algorithm (GA-GSA). In the proposed method, the GA-GSA is mapped to the binary space at first. Then, the accuracy of the support vector machine (SVM) classifier and the number of selected spectral bands are utilized to measure the discriminative capability of the band subset. Finally, the band subset with the smallest number of spectral bands as well as covers the most useful and valuable information is obtained. To verify the effectiveness of the proposed method, studies conducted on an AVIRIS image against two recently proposed state-of-the-art GSA variants are presented. The experimental results revealed the superiority of the proposed method and indicated that the method can indeed considerably reduce data storage costs and efficiently identify the band subset with stable and high classification precision.
Classification of right-hand grasp movement based on EMOTIV Epoc+
NASA Astrophysics Data System (ADS)
Tobing, T. A. M. L.; Prawito, Wijaya, S. K.
2017-07-01
Combinations of BCT elements for right-hand grasp movement have been obtained, providing the average value of their classification accuracy. The aim of this study is to find a suitable combination for best classification accuracy of right-hand grasp movement based on EEG headset, EMOTIV Epoc+. There are three movement classifications: grasping hand, relax, and opening hand. These classifications take advantage of Event-Related Desynchronization (ERD) phenomenon that makes it possible to differ relaxation, imagery, and movement state from each other. The combinations of elements are the usage of Independent Component Analysis (ICA), spectrum analysis by Fast Fourier Transform (FFT), maximum mu and beta power with their frequency as features, and also classifier Probabilistic Neural Network (PNN) and Radial Basis Function (RBF). The average values of classification accuracy are ± 83% for training and ± 57% for testing. To have a better understanding of the signal quality recorded by EMOTIV Epoc+, the result of classification accuracy of left or right-hand grasping movement EEG signal (provided by Physionet) also be given, i.e.± 85% for training and ± 70% for testing. The comparison of accuracy value from each combination, experiment condition, and external EEG data are provided for the purpose of value analysis of classification accuracy.
Pan, Jianjun
2018-01-01
This paper focuses on evaluating the ability and contribution of using backscatter intensity, texture, coherence, and color features extracted from Sentinel-1A data for urban land cover classification and comparing different multi-sensor land cover mapping methods to improve classification accuracy. Both Landsat-8 OLI and Hyperion images were also acquired, in combination with Sentinel-1A data, to explore the potential of different multi-sensor urban land cover mapping methods to improve classification accuracy. The classification was performed using a random forest (RF) method. The results showed that the optimal window size of the combination of all texture features was 9 × 9, and the optimal window size was different for each individual texture feature. For the four different feature types, the texture features contributed the most to the classification, followed by the coherence and backscatter intensity features; and the color features had the least impact on the urban land cover classification. Satisfactory classification results can be obtained using only the combination of texture and coherence features, with an overall accuracy up to 91.55% and a kappa coefficient up to 0.8935, respectively. Among all combinations of Sentinel-1A-derived features, the combination of the four features had the best classification result. Multi-sensor urban land cover mapping obtained higher classification accuracy. The combination of Sentinel-1A and Hyperion data achieved higher classification accuracy compared to the combination of Sentinel-1A and Landsat-8 OLI images, with an overall accuracy of up to 99.12% and a kappa coefficient up to 0.9889. When Sentinel-1A data was added to Hyperion images, the overall accuracy and kappa coefficient were increased by 4.01% and 0.0519, respectively. PMID:29382073
Feature weight estimation for gene selection: a local hyperlinear learning approach
2014-01-01
Background Modeling high-dimensional data involving thousands of variables is particularly important for gene expression profiling experiments, nevertheless,it remains a challenging task. One of the challenges is to implement an effective method for selecting a small set of relevant genes, buried in high-dimensional irrelevant noises. RELIEF is a popular and widely used approach for feature selection owing to its low computational cost and high accuracy. However, RELIEF based methods suffer from instability, especially in the presence of noisy and/or high-dimensional outliers. Results We propose an innovative feature weighting algorithm, called LHR, to select informative genes from highly noisy data. LHR is based on RELIEF for feature weighting using classical margin maximization. The key idea of LHR is to estimate the feature weights through local approximation rather than global measurement, which is typically used in existing methods. The weights obtained by our method are very robust in terms of degradation of noisy features, even those with vast dimensions. To demonstrate the performance of our method, extensive experiments involving classification tests have been carried out on both synthetic and real microarray benchmark datasets by combining the proposed technique with standard classifiers, including the support vector machine (SVM), k-nearest neighbor (KNN), hyperplane k-nearest neighbor (HKNN), linear discriminant analysis (LDA) and naive Bayes (NB). Conclusion Experiments on both synthetic and real-world datasets demonstrate the superior performance of the proposed feature selection method combined with supervised learning in three aspects: 1) high classification accuracy, 2) excellent robustness to noise and 3) good stability using to various classification algorithms. PMID:24625071
Estimating Classification Consistency and Accuracy for Cognitive Diagnostic Assessment
ERIC Educational Resources Information Center
Cui, Ying; Gierl, Mark J.; Chang, Hua-Hua
2012-01-01
This article introduces procedures for the computation and asymptotic statistical inference for classification consistency and accuracy indices specifically designed for cognitive diagnostic assessments. The new classification indices can be used as important indicators of the reliability and validity of classification results produced by…
Classification of arrhythmia using hybrid networks.
Haseena, Hassan H; Joseph, Paul K; Mathew, Abraham T
2011-12-01
Reliable detection of arrhythmias based on digital processing of Electrocardiogram (ECG) signals is vital in providing suitable and timely treatment to a cardiac patient. Due to corruption of ECG signals with multiple frequency noise and presence of multiple arrhythmic events in a cardiac rhythm, computerized interpretation of abnormal ECG rhythms is a challenging task. This paper focuses a Fuzzy C- Mean (FCM) clustered Probabilistic Neural Network (PNN) and Multi Layered Feed Forward Network (MLFFN) for the discrimination of eight types of ECG beats. Parameters such as fourth order Auto Regressive (AR) coefficients along with Spectral Entropy (SE) are extracted from each ECG beat and feature reduction has been carried out using FCM clustering. The cluster centers form the input of neural network classifiers. The extensive analysis of Massachusetts Institute of Technology- Beth Israel Hospital (MIT-BIH) arrhythmia database shows that FCM clustered PNNs is superior in cardiac arrhythmia classification than FCM clustered MLFFN with an overall accuracy of 99.05%, 97.14%, respectively.
Support-vector-based emergent self-organising approach for emotional understanding
NASA Astrophysics Data System (ADS)
Nguwi, Yok-Yen; Cho, Siu-Yeung
2010-12-01
This study discusses the computational analysis of general emotion understanding from questionnaires methodology. The questionnaires method approaches the subject by investigating the real experience that accompanied the emotions, whereas the other laboratory approaches are generally associated with exaggerated elements. We adopted a connectionist model called support-vector-based emergent self-organising map (SVESOM) to analyse the emotion profiling from the questionnaires method. The SVESOM first identifies the important variables by giving discriminative features with high ranking. The classifier then performs the classification based on the selected features. Experimental results show that the top rank features are in line with the work of Scherer and Wallbott [(1994), 'Evidence for Universality and Cultural Variation of Differential Emotion Response Patterning', Journal of Personality and Social Psychology, 66, 310-328], which approached the emotions physiologically. While the performance measures show that using the full features for classifications can degrade the performance, the selected features provide superior results in terms of accuracy and generalisation.
Gender Classification Based on Eye Movements: A Processing Effect During Passive Face Viewing
Sammaknejad, Negar; Pouretemad, Hamidreza; Eslahchi, Changiz; Salahirad, Alireza; Alinejad, Ashkan
2017-01-01
Studies have revealed superior face recognition skills in females, partially due to their different eye movement strategies when encoding faces. In the current study, we utilized these slight but important differences and proposed a model that estimates the gender of the viewers and classifies them into two subgroups, males and females. An eye tracker recorded participant’s eye movements while they viewed images of faces. Regions of interest (ROIs) were defined for each face. Results showed that the gender dissimilarity in eye movements was not due to differences in frequency of fixations in the ROI s per se. Instead, it was caused by dissimilarity in saccade paths between the ROIs. The difference enhanced when saccades were towards the eyes. Females showed significant increase in transitions from other ROI s to the eyes. Consequently, the extraction of temporal transient information of saccade paths through a transition probability matrix, similar to a first order Markov chain model, significantly improved the accuracy of the gender classification results. PMID:29071007
Gender Classification Based on Eye Movements: A Processing Effect During Passive Face Viewing.
Sammaknejad, Negar; Pouretemad, Hamidreza; Eslahchi, Changiz; Salahirad, Alireza; Alinejad, Ashkan
2017-01-01
Studies have revealed superior face recognition skills in females, partially due to their different eye movement strategies when encoding faces. In the current study, we utilized these slight but important differences and proposed a model that estimates the gender of the viewers and classifies them into two subgroups, males and females. An eye tracker recorded participant's eye movements while they viewed images of faces. Regions of interest (ROIs) were defined for each face. Results showed that the gender dissimilarity in eye movements was not due to differences in frequency of fixations in the ROI s per se. Instead, it was caused by dissimilarity in saccade paths between the ROIs. The difference enhanced when saccades were towards the eyes. Females showed significant increase in transitions from other ROI s to the eyes. Consequently, the extraction of temporal transient information of saccade paths through a transition probability matrix, similar to a first order Markov chain model, significantly improved the accuracy of the gender classification results.
The effect of finite field size on classification and atmospheric correction
NASA Technical Reports Server (NTRS)
Kaufman, Y. J.; Fraser, R. S.
1981-01-01
The atmospheric effect on the upward radiance of sunlight scattered from the Earth-atmosphere system is strongly influenced by the contrasts between fields and their sizes. For a given atmospheric turbidity, the atmospheric effect on classification of surface features is much stronger for nonuniform surfaces than for uniform surfaces. Therefore, the classification accuracy of agricultural fields and urban areas is dependent not only on the optical characteristics of the atmosphere, but also on the size of the surface do not account for the nonuniformity of the surface have only a slight effect on the classification accuracy; in other cases the classification accuracy descreases. The radiances above finite fields were computed to simulate radiances measured by a satellite. A simulation case including 11 agricultural fields and four natural fields (water, soil, savanah, and forest) was used to test the effect of the size of the background reflectance and the optical thickness of the atmosphere on classification accuracy. It is concluded that new atmospheric correction methods, which take into account the finite size of the fields, have to be developed to improve significantly the classification accuracy.
Yang, Xiaoyan; Chen, Longgao; Li, Yingkui; Xi, Wenjia; Chen, Longqian
2015-07-01
Land use/land cover (LULC) inventory provides an important dataset in regional planning and environmental assessment. To efficiently obtain the LULC inventory, we compared the LULC classifications based on single satellite imagery with a rule-based classification based on multi-seasonal imagery in Lianyungang City, a coastal city in China, using CBERS-02 (the 2nd China-Brazil Environmental Resource Satellites) images. The overall accuracies of the classification based on single imagery are 78.9, 82.8, and 82.0% in winter, early summer, and autumn, respectively. The rule-based classification improves the accuracy to 87.9% (kappa 0.85), suggesting that combining multi-seasonal images can considerably improve the classification accuracy over any single image-based classification. This method could also be used to analyze seasonal changes of LULC types, especially for those associated with tidal changes in coastal areas. The distribution and inventory of LULC types with an overall accuracy of 87.9% and a spatial resolution of 19.5 m can assist regional planning and environmental assessment efficiently in Lianyungang City. This rule-based classification provides a guidance to improve accuracy for coastal areas with distinct LULC temporal spectral features.
Feature selection for elderly faller classification based on wearable sensors.
Howcroft, Jennifer; Kofman, Jonathan; Lemaire, Edward D
2017-05-30
Wearable sensors can be used to derive numerous gait pattern features for elderly fall risk and faller classification; however, an appropriate feature set is required to avoid high computational costs and the inclusion of irrelevant features. The objectives of this study were to identify and evaluate smaller feature sets for faller classification from large feature sets derived from wearable accelerometer and pressure-sensing insole gait data. A convenience sample of 100 older adults (75.5 ± 6.7 years; 76 non-fallers, 24 fallers based on 6 month retrospective fall occurrence) walked 7.62 m while wearing pressure-sensing insoles and tri-axial accelerometers at the head, pelvis, left and right shanks. Feature selection was performed using correlation-based feature selection (CFS), fast correlation based filter (FCBF), and Relief-F algorithms. Faller classification was performed using multi-layer perceptron neural network, naïve Bayesian, and support vector machine classifiers, with 75:25 single stratified holdout and repeated random sampling. The best performing model was a support vector machine with 78% accuracy, 26% sensitivity, 95% specificity, 0.36 F1 score, and 0.31 MCC and one posterior pelvis accelerometer input feature (left acceleration standard deviation). The second best model achieved better sensitivity (44%) and used a support vector machine with 74% accuracy, 83% specificity, 0.44 F1 score, and 0.29 MCC. This model had ten input features: maximum, mean and standard deviation posterior acceleration; maximum, mean and standard deviation anterior acceleration; mean superior acceleration; and three impulse features. The best multi-sensor model sensitivity (56%) was achieved using posterior pelvis and both shank accelerometers and a naïve Bayesian classifier. The best single-sensor model sensitivity (41%) was achieved using the posterior pelvis accelerometer and a naïve Bayesian classifier. Feature selection provided models with smaller feature sets and improved faller classification compared to faller classification without feature selection. CFS and FCBF provided the best feature subset (one posterior pelvis accelerometer feature) for faller classification. However, better sensitivity was achieved by the second best model based on a Relief-F feature subset with three pressure-sensing insole features and seven head accelerometer features. Feature selection should be considered as an important step in faller classification using wearable sensors.
Data Analytics for Smart Parking Applications.
Piovesan, Nicola; Turi, Leo; Toigo, Enrico; Martinez, Borja; Rossi, Michele
2016-09-23
We consider real-life smart parking systems where parking lot occupancy data are collected from field sensor devices and sent to backend servers for further processing and usage for applications. Our objective is to make these data useful to end users, such as parking managers, and, ultimately, to citizens. To this end, we concoct and validate an automated classification algorithm having two objectives: (1) outlier detection: to detect sensors with anomalous behavioral patterns, i.e., outliers; and (2) clustering: to group the parking sensors exhibiting similar patterns into distinct clusters. We first analyze the statistics of real parking data, obtaining suitable simulation models for parking traces. We then consider a simple classification algorithm based on the empirical complementary distribution function of occupancy times and show its limitations. Hence, we design a more sophisticated algorithm exploiting unsupervised learning techniques (self-organizing maps). These are tuned following a supervised approach using our trace generator and are compared against other clustering schemes, namely expectation maximization, k-means clustering and DBSCAN, considering six months of data from a real sensor deployment. Our approach is found to be superior in terms of classification accuracy, while also being capable of identifying all of the outliers in the dataset.
Classification of Multiple Chinese Liquors by Means of a QCM-based E-Nose and MDS-SVM Classifier.
Li, Qiang; Gu, Yu; Jia, Jing
2017-01-30
Chinese liquors are internationally well-known fermentative alcoholic beverages. They have unique flavors attributable to the use of various bacteria and fungi, raw materials, and production processes. Developing a novel, rapid, and reliable method to identify multiple Chinese liquors is of positive significance. This paper presents a pattern recognition system for classifying ten brands of Chinese liquors based on multidimensional scaling (MDS) and support vector machine (SVM) algorithms in a quartz crystal microbalance (QCM)-based electronic nose (e-nose) we designed. We evaluated the comprehensive performance of the MDS-SVM classifier that predicted all ten brands of Chinese liquors individually. The prediction accuracy (98.3%) showed superior performance of the MDS-SVM classifier over the back-propagation artificial neural network (BP-ANN) classifier (93.3%) and moving average-linear discriminant analysis (MA-LDA) classifier (87.6%). The MDS-SVM classifier has reasonable reliability, good fitting and prediction (generalization) performance in classification of the Chinese liquors. Taking both application of the e-nose and validation of the MDS-SVM classifier into account, we have thus created a useful method for the classification of multiple Chinese liquors.
Data Analytics for Smart Parking Applications
Piovesan, Nicola; Turi, Leo; Toigo, Enrico; Martinez, Borja; Rossi, Michele
2016-01-01
We consider real-life smart parking systems where parking lot occupancy data are collected from field sensor devices and sent to backend servers for further processing and usage for applications. Our objective is to make these data useful to end users, such as parking managers, and, ultimately, to citizens. To this end, we concoct and validate an automated classification algorithm having two objectives: (1) outlier detection: to detect sensors with anomalous behavioral patterns, i.e., outliers; and (2) clustering: to group the parking sensors exhibiting similar patterns into distinct clusters. We first analyze the statistics of real parking data, obtaining suitable simulation models for parking traces. We then consider a simple classification algorithm based on the empirical complementary distribution function of occupancy times and show its limitations. Hence, we design a more sophisticated algorithm exploiting unsupervised learning techniques (self-organizing maps). These are tuned following a supervised approach using our trace generator and are compared against other clustering schemes, namely expectation maximization, k-means clustering and DBSCAN, considering six months of data from a real sensor deployment. Our approach is found to be superior in terms of classification accuracy, while also being capable of identifying all of the outliers in the dataset. PMID:27669259
Real-time classification of auditory sentences using evoked cortical activity in humans
NASA Astrophysics Data System (ADS)
Moses, David A.; Leonard, Matthew K.; Chang, Edward F.
2018-06-01
Objective. Recent research has characterized the anatomical and functional basis of speech perception in the human auditory cortex. These advances have made it possible to decode speech information from activity in brain regions like the superior temporal gyrus, but no published work has demonstrated this ability in real-time, which is necessary for neuroprosthetic brain-computer interfaces. Approach. Here, we introduce a real-time neural speech recognition (rtNSR) software package, which was used to classify spoken input from high-resolution electrocorticography signals in real-time. We tested the system with two human subjects implanted with electrode arrays over the lateral brain surface. Subjects listened to multiple repetitions of ten sentences, and rtNSR classified what was heard in real-time from neural activity patterns using direct sentence-level and HMM-based phoneme-level classification schemes. Main results. We observed single-trial sentence classification accuracies of 90% or higher for each subject with less than 7 minutes of training data, demonstrating the ability of rtNSR to use cortical recordings to perform accurate real-time speech decoding in a limited vocabulary setting. Significance. Further development and testing of the package with different speech paradigms could influence the design of future speech neuroprosthetic applications.
Austin, Peter C; Lee, Douglas S
2011-01-01
Purpose: Classification trees are increasingly being used to classifying patients according to the presence or absence of a disease or health outcome. A limitation of classification trees is their limited predictive accuracy. In the data-mining and machine learning literature, boosting has been developed to improve classification. Boosting with classification trees iteratively grows classification trees in a sequence of reweighted datasets. In a given iteration, subjects that were misclassified in the previous iteration are weighted more highly than subjects that were correctly classified. Classifications from each of the classification trees in the sequence are combined through a weighted majority vote to produce a final classification. The authors' objective was to examine whether boosting improved the accuracy of classification trees for predicting outcomes in cardiovascular patients. Methods: We examined the utility of boosting classification trees for classifying 30-day mortality outcomes in patients hospitalized with either acute myocardial infarction or congestive heart failure. Results: Improvements in the misclassification rate using boosted classification trees were at best minor compared to when conventional classification trees were used. Minor to modest improvements to sensitivity were observed, with only a negligible reduction in specificity. For predicting cardiovascular mortality, boosted classification trees had high specificity, but low sensitivity. Conclusions: Gains in predictive accuracy for predicting cardiovascular outcomes were less impressive than gains in performance observed in the data mining literature. PMID:22254181
Di-codon Usage for Gene Classification
NASA Astrophysics Data System (ADS)
Nguyen, Minh N.; Ma, Jianmin; Fogel, Gary B.; Rajapakse, Jagath C.
Classification of genes into biologically related groups facilitates inference of their functions. Codon usage bias has been described previously as a potential feature for gene classification. In this paper, we demonstrate that di-codon usage can further improve classification of genes. By using both codon and di-codon features, we achieve near perfect accuracies for the classification of HLA molecules into major classes and sub-classes. The method is illustrated on 1,841 HLA sequences which are classified into two major classes, HLA-I and HLA-II. Major classes are further classified into sub-groups. A binary SVM using di-codon usage patterns achieved 99.95% accuracy in the classification of HLA genes into major HLA classes; and multi-class SVM achieved accuracy rates of 99.82% and 99.03% for sub-class classification of HLA-I and HLA-II genes, respectively. Furthermore, by combining codon and di-codon usages, the prediction accuracies reached 100%, 99.82%, and 99.84% for HLA major class classification, and for sub-class classification of HLA-I and HLA-II genes, respectively.
NASA Technical Reports Server (NTRS)
Cibula, William G.; Nyquist, Maurice O.
1987-01-01
An unsupervised computer classification of vegetation/landcover of Olympic National Park and surrounding environs was initially carried out using four bands of Landsat MSS data. The primary objective of the project was to derive a level of landcover classifications useful for park management applications while maintaining an acceptably high level of classification accuracy. Initially, nine generalized vegetation/landcover classes were derived. Overall classification accuracy was 91.7 percent. In an attempt to refine the level of classification, a geographic information system (GIS) approach was employed. Topographic data and watershed boundaries (inferred precipitation/temperature) data were registered with the Landsat MSS data. The resultant boolean operations yielded 21 vegetation/landcover classes while maintaining the same level of classification accuracy. The final classification provided much better identification and location of the major forest types within the park at the same high level of accuracy, and these met the project objective. This classification could now become inputs into a GIS system to help provide answers to park management coupled with other ancillary data programs such as fire management.
NASA Astrophysics Data System (ADS)
Gao, Yan; Marpu, Prashanth; Morales Manila, Luis M.
2014-11-01
This paper assesses the suitability of 8-band Worldview-2 (WV2) satellite data and object-based random forest algorithm for the classification of avocado growth stages in Mexico. We tested both pixel-based with minimum distance (MD) and maximum likelihood (MLC) and object-based with Random Forest (RF) algorithm for this task. Training samples and verification data were selected by visual interpreting the WV2 images for seven thematic classes: fully grown, middle stage, and early stage of avocado crops, bare land, two types of natural forests, and water body. To examine the contribution of the four new spectral bands of WV2 sensor, all the tested classifications were carried out with and without the four new spectral bands. Classification accuracy assessment results show that object-based classification with RF algorithm obtained higher overall higher accuracy (93.06%) than pixel-based MD (69.37%) and MLC (64.03%) method. For both pixel-based and object-based methods, the classifications with the four new spectral bands (overall accuracy obtained higher accuracy than those without: overall accuracy of object-based RF classification with vs without: 93.06% vs 83.59%, pixel-based MD: 69.37% vs 67.2%, pixel-based MLC: 64.03% vs 36.05%, suggesting that the four new spectral bands in WV2 sensor contributed to the increase of the classification accuracy.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Yongjun; Lim, Jonghyuck; Kim, Namkug
2013-05-15
Purpose: To investigate the effect of using different computed tomography (CT) scanners on the accuracy of high-resolution CT (HRCT) images in classifying regional disease patterns in patients with diffuse lung disease, support vector machine (SVM) and Bayesian classifiers were applied to multicenter data. Methods: Two experienced radiologists marked sets of 600 rectangular 20 Multiplication-Sign 20 pixel regions of interest (ROIs) on HRCT images obtained from two scanners (GE and Siemens), including 100 ROIs for each of local patterns of lungs-normal lung and five of regional pulmonary disease patterns (ground-glass opacity, reticular opacity, honeycombing, emphysema, and consolidation). Each ROI was assessedmore » using 22 quantitative features belonging to one of the following descriptors: histogram, gradient, run-length, gray level co-occurrence matrix, low-attenuation area cluster, and top-hat transform. For automatic classification, a Bayesian classifier and a SVM classifier were compared under three different conditions. First, classification accuracies were estimated using data from each scanner. Next, data from the GE and Siemens scanners were used for training and testing, respectively, and vice versa. Finally, all ROI data were integrated regardless of the scanner type and were then trained and tested together. All experiments were performed based on forward feature selection and fivefold cross-validation with 20 repetitions. Results: For each scanner, better classification accuracies were achieved with the SVM classifier than the Bayesian classifier (92% and 82%, respectively, for the GE scanner; and 92% and 86%, respectively, for the Siemens scanner). The classification accuracies were 82%/72% for training with GE data and testing with Siemens data, and 79%/72% for the reverse. The use of training and test data obtained from the HRCT images of different scanners lowered the classification accuracy compared to the use of HRCT images from the same scanner. For integrated ROI data obtained from both scanners, the classification accuracies with the SVM and Bayesian classifiers were 92% and 77%, respectively. The selected features resulting from the classification process differed by scanner, with more features included for the classification of the integrated HRCT data than for the classification of the HRCT data from each scanner. For the integrated data, consisting of HRCT images of both scanners, the classification accuracy based on the SVM was statistically similar to the accuracy of the data obtained from each scanner. However, the classification accuracy of the integrated data using the Bayesian classifier was significantly lower than the classification accuracy of the ROI data of each scanner. Conclusions: The use of an integrated dataset along with a SVM classifier rather than a Bayesian classifier has benefits in terms of the classification accuracy of HRCT images acquired with more than one scanner. This finding is of relevance in studies involving large number of images, as is the case in a multicenter trial with different scanners.« less
Corn and soybean Landsat MSS classification performance as a function of scene characteristics
NASA Technical Reports Server (NTRS)
Batista, G. T.; Hixson, M. M.; Bauer, M. E.
1982-01-01
In order to fully utilize remote sensing to inventory crop production, it is important to identify the factors that affect the accuracy of Landsat classifications. The objective of this study was to investigate the effect of scene characteristics involving crop, soil, and weather variables on the accuracy of Landsat classifications of corn and soybeans. Segments sampling the U.S. Corn Belt were classified using a Gaussian maximum likelihood classifier on multitemporally registered data from two key acquisition periods. Field size had a strong effect on classification accuracy with small fields tending to have low accuracies even when the effect of mixed pixels was eliminated. Other scene characteristics accounting for variability in classification accuracy included proportions of corn and soybeans, crop diversity index, proportion of all field crops, soil drainage, slope, soil order, long-term average soybean yield, maximum yield, relative position of the segment in the Corn Belt, weather, and crop development stage.
A Classification of Remote Sensing Image Based on Improved Compound Kernels of Svm
NASA Astrophysics Data System (ADS)
Zhao, Jianing; Gao, Wanlin; Liu, Zili; Mou, Guifen; Lu, Lin; Yu, Lina
The accuracy of RS classification based on SVM which is developed from statistical learning theory is high under small number of train samples, which results in satisfaction of classification on RS using SVM methods. The traditional RS classification method combines visual interpretation with computer classification. The accuracy of the RS classification, however, is improved a lot based on SVM method, because it saves much labor and time which is used to interpret images and collect training samples. Kernel functions play an important part in the SVM algorithm. It uses improved compound kernel function and therefore has a higher accuracy of classification on RS images. Moreover, compound kernel improves the generalization and learning ability of the kernel.
Nationwide forestry applications program. Analysis of forest classification accuracy
NASA Technical Reports Server (NTRS)
Congalton, R. G.; Mead, R. A.; Oderwald, R. G.; Heinen, J. (Principal Investigator)
1981-01-01
The development of LANDSAT classification accuracy assessment techniques, and of a computerized system for assessing wildlife habitat from land cover maps are considered. A literature review on accuracy assessment techniques and an explanation for the techniques development under both projects are included along with listings of the computer programs. The presentations and discussions at the National Working Conference on LANDSAT Classification Accuracy are summarized. Two symposium papers which were published on the results of this project are appended.
NASA Technical Reports Server (NTRS)
Hoffbeck, Joseph P.; Landgrebe, David A.
1994-01-01
Many analysis algorithms for high-dimensional remote sensing data require that the remotely sensed radiance spectra be transformed to approximate reflectance to allow comparison with a library of laboratory reflectance spectra. In maximum likelihood classification, however, the remotely sensed spectra are compared to training samples, thus a transformation to reflectance may or may not be helpful. The effect of several radiance-to-reflectance transformations on maximum likelihood classification accuracy is investigated in this paper. We show that the empirical line approach, LOWTRAN7, flat-field correction, single spectrum method, and internal average reflectance are all non-singular affine transformations, and that non-singular affine transformations have no effect on discriminant analysis feature extraction and maximum likelihood classification accuracy. (An affine transformation is a linear transformation with an optional offset.) Since the Atmosphere Removal Program (ATREM) and the log residue method are not affine transformations, experiments with Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data were conducted to determine the effect of these transformations on maximum likelihood classification accuracy. The average classification accuracy of the data transformed by ATREM and the log residue method was slightly less than the accuracy of the original radiance data. Since the radiance-to-reflectance transformations allow direct comparison of remotely sensed spectra with laboratory reflectance spectra, they can be quite useful in labeling the training samples required by maximum likelihood classification, but these transformations have only a slight effect or no effect at all on discriminant analysis and maximum likelihood classification accuracy.
A fuzzy hill-climbing algorithm for the development of a compact associative classifier
NASA Astrophysics Data System (ADS)
Mitra, Soumyaroop; Lam, Sarah S.
2012-02-01
Classification, a data mining technique, has widespread applications including medical diagnosis, targeted marketing, and others. Knowledge discovery from databases in the form of association rules is one of the important data mining tasks. An integrated approach, classification based on association rules, has drawn the attention of the data mining community over the last decade. While attention has been mainly focused on increasing classifier accuracies, not much efforts have been devoted towards building interpretable and less complex models. This paper discusses the development of a compact associative classification model using a hill-climbing approach and fuzzy sets. The proposed methodology builds the rule-base by selecting rules which contribute towards increasing training accuracy, thus balancing classification accuracy with the number of classification association rules. The results indicated that the proposed associative classification model can achieve competitive accuracies on benchmark datasets with continuous attributes and lend better interpretability, when compared with other rule-based systems.
Derivation of an artificial gene to improve classification accuracy upon gene selection.
Seo, Minseok; Oh, Sejong
2012-02-01
Classification analysis has been developed continuously since 1936. This research field has advanced as a result of development of classifiers such as KNN, ANN, and SVM, as well as through data preprocessing areas. Feature (gene) selection is required for very high dimensional data such as microarray before classification work. The goal of feature selection is to choose a subset of informative features that reduces processing time and provides higher classification accuracy. In this study, we devised a method of artificial gene making (AGM) for microarray data to improve classification accuracy. Our artificial gene was derived from a whole microarray dataset, and combined with a result of gene selection for classification analysis. We experimentally confirmed a clear improvement of classification accuracy after inserting artificial gene. Our artificial gene worked well for popular feature (gene) selection algorithms and classifiers. The proposed approach can be applied to any type of high dimensional dataset. Copyright © 2011 Elsevier Ltd. All rights reserved.
SVM-RFE based feature selection and Taguchi parameters optimization for multiclass SVM classifier.
Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W M; Li, R K; Jiang, Bo-Ru
2014-01-01
Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.
SVM-RFE Based Feature Selection and Taguchi Parameters Optimization for Multiclass SVM Classifier
Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W. M.; Li, R. K.; Jiang, Bo-Ru
2014-01-01
Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases. PMID:25295306
NASA Astrophysics Data System (ADS)
Hong, Liang
2013-10-01
The availability of high spatial resolution remote sensing data provides new opportunities for urban land-cover classification. More geometric details can be observed in the high resolution remote sensing image, Also Ground objects in the high resolution remote sensing image have displayed rich texture, structure, shape and hierarchical semantic characters. More landscape elements are represented by a small group of pixels. Recently years, the an object-based remote sensing analysis methodology is widely accepted and applied in high resolution remote sensing image processing. The classification method based on Geo-ontology and conditional random fields is presented in this paper. The proposed method is made up of four blocks: (1) the hierarchical ground objects semantic framework is constructed based on geoontology; (2) segmentation by mean-shift algorithm, which image objects are generated. And the mean-shift method is to get boundary preserved and spectrally homogeneous over-segmentation regions ;(3) the relations between the hierarchical ground objects semantic and over-segmentation regions are defined based on conditional random fields framework ;(4) the hierarchical classification results are obtained based on geo-ontology and conditional random fields. Finally, high-resolution remote sensed image data -GeoEye, is used to testify the performance of the presented method. And the experimental results have shown the superiority of this method to the eCognition method both on the effectively and accuracy, which implies it is suitable for the classification of high resolution remote sensing image.
Al-Shaikhli, Saif Dawood Salman; Yang, Michael Ying; Rosenhahn, Bodo
2016-12-01
This paper presents a novel method for Alzheimer's disease classification via an automatic 3D caudate nucleus segmentation. The proposed method consists of segmentation and classification steps. In the segmentation step, we propose a novel level set cost function. The proposed cost function is constrained by a sparse representation of local image features using a dictionary learning method. We present coupled dictionaries: a feature dictionary of a grayscale brain image and a label dictionary of a caudate nucleus label image. Using online dictionary learning, the coupled dictionaries are learned from the training data. The learned coupled dictionaries are embedded into a level set function. In the classification step, a region-based feature dictionary is built. The region-based feature dictionary is learned from shape features of the caudate nucleus in the training data. The classification is based on the measure of the similarity between the sparse representation of region-based shape features of the segmented caudate in the test image and the region-based feature dictionary. The experimental results demonstrate the superiority of our method over the state-of-the-art methods by achieving a high segmentation (91.5%) and classification (92.5%) accuracy. In this paper, we find that the study of the caudate nucleus atrophy gives an advantage over the study of whole brain structure atrophy to detect Alzheimer's disease. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Classification of large-scale fundus image data sets: a cloud-computing framework.
Roychowdhury, Sohini
2016-08-01
Large medical image data sets with high dimensionality require substantial amount of computation time for data creation and data processing. This paper presents a novel generalized method that finds optimal image-based feature sets that reduce computational time complexity while maximizing overall classification accuracy for detection of diabetic retinopathy (DR). First, region-based and pixel-based features are extracted from fundus images for classification of DR lesions and vessel-like structures. Next, feature ranking strategies are used to distinguish the optimal classification feature sets. DR lesion and vessel classification accuracies are computed using the boosted decision tree and decision forest classifiers in the Microsoft Azure Machine Learning Studio platform, respectively. For images from the DIARETDB1 data set, 40 of its highest-ranked features are used to classify four DR lesion types with an average classification accuracy of 90.1% in 792 seconds. Also, for classification of red lesion regions and hemorrhages from microaneurysms, accuracies of 85% and 72% are observed, respectively. For images from STARE data set, 40 high-ranked features can classify minor blood vessels with an accuracy of 83.5% in 326 seconds. Such cloud-based fundus image analysis systems can significantly enhance the borderline classification performances in automated screening systems.
Real-time, resource-constrained object classification on a micro-air vehicle
NASA Astrophysics Data System (ADS)
Buck, Louis; Ray, Laura
2013-12-01
A real-time embedded object classification algorithm is developed through the novel combination of binary feature descriptors, a bag-of-visual-words object model and the cortico-striatal loop (CSL) learning algorithm. The BRIEF, ORB and FREAK binary descriptors are tested and compared to SIFT descriptors with regard to their respective classification accuracies, execution times, and memory requirements when used with CSL on a 12.6 g ARM Cortex embedded processor running at 800 MHz. Additionally, the effect of x2 feature mapping and opponent-color representations used with these descriptors is examined. These tests are performed on four data sets of varying sizes and difficulty, and the BRIEF descriptor is found to yield the best combination of speed and classification accuracy. Its use with CSL achieves accuracies between 67% and 95% of those achieved with SIFT descriptors and allows for the embedded classification of a 128x192 pixel image in 0.15 seconds, 60 times faster than classification with SIFT. X2 mapping is found to provide substantial improvements in classification accuracy for all of the descriptors at little cost, while opponent-color descriptors are offer accuracy improvements only on colorful datasets.
A Nonparametric Approach to Estimate Classification Accuracy and Consistency
ERIC Educational Resources Information Center
Lathrop, Quinn N.; Cheng, Ying
2014-01-01
When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…
NASA Astrophysics Data System (ADS)
Löw, Fabian; Schorcht, Gunther; Michel, Ulrich; Dech, Stefan; Conrad, Christopher
2012-10-01
Accurate crop identification and crop area estimation are important for studies on irrigated agricultural systems, yield and water demand modeling, and agrarian policy development. In this study a novel combination of Random Forest (RF) and Support Vector Machine (SVM) classifiers is presented that (i) enhances crop classification accuracy and (ii) provides spatial information on map uncertainty. The methodology was implemented over four distinct irrigated sites in Middle Asia using RapidEye time series data. The RF feature importance statistics was used as feature-selection strategy for the SVM to assess possible negative effects on classification accuracy caused by an oversized feature space. The results of the individual RF and SVM classifications were combined with rules based on posterior classification probability and estimates of classification probability entropy. SVM classification performance was increased by feature selection through RF. Further experimental results indicate that the hybrid classifier improves overall classification accuracy in comparison to the single classifiers as well as useŕs and produceŕs accuracy.
Wang, Shuihua; Zhang, Yudong; Liu, Ge; Phillips, Preetha; Yuan, Ti-Fei
2016-01-01
Within the past decade, computer scientists have developed many methods using computer vision and machine learning techniques to detect Alzheimer's disease (AD) in its early stages. However, some of these methods are unable to achieve excellent detection accuracy, and several other methods are unable to locate AD-related regions. Hence, our goal was to develop a novel AD brain detection method. In this study, our method was based on the three-dimensional (3D) displacement-field (DF) estimation between subjects in the healthy elder control group and AD group. The 3D-DF was treated with AD-related features. The three feature selection measures were used in the Bhattacharyya distance, Student's t-test, and Welch's t-test (WTT). Two non-parallel support vector machines, i.e., generalized eigenvalue proximal support vector machine and twin support vector machine (TSVM), were then used for classification. A 50 × 10-fold cross validation was implemented for statistical analysis. The results showed that "3D-DF+WTT+TSVM" achieved the best performance, with an accuracy of 93.05 ± 2.18, a sensitivity of 92.57 ± 3.80, a specificity of 93.18 ± 3.35, and a precision of 79.51 ± 2.86. This method also exceled in 13 state-of-the-art approaches. Additionally, we were able to detect 17 regions related to AD by using the pure computer-vision technique. These regions include sub-gyral, inferior parietal lobule, precuneus, angular gyrus, lingual gyrus, supramarginal gyrus, postcentral gyrus, third ventricle, superior parietal lobule, thalamus, middle temporal gyrus, precentral gyrus, superior temporal gyrus, superior occipital gyrus, cingulate gyrus, culmen, and insula. These regions were reported in recent publications. The 3D-DF is effective in AD subject and related region detection.
Classifying epileptic EEG signals with delay permutation entropy and Multi-Scale K-means.
Zhu, Guohun; Li, Yan; Wen, Peng Paul; Wang, Shuaifang
2015-01-01
Most epileptic EEG classification algorithms are supervised and require large training datasets, that hinder their use in real time applications. This chapter proposes an unsupervised Multi-Scale K-means (MSK-means) MSK-means algorithm to distinguish epileptic EEG signals and identify epileptic zones. The random initialization of the K-means algorithm can lead to wrong clusters. Based on the characteristics of EEGs, the MSK-means MSK-means algorithm initializes the coarse-scale centroid of a cluster with a suitable scale factor. In this chapter, the MSK-means algorithm is proved theoretically superior to the K-means algorithm on efficiency. In addition, three classifiers: the K-means, MSK-means MSK-means and support vector machine (SVM), are used to identify seizure and localize epileptogenic zone using delay permutation entropy features. The experimental results demonstrate that identifying seizure with the MSK-means algorithm and delay permutation entropy achieves 4. 7 % higher accuracy than that of K-means, and 0. 7 % higher accuracy than that of the SVM.
Bolin, Jocelyn Holden; Finch, W Holmes
2014-01-01
Statistical classification of phenomena into observed groups is very common in the social and behavioral sciences. Statistical classification methods, however, are affected by the characteristics of the data under study. Statistical classification can be further complicated by initial misclassification of the observed groups. The purpose of this study is to investigate the impact of initial training data misclassification on several statistical classification and data mining techniques. Misclassification conditions in the three group case will be simulated and results will be presented in terms of overall as well as subgroup classification accuracy. Results show decreased classification accuracy as sample size, group separation and group size ratio decrease and as misclassification percentage increases with random forests demonstrating the highest accuracy across conditions.
NASA Astrophysics Data System (ADS)
Marais Sicre, Claire; Baup, Frederic; Fieuzal, Remy
2015-04-01
In the context of climate change (with consequences on temperature and precipitation patterns), persons involved in agricultural management have the imperative to combine: sufficient productivity (as a response of the increment of the necessary foods) and durability of the resources (in order to restrain waste of water, fertilizer or environmental damages). To this end, a detailed knowledge of land use will improve the management of food and water, while preserving the ecosystems. Among the wide range of available monitoring tools, numerous studies demonstrated the interest of satellite images for agricultural mapping. Recently, the launch of several radar and optical sensors offer new perspectives for the multi-wavelength crop monitoring (Terrasar-X, Radarsat-2, Sentinel-1, Landsat-8…) allowing surface survey whatever the cloud conditions. Previous studies have demonstrated the interest of using multi-temporal approaches for crop classification, requiring several images for suitable classification results. Unfortunately, these approaches are limited (due to the satellite orbit cycle) and require waiting several days, week or month before offering an accurate land use map. The objective of this study is to compare the accuracy of object-oriented classification (random forest algorithm combined with vector layer coming from segmentation) to map winter crop (barley, rapeseed, grasslands and wheat) and soil states (bare soils with different surface roughness) using quasi-synchronous images. Satellite data are composed of multi-frequency and multi-polarization (HH, VV, HV and VH) images acquired near the 14th of April, 2010, over a studied area (90km²) located close to Toulouse in France. This is a region of alluvial plains and hills, which are mostly mixed farming and governed by a temperate climate. Remote sensing images are provided by Formosat-2 (04/18), Radarsat-2 (C-band, 04/15), Terrasar-X (X-band, 04/14) and ALOS (L-band, 04/14). Ground data are collected over 214 plots during the MCM'10 experiment conducted by the CESBIO laboratory in 2010. Classifications performances have been evaluated considering two cases: using only one frequency in optical or microwave domain, or using a combination of several frequencies (mixed between optical and microwave). For the first case, best results were obtained using optical wavelength with mean overall accuracy (OA) of 84%, followed by Terrasar-X (HH) and Radarsat-2 (HV or HV) which respectively offer overall accuracies of 77% and 73%. Concerning the vegetation, wheat was well classified whatever the wavelength used (OA > 93%). Barley was more complicated to classified and could be mingled with wheat or grassland. Best results were obtained using of green, red, blue, X-band or L-band wavelength offering an OA superior to 45%. Radar images were clearly well adapted to identify rapeseed (OA > 83%), especially at C (VV, HH and HV) and X-band (HH). The accuracy of grassland classification never exceeded 79% and results were stable between frequencies (excepted at L-band: 51%). The three soil roughness states were quite well classified whatever the wavelength and performances decreased with the increase of soil roughness. The combine use of multi-frequencies increased performances of the classification. Overall accuracy reached respectively 83% and 96% for C-band full polarization and for Formosat-2 multispectral approaches.
Classification of parotidectomies: a proposal of the European Salivary Gland Society.
Quer, M; Guntinas-Lichius, O; Marchal, F; Vander Poorten, V; Chevalier, D; León, X; Eisele, D; Dulguerov, P
2016-10-01
The objective of this study is to provide a comprehensive classification system for parotidectomy operations. Data sources include Medline publications, author's experience, and consensus round table at the Third European Salivary Gland Society (ESGS) Meeting. The Medline database was searched with the term "parotidectomy" and "definition". The various definitions of parotidectomy procedures and parotid gland subdivisions extracted. Previous classification systems re-examined and a new classification proposed by a consensus. The ESGS proposes to subdivide the parotid parenchyma in five levels: I (lateral superior), II (lateral inferior), III (deep inferior), IV (deep superior), V (accessory). A new classification is proposed where the type of resection is divided into formal parotidectomy with facial nerve dissection and extracapsular dissection. Parotidectomies are further classified according to the levels removed, as well as the extra-parotid structures ablated. A new classification of parotidectomy procedures is proposed.
Use of in Vitro HTS-Derived Concentration-Response Data as ...
Background: Quantitative high-throughput screening (qHTS) assays are increasingly being employed to inform chemical hazard identification. Hundreds of chemicals have been tested in dozens of cell lines across extensive concentration ranges by the National Toxicology Program in collaboration with the NIH Chemical Genomics Center. Objectives: To test a hypothesis that dose-response data points of the qHTS assays can serve as biological descriptors of assayed chemicals and, when combined with conventional chemical descriptors, may improve the accuracy of Quantitative Structure-Activity Relationship (QSAR) models applied to prediction of in vivo toxicity endpoints. Methods and Results: The cell viability qHTS concentration-response data for 1,408 substances assayed in 13 cell lines were obtained from PubChem; for a subset of these compounds rodent acute toxicity LD50 data were also available. The classification k Nearest Neighbor and Random Forest QSAR methods were employed for modeling LD50 data using either chemical descriptors alone (conventional models) or in combination with biological descriptors derived from the concentration-response qHTS data (hybrid models). Critical to our approach was the use of a novel noise-filtering algorithm to treat qHTS data. We show that both the external classification accuracy and coverage (i.e., fraction of compounds in the external set that fall within the applicability domain) of the hybrid QSAR models was superior to convent
Groom, Madeleine J; Young, Zoe; Hall, Charlotte L; Gillott, Alinda; Hollis, Chris
2016-09-30
There is a clinical need for objective evidence-based measures that are sensitive and specific to ADHD when compared with other neurodevelopmental disorders. This study evaluated the incremental validity of adding an objective measure of activity and computerised cognitive assessment to clinical rating scales to differentiate adult ADHD from Autism spectrum disorders (ASD). Adults with ADHD (n=33) or ASD (n=25) performed the QbTest, comprising a Continuous Performance Test with motion-tracker to record physical activity. QbTest parameters measuring inattention, impulsivity and hyperactivity were combined to provide a summary score ('QbTotal'). Binary stepwise logistic regression measured the probability of assignment to the ADHD or ASD group based on scores on the Conners Adult ADHD Rating Scale-subscale E (CAARS-E) and Autism Quotient (AQ10) in the first step and then QbTotal added in the second step. The model fit was significant at step 1 (CAARS-E, AQ10) with good group classification accuracy. These predictors were retained and QbTotal was added, resulting in a significant improvement in model fit and group classification accuracy. All predictors were significant. ROC curves indicated superior specificity of QbTotal. The findings present preliminary evidence that adding QbTest to clinical rating scales may improve the differentiation of ADHD and ASD in adults. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Minimum distance classification in remote sensing
NASA Technical Reports Server (NTRS)
Wacker, A. G.; Landgrebe, D. A.
1972-01-01
The utilization of minimum distance classification methods in remote sensing problems, such as crop species identification, is considered. Literature concerning both minimum distance classification problems and distance measures is reviewed. Experimental results are presented for several examples. The objective of these examples is to: (a) compare the sample classification accuracy of a minimum distance classifier, with the vector classification accuracy of a maximum likelihood classifier, and (b) compare the accuracy of a parametric minimum distance classifier with that of a nonparametric one. Results show the minimum distance classifier performance is 5% to 10% better than that of the maximum likelihood classifier. The nonparametric classifier is only slightly better than the parametric version.
Shin, Jaeyoung; Kwon, Jinuk; Im, Chang-Hwan
2018-01-01
The performance of a brain-computer interface (BCI) can be enhanced by simultaneously using two or more modalities to record brain activity, which is generally referred to as a hybrid BCI. To date, many BCI researchers have tried to implement a hybrid BCI system by combining electroencephalography (EEG) and functional near-infrared spectroscopy (NIRS) to improve the overall accuracy of binary classification. However, since hybrid EEG-NIRS BCI, which will be denoted by hBCI in this paper, has not been applied to ternary classification problems, paradigms and classification strategies appropriate for ternary classification using hBCI are not well investigated. Here we propose the use of an hBCI for the classification of three brain activation patterns elicited by mental arithmetic, motor imagery, and idle state, with the aim to elevate the information transfer rate (ITR) of hBCI by increasing the number of classes while minimizing the loss of accuracy. EEG electrodes were placed over the prefrontal cortex and the central cortex, and NIRS optodes were placed only on the forehead. The ternary classification problem was decomposed into three binary classification problems using the "one-versus-one" (OVO) classification strategy to apply the filter-bank common spatial patterns filter to EEG data. A 10 × 10-fold cross validation was performed using shrinkage linear discriminant analysis (sLDA) to evaluate the average classification accuracies for EEG-BCI, NIRS-BCI, and hBCI when the meta-classification method was adopted to enhance classification accuracy. The ternary classification accuracies for EEG-BCI, NIRS-BCI, and hBCI were 76.1 ± 12.8, 64.1 ± 9.7, and 82.2 ± 10.2%, respectively. The classification accuracy of the proposed hBCI was thus significantly higher than those of the other BCIs ( p < 0.005). The average ITR for the proposed hBCI was calculated to be 4.70 ± 1.92 bits/minute, which was 34.3% higher than that reported for a previous binary hBCI study.
Selective classification for improved robustness of myoelectric control under nonideal conditions.
Scheme, Erik J; Englehart, Kevin B; Hudgins, Bernard S
2011-06-01
Recent literature in pattern recognition-based myoelectric control has highlighted a disparity between classification accuracy and the usability of upper limb prostheses. This paper suggests that the conventionally defined classification accuracy may be idealistic and may not reflect true clinical performance. Herein, a novel myoelectric control system based on a selective multiclass one-versus-one classification scheme, capable of rejecting unknown data patterns, is introduced. This scheme is shown to outperform nine other popular classifiers when compared using conventional classification accuracy as well as a form of leave-one-out analysis that may be more representative of real prosthetic use. Additionally, the classification scheme allows for real-time, independent adjustment of individual class-pair boundaries making it flexible and intuitive for clinical use.
Multi-source remotely sensed data fusion for improving land cover classification
NASA Astrophysics Data System (ADS)
Chen, Bin; Huang, Bo; Xu, Bing
2017-02-01
Although many advances have been made in past decades, land cover classification of fine-resolution remotely sensed (RS) data integrating multiple temporal, angular, and spectral features remains limited, and the contribution of different RS features to land cover classification accuracy remains uncertain. We proposed to improve land cover classification accuracy by integrating multi-source RS features through data fusion. We further investigated the effect of different RS features on classification performance. The results of fusing Landsat-8 Operational Land Imager (OLI) data with Moderate Resolution Imaging Spectroradiometer (MODIS), China Environment 1A series (HJ-1A), and Advanced Spaceborne Thermal Emission and Reflection (ASTER) digital elevation model (DEM) data, showed that the fused data integrating temporal, spectral, angular, and topographic features achieved better land cover classification accuracy than the original RS data. Compared with the topographic feature, the temporal and angular features extracted from the fused data played more important roles in classification performance, especially those temporal features containing abundant vegetation growth information, which markedly increased the overall classification accuracy. In addition, the multispectral and hyperspectral fusion successfully discriminated detailed forest types. Our study provides a straightforward strategy for hierarchical land cover classification by making full use of available RS data. All of these methods and findings could be useful for land cover classification at both regional and global scales.
Decision rules for unbiased inventory estimates
NASA Technical Reports Server (NTRS)
Argentiero, P. D.; Koch, D.
1979-01-01
An efficient and accurate procedure for estimating inventories from remote sensing scenes is presented. In place of the conventional and expensive full dimensional Bayes decision rule, a one-dimensional feature extraction and classification technique was employed. It is shown that this efficient decision rule can be used to develop unbiased inventory estimates and that for large sample sizes typical of satellite derived remote sensing scenes, resulting accuracies are comparable or superior to more expensive alternative procedures. Mathematical details of the procedure are provided in the body of the report and in the appendix. Results of a numerical simulation of the technique using statistics obtained from an observed LANDSAT scene are included. The simulation demonstrates the effectiveness of the technique in computing accurate inventory estimates.
Feature ranking and rank aggregation for automatic sleep stage classification: a comparative study.
Najdi, Shirin; Gharbali, Ali Abdollahi; Fonseca, José Manuel
2017-08-18
Nowadays, sleep quality is one of the most important measures of healthy life, especially considering the huge number of sleep-related disorders. Identifying sleep stages using polysomnographic (PSG) signals is the traditional way of assessing sleep quality. However, the manual process of sleep stage classification is time-consuming, subjective and costly. Therefore, in order to improve the accuracy and efficiency of the sleep stage classification, researchers have been trying to develop automatic classification algorithms. Automatic sleep stage classification mainly consists of three steps: pre-processing, feature extraction and classification. Since classification accuracy is deeply affected by the extracted features, a poor feature vector will adversely affect the classifier and eventually lead to low classification accuracy. Therefore, special attention should be given to the feature extraction and selection process. In this paper the performance of seven feature selection methods, as well as two feature rank aggregation methods, were compared. Pz-Oz EEG, horizontal EOG and submental chin EMG recordings of 22 healthy males and females were used. A comprehensive feature set including 49 features was extracted from these recordings. The extracted features are among the most common and effective features used in sleep stage classification from temporal, spectral, entropy-based and nonlinear categories. The feature selection methods were evaluated and compared using three criteria: classification accuracy, stability, and similarity. Simulation results show that MRMR-MID achieves the highest classification performance while Fisher method provides the most stable ranking. In our simulations, the performance of the aggregation methods was in the average level, although they are known to generate more stable results and better accuracy. The Borda and RRA rank aggregation methods could not outperform significantly the conventional feature ranking methods. Among conventional methods, some of them slightly performed better than others, although the choice of a suitable technique is dependent on the computational complexity and accuracy requirements of the user.
Comparison of wheat classification accuracy using different classifiers of the image-100 system
NASA Technical Reports Server (NTRS)
Dejesusparada, N. (Principal Investigator); Chen, S. C.; Moreira, M. A.; Delima, A. M.
1981-01-01
Classification results using single-cell and multi-cell signature acquisition options, a point-by-point Gaussian maximum-likelihood classifier, and K-means clustering of the Image-100 system are presented. Conclusions reached are that: a better indication of correct classification can be provided by using a test area which contains various cover types of the study area; classification accuracy should be evaluated considering both the percentages of correct classification and error of commission; supervised classification approaches are better than K-means clustering; Gaussian distribution maximum likelihood classifier is better than Single-cell and Multi-cell Signature Acquisition Options of the Image-100 system; and in order to obtain a high classification accuracy in a large and heterogeneous crop area, using Gaussian maximum-likelihood classifier, homogeneous spectral subclasses of the study crop should be created to derive training statistics.
Abnormal brain activation during directed forgetting of negative memory in depressed patients.
Yang, Wenjing; Chen, Qunlin; Liu, Peiduo; Cheng, Hongsheng; Cui, Qian; Wei, Dongtao; Zhang, Qinglin; Qiu, Jiang
2016-01-15
The frequent occurrence of uncontrollable negative thoughts and memories is a troubling aspect of depression. Thus, knowledge on the mechanism underlying intentional forgetting of these thoughts and memories is crucial to develop an effective emotion regulation strategy for depressed individuals. Behavioral studies have demonstrated that depressed participants cannot intentionally forget negative memories. However, the neural mechanism underlying this process remains unclear. In this study, participants completed the directed forgetting task in which they were instructed to remember or forget neutral or negative words. Standard univariate analysis based on the General Linear Model showed that the depressed participants have higher activation in the inferior frontal gyrus (IFG), superior frontal gyrus (SFG), superior parietal gyrus (SPG), and inferior temporal gyrus (ITG) than the healthy individuals. The results indicated that depressed participants recruited more frontal and parietal inhibitory control resources to inhibit the TBF items, but the attempt still failed because of negative bias. We also used the Support Vector Machine to perform multivariate pattern classification based on the brain activation during directed forgetting. The pattern of brain activity in directed forgetting of negative words allowed correct group classification with an overall accuracy of 75% (P=0.012). The brain regions which are critical for this discrimination showed abnormal activation when depressed participants were attempting to forget negative words. These results indicated that the abnormal neural circuitry when depressed individuals tried to forget the negative words might provide neurobiological markers for depression. Copyright © 2015 Elsevier B.V. All rights reserved.
MKID digital readout tuning with deep learning
NASA Astrophysics Data System (ADS)
Dodkins, R.; Mahashabde, S.; O'Brien, K.; Thatte, N.; Fruitwala, N.; Walter, A. B.; Meeker, S. R.; Szypryt, P.; Mazin, B. A.
2018-04-01
Microwave Kinetic Inductance Detector (MKID) devices offer inherent spectral resolution, simultaneous read out of thousands of pixels, and photon-limited sensitivity at optical wavelengths. Before taking observations the readout power and frequency of each pixel must be individually tuned, and if the equilibrium state of the pixels change, then the readout must be retuned. This process has previously been performed through manual inspection, and typically takes one hour per 500 resonators (20 h for a ten-kilo-pixel array). We present an algorithm based on a deep convolution neural network (CNN) architecture to determine the optimal bias power for each resonator. The bias point classifications from this CNN model, and those from alternative automated methods, are compared to those from human decisions, and the accuracy of each method is assessed. On a test feed-line dataset, the CNN achieves an accuracy of 90% within 1 dB of the designated optimal value, which is equivalent accuracy to a randomly selected human operator, and superior to the highest scoring alternative automated method by 10%. On a full ten-kilopixel array, the CNN performs the characterization in a matter of minutes - paving the way for future mega-pixel MKID arrays.
NASA Astrophysics Data System (ADS)
Kim, Namkug; Seo, Joon Beom; Park, Sang Ok; Lee, Youngjoo; Lee, Jeongjin
2009-02-01
To evaluate the accuracy of computer aided differential diagnosis (CADD) between usual interstitial pneumonia (UIP) and nonspecific interstitial pneumonia (NSIP) at HRCT in comparison with that of a radiologist's decision. A computerized classification for six local disease patterns (normal, NL; ground-glass opacity, GGO; reticular opacity, RO; honeycombing, HC; emphysema, EM; and consolidation, CON) using texture/shape analyses and a SVM classifier at HRCT was used for pixel-by-pixel labeling on the whole lung area. The mode filter was applied on the results to reduce noise. Area fraction (AF) of each pattern, directional probabilistic density function (pdf) (dPDF: mean, SD, skewness of pdf /3 directions: superior-inferior, anterior-posterior, central-peripheral), regional cluster distribution pattern (RCDP: number, mean, SD of clusters, mean, SD of centroid of clusters) were automatically evaluated. Spatially normalized left and right lungs were evaluated separately. Disease division index (DDI) on every combination of AFs and asymmetric index (AI) between left and right lung ((left-right)/left) were also evaluated. To assess the accuracy of the system, fifty-four HRCT data sets in patients with pathologically diagnosed UIP (n=26) and NSIP (n=28) were used. For a classification procedure, a CADD-SVM classifier with internal parameter optimization, and sequential forward floating feature selection (SFFS) were employed. The accuracy was assessed by a 5-folding cross validation with 20- times repetition. For comparison, two thoracic radiologists reviewed the whole HRCT images without clinical information and diagnose each case either as UIP or NSIP. The accuracies of radiologists' decision were 0.75 and 0.87, respectively. The accuracies of the CADD system using the features of AF, dPDF, AI of dPDF, RDP, AI of RDP, DDI were 0.70, 0.79, 0.77, 0.80, 0.78, 0.81, respectively. The accuracy of optimized CADD using all features after SFFS was 0.91. We developed the CADD system to differentiate between UIP and NSIP using automated assessment of the extent and distribution of regional disease patterns at HRCT.
NASA Astrophysics Data System (ADS)
Bangs, Corey F.; Kruse, Fred A.; Olsen, Chris R.
2013-05-01
Hyperspectral data were assessed to determine the effect of integrating spectral data and extracted texture feature data on classification accuracy. Four separate spectral ranges (hundreds of spectral bands total) were used from the Visible and Near Infrared (VNIR) and Shortwave Infrared (SWIR) portions of the electromagnetic spectrum. Haralick texture features (contrast, entropy, and correlation) were extracted from the average gray-level image for each of the four spectral ranges studied. A maximum likelihood classifier was trained using a set of ground truth regions of interest (ROIs) and applied separately to the spectral data, texture data, and a fused dataset containing both. Classification accuracy was measured by comparison of results to a separate verification set of test ROIs. Analysis indicates that the spectral range (source of the gray-level image) used to extract the texture feature data has a significant effect on the classification accuracy. This result applies to texture-only classifications as well as the classification of integrated spectral data and texture feature data sets. Overall classification improvement for the integrated data sets was near 1%. Individual improvement for integrated spectral and texture classification of the "Urban" class showed approximately 9% accuracy increase over spectral-only classification. Texture-only classification accuracy was highest for the "Dirt Path" class at approximately 92% for the spectral range from 947 to 1343nm. This research demonstrates the effectiveness of texture feature data for more accurate analysis of hyperspectral data and the importance of selecting the correct spectral range to be used for the gray-level image source to extract these features.
Accuracy of Remotely Sensed Classifications For Stratification of Forest and Nonforest Lands
Raymond L. Czaplewski; Paul L. Patterson
2001-01-01
We specify accuracy standards for remotely sensed classifications used by FIA to stratify landscapes into two categories: forest and nonforest. Accuracy must be highest when forest area approaches 100 percent of the landscape. If forest area is rare in a landscape, then accuracy in the nonforest stratum must be very high, even at the expense of accuracy in the forest...
NASA Astrophysics Data System (ADS)
Shahriari Nia, Morteza; Wang, Daisy Zhe; Bohlman, Stephanie Ann; Gader, Paul; Graves, Sarah J.; Petrovic, Milenko
2015-01-01
Hyperspectral images can be used to identify savannah tree species at the landscape scale, which is a key step in measuring biomass and carbon, and tracking changes in species distributions, including invasive species, in these ecosystems. Before automated species mapping can be performed, image processing and atmospheric correction is often performed, which can potentially affect the performance of classification algorithms. We determine how three processing and correction techniques (atmospheric correction, Gaussian filters, and shade/green vegetation filters) affect the prediction accuracy of classification of tree species at pixel level from airborne visible/infrared imaging spectrometer imagery of longleaf pine savanna in Central Florida, United States. Species classification using fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) atmospheric correction outperformed ATCOR in the majority of cases. Green vegetation (normalized difference vegetation index) and shade (near-infrared) filters did not increase classification accuracy when applied to large and continuous patches of specific species. Finally, applying a Gaussian filter reduces interband noise and increases species classification accuracy. Using the optimal preprocessing steps, our classification accuracy of six species classes is about 75%.
NASA Astrophysics Data System (ADS)
Wei, Hongqiang; Zhou, Guiyun; Zhou, Junjie
2018-04-01
The classification of leaf and wood points is an essential preprocessing step for extracting inventory measurements and canopy characterization of trees from the terrestrial laser scanning (TLS) data. The geometry-based approach is one of the widely used classification method. In the geometry-based method, it is common practice to extract salient features at one single scale before the features are used for classification. It remains unclear how different scale(s) used affect the classification accuracy and efficiency. To assess the scale effect on the classification accuracy and efficiency, we extracted the single-scale and multi-scale salient features from the point clouds of two oak trees of different sizes and conducted the classification on leaf and wood. Our experimental results show that the balanced accuracy of the multi-scale method is higher than the average balanced accuracy of the single-scale method by about 10 % for both trees. The average speed-up ratio of single scale classifiers over multi-scale classifier for each tree is higher than 30.
Comparing Features for Classification of MEG Responses to Motor Imagery.
Halme, Hanna-Leena; Parkkonen, Lauri
2016-01-01
Motor imagery (MI) with real-time neurofeedback could be a viable approach, e.g., in rehabilitation of cerebral stroke. Magnetoencephalography (MEG) noninvasively measures electric brain activity at high temporal resolution and is well-suited for recording oscillatory brain signals. MI is known to modulate 10- and 20-Hz oscillations in the somatomotor system. In order to provide accurate feedback to the subject, the most relevant MI-related features should be extracted from MEG data. In this study, we evaluated several MEG signal features for discriminating between left- and right-hand MI and between MI and rest. MEG was measured from nine healthy participants imagining either left- or right-hand finger tapping according to visual cues. Data preprocessing, feature extraction and classification were performed offline. The evaluated MI-related features were power spectral density (PSD), Morlet wavelets, short-time Fourier transform (STFT), common spatial patterns (CSP), filter-bank common spatial patterns (FBCSP), spatio-spectral decomposition (SSD), and combined SSD+CSP, CSP+PSD, CSP+Morlet, and CSP+STFT. We also compared four classifiers applied to single trials using 5-fold cross-validation for evaluating the classification accuracy and its possible dependence on the classification algorithm. In addition, we estimated the inter-session left-vs-right accuracy for each subject. The SSD+CSP combination yielded the best accuracy in both left-vs-right (mean 73.7%) and MI-vs-rest (mean 81.3%) classification. CSP+Morlet yielded the best mean accuracy in inter-session left-vs-right classification (mean 69.1%). There were large inter-subject differences in classification accuracy, and the level of the 20-Hz suppression correlated significantly with the subjective MI-vs-rest accuracy. Selection of the classification algorithm had only a minor effect on the results. We obtained good accuracy in sensor-level decoding of MI from single-trial MEG data. Feature extraction methods utilizing both the spatial and spectral profile of MI-related signals provided the best classification results, suggesting good performance of these methods in an online MEG neurofeedback system.
PCA based feature reduction to improve the accuracy of decision tree c4.5 classification
NASA Astrophysics Data System (ADS)
Nasution, M. Z. F.; Sitompul, O. S.; Ramli, M.
2018-03-01
Splitting attribute is a major process in Decision Tree C4.5 classification. However, this process does not give a significant impact on the establishment of the decision tree in terms of removing irrelevant features. It is a major problem in decision tree classification process called over-fitting resulting from noisy data and irrelevant features. In turns, over-fitting creates misclassification and data imbalance. Many algorithms have been proposed to overcome misclassification and overfitting on classifications Decision Tree C4.5. Feature reduction is one of important issues in classification model which is intended to remove irrelevant data in order to improve accuracy. The feature reduction framework is used to simplify high dimensional data to low dimensional data with non-correlated attributes. In this research, we proposed a framework for selecting relevant and non-correlated feature subsets. We consider principal component analysis (PCA) for feature reduction to perform non-correlated feature selection and Decision Tree C4.5 algorithm for the classification. From the experiments conducted using available data sets from UCI Cervical cancer data set repository with 858 instances and 36 attributes, we evaluated the performance of our framework based on accuracy, specificity and precision. Experimental results show that our proposed framework is robust to enhance classification accuracy with 90.70% accuracy rates.
THE ROLE OF WATERSHED CLASSIFICATION IN DIAGNOSING CAUSES OF BIOLOGICAL IMPAIRMENT
We compared classification schemes based on watershed storage (wetland + lake area/watershed area) and forest fragmention with a gewographically-based classification scheme for two case studies involving 1) Lake Superior tributaries and 2) watersheds of riverine coastal wetlands ...
Improving crop classification through attention to the timing of airborne radar acquisitions
NASA Technical Reports Server (NTRS)
Brisco, B.; Ulaby, F. T.; Protz, R.
1984-01-01
Radar remote sensors may provide valuable input to crop classification procedures because of (1) their independence of weather conditions and solar illumination, and (2) their ability to respond to differences in crop type. Manual classification of multidate synthetic aperture radar (SAR) imagery resulted in an overall accuracy of 83 percent for corn, forest, grain, and 'other' cover types. Forests and corn fields were identified with accuracies approaching or exceeding 90 percent. Grain fields and 'other' fields were often confused with each other, resulting in classification accuracies of 51 and 66 percent, respectively. The 83 percent correct classification represents a 10 percent improvement when compared to similar SAR data for the same area collected at alternate time periods in 1978. These results demonstrate that improvements in crop classification accuracy can be achieved with SAR data by synchronizing data collection times with crop growth stages in order to maximize differences in the geometric and dielectric properties of the cover types of interest.
IMPACTS OF PATCH SIZE AND LANDSCAPE HETEROGENEITY ON THEMATIC IMAGE CLASSIFICATION ACCURACY
Impacts of Patch Size and Landscape Heterogeneity on Thematic Image Classification Accuracy.
Currently, most thematic accuracy assessments of classified remotely sensed images oily account for errors between the various classes employed, at particular pixels of interest, thu...
A practical approach to Sasang constitutional diagnosis using vocal features
2013-01-01
Background Sasang constitutional medicine (SCM) is a type of tailored medicine that divides human beings into four Sasang constitutional (SC) types. Diagnosis of SC types is crucial to proper treatment in SCM. Voice characteristics have been used as an essential clue for diagnosing SC types. In the past, many studies tried to extract quantitative vocal features to make diagnosis models; however, these studies were flawed by limited data collected from one or a few sites, long recording time, and low accuracy. We propose a practical diagnosis model having only a few variables, which decreases model complexity. This in turn, makes our model appropriate for clinical applications. Methods A total of 2,341 participants’ voice recordings were used in making a SC classification model and to test the generalization ability of the model. Although the voice data consisted of five vowels and two repeated sentences per participant, we used only the sentence part for our study. A total of 21 features were extracted, and an advanced feature selection method—the least absolute shrinkage and selection operator (LASSO)—was applied to reduce the number of variables for classifier learning. A SC classification model was developed using multinomial logistic regression via LASSO. Results We compared the proposed classification model to the previous study, which used both sentences and five vowels from the same patient’s group. The classification accuracies for the test set were 47.9% and 40.4% for male and female, respectively. Our result showed that the proposed method was superior to the previous study in that it required shorter voice recordings, is more applicable to practical use, and had better generalization performance. Conclusions We proposed a practical SC classification method and showed that our model having fewer variables outperformed the model having many variables in the generalization test. We attempted to reduce the number of variables in two ways: 1) the initial number of candidate features was decreased by considering shorter voice recording, and 2) LASSO was introduced for reducing model complexity. The proposed method is suitable for an actual clinical environment. Moreover, we expect it to yield more stable results because of the model’s simplicity. PMID:24200041
40 CFR 52.2571 - Classification of regions.
Code of Federal Regulations, 2012 CFR
2012-07-01
... (CONTINUED) APPROVAL AND PROMULGATION OF IMPLEMENTATION PLANS (CONTINUED) Wisconsin § 52.2571 Classification of regions. The Wisconsin plan was evaluated on the basis of the following classifications: Air... Duluth (Minnesota)-Superior (Wisconsin) Interstate I II III III III North Central Wisconsin Intrastate II...
40 CFR 52.2571 - Classification of regions.
Code of Federal Regulations, 2013 CFR
2013-07-01
... (CONTINUED) APPROVAL AND PROMULGATION OF IMPLEMENTATION PLANS (CONTINUED) Wisconsin § 52.2571 Classification of regions. The Wisconsin plan was evaluated on the basis of the following classifications: Air... Duluth (Minnesota)-Superior (Wisconsin) Interstate I II III III III North Central Wisconsin Intrastate II...
40 CFR 52.2571 - Classification of regions.
Code of Federal Regulations, 2014 CFR
2014-07-01
... (CONTINUED) APPROVAL AND PROMULGATION OF IMPLEMENTATION PLANS (CONTINUED) Wisconsin § 52.2571 Classification of regions. The Wisconsin plan was evaluated on the basis of the following classifications: Air... Duluth (Minnesota)-Superior (Wisconsin) Interstate I II III III III North Central Wisconsin Intrastate II...
Global Optimization Ensemble Model for Classification Methods
Anwar, Hina; Qamar, Usman; Muzaffar Qureshi, Abdul Wahab
2014-01-01
Supervised learning is the process of data mining for deducing rules from training datasets. A broad array of supervised learning algorithms exists, every one of them with its own advantages and drawbacks. There are some basic issues that affect the accuracy of classifier while solving a supervised learning problem, like bias-variance tradeoff, dimensionality of input space, and noise in the input data space. All these problems affect the accuracy of classifier and are the reason that there is no global optimal method for classification. There is not any generalized improvement method that can increase the accuracy of any classifier while addressing all the problems stated above. This paper proposes a global optimization ensemble model for classification methods (GMC) that can improve the overall accuracy for supervised learning problems. The experimental results on various public datasets showed that the proposed model improved the accuracy of the classification models from 1% to 30% depending upon the algorithm complexity. PMID:24883382
Zhao, Weixiang; Sankaran, Shankar; Ibáñez, Ana M; Dandekar, Abhaya M; Davis, Cristina E
2009-08-04
This study introduces two-dimensional (2-D) wavelet analysis to the classification of gas chromatogram differential mobility spectrometry (GC/DMS) data which are composed of retention time, compensation voltage, and corresponding intensities. One reported method to process such large data sets is to convert 2-D signals to 1-D signals by summing intensities either across retention time or compensation voltage, but it can lose important signal information in one data dimension. A 2-D wavelet analysis approach keeps the 2-D structure of original signals, while significantly reducing data size. We applied this feature extraction method to 2-D GC/DMS signals measured from control and disordered fruit and then employed two typical classification algorithms to testify the effects of the resultant features on chemical pattern recognition. Yielding a 93.3% accuracy of separating data from control and disordered fruit samples, 2-D wavelet analysis not only proves its feasibility to extract feature from original 2-D signals but also shows its superiority over the conventional feature extraction methods including converting 2-D to 1-D and selecting distinguishable pixels from training set. Furthermore, this process does not require coupling with specific pattern recognition methods, which may help ensure wide applications of this method to 2-D spectrometry data.
Ham, Nam Seok; Jang, Jae Young; Ryu, Sung Woo; Kim, Ji Hye; Park, Eui Ju; Lee, Woong Cheul; Shim, Kwang Yeun; Jeong, Soung Won; Kim, Hyun Gun; Lee, Tae Hee; Jeon, Sung Ran; Cho, Jun Hyung; Cho, Joo Young; Jin, So Young; Lee, Ji Sung
2013-11-07
To determine whether magnified observation of short-segment Barrett's esophagus (BE) is useful for the detection of specialized intestinal metaplasia (SIM). Thirty patients with suspected short-segment BE underwent magnifying endoscopy up to × 80. The magnified images were analyzed with respect to their pit-patterns, which were simultaneously classified into five epithelial types [I (small round), II (straight), III (long oval), IV (tubular), V (villous)] by Endo's classification. Then, a 0.5% solution of methylene blue (MB) was sprayed over columnar mucosa. The patterns of the magnified image and MB staining were analyzed. Biopsies were obtained from the regions previously observed by magnifying endoscopy and MB chromoendoscopy. Three of five patients with a type V (villous) epithelial pattern had SIM, whereas 21 patients with a non-type V epithelial patterns did not have SIM. The sensitivity, specificity, accuracy, positive predictive value, and negative predictive value of pit-patterns in detecting SIM were 100%, 91.3%, 92.3%, 60% and 100%, respectively (P = 0.004). Three of the 12 patients with positive MB staining had SIM, whereas 14 patients with negative MB staining did not have SIM. The sensitivity, specificity, accuracy, positive predictive value, and negative predictive value of MB staining in detecting SIM were 100%, 60.9%, 65.4%, 25% and 100%, respectively (P = 0.085). The specificity and accuracy of pit-pattern evaluation were significantly superior compared with MB staining for detecting SIM by comparison with the exact McNemar's test (P = 0.0391). The magnified observation of a short-segment BE according to the mucosal pattern and its classification can be predictive of SIM.
Tada, Toshifumi; Kumada, Takashi; Toyoda, Hidenori; Ito, Takanori; Sone, Yasuhiro; Okuda, Seiji; Ogawa, Sadanobu; Igura, Takumi; Imai, Yasuharu
2015-01-01
The macroscopic type of hepatocellular carcinoma (HCC) is a predictor of prognosis. We clarified the diagnostic value of gadolinium ethoxybenzyl diethylenetriamine pentaacetic acid (Gd-EOB-DTPA)-enhanced magnetic resonance imaging (MRI) in the macroscopic classification of nodular hepatocellular carcinoma (HCC) as compared to angiography-assisted computed tomography (CT). A total of 71 surgically resected nodular HCCs with a maximum diameter of ≤5 cm were investigated. HCCs were evaluated preoperatively using Gd-EOB-DTPA-enhanced MRI and angiography-assisted CT. HCCs were pathologically classified as simple nodular (SN), SN with extranodular growth (SN-EG), or confluent multinodular (CMN). SN-EG and CMN were grouped as non-SN. Five readers independently reviewed the images using a five-point scale. We examined the accuracy of both imaging modalities in differentiating between SN and non-SN HCC. Overall, the area under the receiver operating characteristic curve (A z ) for the diagnosis of non-SN did not differ between Gd-EOB-DTPA-enhanced MRI and angiography-assisted CT [0.879 (95% confidence interval (CI), 0.779-0.937) and 0.845 (95% CI, 0.723-0.919), respectively]. For HCCs >2 cm, the A z for Gd-EOB-DTPA-enhanced MRI was greater than 0.9. The sensitivity, specificity, and accuracy of Gd-EOB-DTPA-enhanced MRI for identifying non-SN were equal to or higher than values with angiography-assisted CT in all three categories (all tumors, ≤2 cm, and >2 cm), but the differences were not statistically significant. Using Gd-EOB-DTPA-enhanced MRI to assess the macroscopic findings in nodular HCC was equal or superior to using angiography-assisted CT.
Porras-Alfaro, Andrea; Liu, Kuan-Liang; Kuske, Cheryl R; Xie, Gary
2014-02-01
We compared the classification accuracy of two sections of the fungal internal transcribed spacer (ITS) region, individually and combined, and the 5' section (about 600 bp) of the large-subunit rRNA (LSU), using a naive Bayesian classifier and BLASTN. A hand-curated ITS-LSU training set of 1,091 sequences and a larger training set of 8,967 ITS region sequences were used. Of the factors evaluated, database composition and quality had the largest effect on classification accuracy, followed by fragment size and use of a bootstrap cutoff to improve classification confidence. The naive Bayesian classifier and BLASTN gave similar results at higher taxonomic levels, but the classifier was faster and more accurate at the genus level when a bootstrap cutoff was used. All of the ITS and LSU sections performed well (>97.7% accuracy) at higher taxonomic ranks from kingdom to family, and differences between them were small at the genus level (within 0.66 to 1.23%). When full-length sequence sections were used, the LSU outperformed the ITS1 and ITS2 fragments at the genus level, but the ITS1 and ITS2 showed higher accuracy when smaller fragment sizes of the same length and a 50% bootstrap cutoff were used. In a comparison using the larger ITS training set, ITS1 and ITS2 had very similar accuracy classification for fragments between 100 and 200 bp. Collectively, the results show that any of the ITS or LSU sections we tested provided comparable classification accuracy to the genus level and underscore the need for larger and more diverse classification training sets.
Liu, Kuan-Liang; Kuske, Cheryl R.
2014-01-01
We compared the classification accuracy of two sections of the fungal internal transcribed spacer (ITS) region, individually and combined, and the 5′ section (about 600 bp) of the large-subunit rRNA (LSU), using a naive Bayesian classifier and BLASTN. A hand-curated ITS-LSU training set of 1,091 sequences and a larger training set of 8,967 ITS region sequences were used. Of the factors evaluated, database composition and quality had the largest effect on classification accuracy, followed by fragment size and use of a bootstrap cutoff to improve classification confidence. The naive Bayesian classifier and BLASTN gave similar results at higher taxonomic levels, but the classifier was faster and more accurate at the genus level when a bootstrap cutoff was used. All of the ITS and LSU sections performed well (>97.7% accuracy) at higher taxonomic ranks from kingdom to family, and differences between them were small at the genus level (within 0.66 to 1.23%). When full-length sequence sections were used, the LSU outperformed the ITS1 and ITS2 fragments at the genus level, but the ITS1 and ITS2 showed higher accuracy when smaller fragment sizes of the same length and a 50% bootstrap cutoff were used. In a comparison using the larger ITS training set, ITS1 and ITS2 had very similar accuracy classification for fragments between 100 and 200 bp. Collectively, the results show that any of the ITS or LSU sections we tested provided comparable classification accuracy to the genus level and underscore the need for larger and more diverse classification training sets. PMID:24242255
Van Cott, Andrew; Hastings, Charles E; Landsiedel, Robert; Kolle, Susanne; Stinchcombe, Stefan
2018-02-01
In vivo acute systemic testing is a regulatory requirement for agrochemical formulations. GHS specifies an alternative computational approach (GHS additivity formula) for calculating the acute toxicity of mixtures. We collected acute systemic toxicity data from formulations that contained one of several acutely-toxic active ingredients. The resulting acute data set includes 210 formulations tested for oral toxicity, 128 formulations tested for inhalation toxicity and 31 formulations tested for dermal toxicity. The GHS additivity formula was applied to each of these formulations and compared with the experimental in vivo result. In the acute oral assay, the GHS additivity formula misclassified 110 formulations using the GHS classification criteria (48% accuracy) and 119 formulations using the USEPA classification criteria (43% accuracy). With acute inhalation, the GHS additivity formula misclassified 50 formulations using the GHS classification criteria (61% accuracy) and 34 formulations using the USEPA classification criteria (73% accuracy). For acute dermal toxicity, the GHS additivity formula misclassified 16 formulations using the GHS classification criteria (48% accuracy) and 20 formulations using the USEPA classification criteria (36% accuracy). This data indicates the acute systemic toxicity of many formulations is not the sum of the ingredients' toxicity (additivity); but rather, ingredients in a formulation can interact to result in lower or higher toxicity than predicted by the GHS additivity formula. Copyright © 2018 Elsevier Inc. All rights reserved.
3D shape representation with spatial probabilistic distribution of intrinsic shape keypoints
NASA Astrophysics Data System (ADS)
Ghorpade, Vijaya K.; Checchin, Paul; Malaterre, Laurent; Trassoudaine, Laurent
2017-12-01
The accelerated advancement in modeling, digitizing, and visualizing techniques for 3D shapes has led to an increasing amount of 3D models creation and usage, thanks to the 3D sensors which are readily available and easy to utilize. As a result, determining the similarity between 3D shapes has become consequential and is a fundamental task in shape-based recognition, retrieval, clustering, and classification. Several decades of research in Content-Based Information Retrieval (CBIR) has resulted in diverse techniques for 2D and 3D shape or object classification/retrieval and many benchmark data sets. In this article, a novel technique for 3D shape representation and object classification has been proposed based on analyses of spatial, geometric distributions of 3D keypoints. These distributions capture the intrinsic geometric structure of 3D objects. The result of the approach is a probability distribution function (PDF) produced from spatial disposition of 3D keypoints, keypoints which are stable on object surface and invariant to pose changes. Each class/instance of an object can be uniquely represented by a PDF. This shape representation is robust yet with a simple idea, easy to implement but fast enough to compute. Both Euclidean and topological space on object's surface are considered to build the PDFs. Topology-based geodesic distances between keypoints exploit the non-planar surface properties of the object. The performance of the novel shape signature is tested with object classification accuracy. The classification efficacy of the new shape analysis method is evaluated on a new dataset acquired with a Time-of-Flight camera, and also, a comparative evaluation on a standard benchmark dataset with state-of-the-art methods is performed. Experimental results demonstrate superior classification performance of the new approach on RGB-D dataset and depth data.
NASA Astrophysics Data System (ADS)
Geelen, Christopher D.; Wijnhoven, Rob G. J.; Dubbelman, Gijs; de With, Peter H. N.
2015-03-01
This research considers gender classification in surveillance environments, typically involving low-resolution images and a large amount of viewpoint variations and occlusions. Gender classification is inherently difficult due to the large intra-class variation and interclass correlation. We have developed a gender classification system, which is successfully evaluated on two novel datasets, which realistically consider the above conditions, typical for surveillance. The system reaches a mean accuracy of up to 90% and approaches our human baseline of 92.6%, proving a high-quality gender classification system. We also present an in-depth discussion of the fundamental differences between SVM and RF classifiers. We conclude that balancing the degree of randomization in any classifier is required for the highest classification accuracy. For our problem, an RF-SVM hybrid classifier exploiting the combination of HSV and LBP features results in the highest classification accuracy of 89.9 0.2%, while classification computation time is negligible compared to the detection time of pedestrians.
Sengur, Abdulkadir; Akbulut, Yaman; Guo, Yanhui; Bajaj, Varun
2017-12-01
Electromyogram (EMG) signals contain useful information of the neuromuscular diseases like amyotrophic lateral sclerosis (ALS). ALS is a well-known brain disease, which can progressively degenerate the motor neurons. In this paper, we propose a deep learning based method for efficient classification of ALS and normal EMG signals. Spectrogram, continuous wavelet transform (CWT), and smoothed pseudo Wigner-Ville distribution (SPWVD) have been employed for time-frequency (T-F) representation of EMG signals. A convolutional neural network is employed to classify these features. In it, Two convolution layers, two pooling layer, a fully connected layer and a lost function layer is considered in CNN architecture. The CNN architecture is trained with the reinforcement sample learning strategy. The efficiency of the proposed implementation is tested on publicly available EMG dataset. The dataset contains 89 ALS and 133 normal EMG signals with 24 kHz sampling frequency. Experimental results show 96.80% accuracy. The obtained results are also compared with other methods, which show the superiority of the proposed method.
SSAW: A new sequence similarity analysis method based on the stationary discrete wavelet transform.
Lin, Jie; Wei, Jing; Adjeroh, Donald; Jiang, Bing-Hua; Jiang, Yue
2018-05-02
Alignment-free sequence similarity analysis methods often lead to significant savings in computational time over alignment-based counterparts. A new alignment-free sequence similarity analysis method, called SSAW is proposed. SSAW stands for Sequence Similarity Analysis using the Stationary Discrete Wavelet Transform (SDWT). It extracts k-mers from a sequence, then maps each k-mer to a complex number field. Then, the series of complex numbers formed are transformed into feature vectors using the stationary discrete wavelet transform. After these steps, the original sequence is turned into a feature vector with numeric values, which can then be used for clustering and/or classification. Using two different types of applications, namely, clustering and classification, we compared SSAW against the the-state-of-the-art alignment free sequence analysis methods. SSAW demonstrates competitive or superior performance in terms of standard indicators, such as accuracy, F-score, precision, and recall. The running time was significantly better in most cases. These make SSAW a suitable method for sequence analysis, especially, given the rapidly increasing volumes of sequence data required by most modern applications.
NASA Astrophysics Data System (ADS)
Dubey, Kavita; Srivastava, Vishal; Singh Mehta, Dalip
2018-04-01
Early identification of fungal infection on the human scalp is crucial for avoiding hair loss. The diagnosis of fungal infection on the human scalp is based on a visual assessment by trained experts or doctors. Optical coherence tomography (OCT) has the ability to capture fungal infection information from the human scalp with a high resolution. In this study, we present a fully automated, non-contact, non-invasive optical method for rapid detection of fungal infections based on the extracted features from A-line and B-scan images of OCT. A multilevel ensemble machine model is designed to perform automated classification, which shows the superiority of our classifier to the best classifier based on the features extracted from OCT images. In this study, 60 samples (30 fungal, 30 normal) were imaged by OCT and eight features were extracted. The classification algorithm had an average sensitivity, specificity and accuracy of 92.30, 90.90 and 91.66%, respectively, for identifying fungal and normal human scalps. This remarkable classifying ability makes the proposed model readily applicable to classifying the human scalp.
Improved Sparse Multi-Class SVM and Its Application for Gene Selection in Cancer Classification
Huang, Lingkang; Zhang, Hao Helen; Zeng, Zhao-Bang; Bushel, Pierre R.
2013-01-01
Background Microarray techniques provide promising tools for cancer diagnosis using gene expression profiles. However, molecular diagnosis based on high-throughput platforms presents great challenges due to the overwhelming number of variables versus the small sample size and the complex nature of multi-type tumors. Support vector machines (SVMs) have shown superior performance in cancer classification due to their ability to handle high dimensional low sample size data. The multi-class SVM algorithm of Crammer and Singer provides a natural framework for multi-class learning. Despite its effective performance, the procedure utilizes all variables without selection. In this paper, we propose to improve the procedure by imposing shrinkage penalties in learning to enforce solution sparsity. Results The original multi-class SVM of Crammer and Singer is effective for multi-class classification but does not conduct variable selection. We improved the method by introducing soft-thresholding type penalties to incorporate variable selection into multi-class classification for high dimensional data. The new methods were applied to simulated data and two cancer gene expression data sets. The results demonstrate that the new methods can select a small number of genes for building accurate multi-class classification rules. Furthermore, the important genes selected by the methods overlap significantly, suggesting general agreement among different variable selection schemes. Conclusions High accuracy and sparsity make the new methods attractive for cancer diagnostics with gene expression data and defining targets of therapeutic intervention. Availability: The source MATLAB code are available from http://math.arizona.edu/~hzhang/software.html. PMID:23966761
Saini, Harsh; Lal, Sunil Pranit; Naidu, Vimal Vikash; Pickering, Vincel Wince; Singh, Gurmeet; Tsunoda, Tatsuhiko; Sharma, Alok
2016-12-05
High dimensional feature space generally degrades classification in several applications. In this paper, we propose a strategy called gene masking, in which non-contributing dimensions are heuristically removed from the data to improve classification accuracy. Gene masking is implemented via a binary encoded genetic algorithm that can be integrated seamlessly with classifiers during the training phase of classification to perform feature selection. It can also be used to discriminate between features that contribute most to the classification, thereby, allowing researchers to isolate features that may have special significance. This technique was applied on publicly available datasets whereby it substantially reduced the number of features used for classification while maintaining high accuracies. The proposed technique can be extremely useful in feature selection as it heuristically removes non-contributing features to improve the performance of classifiers.
Analyzing thematic maps and mapping for accuracy
Rosenfield, G.H.
1982-01-01
Two problems which exist while attempting to test the accuracy of thematic maps and mapping are: (1) evaluating the accuracy of thematic content, and (2) evaluating the effects of the variables on thematic mapping. Statistical analysis techniques are applicable to both these problems and include techniques for sampling the data and determining their accuracy. In addition, techniques for hypothesis testing, or inferential statistics, are used when comparing the effects of variables. A comprehensive and valid accuracy test of a classification project, such as thematic mapping from remotely sensed data, includes the following components of statistical analysis: (1) sample design, including the sample distribution, sample size, size of the sample unit, and sampling procedure; and (2) accuracy estimation, including estimation of the variance and confidence limits. Careful consideration must be given to the minimum sample size necessary to validate the accuracy of a given. classification category. The results of an accuracy test are presented in a contingency table sometimes called a classification error matrix. Usually the rows represent the interpretation, and the columns represent the verification. The diagonal elements represent the correct classifications. The remaining elements of the rows represent errors by commission, and the remaining elements of the columns represent the errors of omission. For tests of hypothesis that compare variables, the general practice has been to use only the diagonal elements from several related classification error matrices. These data are arranged in the form of another contingency table. The columns of the table represent the different variables being compared, such as different scales of mapping. The rows represent the blocking characteristics, such as the various categories of classification. The values in the cells of the tables might be the counts of correct classification or the binomial proportions of these counts divided by either the row totals or the column totals from the original classification error matrices. In hypothesis testing, when the results of tests of multiple sample cases prove to be significant, some form of statistical test must be used to separate any results that differ significantly from the others. In the past, many analyses of the data in this error matrix were made by comparing the relative magnitudes of the percentage of correct classifications, for either individual categories, the entire map or both. More rigorous analyses have used data transformations and (or) two-way classification analysis of variance. A more sophisticated step of data analysis techniques would be to use the entire classification error matrices using the methods of discrete multivariate analysis or of multiviariate analysis of variance.
A review of supervised object-based land-cover image classification
NASA Astrophysics Data System (ADS)
Ma, Lei; Li, Manchun; Ma, Xiaoxue; Cheng, Liang; Du, Peijun; Liu, Yongxue
2017-08-01
Object-based image classification for land-cover mapping purposes using remote-sensing imagery has attracted significant attention in recent years. Numerous studies conducted over the past decade have investigated a broad array of sensors, feature selection, classifiers, and other factors of interest. However, these research results have not yet been synthesized to provide coherent guidance on the effect of different supervised object-based land-cover classification processes. In this study, we first construct a database with 28 fields using qualitative and quantitative information extracted from 254 experimental cases described in 173 scientific papers. Second, the results of the meta-analysis are reported, including general characteristics of the studies (e.g., the geographic range of relevant institutes, preferred journals) and the relationships between factors of interest (e.g., spatial resolution and study area or optimal segmentation scale, accuracy and number of targeted classes), especially with respect to the classification accuracy of different sensors, segmentation scale, training set size, supervised classifiers, and land-cover types. Third, useful data on supervised object-based image classification are determined from the meta-analysis. For example, we find that supervised object-based classification is currently experiencing rapid advances, while development of the fuzzy technique is limited in the object-based framework. Furthermore, spatial resolution correlates with the optimal segmentation scale and study area, and Random Forest (RF) shows the best performance in object-based classification. The area-based accuracy assessment method can obtain stable classification performance, and indicates a strong correlation between accuracy and training set size, while the accuracy of the point-based method is likely to be unstable due to mixed objects. In addition, the overall accuracy benefits from higher spatial resolution images (e.g., unmanned aerial vehicle) or agricultural sites where it also correlates with the number of targeted classes. More than 95.6% of studies involve an area less than 300 ha, and the spatial resolution of images is predominantly between 0 and 2 m. Furthermore, we identify some methods that may advance supervised object-based image classification. For example, deep learning and type-2 fuzzy techniques may further improve classification accuracy. Lastly, scientists are strongly encouraged to report results of uncertainty studies to further explore the effects of varied factors on supervised object-based image classification.
Sub-pixel image classification for forest types in East Texas
NASA Astrophysics Data System (ADS)
Westbrook, Joey
Sub-pixel classification is the extraction of information about the proportion of individual materials of interest within a pixel. Landcover classification at the sub-pixel scale provides more discrimination than traditional per-pixel multispectral classifiers for pixels where the material of interest is mixed with other materials. It allows for the un-mixing of pixels to show the proportion of each material of interest. The materials of interest for this study are pine, hardwood, mixed forest and non-forest. The goal of this project was to perform a sub-pixel classification, which allows a pixel to have multiple labels, and compare the result to a traditional supervised classification, which allows a pixel to have only one label. The satellite image used was a Landsat 5 Thematic Mapper (TM) scene of the Stephen F. Austin Experimental Forest in Nacogdoches County, Texas and the four cover type classes are pine, hardwood, mixed forest and non-forest. Once classified, a multi-layer raster datasets was created that comprised four raster layers where each layer showed the percentage of that cover type within the pixel area. Percentage cover type maps were then produced and the accuracy of each was assessed using a fuzzy error matrix for the sub-pixel classifications, and the results were compared to the supervised classification in which a traditional error matrix was used. The overall accuracy of the sub-pixel classification using the aerial photo for both training and reference data had the highest (65% overall) out of the three sub-pixel classifications. This was understandable because the analyst can visually observe the cover types actually on the ground for training data and reference data, whereas using the FIA (Forest Inventory and Analysis) plot data, the analyst must assume that an entire pixel contains the exact percentage of a cover type found in a plot. An increase in accuracy was found after reclassifying each sub-pixel classification from nine classes with 10 percent interval each to five classes with 20 percent interval each. When compared to the supervised classification which has a satisfactory overall accuracy of 90%, none of the sub-pixel classification achieved the same level. However, since traditional per-pixel classifiers assign only one label to pixels throughout the landscape while sub-pixel classifications assign multiple labels to each pixel, the traditional 85% accuracy of acceptance for pixel-based classifications should not apply to sub-pixel classifications. More research is needed in order to define the level of accuracy that is deemed acceptable for sub-pixel classifications.
Stinchfield, Randy; McCready, John; Turner, Nigel E; Jimenez-Murcia, Susana; Petry, Nancy M; Grant, Jon; Welte, John; Chapman, Heather; Winters, Ken C
2016-09-01
The DSM-5 was published in 2013 and it included two substantive revisions for gambling disorder (GD). These changes are the reduction in the threshold from five to four criteria and elimination of the illegal activities criterion. The purpose of this study was to twofold. First, to assess the reliability, validity and classification accuracy of the DSM-5 diagnostic criteria for GD. Second, to compare the DSM-5-DSM-IV on reliability, validity, and classification accuracy, including an examination of the effect of the elimination of the illegal acts criterion on diagnostic accuracy. To compare DSM-5 and DSM-IV, eight datasets from three different countries (Canada, USA, and Spain; total N = 3247) were used. All datasets were based on similar research methods. Participants were recruited from outpatient gambling treatment services to represent the group with a GD and from the community to represent the group without a GD. All participants were administered a standardized measure of diagnostic criteria. The DSM-5 yielded satisfactory reliability, validity and classification accuracy. In comparing the DSM-5 to the DSM-IV, most comparisons of reliability, validity and classification accuracy showed more similarities than differences. There was evidence of modest improvements in classification accuracy for DSM-5 over DSM-IV, particularly in reduction of false negative errors. This reduction in false negative errors was largely a function of lowering the cut score from five to four and this revision is an improvement over DSM-IV. From a statistical standpoint, eliminating the illegal acts criterion did not make a significant impact on diagnostic accuracy. From a clinical standpoint, illegal acts can still be addressed in the context of the DSM-5 criterion of lying to others.
Practical Issues in Estimating Classification Accuracy and Consistency with R Package cacIRT
ERIC Educational Resources Information Center
Lathrop, Quinn N.
2015-01-01
There are two main lines of research in estimating classification accuracy (CA) and classification consistency (CC) under Item Response Theory (IRT). The R package cacIRT provides computer implementations of both approaches in an accessible and unified framework. Even with available implementations, there remains decisions a researcher faces when…
Variance estimates and confidence intervals for the Kappa measure of classification accuracy
M. A. Kalkhan; R. M. Reich; R. L. Czaplewski
1997-01-01
The Kappa statistic is frequently used to characterize the results of an accuracy assessment used to evaluate land use and land cover classifications obtained by remotely sensed data. This statistic allows comparisons of alternative sampling designs, classification algorithms, photo-interpreters, and so forth. In order to make these comparisons, it is...
We compared classification schemes based on watershed storage (wetland + lake area/watershed area) and forest fragmentation with a geographically-based classification scheme for two case studies involving 1) Lake Superior tributaries and 2) watersheds of riverine coastal wetlands...
Stratified random selection of watersheds allowed us to compare geographically-independent classification schemes based on watershed storage (wetland + lake area/watershed area) and forest fragmentation with a geographically-based classification scheme within the Northern Lakes a...
We compared classification schemes based on watershed storage (wetland + lake area/watershed area) and forest fragmentation with a geographically-based classification scheme for two case studies involving 1)Lake Superior tributaries and 2) watersheds of riverine coastal wetlands ...
On-line analysis of algae in water by discrete three-dimensional fluorescence spectroscopy.
Zhao, Nanjing; Zhang, Xiaoling; Yin, Gaofang; Yang, Ruifang; Hu, Li; Chen, Shuang; Liu, Jianguo; Liu, Wenqing
2018-03-19
In view of the problem of the on-line measurement of algae classification, a method of algae classification and concentration determination based on the discrete three-dimensional fluorescence spectra was studied in this work. The discrete three-dimensional fluorescence spectra of twelve common species of algae belonging to five categories were analyzed, the discrete three-dimensional standard spectra of five categories were built, and the recognition, classification and concentration prediction of algae categories were realized by the discrete three-dimensional fluorescence spectra coupled with non-negative weighted least squares linear regression analysis. The results show that similarities between discrete three-dimensional standard spectra of different categories were reduced and the accuracies of recognition, classification and concentration prediction of the algae categories were significantly improved. By comparing with that of the chlorophyll a fluorescence excitation spectra method, the recognition accuracy rate in pure samples by discrete three-dimensional fluorescence spectra is improved 1.38%, and the recovery rate and classification accuracy in pure diatom samples 34.1% and 46.8%, respectively; the recognition accuracy rate of mixed samples by discrete-three dimensional fluorescence spectra is enhanced by 26.1%, the recovery rate of mixed samples with Chlorophyta 37.8%, and the classification accuracy of mixed samples with diatoms 54.6%.
Belgiu, Mariana; Dr Guţ, Lucian
2014-10-01
Although multiresolution segmentation (MRS) is a powerful technique for dealing with very high resolution imagery, some of the image objects that it generates do not match the geometries of the target objects, which reduces the classification accuracy. MRS can, however, be guided to produce results that approach the desired object geometry using either supervised or unsupervised approaches. Although some studies have suggested that a supervised approach is preferable, there has been no comparative evaluation of these two approaches. Therefore, in this study, we have compared supervised and unsupervised approaches to MRS. One supervised and two unsupervised segmentation methods were tested on three areas using QuickBird and WorldView-2 satellite imagery. The results were assessed using both segmentation evaluation methods and an accuracy assessment of the resulting building classifications. Thus, differences in the geometries of the image objects and in the potential to achieve satisfactory thematic accuracies were evaluated. The two approaches yielded remarkably similar classification results, with overall accuracies ranging from 82% to 86%. The performance of one of the unsupervised methods was unexpectedly similar to that of the supervised method; they identified almost identical scale parameters as being optimal for segmenting buildings, resulting in very similar geometries for the resulting image objects. The second unsupervised method produced very different image objects from the supervised method, but their classification accuracies were still very similar. The latter result was unexpected because, contrary to previously published findings, it suggests a high degree of independence between the segmentation results and classification accuracy. The results of this study have two important implications. The first is that object-based image analysis can be automated without sacrificing classification accuracy, and the second is that the previously accepted idea that classification is dependent on segmentation is challenged by our unexpected results, casting doubt on the value of pursuing 'optimal segmentation'. Our results rather suggest that as long as under-segmentation remains at acceptable levels, imperfections in segmentation can be ruled out, so that a high level of classification accuracy can still be achieved.
NASA Astrophysics Data System (ADS)
Quesada-Barriuso, Pablo; Heras, Dora B.; Argüello, Francisco
2016-10-01
The classification of remote sensing hyperspectral images for land cover applications is a very intensive topic. In the case of supervised classification, Support Vector Machines (SVMs) play a dominant role. Recently, the Extreme Learning Machine algorithm (ELM) has been extensively used. The classification scheme previously published by the authors, and called WT-EMP, introduces spatial information in the classification process by means of an Extended Morphological Profile (EMP) that is created from features extracted by wavelets. In addition, the hyperspectral image is denoised in the 2-D spatial domain, also using wavelets and it is joined to the EMP via a stacked vector. In this paper, the scheme is improved achieving two goals. The first one is to reduce the classification time while preserving the accuracy of the classification by using ELM instead of SVM. The second one is to improve the accuracy results by performing not only a 2-D denoising for every spectral band, but also a previous additional 1-D spectral signature denoising applied to each pixel vector of the image. For each denoising the image is transformed by applying a 1-D or 2-D wavelet transform, and then a NeighShrink thresholding is applied. Improvements in terms of classification accuracy are obtained, especially for images with close regions in the classification reference map, because in these cases the accuracy of the classification in the edges between classes is more relevant.
Comparing Features for Classification of MEG Responses to Motor Imagery
Halme, Hanna-Leena; Parkkonen, Lauri
2016-01-01
Background Motor imagery (MI) with real-time neurofeedback could be a viable approach, e.g., in rehabilitation of cerebral stroke. Magnetoencephalography (MEG) noninvasively measures electric brain activity at high temporal resolution and is well-suited for recording oscillatory brain signals. MI is known to modulate 10- and 20-Hz oscillations in the somatomotor system. In order to provide accurate feedback to the subject, the most relevant MI-related features should be extracted from MEG data. In this study, we evaluated several MEG signal features for discriminating between left- and right-hand MI and between MI and rest. Methods MEG was measured from nine healthy participants imagining either left- or right-hand finger tapping according to visual cues. Data preprocessing, feature extraction and classification were performed offline. The evaluated MI-related features were power spectral density (PSD), Morlet wavelets, short-time Fourier transform (STFT), common spatial patterns (CSP), filter-bank common spatial patterns (FBCSP), spatio—spectral decomposition (SSD), and combined SSD+CSP, CSP+PSD, CSP+Morlet, and CSP+STFT. We also compared four classifiers applied to single trials using 5-fold cross-validation for evaluating the classification accuracy and its possible dependence on the classification algorithm. In addition, we estimated the inter-session left-vs-right accuracy for each subject. Results The SSD+CSP combination yielded the best accuracy in both left-vs-right (mean 73.7%) and MI-vs-rest (mean 81.3%) classification. CSP+Morlet yielded the best mean accuracy in inter-session left-vs-right classification (mean 69.1%). There were large inter-subject differences in classification accuracy, and the level of the 20-Hz suppression correlated significantly with the subjective MI-vs-rest accuracy. Selection of the classification algorithm had only a minor effect on the results. Conclusions We obtained good accuracy in sensor-level decoding of MI from single-trial MEG data. Feature extraction methods utilizing both the spatial and spectral profile of MI-related signals provided the best classification results, suggesting good performance of these methods in an online MEG neurofeedback system. PMID:27992574
Kainz, Philipp; Pfeiffer, Michael; Urschler, Martin
2017-01-01
Segmentation of histopathology sections is a necessary preprocessing step for digital pathology. Due to the large variability of biological tissue, machine learning techniques have shown superior performance over conventional image processing methods. Here we present our deep neural network-based approach for segmentation and classification of glands in tissue of benign and malignant colorectal cancer, which was developed to participate in the GlaS@MICCAI2015 colon gland segmentation challenge. We use two distinct deep convolutional neural networks (CNN) for pixel-wise classification of Hematoxylin-Eosin stained images. While the first classifier separates glands from background, the second classifier identifies gland-separating structures. In a subsequent step, a figure-ground segmentation based on weighted total variation produces the final segmentation result by regularizing the CNN predictions. We present both quantitative and qualitative segmentation results on the recently released and publicly available Warwick-QU colon adenocarcinoma dataset associated with the GlaS@MICCAI2015 challenge and compare our approach to the simultaneously developed other approaches that participated in the same challenge. On two test sets, we demonstrate our segmentation performance and show that we achieve a tissue classification accuracy of 98% and 95%, making use of the inherent capability of our system to distinguish between benign and malignant tissue. Our results show that deep learning approaches can yield highly accurate and reproducible results for biomedical image analysis, with the potential to significantly improve the quality and speed of medical diagnoses.
Kainz, Philipp; Pfeiffer, Michael
2017-01-01
Segmentation of histopathology sections is a necessary preprocessing step for digital pathology. Due to the large variability of biological tissue, machine learning techniques have shown superior performance over conventional image processing methods. Here we present our deep neural network-based approach for segmentation and classification of glands in tissue of benign and malignant colorectal cancer, which was developed to participate in the GlaS@MICCAI2015 colon gland segmentation challenge. We use two distinct deep convolutional neural networks (CNN) for pixel-wise classification of Hematoxylin-Eosin stained images. While the first classifier separates glands from background, the second classifier identifies gland-separating structures. In a subsequent step, a figure-ground segmentation based on weighted total variation produces the final segmentation result by regularizing the CNN predictions. We present both quantitative and qualitative segmentation results on the recently released and publicly available Warwick-QU colon adenocarcinoma dataset associated with the GlaS@MICCAI2015 challenge and compare our approach to the simultaneously developed other approaches that participated in the same challenge. On two test sets, we demonstrate our segmentation performance and show that we achieve a tissue classification accuracy of 98% and 95%, making use of the inherent capability of our system to distinguish between benign and malignant tissue. Our results show that deep learning approaches can yield highly accurate and reproducible results for biomedical image analysis, with the potential to significantly improve the quality and speed of medical diagnoses. PMID:29018612
DeepPap: Deep Convolutional Networks for Cervical Cell Classification.
Zhang, Ling; Le Lu; Nogues, Isabella; Summers, Ronald M; Liu, Shaoxiong; Yao, Jianhua
2017-11-01
Automation-assisted cervical screening via Pap smear or liquid-based cytology (LBC) is a highly effective cell imaging based cancer detection tool, where cells are partitioned into "abnormal" and "normal" categories. However, the success of most traditional classification methods relies on the presence of accurate cell segmentations. Despite sixty years of research in this field, accurate segmentation remains a challenge in the presence of cell clusters and pathologies. Moreover, previous classification methods are only built upon the extraction of hand-crafted features, such as morphology and texture. This paper addresses these limitations by proposing a method to directly classify cervical cells-without prior segmentation-based on deep features, using convolutional neural networks (ConvNets). First, the ConvNet is pretrained on a natural image dataset. It is subsequently fine-tuned on a cervical cell dataset consisting of adaptively resampled image patches coarsely centered on the nuclei. In the testing phase, aggregation is used to average the prediction scores of a similar set of image patches. The proposed method is evaluated on both Pap smear and LBC datasets. Results show that our method outperforms previous algorithms in classification accuracy (98.3%), area under the curve (0.99) values, and especially specificity (98.3%), when applied to the Herlev benchmark Pap smear dataset and evaluated using five-fold cross validation. Similar superior performances are also achieved on the HEMLBC (H&E stained manual LBC) dataset. Our method is promising for the development of automation-assisted reading systems in primary cervical screening.
Bag-of-features approach for improvement of lung tissue classification in diffuse lung disease
NASA Astrophysics Data System (ADS)
Kato, Noriji; Fukui, Motofumi; Isozaki, Takashi
2009-02-01
Many automated techniques have been proposed to classify diffuse lung disease patterns. Most of the techniques utilize texture analysis approaches with second and higher order statistics, and show successful classification result among various lung tissue patterns. However, the approaches do not work well for the patterns with inhomogeneous texture distribution within a region of interest (ROI), such as reticular and honeycombing patterns, because the statistics can only capture averaged feature over the ROI. In this work, we have introduced the bag-of-features approach to overcome this difficulty. In the approach, texture images are represented as histograms or distributions of a few basic primitives, which are obtained by clustering local image features. The intensity descriptor and the Scale Invariant Feature Transformation (SIFT) descriptor are utilized to extract the local features, which have significant discriminatory power due to their specificity to a particular image class. In contrast, the drawback of the local features is lack of invariance under translation and rotation. We improved the invariance by sampling many local regions so that the distribution of the local features is unchanged. We evaluated the performance of our system in the classification task with 5 image classes (ground glass, reticular, honeycombing, emphysema, and normal) using 1109 ROIs from 211 patients. Our system achieved high classification accuracy of 92.8%, which is superior to that of the conventional system with the gray level co-occurrence matrix (GLCM) feature especially for inhomogeneous texture patterns.
Baltzer, Pascal A T; Dietzel, Matthias; Kaiser, Werner A
2013-08-01
In the face of multiple available diagnostic criteria in MR-mammography (MRM), a practical algorithm for lesion classification is needed. Such an algorithm should be as simple as possible and include only important independent lesion features to differentiate benign from malignant lesions. This investigation aimed to develop a simple classification tree for differential diagnosis in MRM. A total of 1,084 lesions in standardised MRM with subsequent histological verification (648 malignant, 436 benign) were investigated. Seventeen lesion criteria were assessed by 2 readers in consensus. Classification analysis was performed using the chi-squared automatic interaction detection (CHAID) method. Results include the probability for malignancy for every descriptor combination in the classification tree. A classification tree incorporating 5 lesion descriptors with a depth of 3 ramifications (1, root sign; 2, delayed enhancement pattern; 3, border, internal enhancement and oedema) was calculated. Of all 1,084 lesions, 262 (40.4 %) and 106 (24.3 %) could be classified as malignant and benign with an accuracy above 95 %, respectively. Overall diagnostic accuracy was 88.4 %. The classification algorithm reduced the number of categorical descriptors from 17 to 5 (29.4 %), resulting in a high classification accuracy. More than one third of all lesions could be classified with accuracy above 95 %. • A practical algorithm has been developed to classify lesions found in MR-mammography. • A simple decision tree consisting of five criteria reaches high accuracy of 88.4 %. • Unique to this approach, each classification is associated with a diagnostic certainty. • Diagnostic certainty of greater than 95 % is achieved in 34 % of all cases.
Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks.
Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R; Nguyen, Tuan N; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T
2017-01-01
This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively.
Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks
Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R.; Nguyen, Tuan N.; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T.
2017-01-01
This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively. PMID:28326009
Zhang, Xiaoheng; Wang, Lirui; Cao, Yao; Wang, Pin; Zhang, Cheng; Yang, Liuyang; Li, Yongming; Zhang, Yanling; Cheng, Oumei
2018-02-01
Diagnosis of Parkinson's disease (PD) based on speech data has been proved to be an effective way in recent years. However, current researches just care about the feature extraction and classifier design, and do not consider the instance selection. Former research by authors showed that the instance selection can lead to improvement on classification accuracy. However, no attention is paid on the relationship between speech sample and feature until now. Therefore, a new diagnosis algorithm of PD is proposed in this paper by simultaneously selecting speech sample and feature based on relevant feature weighting algorithm and multiple kernel method, so as to find their synergy effects, thereby improving classification accuracy. Experimental results showed that this proposed algorithm obtained apparent improvement on classification accuracy. It can obtain mean classification accuracy of 82.5%, which was 30.5% higher than the relevant algorithm. Besides, the proposed algorithm detected the synergy effects of speech sample and feature, which is valuable for speech marker extraction.
A neural network approach to cloud classification
NASA Technical Reports Server (NTRS)
Lee, Jonathan; Weger, Ronald C.; Sengupta, Sailes K.; Welch, Ronald M.
1990-01-01
It is shown that, using high-spatial-resolution data, very high cloud classification accuracies can be obtained with a neural network approach. A texture-based neural network classifier using only single-channel visible Landsat MSS imagery achieves an overall cloud identification accuracy of 93 percent. Cirrus can be distinguished from boundary layer cloudiness with an accuracy of 96 percent, without the use of an infrared channel. Stratocumulus is retrieved with an accuracy of 92 percent, cumulus at 90 percent. The use of the neural network does not improve cirrus classification accuracy. Rather, its main effect is in the improved separation between stratocumulus and cumulus cloudiness. While most cloud classification algorithms rely on linear parametric schemes, the present study is based on a nonlinear, nonparametric four-layer neural network approach. A three-layer neural network architecture, the nonparametric K-nearest neighbor approach, and the linear stepwise discriminant analysis procedure are compared. A significant finding is that significantly higher accuracies are attained with the nonparametric approaches using only 20 percent of the database as training data, compared to 67 percent of the database in the linear approach.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Getman, Daniel J
2008-01-01
Many attempts to observe changes in terrestrial systems over time would be significantly enhanced if it were possible to improve the accuracy of classifications of low-resolution historic satellite data. In an effort to examine improving the accuracy of historic satellite image classification by combining satellite and air photo data, two experiments were undertaken in which low-resolution multispectral data and high-resolution panchromatic data were combined and then classified using the ECHO spectral-spatial image classification algorithm and the Maximum Likelihood technique. The multispectral data consisted of 6 multispectral channels (30-meter pixel resolution) from Landsat 7. These data were augmented with panchromatic datamore » (15m pixel resolution) from Landsat 7 in the first experiment, and with a mosaic of digital aerial photography (1m pixel resolution) in the second. The addition of the Landsat 7 panchromatic data provided a significant improvement in the accuracy of classifications made using the ECHO algorithm. Although the inclusion of aerial photography provided an improvement in accuracy, this improvement was only statistically significant at a 40-60% level. These results suggest that once error levels associated with combining aerial photography and multispectral satellite data are reduced, this approach has the potential to significantly enhance the precision and accuracy of classifications made using historic remotely sensed data, as a way to extend the time range of efforts to track temporal changes in terrestrial systems.« less
2011-01-01
Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most sensitivity was around or even lower than a median value of 0.5. Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing. PMID:21849043
ERIC Educational Resources Information Center
Wyse, Adam E.; Babcock, Ben
2016-01-01
A common suggestion made in the psychometric literature for fixed-length classification tests is that one should design tests so that they have maximum information at the cut score. Designing tests in this way is believed to maximize the classification accuracy and consistency of the assessment. This article uses simulated examples to illustrate…
Muñoz-Ruiz, Miguel Ángel; Hartikainen, Päivi; Koikkalainen, Juha; Wolz, Robin; Julkunen, Valtteri; Niskanen, Eini; Herukka, Sanna-Kaisa; Kivipelto, Miia; Vanninen, Ritva; Rueckert, Daniel; Liu, Yawu; Lötjönen, Jyrki; Soininen, Hilkka
2012-01-01
Background MRI is an important clinical tool for diagnosing dementia-like diseases such as Frontemporal Dementia (FTD). However there is a need to develop more accurate and standardized MRI analysis methods. Objective To compare FTD with Alzheimer’s Disease (AD) and Mild Cognitive Impairment (MCI) with three automatic MRI analysis methods - Hippocampal Volumetry (HV), Tensor-based Morphometry (TBM) and Voxel-based Morphometry (VBM), in specific regions of interest in order to determine the highest classification accuracy. Methods Thirty-seven patients with FTD, 46 patients with AD, 26 control subjects, 16 patients with progressive MCI (PMCI) and 48 patients with stable MCI (SMCI) were examined with HV, TBM for shape change, and VBM for gray matter density. We calculated the Correct Classification Rate (CCR), sensitivity (SS) and specificity (SP) between the study groups. Results We found unequivocal results differentiating controls from FTD with HV (hippocampus left side) (CCR = 0.83; SS = 0.84; SP = 0.80), with TBM (hippocampus and amygdala (CCR = 0.80/SS = 0.71/SP = 0.94), and with VBM (all the regions studied, especially in lateral ventricle frontal horn, central part and occipital horn) (CCR = 0.87/SS = 0.81/SP = 0.96). VBM achieved the highest accuracy in differentiating AD and FTD (CCR = 0.72/SS = 0.67/SP = 0.76), particularly in lateral ventricle (frontal horn, central part and occipital horn) (CCR = 0.73), whereas TBM in superior frontal gyrus also achieved a high accuracy (CCR = 0.71/SS = 0.68/SP = 0.73). TBM resulted in low accuracy (CCR = 0.62) in the differentiation of AD from FTD using all regions of interest, with similar results for HV (CCR = 0.55). Conclusion Hippocampal atrophy is present not only in AD but also in FTD. Of the methods used, VBM achieved the highest accuracy in its ability to differentiate between FTD and AD. PMID:23285078
Developing collaborative classifiers using an expert-based model
Mountrakis, G.; Watts, R.; Luo, L.; Wang, Jingyuan
2009-01-01
This paper presents a hierarchical, multi-stage adaptive strategy for image classification. We iteratively apply various classification methods (e.g., decision trees, neural networks), identify regions of parametric and geographic space where accuracy is low, and in these regions, test and apply alternate methods repeating the process until the entire image is classified. Currently, classifiers are evaluated through human input using an expert-based system; therefore, this paper acts as the proof of concept for collaborative classifiers. Because we decompose the problem into smaller, more manageable sub-tasks, our classification exhibits increased flexibility compared to existing methods since classification methods are tailored to the idiosyncrasies of specific regions. A major benefit of our approach is its scalability and collaborative support since selected low-accuracy classifiers can be easily replaced with others without affecting classification accuracy in high accuracy areas. At each stage, we develop spatially explicit accuracy metrics that provide straightforward assessment of results by non-experts and point to areas that need algorithmic improvement or ancillary data. Our approach is demonstrated in the task of detecting impervious surface areas, an important indicator for human-induced alterations to the environment, using a 2001 Landsat scene from Las Vegas, Nevada. ?? 2009 American Society for Photogrammetry and Remote Sensing.
Schmidt, Robert L; Walker, Brandon S; Cohen, Michael B
2015-03-01
Reliable estimates of accuracy are important for any diagnostic test. Diagnostic accuracy studies are subject to unique sources of bias. Verification bias and classification bias are 2 sources of bias that commonly occur in diagnostic accuracy studies. Statistical methods are available to estimate the impact of these sources of bias when they occur alone. The impact of interactions when these types of bias occur together has not been investigated. We developed mathematical relationships to show the combined effect of verification bias and classification bias. A wide range of case scenarios were generated to assess the impact of bias components and interactions on total bias. Interactions between verification bias and classification bias caused overestimation of sensitivity and underestimation of specificity. Interactions had more effect on sensitivity than specificity. Sensitivity was overestimated by at least 7% in approximately 6% of the tested scenarios. Specificity was underestimated by at least 7% in less than 0.1% of the scenarios. Interactions between verification bias and classification bias create distortions in accuracy estimates that are greater than would be predicted from each source of bias acting independently. © 2014 American Cancer Society.
Compensatory neurofuzzy model for discrete data classification in biomedical
NASA Astrophysics Data System (ADS)
Ceylan, Rahime
2015-03-01
Biomedical data is separated to two main sections: signals and discrete data. So, studies in this area are about biomedical signal classification or biomedical discrete data classification. There are artificial intelligence models which are relevant to classification of ECG, EMG or EEG signals. In same way, in literature, many models exist for classification of discrete data taken as value of samples which can be results of blood analysis or biopsy in medical process. Each algorithm could not achieve high accuracy rate on classification of signal and discrete data. In this study, compensatory neurofuzzy network model is presented for classification of discrete data in biomedical pattern recognition area. The compensatory neurofuzzy network has a hybrid and binary classifier. In this system, the parameters of fuzzy systems are updated by backpropagation algorithm. The realized classifier model is conducted to two benchmark datasets (Wisconsin Breast Cancer dataset and Pima Indian Diabetes dataset). Experimental studies show that compensatory neurofuzzy network model achieved 96.11% accuracy rate in classification of breast cancer dataset and 69.08% accuracy rate was obtained in experiments made on diabetes dataset with only 10 iterations.
NASA Technical Reports Server (NTRS)
Nalepka, R. F. (Principal Investigator); Sadowski, F. E.; Sarno, J. E.
1976-01-01
The author has identified the following significant results. A supervised classification within two separate ground areas of the Sam Houston National Forest was carried out for two sq meters spatial resolution MSS data. Data were progressively coarsened to simulate five additional cases of spatial resolution ranging up to 64 sq meters. Similar processing and analysis of all spatial resolutions enabled evaluations of the effect of spatial resolution on classification accuracy for various levels of detail and the effects on area proportion estimation for very general forest features. For very coarse resolutions, a subset of spectral channels which simulated the proposed thematic mapper channels was used to study classification accuracy.
The use of Landsat data to inventory cotton and soybean acreage in North Alabama
NASA Technical Reports Server (NTRS)
Downs, S. W., Jr.; Faust, N. L.
1980-01-01
This study was performed to determine if Landsat data could be used to improve the accuracy of the estimation of cotton acreage. A linear classification algorithm and a maximum likelihood algorithm were used for computer classification of the area, and the classification was compared with ground truth. The classification accuracy for some fields was greater than 90 percent; however, the overall accuracy was 71 percent for cotton and 56 percent for soybeans. The results of this research indicate that computer analysis of Landsat data has potential for improving upon the methods presently being used to determine cotton acreage; however, additional experiments and refinements are needed before the method can be used operationally.
NASA Technical Reports Server (NTRS)
Card, Don H.; Strong, Laurence L.
1989-01-01
An application of a classification accuracy assessment procedure is described for a vegetation and land cover map prepared by digital image processing of LANDSAT multispectral scanner data. A statistical sampling procedure called Stratified Plurality Sampling was used to assess the accuracy of portions of a map of the Arctic National Wildlife Refuge coastal plain. Results are tabulated as percent correct classification overall as well as per category with associated confidence intervals. Although values of percent correct were disappointingly low for most categories, the study was useful in highlighting sources of classification error and demonstrating shortcomings of the plurality sampling method.
Bae, Hyoung Won; Lee, Sang Yeop; Kim, Sangah; Park, Chan Keum; Lee, Kwanghyun; Kim, Chan Yun; Seong, Gong Je
2018-01-01
To assess whether the asymmetry in the peripapillary retinal nerve fiber layer (pRNFL) thickness between superior and inferior hemispheres on optical coherence tomography (OCT) is useful for early detection of glaucoma. The patient population consisted of Training set (a total of 60 subjects with early glaucoma and 59 normal subjects) and Validation set (30 subjects with early glaucoma and 30 normal subjects). Two kinds of ratios were employed to measure the asymmetry between the superior and inferior pRNFL thickness using OCT. One was the ratio of the superior to inferior peak thicknesses (peak pRNFL thickness ratio; PTR), and the other was the ratio of the superior to inferior average thickness (average pRNFL thickness ratio; ATR). The diagnostic abilities of the PTR and ATR were compared to the color code classification in OCT. Using the optimal cut-off values of the PTR and ATR obtained from the Training set, the two ratios were independently validated for diagnostic capability. For the Training set, the sensitivities/specificities of the PTR, ATR, quadrants color code classification, and clock-hour color code classification were 81.7%/93.2%, 71.7%/74.6%, 75.0%/93.2%, and 75.0%/79.7%, respectively. The PTR showed a better diagnostic performance for early glaucoma detection than the ATR and the clock-hour color code classification in terms of areas under the receiver operating characteristic curves (AUCs) (0.898, 0.765, and 0.773, respectively). For the Validation set, the PTR also showed the best sensitivity and AUC. The PTR is a simple method with considerable diagnostic ability for early glaucoma detection. It can, therefore, be widely used as a new screening method for early glaucoma. © Copyright: Yonsei University College of Medicine 2018
Interactive lesion segmentation with shape priors from offline and online learning.
Shepherd, Tony; Prince, Simon J D; Alexander, Daniel C
2012-09-01
In medical image segmentation, tumors and other lesions demand the highest levels of accuracy but still call for the highest levels of manual delineation. One factor holding back automatic segmentation is the exemption of pathological regions from shape modelling techniques that rely on high-level shape information not offered by lesions. This paper introduces two new statistical shape models (SSMs) that combine radial shape parameterization with machine learning techniques from the field of nonlinear time series analysis. We then develop two dynamic contour models (DCMs) using the new SSMs as shape priors for tumor and lesion segmentation. From training data, the SSMs learn the lower level shape information of boundary fluctuations, which we prove to be nevertheless highly discriminant. One of the new DCMs also uses online learning to refine the shape prior for the lesion of interest based on user interactions. Classification experiments reveal superior sensitivity and specificity of the new shape priors over those previously used to constrain DCMs. User trials with the new interactive algorithms show that the shape priors are directly responsible for improvements in accuracy and reductions in user demand.
Singha, Mrinal; Wu, Bingfang; Zhang, Miao
2016-01-01
Accurate and timely mapping of paddy rice is vital for food security and environmental sustainability. This study evaluates the utility of temporal features extracted from coarse resolution data for object-based paddy rice classification of fine resolution data. The coarse resolution vegetation index data is first fused with the fine resolution data to generate the time series fine resolution data. Temporal features are extracted from the fused data and added with the multi-spectral data to improve the classification accuracy. Temporal features provided the crop growth information, while multi-spectral data provided the pattern variation of paddy rice. The achieved overall classification accuracy and kappa coefficient were 84.37% and 0.68, respectively. The results indicate that the use of temporal features improved the overall classification accuracy of a single-date multi-spectral image by 18.75% from 65.62% to 84.37%. The minimum sensitivity (MS) of the paddy rice classification has also been improved. The comparison showed that the mapped paddy area was analogous to the agricultural statistics at the district level. This work also highlighted the importance of feature selection to achieve higher classification accuracies. These results demonstrate the potential of the combined use of temporal and spectral features for accurate paddy rice classification. PMID:28025525
Janousova, Eva; Schwarz, Daniel; Kasparek, Tomas
2015-06-30
We investigated a combination of three classification algorithms, namely the modified maximum uncertainty linear discriminant analysis (mMLDA), the centroid method, and the average linkage, with three types of features extracted from three-dimensional T1-weighted magnetic resonance (MR) brain images, specifically MR intensities, grey matter densities, and local deformations for distinguishing 49 first episode schizophrenia male patients from 49 healthy male subjects. The feature sets were reduced using intersubject principal component analysis before classification. By combining the classifiers, we were able to obtain slightly improved results when compared with single classifiers. The best classification performance (81.6% accuracy, 75.5% sensitivity, and 87.8% specificity) was significantly better than classification by chance. We also showed that classifiers based on features calculated using more computation-intensive image preprocessing perform better; mMLDA with classification boundary calculated as weighted mean discriminative scores of the groups had improved sensitivity but similar accuracy compared to the original MLDA; reducing a number of eigenvectors during data reduction did not always lead to higher classification accuracy, since noise as well as the signal important for classification were removed. Our findings provide important information for schizophrenia research and may improve accuracy of computer-aided diagnostics of neuropsychiatric diseases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Singha, Mrinal; Wu, Bingfang; Zhang, Miao
2016-12-22
Accurate and timely mapping of paddy rice is vital for food security and environmental sustainability. This study evaluates the utility of temporal features extracted from coarse resolution data for object-based paddy rice classification of fine resolution data. The coarse resolution vegetation index data is first fused with the fine resolution data to generate the time series fine resolution data. Temporal features are extracted from the fused data and added with the multi-spectral data to improve the classification accuracy. Temporal features provided the crop growth information, while multi-spectral data provided the pattern variation of paddy rice. The achieved overall classification accuracy and kappa coefficient were 84.37% and 0.68, respectively. The results indicate that the use of temporal features improved the overall classification accuracy of a single-date multi-spectral image by 18.75% from 65.62% to 84.37%. The minimum sensitivity (MS) of the paddy rice classification has also been improved. The comparison showed that the mapped paddy area was analogous to the agricultural statistics at the district level. This work also highlighted the importance of feature selection to achieve higher classification accuracies. These results demonstrate the potential of the combined use of temporal and spectral features for accurate paddy rice classification.
Automatic classification of protein structures using physicochemical parameters.
Mohan, Abhilash; Rao, M Divya; Sunderrajan, Shruthi; Pennathur, Gautam
2014-09-01
Protein classification is the first step to functional annotation; SCOP and Pfam databases are currently the most relevant protein classification schemes. However, the disproportion in the number of three dimensional (3D) protein structures generated versus their classification into relevant superfamilies/families emphasizes the need for automated classification schemes. Predicting function of novel proteins based on sequence information alone has proven to be a major challenge. The present study focuses on the use of physicochemical parameters in conjunction with machine learning algorithms (Naive Bayes, Decision Trees, Random Forest and Support Vector Machines) to classify proteins into their respective SCOP superfamily/Pfam family, using sequence derived information. Spectrophores™, a 1D descriptor of the 3D molecular field surrounding a structure was used as a benchmark to compare the performance of the physicochemical parameters. The machine learning algorithms were modified to select features based on information gain for each SCOP superfamily/Pfam family. The effect of combining physicochemical parameters and spectrophores on classification accuracy (CA) was studied. Machine learning algorithms trained with the physicochemical parameters consistently classified SCOP superfamilies and Pfam families with a classification accuracy above 90%, while spectrophores performed with a CA of around 85%. Feature selection improved classification accuracy for both physicochemical parameters and spectrophores based machine learning algorithms. Combining both attributes resulted in a marginal loss of performance. Physicochemical parameters were able to classify proteins from both schemes with classification accuracy ranging from 90-96%. These results suggest the usefulness of this method in classifying proteins from amino acid sequences.
Classification accuracy for stratification with remotely sensed data
Raymond L. Czaplewski; Paul L. Patterson
2003-01-01
Tools are developed that help specify the classification accuracy required from remotely sensed data. These tools are applied during the planning stage of a sample survey that will use poststratification, prestratification with proportional allocation, or double sampling for stratification. Accuracy standards are developed in terms of an âerror matrix,â which is...
NASA Astrophysics Data System (ADS)
Schmalz, M.; Ritter, G.; Key, R.
Accurate and computationally efficient spectral signature classification is a crucial step in the nonimaging detection and recognition of spaceborne objects. In classical hyperspectral recognition applications using linear mixing models, signature classification accuracy depends on accurate spectral endmember discrimination [1]. If the endmembers cannot be classified correctly, then the signatures cannot be classified correctly, and object recognition from hyperspectral data will be inaccurate. In practice, the number of endmembers accurately classified often depends linearly on the number of inputs. This can lead to potentially severe classification errors in the presence of noise or densely interleaved signatures. In this paper, we present an comparison of emerging technologies for nonimaging spectral signature classfication based on a highly accurate, efficient search engine called Tabular Nearest Neighbor Encoding (TNE) [3,4] and a neural network technology called Morphological Neural Networks (MNNs) [5]. Based on prior results, TNE can optimize its classifier performance to track input nonergodicities, as well as yield measures of confidence or caution for evaluation of classification results. Unlike neural networks, TNE does not have a hidden intermediate data structure (e.g., the neural net weight matrix). Instead, TNE generates and exploits a user-accessible data structure called the agreement map (AM), which can be manipulated by Boolean logic operations to effect accurate classifier refinement algorithms. The open architecture and programmability of TNE's agreement map processing allows a TNE programmer or user to determine classification accuracy, as well as characterize in detail the signatures for which TNE did not obtain classification matches, and why such mis-matches occurred. In this study, we will compare TNE and MNN based endmember classification, using performance metrics such as probability of correct classification (Pd) and rate of false detections (Rfa). As proof of principle, we analyze classification of multiple closely spaced signatures from a NASA database of space material signatures. Additional analysis pertains to computational complexity and noise sensitivity, which are superior to Bayesian techniques based on classical neural networks. [1] Winter, M.E. "Fast autonomous spectral end-member determination in hyperspectral data," in Proceedings of the 13th International Conference On Applied Geologic Remote Sensing, Vancouver, B.C., Canada, pp. 337-44 (1999). [2] N. Keshava, "A survey of spectral unmixing algorithms," Lincoln Laboratory Journal 14:55-78 (2003). [3] Key, G., M.S. SCHMALZ, F.M. Caimi, and G.X. Ritter. "Performance analysis of tabular nearest neighbor encoding algorithm for joint compression and ATR", in Proceedings SPIE 3814:115-126 (1999). [4] Schmalz, M.S. and G. Key. "Algorithms for hyperspectral signature classification in unresolved object detection using tabular nearest neighbor encoding" in Proceedings of the 2007 AMOS Conference, Maui HI (2007). [5] Ritter, G.X., G. Urcid, and M.S. Schmalz. "Autonomous single-pass endmember approximation using lattice auto-associative memories", Neurocomputing (Elsevier), accepted (June 2008).
ERIC Educational Resources Information Center
Bramley, Tom
2010-01-01
Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…
Bahadure, Nilesh Bhaskarrao; Ray, Arun Kumar; Thethi, Har Pal
2018-01-17
The detection of a brain tumor and its classification from modern imaging modalities is a primary concern, but a time-consuming and tedious work was performed by radiologists or clinical supervisors. The accuracy of detection and classification of tumor stages performed by radiologists is depended on their experience only, so the computer-aided technology is very important to aid with the diagnosis accuracy. In this study, to improve the performance of tumor detection, we investigated comparative approach of different segmentation techniques and selected the best one by comparing their segmentation score. Further, to improve the classification accuracy, the genetic algorithm is employed for the automatic classification of tumor stage. The decision of classification stage is supported by extracting relevant features and area calculation. The experimental results of proposed technique are evaluated and validated for performance and quality analysis on magnetic resonance brain images, based on segmentation score, accuracy, sensitivity, specificity, and dice similarity index coefficient. The experimental results achieved 92.03% accuracy, 91.42% specificity, 92.36% sensitivity, and an average segmentation score between 0.82 and 0.93 demonstrating the effectiveness of the proposed technique for identifying normal and abnormal tissues from brain MR images. The experimental results also obtained an average of 93.79% dice similarity index coefficient, which indicates better overlap between the automated extracted tumor regions with manually extracted tumor region by radiologists.
Thematic accuracy of the National Land Cover Database (NLCD) 2001 land cover for Alaska
Selkowitz, D.J.; Stehman, S.V.
2011-01-01
The National Land Cover Database (NLCD) 2001 Alaska land cover classification is the first 30-m resolution land cover product available covering the entire state of Alaska. The accuracy assessment of the NLCD 2001 Alaska land cover classification employed a geographically stratified three-stage sampling design to select the reference sample of pixels. Reference land cover class labels were determined via fixed wing aircraft, as the high resolution imagery used for determining the reference land cover classification in the conterminous U.S. was not available for most of Alaska. Overall thematic accuracy for the Alaska NLCD was 76.2% (s.e. 2.8%) at Level II (12 classes evaluated) and 83.9% (s.e. 2.1%) at Level I (6 classes evaluated) when agreement was defined as a match between the map class and either the primary or alternate reference class label. When agreement was defined as a match between the map class and primary reference label only, overall accuracy was 59.4% at Level II and 69.3% at Level I. The majority of classification errors occurred at Level I of the classification hierarchy (i.e., misclassifications were generally to a different Level I class, not to a Level II class within the same Level I class). Classification accuracy was higher for more abundant land cover classes and for pixels located in the interior of homogeneous land cover patches. ?? 2011.
NASA Astrophysics Data System (ADS)
Kurniawan, Dian; Suparti; Sugito
2018-05-01
Population growth in Indonesia has increased every year. According to the population census conducted by the Central Bureau of Statistics (BPS) in 2010, the population of Indonesia has reached 237.6 million people. Therefore, to control the population growth rate, the government hold Family Planning or Keluarga Berencana (KB) program for couples of childbearing age. The purpose of this program is to improve the health of mothers and children in order to manifest prosperous society by controlling births while ensuring control of population growth. The data used in this study is the updated family data of Semarang city in 2016 that conducted by National Family Planning Coordinating Board (BKKBN). From these data, classifiers with kernel discriminant analysis will be obtained, and also classification accuracy will be obtained from that method. The result of the analysis showed that normal kernel discriminant analysis gives 71.05 % classification accuracy with 28.95 % classification error. Whereas triweight kernel discriminant analysis gives 73.68 % classification accuracy with 26.32 % classification error. Using triweight kernel discriminant for data preprocessing of family planning participation of childbearing age couples in Semarang City of 2016 can be stated better than with normal kernel discriminant.
Madison, Matthew J; Bradshaw, Laine P
2015-06-01
Diagnostic classification models are psychometric models that aim to classify examinees according to their mastery or non-mastery of specified latent characteristics. These models are well-suited for providing diagnostic feedback on educational assessments because of their practical efficiency and increased reliability when compared with other multidimensional measurement models. A priori specifications of which latent characteristics or attributes are measured by each item are a core element of the diagnostic assessment design. This item-attribute alignment, expressed in a Q-matrix, precedes and supports any inference resulting from the application of the diagnostic classification model. This study investigates the effects of Q-matrix design on classification accuracy for the log-linear cognitive diagnosis model. Results indicate that classification accuracy, reliability, and convergence rates improve when the Q-matrix contains isolated information from each measured attribute.
The study of vehicle classification equipment with solutions to improve accuracy in Oklahoma.
DOT National Transportation Integrated Search
2014-12-01
The accuracy of vehicle counting and classification data is vital for appropriate future highway and road : design, including determining pavement characteristics, eliminating traffic jams, and improving safety. : Organizations relying on vehicle cla...
Comparison of Endoanal Ultrasound with Clinical Diagnosis in Anal Fistula Assessment.
Sirikurnpiboon, Siripong; Phadhana-anake, Oradee; Awapittaya, Burin
2016-02-01
Anal fistula anatomy and its relationship with anal sphincters are important factors influencing the results of surgical management. Pre-operative definitions of fistulous track(s) and the internal opening play a primary role in minimizing damage to the sphincters and recurrence of the fistula. To evaluate the relative accuracy of digital examination and endoanal ultrasound for pre-operative assessment of anal fistula by comparing operative findings. A retrospective review was conducted of all patients with anal fistula admitted to the surgical unit between May 2008 and May 2012. Physical examination and hydrogen peroxide-enhanced endoanal ultrasound (utilising a 10 MHz endoprobe, HITACHI: EUB-7500), were performed in 142 consecutive patients. Results were matched with surgical features to establish their accuracy in preoperative anal fistula assessment. A total of 142 patients (107 men, 35 women), 28 of whom had had previous surgery, were included in the study. Their mean age was 40 (range 18-71) years and their mean BMI was 26.37 (range 17.30-36.11) kg/m². The majority of the fistulas were transphincteric (90.4%) and the rest were intersphincteric (9.6%). The accuracy rates of clinical examination and endoanal ultrasound were 55.63 and 95.07 percent (p < 0.01), respectively. Endoanal ultrasound is superior to digital examination for pre-operative classification of anal fistula
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds.
Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M; Bloom, Peter H; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds
Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data. PMID:28403159
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds
Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael J.; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.
Prediction of individual season of birth using MRI.
Pantazatos, Spiro P
2014-03-01
Previous research suggests statistical associations between season of birth (SOB) with prevalence of neurobehavioral disorders such as schizophrenia and bipolar disorder, personality traits such as novelty and sensation seeking, and suicidal behavior. These effects are thought to be mediated by seasonal differences in perinatal photoperiod, which was recently shown to imprint circadian clock neurons and behavior in rodents. However, it is unknown whether SOB is associated with any measurable differences in the normal human adult brain, and whether individual SOB can be deduced based on phenotype. Here I show that SOB predicts morphological differences in brain structure, and that MRI scans carry spatially distributed information allowing significantly above chance prediction of an individual's SOB. Using an open source database of over 550 structural brain scans, Voxel-Based Morphometry (VBM) analysis showed a significant SOB effect in the left superior temporal gyrus (STG) in males (p=0.009, FWE whole-brain corrected), with greater gray matter volumes in fall and winter births. A cosinor analysis revealed a significant annual periodicity in the left STG gray matter volume (Zero Amplitude Test: p<5×10(-7)), with a peak towards the end of December and a nadir towards the end of June, suggesting that perinatal photoperiod accounts for this SOB effect. Whole-brain VBM maps were used as input features to multivariate machine-learning based analyses to classify SOB. Significantly greater than chance prediction was achieved in females (overall accuracy 35%, p<0.001), but not in males (overall accuracy 26%, p=0.45). Pairwise binary classification in females revealed that the highest discrimination was obtained for winter vs. summer classification (peak area under the ROC curve=0.71, p<0.0005). Discriminating regions included fusiform and middle temporal gyrus, inferior and superior parietal lobe, cerebellum, and dorsolateral and dorsomedial prefrontal cortex. Results indicate that SOB is detectable with MRI, imply that SOB exerts effects on the developing human brain that persist through adulthood, and suggest that neuroimaging may be a valuable intermediate phenotype in bridging the gap between SOB and personality and neurobehavioral disorders. Copyright © 2013 Elsevier Inc. All rights reserved.
Byun, Wonwoo; Lee, Jung-Min; Kim, Youngwon; Brusseau, Timothy A
2018-03-26
This study examined the accuracy of the Fitbit activity tracker (FF) for quantifying sedentary behavior (SB) and varying intensities of physical activity (PA) in 3-5-year-old children. Twenty-eight healthy preschool-aged children (Girls: 46%, Mean age: 4.8 ± 1.0 years) wore the FF and were directly observed while performing a set of various unstructured and structured free-living activities from sedentary to vigorous intensity. The classification accuracy of the FF for measuring SB, light PA (LPA), moderate-to-vigorous PA (MVPA), and total PA (TPA) was examined calculating Pearson correlation coefficients (r), mean absolute percent error (MAPE), Cohen's kappa ( k ), sensitivity (Se), specificity (Sp), and area under the receiver operating curve (ROC-AUC). The classification accuracies of the FF (ROC-AUC) were 0.92, 0.63, 0.77 and 0.92 for SB, LPA, MVPA and TPA, respectively. Similarly, values of kappa, Se, Sp and percentage of correct classification were consistently high for SB and TPA, but low for LPA and MVPA. The FF demonstrated excellent classification accuracy for assessing SB and TPA, but lower accuracy for classifying LPA and MVPA. Our findings suggest that the FF should be considered as a valid instrument for assessing time spent sedentary and overall physical activity in preschool-aged children.
Ground Truth Sampling and LANDSAT Accuracy Assessment
NASA Technical Reports Server (NTRS)
Robinson, J. W.; Gunther, F. J.; Campbell, W. J.
1982-01-01
It is noted that the key factor in any accuracy assessment of remote sensing data is the method used for determining the ground truth, independent of the remote sensing data itself. The sampling and accuracy procedures developed for nuclear power plant siting study are described. The purpose of the sampling procedure was to provide data for developing supervised classifications for two study sites and for assessing the accuracy of that and the other procedures used. The purpose of the accuracy assessment was to allow the comparison of the cost and accuracy of various classification procedures as applied to various data types.
Multi-Temporal Classification and Change Detection Using Uav Images
NASA Astrophysics Data System (ADS)
Makuti, S.; Nex, F.; Yang, M. Y.
2018-05-01
In this paper different methodologies for the classification and change detection of UAV image blocks are explored. UAV is not only the cheapest platform for image acquisition but it is also the easiest platform to operate in repeated data collections over a changing area like a building construction site. Two change detection techniques have been evaluated in this study: the pre-classification and the post-classification algorithms. These methods are based on three main steps: feature extraction, classification and change detection. A set of state of the art features have been used in the tests: colour features (HSV), textural features (GLCM) and 3D geometric features. For classification purposes Conditional Random Field (CRF) has been used: the unary potential was determined using the Random Forest algorithm while the pairwise potential was defined by the fully connected CRF. In the performed tests, different feature configurations and settings have been considered to assess the performance of these methods in such challenging task. Experimental results showed that the post-classification approach outperforms the pre-classification change detection method. This was analysed using the overall accuracy, where by post classification have an accuracy of up to 62.6 % and the pre classification change detection have an accuracy of 46.5 %. These results represent a first useful indication for future works and developments.
Automated structural classification of lipids by machine learning.
Taylor, Ryan; Miller, Ryan H; Miller, Ryan D; Porter, Michael; Dalgleish, James; Prince, John T
2015-03-01
Modern lipidomics is largely dependent upon structural ontologies because of the great diversity exhibited in the lipidome, but no automated lipid classification exists to facilitate this partitioning. The size of the putative lipidome far exceeds the number currently classified, despite a decade of work. Automated classification would benefit ongoing classification efforts by decreasing the time needed and increasing the accuracy of classification while providing classifications for mass spectral identification algorithms. We introduce a tool that automates classification into the LIPID MAPS ontology of known lipids with >95% accuracy and novel lipids with 63% accuracy. The classification is based upon simple chemical characteristics and modern machine learning algorithms. The decision trees produced are intelligible and can be used to clarify implicit assumptions about the current LIPID MAPS classification scheme. These characteristics and decision trees are made available to facilitate alternative implementations. We also discovered many hundreds of lipids that are currently misclassified in the LIPID MAPS database, strongly underscoring the need for automated classification. Source code and chemical characteristic lists as SMARTS search strings are available under an open-source license at https://www.github.com/princelab/lipid_classifier. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
NASA Technical Reports Server (NTRS)
Quattrochi, D. A.
1984-01-01
An initial analysis of LANDSAT 4 Thematic Mapper (TM) data for the discrimination of agricultural, forested wetland, and urban land covers is conducted using a scene of data collected over Arkansas and Tennessee. A classification of agricultural lands derived from multitemporal LANDSAT Multispectral Scanner (MSS) data is compared with a classification of TM data for the same area. Results from this comparative analysis show that the multitemporal MSS classification produced an overall accuracy of 80.91% while the TM classification yields an overall classification accuracy of 97.06% correct.
NASA Technical Reports Server (NTRS)
Robinson, Julie A.; Webb, Edward L.; Evangelista, Arlene
2000-01-01
Studies that utilize astronaut-acquired orbital photographs for visual or digital classification require high-quality data to ensure accuracy. The majority of images available must be digitized from film and electronically transferred to scientific users. This study examined the effect of scanning spatial resolution (1200, 2400 pixels per inch [21.2 and 10.6 microns/pixel]), scanning density range option (Auto, Full) and compression ratio (non-lossy [TIFF], and lossy JPEG 10:1, 46:1, 83:1) on digital classification results of an orbital photograph from the NASA - Johnson Space Center archive. Qualitative results suggested that 1200 ppi was acceptable for visual interpretive uses for major land cover types. Moreover, Auto scanning density range was superior to Full density range. Quantitative assessment of the processing steps indicated that, while 2400 ppi scanning spatial resolution resulted in more classified polygons as well as a substantially greater proportion of polygons < 0.2 ha, overall agreement between 1200 ppi and 2400 ppi was quite high. JPEG compression up to approximately 46:1 also did not appear to have a major impact on quantitative classification characteristics. We conclude that both 1200 and 2400 ppi scanning resolutions are acceptable options for this level of land cover classification, as well as a compression ratio at or below approximately 46:1. Auto range density should always be used during scanning because it acquires more of the information from the film. The particular combination of scanning spatial resolution and compression level will require a case-by-case decision and will depend upon memory capabilities, analytical objectives and the spatial properties of the objects in the image.
NASA Astrophysics Data System (ADS)
Luo, Yiping; Jiang, Ting; Gao, Shengli; Wang, Xin
2010-10-01
It presents a new approach for detecting building footprints in a combination of registered aerial image with multispectral bands and airborne laser scanning data synchronously obtained by Leica-Geosystems ALS40 and Applanix DACS-301 on the same platform. A two-step method for building detection was presented consisting of selecting 'building' candidate points and then classifying candidate points. A digital surface model(DSM) derived from last pulse laser scanning data was first filtered and the laser points were classified into classes 'ground' and 'building or tree' based on mathematic morphological filter. Then, 'ground' points were resample into digital elevation model(DEM), and a Normalized DSM(nDSM) was generated from DEM and DSM. The candidate points were selected from 'building or tree' points by height value and area threshold in nDSM. The candidate points were further classified into building points and tree points by using the support vector machines(SVM) classification method. Two classification tests were carried out using features only from laser scanning data and associated features from two input data sources. The features included height, height finite difference, RGB bands value, and so on. The RGB value of points was acquired by matching laser scanning data and image using collinear equation. The features of training points were presented as input data for SVM classification method, and cross validation was used to select best classification parameters. The determinant function could be constructed by the classification parameters and the class of candidate points was determined by determinant function. The result showed that associated features from two input data sources were superior to features only from laser scanning data. The accuracy of more than 90% was achieved for buildings in first kind of features.
Jiao, Y; Chen, R; Ke, X; Cheng, L; Chu, K; Lu, Z; Herskovits, E H
2011-01-01
Autism spectrum disorder (ASD) is a neurodevelopmental disorder, of which Asperger syndrome and high-functioning autism are subtypes. Our goal is: 1) to determine whether a diagnostic model based on single-nucleotide polymorphisms (SNPs), brain regional thickness measurements, or brain regional volume measurements can distinguish Asperger syndrome from high-functioning autism; and 2) to compare the SNP, thickness, and volume-based diagnostic models. Our study included 18 children with ASD: 13 subjects with high-functioning autism and 5 subjects with Asperger syndrome. For each child, we obtained 25 SNPs for 8 ASD-related genes; we also computed regional cortical thicknesses and volumes for 66 brain structures, based on structural magnetic resonance (MR) examination. To generate diagnostic models, we employed five machine-learning techniques: decision stump, alternating decision trees, multi-class alternating decision trees, logistic model trees, and support vector machines. For SNP-based classification, three decision-tree-based models performed better than the other two machine-learning models. The performance metrics for three decision-tree-based models were similar: decision stump was modestly better than the other two methods, with accuracy = 90%, sensitivity = 0.95 and specificity = 0.75. All thickness and volume-based diagnostic models performed poorly. The SNP-based diagnostic models were superior to those based on thickness and volume. For SNP-based classification, rs878960 in GABRB3 (gamma-aminobutyric acid A receptor, beta 3) was selected by all tree-based models. Our analysis demonstrated that SNP-based classification was more accurate than morphometry-based classification in ASD subtype classification. Also, we found that one SNP--rs878960 in GABRB3--distinguishes Asperger syndrome from high-functioning autism.
NASA Technical Reports Server (NTRS)
Rignot, Eric; Williams, Cynthia; Way, Jobea; Viereck, Leslie
1993-01-01
A maximum a posteriori Bayesian classifier for multifrequency polarimetric SAR data is used to perform a supervised classification of forest types in the floodplains of Alaska. The image classes include white spruce, balsam poplar, black spruce, alder, non-forests, and open water. The authors investigate the effect on classification accuracy of changing environmental conditions, and of frequency and polarization of the signal. The highest classification accuracy (86 percent correctly classified forest pixels, and 91 percent overall) is obtained combining L- and C-band frequencies fully polarimetric on a date where the forest is just recovering from flooding. The forest map compares favorably with a vegetation map assembled from digitized aerial photos which took five years for completion, and address the state of the forest in 1978, ignoring subsequent fires, changes in the course of the river, clear-cutting of trees, and tree growth. HV-polarization is the most useful polarization at L- and C-band for classification. C-band VV (ERS-1 mode) and L-band HH (J-ERS-1 mode) alone or combined yield unsatisfactory classification accuracies. Additional data acquired in the winter season during thawed and frozen days yield classification accuracies respectively 20 percent and 30 percent lower due to a greater confusion between conifers and deciduous trees. Data acquired at the peak of flooding in May 1991 also yield classification accuracies 10 percent lower because of dominant trunk-ground interactions which mask out finer differences in radar backscatter between tree species. Combination of several of these dates does not improve classification accuracy. For comparison, panchromatic optical data acquired by SPOT in the summer season of 1991 are used to classify the same area. The classification accuracy (78 percent for the forest types and 90 percent if open water is included) is lower than that obtained with AIRSAR although conifers and deciduous trees are better separated due to the presence of leaves on the deciduous trees. Optical data do not separate black spruce and white spruce as well as SAR data, cannot separate alder from balsam poplar, and are of course limited by the frequent cloud cover in the polar regions. Yet, combining SPOT and AIRSAR offers better chances to identify vegetation types independent of ground truth information using a combination of NDVI indexes from SPOT, biomass numbers from AIRSAR, and a segmentation map from either one.
NASA Astrophysics Data System (ADS)
Karakacan Kuzucu, A.; Bektas Balcik, F.
2017-11-01
Accurate and reliable land use/land cover (LULC) information obtained by remote sensing technology is necessary in many applications such as environmental monitoring, agricultural management, urban planning, hydrological applications, soil management, vegetation condition study and suitability analysis. But this information still remains a challenge especially in heterogeneous landscapes covering urban and rural areas due to spectrally similar LULC features. In parallel with technological developments, supplementary data such as satellite-derived spectral indices have begun to be used as additional bands in classification to produce data with high accuracy. The aim of this research is to test the potential of spectral vegetation indices combination with supervised classification methods and to extract reliable LULC information from SPOT 7 multispectral imagery. The Normalized Difference Vegetation Index (NDVI), the Ratio Vegetation Index (RATIO), the Soil Adjusted Vegetation Index (SAVI) were the three vegetation indices used in this study. The classical maximum likelihood classifier (MLC) and support vector machine (SVM) algorithm were applied to classify SPOT 7 image. Catalca is selected region located in the north west of the Istanbul in Turkey, which has complex landscape covering artificial surface, forest and natural area, agricultural field, quarry/mining area, pasture/scrubland and water body. Accuracy assessment of all classified images was performed through overall accuracy and kappa coefficient. The results indicated that the incorporation of these three different vegetation indices decrease the classification accuracy for the MLC and SVM classification. In addition, the maximum likelihood classification slightly outperformed the support vector machine classification approach in both overall accuracy and kappa statistics.
The Effect of Normalization in Violence Video Classification Performance
NASA Astrophysics Data System (ADS)
Ali, Ashikin; Senan, Norhalina
2017-08-01
Basically, data pre-processing is an important part of data mining. Normalization is a pre-processing stage for any type of problem statement, especially in video classification. Challenging problems that arises in video classification is because of the heterogeneous content, large variations in video quality and complex semantic meanings of the concepts involved. Therefore, to regularize this problem, it is thoughtful to ensure normalization or basically involvement of thorough pre-processing stage aids the robustness of classification performance. This process is to scale all the numeric variables into certain range to make it more meaningful for further phases in available data mining techniques. Thus, this paper attempts to examine the effect of 2 normalization techniques namely Min-max normalization and Z-score in violence video classifications towards the performance of classification rate using Multi-layer perceptron (MLP) classifier. Using Min-Max Normalization range of [0,1] the result shows almost 98% of accuracy, meanwhile Min-Max Normalization range of [-1,1] accuracy is 59% and for Z-score the accuracy is 50%.
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features.
Li, Linyi; Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images.
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features
Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images. PMID:28761440
Protein classification based on text document classification techniques.
Cheng, Betty Yee Man; Carbonell, Jaime G; Klein-Seetharaman, Judith
2005-03-01
The need for accurate, automated protein classification methods continues to increase as advances in biotechnology uncover new proteins. G-protein coupled receptors (GPCRs) are a particularly difficult superfamily of proteins to classify due to extreme diversity among its members. Previous comparisons of BLAST, k-nearest neighbor (k-NN), hidden markov model (HMM) and support vector machine (SVM) using alignment-based features have suggested that classifiers at the complexity of SVM are needed to attain high accuracy. Here, analogous to document classification, we applied Decision Tree and Naive Bayes classifiers with chi-square feature selection on counts of n-grams (i.e. short peptide sequences of length n) to this classification task. Using the GPCR dataset and evaluation protocol from the previous study, the Naive Bayes classifier attained an accuracy of 93.0 and 92.4% in level I and level II subfamily classification respectively, while SVM has a reported accuracy of 88.4 and 86.3%. This is a 39.7 and 44.5% reduction in residual error for level I and level II subfamily classification, respectively. The Decision Tree, while inferior to SVM, outperforms HMM in both level I and level II subfamily classification. For those GPCR families whose profiles are stored in the Protein FAMilies database of alignments and HMMs (PFAM), our method performs comparably to a search against those profiles. Finally, our method can be generalized to other protein families by applying it to the superfamily of nuclear receptors with 94.5, 97.8 and 93.6% accuracy in family, level I and level II subfamily classification respectively. Copyright 2005 Wiley-Liss, Inc.
Zhe Fan; Zhong Wang; Guanglin Li; Ruomei Wang
2016-08-01
Motion classification system based on surface Electromyography (sEMG) pattern recognition has achieved good results in experimental condition. But it is still a challenge for clinical implement and practical application. Many factors contribute to the difficulty of clinical use of the EMG based dexterous control. The most obvious and important is the noise in the EMG signal caused by electrode shift, muscle fatigue, motion artifact, inherent instability of signal and biological signals such as Electrocardiogram. In this paper, a novel method based on Canonical Correlation Analysis (CCA) was developed to eliminate the reduction of classification accuracy caused by electrode shift. The average classification accuracy of our method were above 95% for the healthy subjects. In the process, we validated the influence of electrode shift on motion classification accuracy and discovered the strong correlation with correlation coefficient of >0.9 between shift position data and normal position data.
Rifai Chai; Naik, Ganesh R; Tran, Yvonne; Sai Ho Ling; Craig, Ashley; Nguyen, Hung T
2015-08-01
An electroencephalography (EEG)-based counter measure device could be used for fatigue detection during driving. This paper explores the classification of fatigue and alert states using power spectral density (PSD) as a feature extractor and fuzzy swarm based-artificial neural network (ANN) as a classifier. An independent component analysis of entropy rate bound minimization (ICA-ERBM) is investigated as a novel source separation technique for fatigue classification using EEG analysis. A comparison of the classification accuracy of source separator versus no source separator is presented. Classification performance based on 43 participants without the inclusion of the source separator resulted in an overall sensitivity of 71.67%, a specificity of 75.63% and an accuracy of 73.65%. However, these results were improved after the inclusion of a source separator module, resulting in an overall sensitivity of 78.16%, a specificity of 79.60% and an accuracy of 78.88% (p <; 0.05).
Forest tree species discrimination in western Himalaya using EO-1 Hyperion
NASA Astrophysics Data System (ADS)
George, Rajee; Padalia, Hitendra; Kushwaha, S. P. S.
2014-05-01
The information acquired in the narrow bands of hyperspectral remote sensing data has potential to capture plant species spectral variability, thereby improving forest tree species mapping. This study assessed the utility of spaceborne EO-1 Hyperion data in discrimination and classification of broadleaved evergreen and conifer forest tree species in western Himalaya. The pre-processing of 242 bands of Hyperion data resulted into 160 noise-free and vertical stripe corrected reflectance bands. Of these, 29 bands were selected through step-wise exclusion of bands (Wilk's Lambda). Spectral Angle Mapper (SAM) and Support Vector Machine (SVM) algorithms were applied to the selected bands to assess their effectiveness in classification. SVM was also applied to broadband data (Landsat TM) to compare the variation in classification accuracy. All commonly occurring six gregarious tree species, viz., white oak, brown oak, chir pine, blue pine, cedar and fir in western Himalaya could be effectively discriminated. SVM produced a better species classification (overall accuracy 82.27%, kappa statistic 0.79) than SAM (overall accuracy 74.68%, kappa statistic 0.70). It was noticed that classification accuracy achieved with Hyperion bands was significantly higher than Landsat TM bands (overall accuracy 69.62%, kappa statistic 0.65). Study demonstrated the potential utility of narrow spectral bands of Hyperion data in discriminating tree species in a hilly terrain.
Boursier, Jérôme; Bertrais, Sandrine; Oberti, Frédéric; Gallois, Yves; Fouchard-Hubert, Isabelle; Rousselet, Marie-Christine; Zarski, Jean-Pierre; Calès, Paul
2011-11-30
Non-invasive tests have been constructed and evaluated mainly for binary diagnoses such as significant fibrosis. Recently, detailed fibrosis classifications for several non-invasive tests have been developed, but their accuracy has not been thoroughly evaluated in comparison to liver biopsy, especially in clinical practice and for Fibroscan. Therefore, the main aim of the present study was to evaluate the accuracy of detailed fibrosis classifications available for non-invasive tests and liver biopsy. The secondary aim was to validate these accuracies in independent populations. Four HCV populations provided 2,068 patients with liver biopsy, four different pathologist skill-levels and non-invasive tests. Results were expressed as percentages of correctly classified patients. In population #1 including 205 patients and comparing liver biopsy (reference: consensus reading by two experts) and blood tests, Metavir fibrosis (FM) stage accuracy was 64.4% in local pathologists vs. 82.2% (p < 10-3) in single expert pathologist. Significant discrepancy (≥ 2FM vs reference histological result) rates were: Fibrotest: 17.2%, FibroMeter2G: 5.6%, local pathologists: 4.9%, FibroMeter3G: 0.5%, expert pathologist: 0% (p < 10-3). In population #2 including 1,056 patients and comparing blood tests, the discrepancy scores, taking into account the error magnitude, of detailed fibrosis classification were significantly different between FibroMeter2G (0.30 ± 0.55) and FibroMeter3G (0.14 ± 0.37, p < 10-3) or Fibrotest (0.84 ± 0.80, p < 10-3). In population #3 (and #4) including 458 (359) patients and comparing blood tests and Fibroscan, accuracies of detailed fibrosis classification were, respectively: Fibrotest: 42.5% (33.5%), Fibroscan: 64.9% (50.7%), FibroMeter2G: 68.7% (68.2%), FibroMeter3G: 77.1% (83.4%), p < 10-3 (p < 10-3). Significant discrepancy (≥ 2 FM) rates were, respectively: Fibrotest: 21.3% (22.2%), Fibroscan: 12.9% (12.3%), FibroMeter2G: 5.7% (6.0%), FibroMeter3G: 0.9% (0.9%), p < 10-3 (p < 10-3). The accuracy in detailed fibrosis classification of the best-performing blood test outperforms liver biopsy read by a local pathologist, i.e., in clinical practice; however, the classification precision is apparently lesser. This detailed classification accuracy is much lower than that of significant fibrosis with Fibroscan and even Fibrotest but higher with FibroMeter3G. FibroMeter classification accuracy was significantly higher than those of other non-invasive tests. Finally, for hepatitis C evaluation in clinical practice, fibrosis degree can be evaluated using an accurate blood test.
2011-01-01
Background Non-invasive tests have been constructed and evaluated mainly for binary diagnoses such as significant fibrosis. Recently, detailed fibrosis classifications for several non-invasive tests have been developed, but their accuracy has not been thoroughly evaluated in comparison to liver biopsy, especially in clinical practice and for Fibroscan. Therefore, the main aim of the present study was to evaluate the accuracy of detailed fibrosis classifications available for non-invasive tests and liver biopsy. The secondary aim was to validate these accuracies in independent populations. Methods Four HCV populations provided 2,068 patients with liver biopsy, four different pathologist skill-levels and non-invasive tests. Results were expressed as percentages of correctly classified patients. Results In population #1 including 205 patients and comparing liver biopsy (reference: consensus reading by two experts) and blood tests, Metavir fibrosis (FM) stage accuracy was 64.4% in local pathologists vs. 82.2% (p < 10-3) in single expert pathologist. Significant discrepancy (≥ 2FM vs reference histological result) rates were: Fibrotest: 17.2%, FibroMeter2G: 5.6%, local pathologists: 4.9%, FibroMeter3G: 0.5%, expert pathologist: 0% (p < 10-3). In population #2 including 1,056 patients and comparing blood tests, the discrepancy scores, taking into account the error magnitude, of detailed fibrosis classification were significantly different between FibroMeter2G (0.30 ± 0.55) and FibroMeter3G (0.14 ± 0.37, p < 10-3) or Fibrotest (0.84 ± 0.80, p < 10-3). In population #3 (and #4) including 458 (359) patients and comparing blood tests and Fibroscan, accuracies of detailed fibrosis classification were, respectively: Fibrotest: 42.5% (33.5%), Fibroscan: 64.9% (50.7%), FibroMeter2G: 68.7% (68.2%), FibroMeter3G: 77.1% (83.4%), p < 10-3 (p < 10-3). Significant discrepancy (≥ 2 FM) rates were, respectively: Fibrotest: 21.3% (22.2%), Fibroscan: 12.9% (12.3%), FibroMeter2G: 5.7% (6.0%), FibroMeter3G: 0.9% (0.9%), p < 10-3 (p < 10-3). Conclusions The accuracy in detailed fibrosis classification of the best-performing blood test outperforms liver biopsy read by a local pathologist, i.e., in clinical practice; however, the classification precision is apparently lesser. This detailed classification accuracy is much lower than that of significant fibrosis with Fibroscan and even Fibrotest but higher with FibroMeter3G. FibroMeter classification accuracy was significantly higher than those of other non-invasive tests. Finally, for hepatitis C evaluation in clinical practice, fibrosis degree can be evaluated using an accurate blood test. PMID:22129438
NASA Astrophysics Data System (ADS)
Hall-Brown, Mary
The heterogeneity of Arctic vegetation can make land cover classification vey difficult when using medium to small resolution imagery (Schneider et al., 2009; Muller et al., 1999). Using high radiometric and spatial resolution imagery, such as the SPOT 5 and IKONOS satellites, have helped arctic land cover classification accuracies rise into the 80 and 90 percentiles (Allard, 2003; Stine et al., 2010; Muller et al., 1999). However, those increases usually come at a high price. High resolution imagery is very expensive and can often add tens of thousands of dollars onto the cost of the research. The EO-1 satellite launched in 2002 carries two sensors that have high specral and/or high spatial resolutions and can be an acceptable compromise between the resolution versus cost issues. The Hyperion is a hyperspectral sensor with the capability of collecting 242 spectral bands of information. The Advanced Land Imager (ALI) is an advanced multispectral sensor whose spatial resolution can be sharpened to 10 meters. This dissertation compares the accuracies of arctic land cover classifications produced by the Hyperion and ALI sensors to the classification accuracies produced by the Systeme Pour l' Observation de le Terre (SPOT), the Landsat Thematic Mapper (TM) and the Landsat Enhanced Thematic Mapper Plus (ETM+) sensors. Hyperion and ALI images from August 2004 were collected over the Upper Kuparuk River Basin, Alaska. Image processing included the stepwise discriminant analysis of pixels that were positively classified from coinciding ground control points, geometric and radiometric correction, and principle component analysis. Finally, stratified random sampling was used to perform accuracy assessments on satellite derived land cover classifications. Accuracy was estimated from an error matrix (confusion matrix) that provided the overall, producer's and user's accuracies. This research found that while the Hyperion sensor produced classfication accuracies that were equivalent to the TM and ETM+ sensor (approximately 78%), the Hyperion could not obtain the accuracy of the SPOT 5 HRV sensor. However, the land cover classifications derived from the ALI sensor exceeded most classification accuracies derived from the TM and ETM+ senors and were even comparable to most SPOT 5 HRV classifications (87%). With the deactivation of the Landsat series satellites, the monitoring of remote locations such as in the Arctic on an uninterupted basis thoughout the world is in jeopardy. The utilization of the Hyperion and ALI sensors are a way to keep that endeavor operational. By keeping the ALI sensor active at all times, uninterupted observation of the entire Earth can be accomplished. Keeping the Hyperion sensor as a "tasked" sensor can provide scientists with additional imagery and options for their studies without overburdening storage issues.
Evaluation of space SAR as a land-cover classification
NASA Technical Reports Server (NTRS)
Brisco, B.; Ulaby, F. T.; Williams, T. H. L.
1985-01-01
The multidimensional approach to the mapping of land cover, crops, and forests is reported. Dimensionality is achieved by using data from sensors such as LANDSAT to augment Seasat and Shuttle Image Radar (SIR) data, using different image features such as tone and texture, and acquiring multidate data. Seasat, Shuttle Imaging Radar (SIR-A), and LANDSAT data are used both individually and in combination to map land cover in Oklahoma. The results indicates that radar is the best single sensor (72% accuracy) and produces the best sensor combination (97.5% accuracy) for discriminating among five land cover categories. Multidate Seasat data and a single data of LANDSAT coverage are then used in a crop classification study of western Kansas. The highest accuracy for a single channel is achieved using a Seasat scene, which produces a classification accuracy of 67%. Classification accuracy increases to approximately 75% when either a multidate Seasat combination or LANDSAT data in a multisensor combination is used. The tonal and textural elements of SIR-A data are then used both alone and in combination to classify forests into five categories.
Comparative Analysis of Haar and Daubechies Wavelet for Hyper Spectral Image Classification
NASA Astrophysics Data System (ADS)
Sharif, I.; Khare, S.
2014-11-01
With the number of channels in the hundreds instead of in the tens Hyper spectral imagery possesses much richer spectral information than multispectral imagery. The increased dimensionality of such Hyper spectral data provides a challenge to the current technique for analyzing data. Conventional classification methods may not be useful without dimension reduction pre-processing. So dimension reduction has become a significant part of Hyper spectral image processing. This paper presents a comparative analysis of the efficacy of Haar and Daubechies wavelets for dimensionality reduction in achieving image classification. Spectral data reduction using Wavelet Decomposition could be useful because it preserves the distinction among spectral signatures. Daubechies wavelets optimally capture the polynomial trends while Haar wavelet is discontinuous and resembles a step function. The performance of these wavelets are compared in terms of classification accuracy and time complexity. This paper shows that wavelet reduction has more separate classes and yields better or comparable classification accuracy. In the context of the dimensionality reduction algorithm, it is found that the performance of classification of Daubechies wavelets is better as compared to Haar wavelet while Daubechies takes more time compare to Haar wavelet. The experimental results demonstrate the classification system consistently provides over 84% classification accuracy.
ANALYSIS OF A CLASSIFICATION ERROR MATRIX USING CATEGORICAL DATA TECHNIQUES.
Rosenfield, George H.; Fitzpatrick-Lins, Katherine
1984-01-01
Summary form only given. A classification error matrix typically contains tabulation results of an accuracy evaluation of a thematic classification, such as that of a land use and land cover map. The diagonal elements of the matrix represent the counts corrected, and the usual designation of classification accuracy has been the total percent correct. The nondiagonal elements of the matrix have usually been neglected. The classification error matrix is known in statistical terms as a contingency table of categorical data. As an example, an application of these methodologies to a problem of remotely sensed data concerning two photointerpreters and four categories of classification indicated that there is no significant difference in the interpretation between the two photointerpreters, and that there are significant differences among the interpreted category classifications. However, two categories, oak and cottonwood, are not separable in classification in this experiment at the 0. 51 percent probability. A coefficient of agreement is determined for the interpreted map as a whole, and individually for each of the interpreted categories. A conditional coefficient of agreement for the individual categories is compared to other methods for expressing category accuracy which have already been presented in the remote sensing literature.
Research on Remote Sensing Image Classification Based on Feature Level Fusion
NASA Astrophysics Data System (ADS)
Yuan, L.; Zhu, G.
2018-04-01
Remote sensing image classification, as an important direction of remote sensing image processing and application, has been widely studied. However, in the process of existing classification algorithms, there still exists the phenomenon of misclassification and missing points, which leads to the final classification accuracy is not high. In this paper, we selected Sentinel-1A and Landsat8 OLI images as data sources, and propose a classification method based on feature level fusion. Compare three kind of feature level fusion algorithms (i.e., Gram-Schmidt spectral sharpening, Principal Component Analysis transform and Brovey transform), and then select the best fused image for the classification experimental. In the classification process, we choose four kinds of image classification algorithms (i.e. Minimum distance, Mahalanobis distance, Support Vector Machine and ISODATA) to do contrast experiment. We use overall classification precision and Kappa coefficient as the classification accuracy evaluation criteria, and the four classification results of fused image are analysed. The experimental results show that the fusion effect of Gram-Schmidt spectral sharpening is better than other methods. In four kinds of classification algorithms, the fused image has the best applicability to Support Vector Machine classification, the overall classification precision is 94.01 % and the Kappa coefficients is 0.91. The fused image with Sentinel-1A and Landsat8 OLI is not only have more spatial information and spectral texture characteristics, but also enhances the distinguishing features of the images. The proposed method is beneficial to improve the accuracy and stability of remote sensing image classification.
NASA Astrophysics Data System (ADS)
Seo, Young Wook; Yoon, Seung Chul; Park, Bosoon; Hinton, Arthur; Windham, William R.; Lawrence, Kurt C.
2013-05-01
Salmonella is a major cause of foodborne disease outbreaks resulting from the consumption of contaminated food products in the United States. This paper reports the development of a hyperspectral imaging technique for detecting and differentiating two of the most common Salmonella serotypes, Salmonella Enteritidis (SE) and Salmonella Typhimurium (ST), from background microflora that are often found in poultry carcass rinse. Presumptive positive screening of colonies with a traditional direct plating method is a labor intensive and time consuming task. Thus, this paper is concerned with the detection of differences in spectral characteristics among the pure SE, ST, and background microflora grown on brilliant green sulfa (BGS) and xylose lysine tergitol 4 (XLT4) agar media with a spread plating technique. Visible near-infrared hyperspectral imaging, providing the spectral and spatial information unique to each microorganism, was utilized to differentiate SE and ST from the background microflora. A total of 10 classification models, including five machine learning algorithms, each without and with principal component analysis (PCA), were validated and compared to find the best model in classification accuracy. The five machine learning (classification) algorithms used in this study were Mahalanobis distance (MD), k-nearest neighbor (kNN), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM). The average classification accuracy of all 10 models on a calibration (or training) set of the pure cultures on BGS agar plates was 98% (Kappa coefficient = 0.95) in determining the presence of SE and/or ST although it was difficult to differentiate between SE and ST. The average classification accuracy of all 10 models on a training set for ST detection on XLT4 agar was over 99% (Kappa coefficient = 0.99) although SE colonies on XLT4 agar were difficult to differentiate from background microflora. The average classification accuracy of all 10 models on a validation set of chicken carcass rinses spiked with SE or ST and incubated on BGS agar plates was 94.45% and 83.73%, without and with PCA for classification, respectively. The best performing classification model on the validation set was QDA without PCA by achieving the classification accuracy of 98.65% (Kappa coefficient=0.98). The overall best performing classification model regardless of using PCA was MD with the classification accuracy of 94.84% (Kappa coefficient=0.88) on the validation set.
AVHRR composite period selection for land cover classification
Maxwell, S.K.; Hoffer, R.M.; Chapman, P.L.
2002-01-01
Multitemporal satellite image datasets provide valuable information on the phenological characteristics of vegetation, thereby significantly increasing the accuracy of cover type classifications compared to single date classifications. However, the processing of these datasets can become very complex when dealing with multitemporal data combined with multispectral data. Advanced Very High Resolution Radiometer (AVHRR) biweekly composite data are commonly used to classify land cover over large regions. Selecting a subset of these biweekly composite periods may be required to reduce the complexity and cost of land cover mapping. The objective of our research was to evaluate the effect of reducing the number of composite periods and altering the spacing of those composite periods on classification accuracy. Because inter-annual variability can have a major impact on classification results, 5 years of AVHRR data were evaluated. AVHRR biweekly composite images for spectral channels 1-4 (visible, near-infrared and two thermal bands) covering the entire growing season were used to classify 14 cover types over the entire state of Colorado for each of five different years. A supervised classification method was applied to maintain consistent procedures for each case tested. Results indicate that the number of composite periods can be halved-reduced from 14 composite dates to seven composite dates-without significantly reducing overall classification accuracy (80.4% Kappa accuracy for the 14-composite data-set as compared to 80.0% for a seven-composite dataset). At least seven composite periods were required to ensure the classification accuracy was not affected by inter-annual variability due to climate fluctuations. Concentrating more composites near the beginning and end of the growing season, as compared to using evenly spaced time periods, consistently produced slightly higher classification values over the 5 years tested (average Kappa) of 80.3% for the heavy early/late case as compared to 79.0% for the alternate dataset case).
Ben Chaabane, Salim; Fnaiech, Farhat
2014-01-23
Color image segmentation has been so far applied in many areas; hence, recently many different techniques have been developed and proposed. In the medical imaging area, the image segmentation may be helpful to provide assistance to doctor in order to follow-up the disease of a certain patient from the breast cancer processed images. The main objective of this work is to rebuild and also to enhance each cell from the three component images provided by an input image. Indeed, from an initial segmentation obtained using the statistical features and histogram threshold techniques, the resulting segmentation may represent accurately the non complete and pasted cells and enhance them. This allows real help to doctors, and consequently, these cells become clear and easy to be counted. A novel method for color edges extraction based on statistical features and automatic threshold is presented. The traditional edge detector, based on the first and the second order neighborhood, describing the relationship between the current pixel and its neighbors, is extended to the statistical domain. Hence, color edges in an image are obtained by combining the statistical features and the automatic threshold techniques. Finally, on the obtained color edges with specific primitive color, a combination rule is used to integrate the edge results over the three color components. Breast cancer cell images were used to evaluate the performance of the proposed method both quantitatively and qualitatively. Hence, a visual and a numerical assessment based on the probability of correct classification (PC), the false classification (Pf), and the classification accuracy (Sens(%)) are presented and compared with existing techniques. The proposed method shows its superiority in the detection of points which really belong to the cells, and also the facility of counting the number of the processed cells. Computer simulations highlight that the proposed method substantially enhances the segmented image with smaller error rates better than other existing algorithms under the same settings (patterns and parameters). Moreover, it provides high classification accuracy, reaching the rate of 97.94%. Additionally, the segmentation method may be extended to other medical imaging types having similar properties.
NASA Astrophysics Data System (ADS)
Titschack, J.; Baum, D.; Matsuyama, K.; Boos, K.; Färber, C.; Kahl, W.-A.; Ehrig, K.; Meinel, D.; Soriano, C.; Stock, S. R.
2018-06-01
During the last decades, X-ray (micro-)computed tomography has gained increasing attention for the description of porous skeletal and shell structures of various organism groups. However, their quantitative analysis is often hampered by the difficulty to discriminate cavities and pores within the object from the surrounding region. Herein, we test the ambient occlusion (AO) algorithm and newly implemented optimisations for the segmentation of cavities (implemented in the software Amira). The segmentation accuracy is evaluated as a function of (i) changes in the ray length input variable, and (ii) the usage of AO (scalar) field and other AO-derived (scalar) fields. The results clearly indicate that the AO field itself outperforms all other AO-derived fields in terms of segmentation accuracy and robustness against variations in the ray length input variable. The newly implemented optimisations improved the AO field-based segmentation only slightly, while the segmentations based on the AO-derived fields improved considerably. Additionally, we evaluated the potential of the AO field and AO-derived fields for the separation and classification of cavities as well as skeletal structures by comparing them with commonly used distance-map-based segmentations. For this, we tested the zooid separation within a bryozoan colony, the stereom classification of an ophiuroid tooth, the separation of bioerosion traces within a marble block and the calice (central cavity)-pore separation within a dendrophyllid coral. The obtained results clearly indicate that the ideal input field depends on the three-dimensional morphology of the object of interest. The segmentations based on the AO-derived fields often provided cavity separations and skeleton classifications that were superior to or impossible to obtain with commonly used distance-map-based segmentations. The combined usage of various AO-derived fields by supervised or unsupervised segmentation algorithms might provide a promising target for future research to further improve the results for this kind of high-end data segmentation and classification. Furthermore, the application of the developed segmentation algorithm is not restricted to X-ray (micro-)computed tomographic data but may potentially be useful for the segmentation of 3D volume data from other sources.
Hao, Pengyu; Wang, Li; Niu, Zheng
2015-01-01
A range of single classifiers have been proposed to classify crop types using time series vegetation indices, and hybrid classifiers are used to improve discriminatory power. Traditional fusion rules use the product of multi-single classifiers, but that strategy cannot integrate the classification output of machine learning classifiers. In this research, the performance of two hybrid strategies, multiple voting (M-voting) and probabilistic fusion (P-fusion), for crop classification using NDVI time series were tested with different training sample sizes at both pixel and object levels, and two representative counties in north Xinjiang were selected as study area. The single classifiers employed in this research included Random Forest (RF), Support Vector Machine (SVM), and See 5 (C 5.0). The results indicated that classification performance improved (increased the mean overall accuracy by 5%~10%, and reduced standard deviation of overall accuracy by around 1%) substantially with the training sample number, and when the training sample size was small (50 or 100 training samples), hybrid classifiers substantially outperformed single classifiers with higher mean overall accuracy (1%~2%). However, when abundant training samples (4,000) were employed, single classifiers could achieve good classification accuracy, and all classifiers obtained similar performances. Additionally, although object-based classification did not improve accuracy, it resulted in greater visual appeal, especially in study areas with a heterogeneous cropping pattern. PMID:26360597
Koch, Stefan P.; Hägele, Claudia; Haynes, John-Dylan; Heinz, Andreas; Schlagenhauf, Florian; Sterzer, Philipp
2015-01-01
Functional neuroimaging has provided evidence for altered function of mesolimbic circuits implicated in reward processing, first and foremost the ventral striatum, in patients with schizophrenia. While such findings based on significant group differences in brain activations can provide important insights into the pathomechanisms of mental disorders, the use of neuroimaging results from standard univariate statistical analysis for individual diagnosis has proven difficult. In this proof of concept study, we tested whether the predictive accuracy for the diagnostic classification of schizophrenia patients vs. healthy controls could be improved using multivariate pattern analysis (MVPA) of regional functional magnetic resonance imaging (fMRI) activation patterns for the anticipation of monetary reward. With a searchlight MVPA approach using support vector machine classification, we found that the diagnostic category could be predicted from local activation patterns in frontal, temporal, occipital and midbrain regions, with a maximal cluster peak classification accuracy of 93% for the right pallidum. Region-of-interest based MVPA for the ventral striatum achieved a maximal cluster peak accuracy of 88%, whereas the classification accuracy on the basis of standard univariate analysis reached only 75%. Moreover, using support vector regression we could additionally predict the severity of negative symptoms from ventral striatal activation patterns. These results show that MVPA can be used to substantially increase the accuracy of diagnostic classification on the basis of task-related fMRI signal patterns in a regionally specific way. PMID:25799236
2013-01-01
Background and purpose Guidelines for fracture treatment and evaluation require a valid classification. Classifications especially designed for children are available, but they might lead to reduced accuracy, considering the relative infrequency of childhood fractures in a general orthopedic department. We tested the reliability and accuracy of the Müller classification when used for long bone fractures in children. Methods We included all long bone fractures in children aged < 16 years who were treated in 2008 at the surgical ward of Stavanger University Hospital. 20 surgeons recorded 232 fractures. Datasets were generated for intra- and inter-rater analysis, as well as a reference dataset for accuracy calculations. We present proportion of agreement (PA) and kappa (K) statistics. Results For intra-rater analysis, overall agreement (κ) was 0.75 (95% CI: 0.68–0.81) and PA was 79%. For inter-rater assessment, K was 0.71 (95% CI: 0.61–0.80) and PA was 77%. Accuracy was estimated: κ = 0.72 (95% CI: 0.64–0.79) and PA = 76%. Interpretation The Müller classification (slightly adjusted for pediatric fractures) showed substantial to excellent accuracy among general orthopedic surgeons when applied to long bone fractures in children. However, separate knowledge about the child-specific fracture pattern, the maturity of the bone, and the degree of displacement must be considered when the treatment and the prognosis of the fractures are evaluated. PMID:23245225
Li, Jiangeng; Su, Lei; Pang, Zenan
2015-12-01
Feature selection techniques have been widely applied to tumor gene expression data analysis in recent years. A filter feature selection method named marginal Fisher analysis score (MFA score) which is based on graph embedding has been proposed, and it has been widely used mainly because it is superior to Fisher score. Considering the heavy redundancy in gene expression data, we proposed a new filter feature selection technique in this paper. It is named MFA score+ and is based on MFA score and redundancy excluding. We applied it to an artificial dataset and eight tumor gene expression datasets to select important features and then used support vector machine as the classifier to classify the samples. Compared with MFA score, t test and Fisher score, it achieved higher classification accuracy.
NASA Astrophysics Data System (ADS)
Pfennigbauer, Martin; Ullrich, Andreas
2010-04-01
Newest developments in laser scanner technologies put surveyors in the position to comply with the ever increasing demand of high-speed, high-accuracy, and highly reliable data acquisition from terrestrial, mobile, and airborne platforms. Echo digitization in pulsed time-of-flight laser ranging has demonstrated its superior performance in the field of bathymetry and airborne laser scanning for more than a decade, however at the cost of somewhat time consuming off line post processing. State-of-the-art online waveform processing as implemented in RIEGL's V-Line not only saves users post-processing time to obtain true 3D point clouds, it also adds the assets of calibrated amplitude and reflectance measurement for data classification and pulse deviation determination for effective and reliable data validation. We present results from data acquisitions in different complex target situations.
Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants.
Yousef, Malik; Saçar Demirci, Müşerref Duygu; Khalifa, Waleed; Allmer, Jens
2016-01-01
MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of ~95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection.
Word pair classification during imagined speech using direct brain recordings
NASA Astrophysics Data System (ADS)
Martin, Stephanie; Brunner, Peter; Iturrate, Iñaki; Millán, José Del R.; Schalk, Gerwin; Knight, Robert T.; Pasley, Brian N.
2016-05-01
People that cannot communicate due to neurological disorders would benefit from an internal speech decoder. Here, we showed the ability to classify individual words during imagined speech from electrocorticographic signals. In a word imagery task, we used high gamma (70-150 Hz) time features with a support vector machine model to classify individual words from a pair of words. To account for temporal irregularities during speech production, we introduced a non-linear time alignment into the SVM kernel. Classification accuracy reached 88% in a two-class classification framework (50% chance level), and average classification accuracy across fifteen word-pairs was significant across five subjects (mean = 58% p < 0.05). We also compared classification accuracy between imagined speech, overt speech and listening. As predicted, higher classification accuracy was obtained in the listening and overt speech conditions (mean = 89% and 86%, respectively; p < 0.0001), where speech stimuli were directly presented. The results provide evidence for a neural representation for imagined words in the temporal lobe, frontal lobe and sensorimotor cortex, consistent with previous findings in speech perception and production. These data represent a proof of concept study for basic decoding of speech imagery, and delineate a number of key challenges to usage of speech imagery neural representations for clinical applications.
Word pair classification during imagined speech using direct brain recordings
Martin, Stephanie; Brunner, Peter; Iturrate, Iñaki; Millán, José del R.; Schalk, Gerwin; Knight, Robert T.; Pasley, Brian N.
2016-01-01
People that cannot communicate due to neurological disorders would benefit from an internal speech decoder. Here, we showed the ability to classify individual words during imagined speech from electrocorticographic signals. In a word imagery task, we used high gamma (70–150 Hz) time features with a support vector machine model to classify individual words from a pair of words. To account for temporal irregularities during speech production, we introduced a non-linear time alignment into the SVM kernel. Classification accuracy reached 88% in a two-class classification framework (50% chance level), and average classification accuracy across fifteen word-pairs was significant across five subjects (mean = 58%; p < 0.05). We also compared classification accuracy between imagined speech, overt speech and listening. As predicted, higher classification accuracy was obtained in the listening and overt speech conditions (mean = 89% and 86%, respectively; p < 0.0001), where speech stimuli were directly presented. The results provide evidence for a neural representation for imagined words in the temporal lobe, frontal lobe and sensorimotor cortex, consistent with previous findings in speech perception and production. These data represent a proof of concept study for basic decoding of speech imagery, and delineate a number of key challenges to usage of speech imagery neural representations for clinical applications. PMID:27165452
Comparing ecoregional classifications for natural areas management in the Klamath Region, USA
Sarr, Daniel A.; Duff, Andrew; Dinger, Eric C.; Shafer, Sarah L.; Wing, Michael; Seavy, Nathaniel E.; Alexander, John D.
2015-01-01
We compared three existing ecoregional classification schemes (Bailey, Omernik, and World Wildlife Fund) with two derived schemes (Omernik Revised and Climate Zones) to explore their effectiveness in explaining species distributions and to better understand natural resource geography in the Klamath Region, USA. We analyzed presence/absence data derived from digital distribution maps for trees, amphibians, large mammals, small mammals, migrant birds, and resident birds using three statistical analyses of classification accuracy (Analysis of Similarity, Canonical Analysis of Principal Coordinates, and Classification Strength). The classifications were roughly comparable in classification accuracy, with Omernik Revised showing the best overall performance. Trees showed the strongest fidelity to the classifications, and large mammals showed the weakest fidelity. We discuss the implications for regional biogeography and describe how intermediate resolution ecoregional classifications may be appropriate for use as natural areas management domains.
a Gsa-Svm Hybrid System for Classification of Binary Problems
NASA Astrophysics Data System (ADS)
Sarafrazi, Soroor; Nezamabadi-pour, Hossein; Barahman, Mojgan
2011-06-01
This paperhybridizesgravitational search algorithm (GSA) with support vector machine (SVM) and made a novel GSA-SVM hybrid system to improve the classification accuracy in binary problems. GSA is an optimization heuristic toolused to optimize the value of SVM kernel parameter (in this paper, radial basis function (RBF) is chosen as the kernel function). The experimental results show that this newapproach can achieve high classification accuracy and is comparable to or better than the particle swarm optimization (PSO)-SVM and genetic algorithm (GA)-SVM, which are two hybrid systems for classification.
Typicality effects in artificial categories: is there a hemisphere difference?
Richards, L G; Chiarello, C
1990-07-01
In category classification tasks, typicality effects are usually found: accuracy and reaction time depend upon distance from a prototype. In this study, subjects learned either verbal or nonverbal dot pattern categories, followed by a lateralized classification task. Comparable typicality effects were found in both reaction time and accuracy across visual fields for both verbal and nonverbal categories. Both hemispheres appeared to use a similarity-to-prototype matching strategy in classification. This indicates that merely having a verbal label does not differentiate classification in the two hemispheres.
Multi-site evaluation of IKONOS data for classification of tropical coral reef environments
Andrefouet, S.; Kramer, Philip; Torres-Pulliza, D.; Joyce, K.E.; Hochberg, E.J.; Garza-Perez, R.; Mumby, P.J.; Riegl, Bernhard; Yamano, H.; White, W.H.; Zubia, M.; Brock, J.C.; Phinn, S.R.; Naseer, A.; Hatcher, B.G.; Muller-Karger, F. E.
2003-01-01
Ten IKONOS images of different coral reef sites distributed around the world were processed to assess the potential of 4-m resolution multispectral data for coral reef habitat mapping. Complexity of reef environments, established by field observation, ranged from 3 to 15 classes of benthic habitats containing various combinations of sediments, carbonate pavement, seagrass, algae, and corals in different geomorphologic zones (forereef, lagoon, patch reef, reef flats). Processing included corrections for sea surface roughness and bathymetry, unsupervised or supervised classification, and accuracy assessment based on ground-truth data. IKONOS classification results were compared with classified Landsat 7 imagery for simple to moderate complexity of reef habitats (5-11 classes). For both sensors, overall accuracies of the classifications show a general linear trend of decreasing accuracy with increasing habitat complexity. The IKONOS sensor performed better, with a 15-20% improvement in accuracy compared to Landsat. For IKONOS, overall accuracy was 77% for 4-5 classes, 71% for 7-8 classes, 65% in 9-11 classes, and 53% for more than 13 classes. The Landsat classification accuracy was systematically lower, with an average of 56% for 5-10 classes. Within this general trend, inter-site comparisons and specificities demonstrate the benefits of different approaches. Pre-segmentation of the different geomorphologic zones and depth correction provided different advantages in different environments. Our results help guide scientists and managers in applying IKONOS-class data for coral reef mapping applications. ?? 2003 Elsevier Inc. All rights reserved.
Transportation Modes Classification Using Sensors on Smartphones.
Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu
2016-08-19
This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user's transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes.
Transportation Modes Classification Using Sensors on Smartphones
Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu
2016-01-01
This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user’s transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes. PMID:27548182
NASA Astrophysics Data System (ADS)
Dondurur, Mehmet
The primary objective of this study was to determine the degree to which modern SAR systems can be used to obtain information about the Earth's vegetative resources. Information obtainable from microwave synthetic aperture radar (SAR) data was compared with that obtainable from LANDSAT-TM and SPOT data. Three hypotheses were tested: (a) Classification of land cover/use from SAR data can be accomplished on a pixel-by-pixel basis with the same overall accuracy as from LANDSAT-TM and SPOT data. (b) Classification accuracy for individual land cover/use classes will differ between sensors. (c) Combining information derived from optical and SAR data into an integrated monitoring system will improve overall and individual land cover/use class accuracies. The study was conducted with three data sets for the Sleeping Bear Dunes test site in the northwestern part of Michigan's lower peninsula, including an October 1982 LANDSAT-TM scene, a June 1989 SPOT scene and C-, L- and P-Band radar data from the Jet Propulsion Laboratory AIRSAR. Reference data were derived from the Michigan Resource Information System (MIRIS) and available color infrared aerial photos. Classification and rectification of data sets were done using ERDAS Image Processing Programs. Classification algorithms included Maximum Likelihood, Mahalanobis Distance, Minimum Spectral Distance, ISODATA, Parallelepiped, and Sequential Cluster Analysis. Classified images were rectified as necessary so that all were at the same scale and oriented north-up. Results were analyzed with contingency tables and percent correctly classified (PCC) and Cohen's Kappa (CK) as accuracy indices using CSLANT and ImagePro programs developed for this study. Accuracy analyses were based upon a 1.4 by 6.5 km area with its long axis east-west. Reference data for this subscene total 55,770 15 by 15 m pixels with sixteen cover types, including seven level III forest classes, three level III urban classes, two level II range classes, two water classes, one wetland class and one agriculture class. An initial analysis was made without correcting the 1978 MIRIS reference data to the different dates of the TM, SPOT and SAR data sets. In this analysis, highest overall classification accuracy (PCC) was 87% with the TM data set, with both SPOT and C-Band SAR at 85%, a difference statistically significant at the 0.05 level. When the reference data were corrected for land cover change between 1978 and 1991, classification accuracy with the C-Band SAR data increased to 87%. Classification accuracy differed from sensor to sensor for individual land cover classes, Combining sensors into hypothetical multi-sensor systems resulted in higher accuracies than for any single sensor. Combining LANDSAT -TM and C-Band SAR yielded an overall classification accuracy (PCC) of 92%. The results of this study indicate that C-Band SAR data provide an acceptable substitute for LANDSAT-TM or SPOT data when land cover information is desired of areas where cloud cover obscures the terrain. Even better results can be obtained by integrating TM and C-Band SAR data into a multi-sensor system.
Multiplex profiling of tumor-associated proteolytic activity in serum of colorectal cancer patients.
Yepes, Diego; Costina, Victor; Pilz, Lothar R; Hofheinz, Ralf; Neumaier, Michael; Findeisen, Peter
2014-06-01
The monitoring of tumor-associated protease activity in blood specimens has recently been proposed as new diagnostic tool in cancer research. In this paper, we describe the screening of a peptide library for identification of reporter peptides (RPs) that are selectively cleaved in serum specimens from colorectal cancer patients and investigate the benefits of RP multiplexing. A library of 144 RPs was constructed that contained amino acid sequences of abundant plasma proteins. Proteolytic cleavage of RPs was monitored with MS. Five RPs that were selectively cleaved in serum specimens from tumor patients were selected for further validation in serum specimens of colorectal tumor patients (n = 30) and nonmalignant controls (n = 60). RP spiking and subsequent quantification of proteolytic fragments with LC-MS showed good reproducibility with CVs always below 26%. The linear discriminant analysis and PCA revealed that a combination of RPs for diagnostic classification is superior to single markers. Classification accuracy reached 88% (79/90) when all five markers were combined. Functional protease profiling with RPs might improve the laboratory-based diagnosis, monitoring and prognosis of malignant disease, and has to be evaluated thoroughly in future studies. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
An SVM-based solution for fault detection in wind turbines.
Santos, Pedro; Villa, Luisa F; Reñones, Aníbal; Bustillo, Andres; Maudes, Jesús
2015-03-09
Research into fault diagnosis in machines with a wide range of variable loads and speeds, such as wind turbines, is of great industrial interest. Analysis of the power signals emitted by wind turbines for the diagnosis of mechanical faults in their mechanical transmission chain is insufficient. A successful diagnosis requires the inclusion of accelerometers to evaluate vibrations. This work presents a multi-sensory system for fault diagnosis in wind turbines, combined with a data-mining solution for the classification of the operational state of the turbine. The selected sensors are accelerometers, in which vibration signals are processed using angular resampling techniques and electrical, torque and speed measurements. Support vector machines (SVMs) are selected for the classification task, including two traditional and two promising new kernels. This multi-sensory system has been validated on a test-bed that simulates the real conditions of wind turbines with two fault typologies: misalignment and imbalance. Comparison of SVM performance with the results of artificial neural networks (ANNs) shows that linear kernel SVM outperforms other kernels and ANNs in terms of accuracy, training and tuning times. The suitability and superior performance of linear SVM is also experimentally analyzed, to conclude that this data acquisition technique generates linearly separable datasets.
Computer Based Melanocytic and Nevus Image Enhancement and Segmentation.
Jamil, Uzma; Akram, M Usman; Khalid, Shehzad; Abbas, Sarmad; Saleem, Kashif
2016-01-01
Digital dermoscopy aids dermatologists in monitoring potentially cancerous skin lesions. Melanoma is the 5th common form of skin cancer that is rare but the most dangerous. Melanoma is curable if it is detected at an early stage. Automated segmentation of cancerous lesion from normal skin is the most critical yet tricky part in computerized lesion detection and classification. The effectiveness and accuracy of lesion classification are critically dependent on the quality of lesion segmentation. In this paper, we have proposed a novel approach that can automatically preprocess the image and then segment the lesion. The system filters unwanted artifacts including hairs, gel, bubbles, and specular reflection. A novel approach is presented using the concept of wavelets for detection and inpainting the hairs present in the cancer images. The contrast of lesion with the skin is enhanced using adaptive sigmoidal function that takes care of the localized intensity distribution within a given lesion's images. We then present a segmentation approach to precisely segment the lesion from the background. The proposed approach is tested on the European database of dermoscopic images. Results are compared with the competitors to demonstrate the superiority of the suggested approach.
Prediction of brain-computer interface aptitude from individual brain structure.
Halder, S; Varkuti, B; Bogdan, M; Kübler, A; Rosenstiel, W; Sitaram, R; Birbaumer, N
2013-01-01
Brain-computer interface (BCI) provide a non-muscular communication channel for patients with impairments of the motor system. A significant number of BCI users is unable to obtain voluntary control of a BCI-system in proper time. This makes methods that can be used to determine the aptitude of a user necessary. We hypothesized that integrity and connectivity of involved white matter connections may serve as a predictor of individual BCI-performance. Therefore, we analyzed structural data from anatomical scans and DTI of motor imagery BCI-users differentiated into high and low BCI-aptitude groups based on their overall performance. Using a machine learning classification method we identified discriminating structural brain trait features and correlated the best features with a continuous measure of individual BCI-performance. Prediction of the aptitude group of each participant was possible with near perfect accuracy (one error). Tissue volumetric analysis yielded only poor classification results. In contrast, the structural integrity and myelination quality of deep white matter structures such as the Corpus Callosum, Cingulum, and Superior Fronto-Occipital Fascicle were positively correlated with individual BCI-performance. This confirms that structural brain traits contribute to individual performance in BCI use.
Prediction of brain-computer interface aptitude from individual brain structure
Halder, S.; Varkuti, B.; Bogdan, M.; Kübler, A.; Rosenstiel, W.; Sitaram, R.; Birbaumer, N.
2013-01-01
Objective: Brain-computer interface (BCI) provide a non-muscular communication channel for patients with impairments of the motor system. A significant number of BCI users is unable to obtain voluntary control of a BCI-system in proper time. This makes methods that can be used to determine the aptitude of a user necessary. Methods: We hypothesized that integrity and connectivity of involved white matter connections may serve as a predictor of individual BCI-performance. Therefore, we analyzed structural data from anatomical scans and DTI of motor imagery BCI-users differentiated into high and low BCI-aptitude groups based on their overall performance. Results: Using a machine learning classification method we identified discriminating structural brain trait features and correlated the best features with a continuous measure of individual BCI-performance. Prediction of the aptitude group of each participant was possible with near perfect accuracy (one error). Conclusions: Tissue volumetric analysis yielded only poor classification results. In contrast, the structural integrity and myelination quality of deep white matter structures such as the Corpus Callosum, Cingulum, and Superior Fronto-Occipital Fascicle were positively correlated with individual BCI-performance. Significance: This confirms that structural brain traits contribute to individual performance in BCI use. PMID:23565083
Brain-Computer Interface Based on Generation of Visual Images
Bobrov, Pavel; Frolov, Alexander; Cantor, Charles; Fedulova, Irina; Bakhnyan, Mikhail; Zhavoronkov, Alexander
2011-01-01
This paper examines the task of recognizing EEG patterns that correspond to performing three mental tasks: relaxation and imagining of two types of pictures: faces and houses. The experiments were performed using two EEG headsets: BrainProducts ActiCap and Emotiv EPOC. The Emotiv headset becomes widely used in consumer BCI application allowing for conducting large-scale EEG experiments in the future. Since classification accuracy significantly exceeded the level of random classification during the first three days of the experiment with EPOC headset, a control experiment was performed on the fourth day using ActiCap. The control experiment has shown that utilization of high-quality research equipment can enhance classification accuracy (up to 68% in some subjects) and that the accuracy is independent of the presence of EEG artifacts related to blinking and eye movement. This study also shows that computationally-inexpensive Bayesian classifier based on covariance matrix analysis yields similar classification accuracy in this problem as a more sophisticated Multi-class Common Spatial Patterns (MCSP) classifier. PMID:21695206
NASA Technical Reports Server (NTRS)
Mulligan, P. J.; Gervin, J. C.; Lu, Y. C.
1985-01-01
An area bordering the Eastern Shore of the Chesapeake Bay was selected for study and classified using unsupervised techniques applied to LANDSAT-2 MSS data and several band combinations of LANDSAT-4 TM data. The accuracies of these Level I land cover classifications were verified using the Taylor's Island USGS 7.5 minute topographic map which was photointerpreted, digitized and rasterized. The the Taylor's Island map, comparing the MSS and TM three band (2 3 4) classifications, the increased resolution of TM produced a small improvement in overall accuracy of 1% correct due primarily to a small improvement, and 1% and 3%, in areas such as water and woodland. This was expected as the MSS data typically produce high accuracies for categories which cover large contiguous areas. However, in the categories covering smaller areas within the map there was generally an improvement of at least 10%. Classification of the important residential category improved 12%, and wetlands were mapped with 11% greater accuracy.
NASA Astrophysics Data System (ADS)
Roychowdhury, K.
2016-06-01
Landcover is the easiest detectable indicator of human interventions on land. Urban and peri-urban areas present a complex combination of landcover, which makes classification challenging. This paper assesses the different methods of classifying landcover using dual polarimetric Sentinel-1 data collected during monsoon (July) and winter (December) months of 2015. Four broad landcover classes such as built up areas, water bodies and wetlands, vegetation and open spaces of Kolkata and its surrounding regions were identified. Polarimetric analyses were conducted on Single Look Complex (SLC) data of the region while ground range detected (GRD) data were used for spectral and spatial classification. Unsupervised classification by means of K-Means clustering used backscatter values and was able to identify homogenous landcovers over the study area. The results produced an overall accuracy of less than 50% for both the seasons. Higher classification accuracy (around 70%) was achieved by adding texture variables as inputs along with the backscatter values. However, the accuracy of classification increased significantly with polarimetric analyses. The overall accuracy was around 80% in Wishart H-A-Alpha unsupervised classification. The method was useful in identifying urban areas due to their double-bounce scattering and vegetated areas, which have more random scattering. Normalized Difference Built-up index (NDBI) and Normalized Difference Vegetation Index (NDVI) obtained from Landsat 8 data over the study area were used to verify vegetation and urban classes. The study compares the accuracies of different methods of classifying landcover using medium resolution SAR data in a complex urban area and suggests that polarimetric analyses present the most accurate results for urban and suburban areas.
Concussion classification via deep learning using whole-brain white matter fiber strains
Cai, Yunliang; Wu, Shaoju; Zhao, Wei; Li, Zhigang; Wu, Zheyang
2018-01-01
Developing an accurate and reliable injury predictor is central to the biomechanical studies of traumatic brain injury. State-of-the-art efforts continue to rely on empirical, scalar metrics based on kinematics or model-estimated tissue responses explicitly pre-defined in a specific brain region of interest. They could suffer from loss of information. A single training dataset has also been used to evaluate performance but without cross-validation. In this study, we developed a deep learning approach for concussion classification using implicit features of the entire voxel-wise white matter fiber strains. Using reconstructed American National Football League (NFL) injury cases, leave-one-out cross-validation was employed to objectively compare injury prediction performances against two baseline machine learning classifiers (support vector machine (SVM) and random forest (RF)) and four scalar metrics via univariate logistic regression (Brain Injury Criterion (BrIC), cumulative strain damage measure of the whole brain (CSDM-WB) and the corpus callosum (CSDM-CC), and peak fiber strain in the CC). Feature-based machine learning classifiers including deep learning, SVM, and RF consistently outperformed all scalar injury metrics across all performance categories (e.g., leave-one-out accuracy of 0.828–0.862 vs. 0.690–0.776, and .632+ error of 0.148–0.176 vs. 0.207–0.292). Further, deep learning achieved the best cross-validation accuracy, sensitivity, AUC, and .632+ error. These findings demonstrate the superior performances of deep learning in concussion prediction and suggest its promise for future applications in biomechanical investigations of traumatic brain injury. PMID:29795640
Concussion classification via deep learning using whole-brain white matter fiber strains.
Cai, Yunliang; Wu, Shaoju; Zhao, Wei; Li, Zhigang; Wu, Zheyang; Ji, Songbai
2018-01-01
Developing an accurate and reliable injury predictor is central to the biomechanical studies of traumatic brain injury. State-of-the-art efforts continue to rely on empirical, scalar metrics based on kinematics or model-estimated tissue responses explicitly pre-defined in a specific brain region of interest. They could suffer from loss of information. A single training dataset has also been used to evaluate performance but without cross-validation. In this study, we developed a deep learning approach for concussion classification using implicit features of the entire voxel-wise white matter fiber strains. Using reconstructed American National Football League (NFL) injury cases, leave-one-out cross-validation was employed to objectively compare injury prediction performances against two baseline machine learning classifiers (support vector machine (SVM) and random forest (RF)) and four scalar metrics via univariate logistic regression (Brain Injury Criterion (BrIC), cumulative strain damage measure of the whole brain (CSDM-WB) and the corpus callosum (CSDM-CC), and peak fiber strain in the CC). Feature-based machine learning classifiers including deep learning, SVM, and RF consistently outperformed all scalar injury metrics across all performance categories (e.g., leave-one-out accuracy of 0.828-0.862 vs. 0.690-0.776, and .632+ error of 0.148-0.176 vs. 0.207-0.292). Further, deep learning achieved the best cross-validation accuracy, sensitivity, AUC, and .632+ error. These findings demonstrate the superior performances of deep learning in concussion prediction and suggest its promise for future applications in biomechanical investigations of traumatic brain injury.
Liu, Chih-Hao; Du, Yong; Singh, Manmohan; Wu, Chen; Han, Zhaolong; Li, Jiasong; Chang, Anthony; Mohan, Chandra; Larin, Kirill V
2016-08-01
Acute glomerulonephritis caused by antiglomerular basement membrane marked by high mortality. The primary reason for this is delayed diagnosis via blood examination, urine analysis, tissue biopsy, or ultrasound and X-ray computed tomography imaging. Blood, urine, and tissue-based diagnoses can be time consuming, while ultrasound and CT imaging have relatively low spatial resolution, with reduced sensitivity. Optical coherence tomography is a noninvasive and high-resolution imaging technique that provides superior spatial resolution (micrometer scale) as compared to ultrasound and CT. Changes in tissue properties can be detected based on the optical metrics analyzed from the OCT signals, such as optical attenuation and speckle variance. Furthermore, OCT does not rely on ionizing radiation as with CT imaging. In addition to structural changes, the elasticity of the kidney can significantly change due to nephritis. In this work, OCT has been utilized to quantify the difference in tissue properties between healthy and nephritic murine kidneys. Although OCT imaging could identify the diseased tissue, its classification accuracy is clinically inadequate. By combining optical metrics with elasticity, the classification accuracy improves from 76% to 95%. These results show that OCT combined with OCE can be a powerful tool for identifying and classifying nephritis. Therefore, the OCT/OCE method could potentially be used as a minimally invasive tool for longitudinal studies during the progression and therapy of glomerulonephritis as well as complement and, perhaps, substitute highly invasive tissue biopsies. Elastic-wave propagation in mouse healthy and nephritic kidneys. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A survey of the dummy face and human face stimuli used in BCI paradigm.
Chen, Long; Jin, Jing; Zhang, Yu; Wang, Xingyu; Cichocki, Andrzej
2015-01-15
It was proved that the human face stimulus were superior to the flash only stimulus in BCI system. However, human face stimulus may lead to copyright infringement problems and was hard to be edited according to the requirement of the BCI study. Recently, it was reported that facial expression changes could be done by changing a curve in a dummy face which could obtain good performance when it was applied to visual-based P300 BCI systems. In this paper, four different paradigms were presented, which were called dummy face pattern, human face pattern, inverted dummy face pattern and inverted human face pattern, to evaluate the performance of the dummy faces stimuli compared with the human faces stimuli. The key point that determined the value of dummy faces in BCI systems were whether dummy faces stimuli could obtain as good performance as human faces stimuli. Online and offline results of four different paradigms would have been obtained and comparatively analyzed. Online and offline results showed that there was no significant difference among dummy faces and human faces in ERPs, classification accuracy and information transfer rate when they were applied in BCI systems. Dummy faces stimuli could evoke large ERPs and obtain as high classification accuracy and information transfer rate as the human faces stimuli. Since dummy faces were easy to be edited and had no copyright infringement problems, it would be a good choice for optimizing the stimuli of BCI systems. Copyright © 2014 Elsevier B.V. All rights reserved.
Pettersson-Yeo, William; Benetti, Stefania; Marquand, Andre F.; Joules, Richard; Catani, Marco; Williams, Steve C. R.; Allen, Paul; McGuire, Philip; Mechelli, Andrea
2014-01-01
In the pursuit of clinical utility, neuroimaging researchers of psychiatric and neurological illness are increasingly using analyses, such as support vector machine, that allow inference at the single-subject level. Recent studies employing single-modality data, however, suggest that classification accuracies must be improved for such utility to be realized. One possible solution is to integrate different data types to provide a single combined output classification; either by generating a single decision function based on an integrated kernel matrix, or, by creating an ensemble of multiple single modality classifiers and integrating their predictions. Here, we describe four integrative approaches: (1) an un-weighted sum of kernels, (2) multi-kernel learning, (3) prediction averaging, and (4) majority voting, and compare their ability to enhance classification accuracy relative to the best single-modality classification accuracy. We achieve this by integrating structural, functional, and diffusion tensor magnetic resonance imaging data, in order to compare ultra-high risk (n = 19), first episode psychosis (n = 19) and healthy control subjects (n = 23). Our results show that (i) whilst integration can enhance classification accuracy by up to 13%, the frequency of such instances may be limited, (ii) where classification can be enhanced, simple methods may yield greater increases relative to more computationally complex alternatives, and, (iii) the potential for classification enhancement is highly influenced by the specific diagnostic comparison under consideration. In conclusion, our findings suggest that for moderately sized clinical neuroimaging datasets, combining different imaging modalities in a data-driven manner is no “magic bullet” for increasing classification accuracy. However, it remains possible that this conclusion is dependent on the use of neuroimaging modalities that had little, or no, complementary information to offer one another, and that the integration of more diverse types of data would have produced greater classification enhancement. We suggest that future studies ideally examine a greater variety of data types (e.g., genetic, cognitive, and neuroimaging) in order to identify the data types and combinations optimally suited to the classification of early stage psychosis. PMID:25076868
Pettersson-Yeo, William; Benetti, Stefania; Marquand, Andre F; Joules, Richard; Catani, Marco; Williams, Steve C R; Allen, Paul; McGuire, Philip; Mechelli, Andrea
2014-01-01
In the pursuit of clinical utility, neuroimaging researchers of psychiatric and neurological illness are increasingly using analyses, such as support vector machine, that allow inference at the single-subject level. Recent studies employing single-modality data, however, suggest that classification accuracies must be improved for such utility to be realized. One possible solution is to integrate different data types to provide a single combined output classification; either by generating a single decision function based on an integrated kernel matrix, or, by creating an ensemble of multiple single modality classifiers and integrating their predictions. Here, we describe four integrative approaches: (1) an un-weighted sum of kernels, (2) multi-kernel learning, (3) prediction averaging, and (4) majority voting, and compare their ability to enhance classification accuracy relative to the best single-modality classification accuracy. We achieve this by integrating structural, functional, and diffusion tensor magnetic resonance imaging data, in order to compare ultra-high risk (n = 19), first episode psychosis (n = 19) and healthy control subjects (n = 23). Our results show that (i) whilst integration can enhance classification accuracy by up to 13%, the frequency of such instances may be limited, (ii) where classification can be enhanced, simple methods may yield greater increases relative to more computationally complex alternatives, and, (iii) the potential for classification enhancement is highly influenced by the specific diagnostic comparison under consideration. In conclusion, our findings suggest that for moderately sized clinical neuroimaging datasets, combining different imaging modalities in a data-driven manner is no "magic bullet" for increasing classification accuracy. However, it remains possible that this conclusion is dependent on the use of neuroimaging modalities that had little, or no, complementary information to offer one another, and that the integration of more diverse types of data would have produced greater classification enhancement. We suggest that future studies ideally examine a greater variety of data types (e.g., genetic, cognitive, and neuroimaging) in order to identify the data types and combinations optimally suited to the classification of early stage psychosis.
NASA Technical Reports Server (NTRS)
Fagan, Matthew E.; Defries, Ruth S.; Sesnie, Steven E.; Arroyo-Mora, J. Pablo; Soto, Carlomagno; Singh, Aditya; Townsend, Philip A.; Chazdon, Robin L.
2015-01-01
An efficient means to map tree plantations is needed to detect tropical land use change and evaluate reforestation projects. To analyze recent tree plantation expansion in northeastern Costa Rica, we examined the potential of combining moderate-resolution hyperspectral imagery (2005 HyMap mosaic) with multitemporal, multispectral data (Landsat) to accurately classify (1) general forest types and (2) tree plantations by species composition. Following a linear discriminant analysis to reduce data dimensionality, we compared four Random Forest classification models: hyperspectral data (HD) alone; HD plus interannual spectral metrics; HD plus a multitemporal forest regrowth classification; and all three models combined. The fourth, combined model achieved overall accuracy of 88.5%. Adding multitemporal data significantly improved classification accuracy (p less than 0.0001) of all forest types, although the effect on tree plantation accuracy was modest. The hyperspectral data alone classified six species of tree plantations with 75% to 93% producer's accuracy; adding multitemporal spectral data increased accuracy only for two species with dense canopies. Non-native tree species had higher classification accuracy overall and made up the majority of tree plantations in this landscape. Our results indicate that combining occasionally acquired hyperspectral data with widely available multitemporal satellite imagery enhances mapping and monitoring of reforestation in tropical landscapes.
NASA Technical Reports Server (NTRS)
Spann, G. W.; Faust, N. L.
1974-01-01
It is known from several previous investigations that many categories of land-use can be mapped via computer processing of Earth Resources Technology Satellite data. The results are presented of one such experiment using the USGS/NASA land-use classification system. Douglas County, Georgia, was chosen as the test site for this project. It was chosen primarily because of its recent rapid growth and future growth potential. Results of the investigation indicate an overall land-use mapping accuracy of 67% with higher accuracies in rural areas and lower accuracies in urban areas. It is estimated, however, that 95% of the State of Georgia could be mapped by these techniques with an accuracy of 80% to 90%.
NASA Astrophysics Data System (ADS)
Park, M.; Stenstrom, M. K.
2004-12-01
Recognizing urban information from the satellite imagery is problematic due to the diverse features and dynamic changes of urban landuse. The use of Landsat imagery for urban land use classification involves inherent uncertainty due to its spatial resolution and the low separability among land uses. To resolve the uncertainty problem, we investigated the performance of Bayesian networks to classify urban land use since Bayesian networks provide a quantitative way of handling uncertainty and have been successfully used in many areas. In this study, we developed the optimized networks for urban land use classification from Landsat ETM+ images of Marina del Rey area based on USGS land cover/use classification level III. The networks started from a tree structure based on mutual information between variables and added the links to improve accuracy. This methodology offers several advantages: (1) The network structure shows the dependency relationships between variables. The class node value can be predicted even with particular band information missing due to sensor system error. The missing information can be inferred from other dependent bands. (2) The network structure provides information of variables that are important for the classification, which is not available from conventional classification methods such as neural networks and maximum likelihood classification. In our case, for example, bands 1, 5 and 6 are the most important inputs in determining the land use of each pixel. (3) The networks can be reduced with those input variables important for classification. This minimizes the problem without considering all possible variables. We also examined the effect of incorporating ancillary data: geospatial information such as X and Y coordinate values of each pixel and DEM data, and vegetation indices such as NDVI and Tasseled Cap transformation. The results showed that the locational information improved overall accuracy (81%) and kappa coefficient (76%), and lowered the omission and commission errors compared with using only spectral data (accuracy 71%, kappa coefficient 62%). Incorporating DEM data did not significantly improve overall accuracy (74%) and kappa coefficient (66%) but lowered the omission and commission errors. Incorporating NDVI did not much improve the overall accuracy (72%) and k coefficient (65%). Including Tasseled Cap transformation reduced the accuracy (accuracy 70%, kappa 61%). Therefore, additional information from the DEM and vegetation indices was not useful as locational ancillary data.
NASA Astrophysics Data System (ADS)
Müller-Putz, Gernot R.; Scherer, Reinhold; Brauneis, Christian; Pfurtscheller, Gert
2005-12-01
Brain-computer interfaces (BCIs) can be realized on the basis of steady-state evoked potentials (SSEPs). These types of brain signals resulting from repetitive stimulation have the same fundamental frequency as the stimulation but also include higher harmonics. This study investigated how the classification accuracy of a 4-class BCI system can be improved by incorporating visually evoked harmonic oscillations. The current study revealed that the use of three SSVEP harmonics yielded a significantly higher classification accuracy than was the case for one or two harmonics. During feedback experiments, the five subjects investigated reached a classification accuracy between 42.5% and 94.4%.
Müller-Putz, Gernot R; Scherer, Reinhold; Brauneis, Christian; Pfurtscheller, Gert
2005-12-01
Brain-computer interfaces (BCIs) can be realized on the basis of steady-state evoked potentials (SSEPs). These types of brain signals resulting from repetitive stimulation have the same fundamental frequency as the stimulation but also include higher harmonics. This study investigated how the classification accuracy of a 4-class BCI system can be improved by incorporating visually evoked harmonic oscillations. The current study revealed that the use of three SSVEP harmonics yielded a significantly higher classification accuracy than was the case for one or two harmonics. During feedback experiments, the five subjects investigated reached a classification accuracy between 42.5% and 94.4%.
High Accuracy Human Activity Recognition Based on Sparse Locality Preserving Projections.
Zhu, Xiangbin; Qiu, Huiling
2016-01-01
Human activity recognition(HAR) from the temporal streams of sensory data has been applied to many fields, such as healthcare services, intelligent environments and cyber security. However, the classification accuracy of most existed methods is not enough in some applications, especially for healthcare services. In order to improving accuracy, it is necessary to develop a novel method which will take full account of the intrinsic sequential characteristics for time-series sensory data. Moreover, each human activity may has correlated feature relationship at different levels. Therefore, in this paper, we propose a three-stage continuous hidden Markov model (TSCHMM) approach to recognize human activities. The proposed method contains coarse, fine and accurate classification. The feature reduction is an important step in classification processing. In this paper, sparse locality preserving projections (SpLPP) is exploited to determine the optimal feature subsets for accurate classification of the stationary-activity data. It can extract more discriminative activities features from the sensor data compared with locality preserving projections. Furthermore, all of the gyro-based features are used for accurate classification of the moving data. Compared with other methods, our method uses significantly less number of features, and the over-all accuracy has been obviously improved.
High Accuracy Human Activity Recognition Based on Sparse Locality Preserving Projections
2016-01-01
Human activity recognition(HAR) from the temporal streams of sensory data has been applied to many fields, such as healthcare services, intelligent environments and cyber security. However, the classification accuracy of most existed methods is not enough in some applications, especially for healthcare services. In order to improving accuracy, it is necessary to develop a novel method which will take full account of the intrinsic sequential characteristics for time-series sensory data. Moreover, each human activity may has correlated feature relationship at different levels. Therefore, in this paper, we propose a three-stage continuous hidden Markov model (TSCHMM) approach to recognize human activities. The proposed method contains coarse, fine and accurate classification. The feature reduction is an important step in classification processing. In this paper, sparse locality preserving projections (SpLPP) is exploited to determine the optimal feature subsets for accurate classification of the stationary-activity data. It can extract more discriminative activities features from the sensor data compared with locality preserving projections. Furthermore, all of the gyro-based features are used for accurate classification of the moving data. Compared with other methods, our method uses significantly less number of features, and the over-all accuracy has been obviously improved. PMID:27893761
NASA Technical Reports Server (NTRS)
Sadowski, F. E.; Sarno, J. E.
1976-01-01
First, an analysis of forest feature signatures was used to help explain the large variation in classification accuracy that can occur among individual forest features for any one case of spatial resolution and the inconsistent changes in classification accuracy that were demonstrated among features as spatial resolution was degraded. Second, the classification rejection threshold was varied in an effort to reduce the large proportion of unclassified resolution elements that previously appeared in the processing of coarse resolution data when a constant rejection threshold was used for all cases of spatial resolution. For the signature analysis, two-channel ellipse plots showing the feature signature distributions for several cases of spatial resolution indicated that the capability of signatures to correctly identify their respective features is dependent on the amount of statistical overlap among signatures. Reductions in signature variance that occur in data of degraded spatial resolution may not necessarily decrease the amount of statistical overlap among signatures having large variance and small mean separations. Features classified by such signatures may thus continue to have similar amounts of misclassified elements in coarser resolution data, and thus, not necessarily improve in classification accuracy.
NASA Astrophysics Data System (ADS)
Fujita, Yusuke; Mitani, Yoshihiro; Hamamoto, Yoshihiko; Segawa, Makoto; Terai, Shuji; Sakaida, Isao
2017-03-01
Ultrasound imaging is a popular and non-invasive tool used in the diagnoses of liver disease. Cirrhosis is a chronic liver disease and it can advance to liver cancer. Early detection and appropriate treatment are crucial to prevent liver cancer. However, ultrasound image analysis is very challenging, because of the low signal-to-noise ratio of ultrasound images. To achieve the higher classification performance, selection of training regions of interest (ROIs) is very important that effect to classification accuracy. The purpose of our study is cirrhosis detection with high accuracy using liver ultrasound images. In our previous works, training ROI selection by MILBoost and multiple-ROI classification based on the product rule had been proposed, to achieve high classification performance. In this article, we propose self-training method to select training ROIs effectively. Evaluation experiments were performed to evaluate effect of self-training, using manually selected ROIs and also automatically selected ROIs. Experimental results show that self-training for manually selected ROIs achieved higher classification performance than other approaches, including our conventional methods. The manually ROI definition and sample selection are important to improve classification accuracy in cirrhosis detection using ultrasound images.
The impact of OCR accuracy on automated cancer classification of pathology reports.
Zuccon, Guido; Nguyen, Anthony N; Bergheim, Anton; Wickman, Sandra; Grayson, Narelle
2012-01-01
To evaluate the effects of Optical Character Recognition (OCR) on the automatic cancer classification of pathology reports. Scanned images of pathology reports were converted to electronic free-text using a commercial OCR system. A state-of-the-art cancer classification system, the Medical Text Extraction (MEDTEX) system, was used to automatically classify the OCR reports. Classifications produced by MEDTEX on the OCR versions of the reports were compared with the classification from a human amended version of the OCR reports. The employed OCR system was found to recognise scanned pathology reports with up to 99.12% character accuracy and up to 98.95% word accuracy. Errors in the OCR processing were found to minimally impact on the automatic classification of scanned pathology reports into notifiable groups. However, the impact of OCR errors is not negligible when considering the extraction of cancer notification items, such as primary site, histological type, etc. The automatic cancer classification system used in this work, MEDTEX, has proven to be robust to errors produced by the acquisition of freetext pathology reports from scanned images through OCR software. However, issues emerge when considering the extraction of cancer notification items.
NASA Astrophysics Data System (ADS)
Dash, Jatindra K.; Kale, Mandar; Mukhopadhyay, Sudipta; Khandelwal, Niranjan; Prabhakar, Nidhi; Garg, Mandeep; Kalra, Naveen
2017-03-01
In this paper, we investigate the effect of the error criteria used during a training phase of the artificial neural network (ANN) on the accuracy of the classifier for classification of lung tissues affected with Interstitial Lung Diseases (ILD). Mean square error (MSE) and the cross-entropy (CE) criteria are chosen being most popular choice in state-of-the-art implementations. The classification experiment performed on the six interstitial lung disease (ILD) patterns viz. Consolidation, Emphysema, Ground Glass Opacity, Micronodules, Fibrosis and Healthy from MedGIFT database. The texture features from an arbitrary region of interest (AROI) are extracted using Gabor filter. Two different neural networks are trained with the scaled conjugate gradient back propagation algorithm with MSE and CE error criteria function respectively for weight updation. Performance is evaluated in terms of average accuracy of these classifiers using 4 fold cross-validation. Each network is trained for five times for each fold with randomly initialized weight vectors and accuracies are computed. Significant improvement in classification accuracy is observed when ANN is trained by using CE (67.27%) as error function compared to MSE (63.60%). Moreover, standard deviation of the classification accuracy for the network trained with CE (6.69) error criteria is found less as compared to network trained with MSE (10.32) criteria.
Cao, Jianfang; Cui, Hongyan; Shi, Hao; Jiao, Lijuan
2016-01-01
A back-propagation (BP) neural network can solve complicated random nonlinear mapping problems; therefore, it can be applied to a wide range of problems. However, as the sample size increases, the time required to train BP neural networks becomes lengthy. Moreover, the classification accuracy decreases as well. To improve the classification accuracy and runtime efficiency of the BP neural network algorithm, we proposed a parallel design and realization method for a particle swarm optimization (PSO)-optimized BP neural network based on MapReduce on the Hadoop platform using both the PSO algorithm and a parallel design. The PSO algorithm was used to optimize the BP neural network's initial weights and thresholds and improve the accuracy of the classification algorithm. The MapReduce parallel programming model was utilized to achieve parallel processing of the BP algorithm, thereby solving the problems of hardware and communication overhead when the BP neural network addresses big data. Datasets on 5 different scales were constructed using the scene image library from the SUN Database. The classification accuracy of the parallel PSO-BP neural network algorithm is approximately 92%, and the system efficiency is approximately 0.85, which presents obvious advantages when processing big data. The algorithm proposed in this study demonstrated both higher classification accuracy and improved time efficiency, which represents a significant improvement obtained from applying parallel processing to an intelligent algorithm on big data.
Edwards, T.C.; Cutler, D.R.; Zimmermann, N.E.; Geiser, L.; Moisen, Gretchen G.
2006-01-01
We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by resubstitution rates were similar for each lichen species irrespective of the underlying sample survey form. Cross-validation estimates of prediction accuracies were lower than resubstitution accuracies for all species and both design types, and in all cases were closer to the true prediction accuracies based on the EVALUATION data set. We argue that greater emphasis should be placed on calculating and reporting cross-validation accuracy rates rather than simple resubstitution accuracy rates. Evaluation of the DESIGN and PURPOSIVE tree models on the EVALUATION data set shows significantly lower prediction accuracy for the PURPOSIVE tree models relative to the DESIGN models, indicating that non-probabilistic sample surveys may generate models with limited predictive capability. These differences were consistent across all four lichen species, with 11 of the 12 possible species and sample survey type comparisons having significantly lower accuracy rates. Some differences in accuracy were as large as 50%. The classification tree structures also differed considerably both among and within the modelled species, depending on the sample survey form. Overlap in the predictor variables selected by the DESIGN and PURPOSIVE tree models ranged from only 20% to 38%, indicating the classification trees fit the two evaluated survey forms on different sets of predictor variables. The magnitude of these differences in predictor variables throws doubt on ecological interpretation derived from prediction models based on non-probabilistic sample surveys. ?? 2006 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Sukawattanavijit, Chanika; Srestasathiern, Panu
2017-10-01
Land Use and Land Cover (LULC) information are significant to observe and evaluate environmental change. LULC classification applying remotely sensed data is a technique popularly employed on a global and local dimension particularly, in urban areas which have diverse land cover types. These are essential components of the urban terrain and ecosystem. In the present, object-based image analysis (OBIA) is becoming widely popular for land cover classification using the high-resolution image. COSMO-SkyMed SAR data was fused with THAICHOTE (namely, THEOS: Thailand Earth Observation Satellite) optical data for land cover classification using object-based. This paper indicates a comparison between object-based and pixel-based approaches in image fusion. The per-pixel method, support vector machines (SVM) was implemented to the fused image based on Principal Component Analysis (PCA). For the objectbased classification was applied to the fused images to separate land cover classes by using nearest neighbor (NN) classifier. Finally, the accuracy assessment was employed by comparing with the classification of land cover mapping generated from fused image dataset and THAICHOTE image. The object-based data fused COSMO-SkyMed with THAICHOTE images demonstrated the best classification accuracies, well over 85%. As the results, an object-based data fusion provides higher land cover classification accuracy than per-pixel data fusion.
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-01-01
Background Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Methodology/Principal Findings Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. Conclusions/Significance The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice. PMID:21359184
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-02-16
Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice.
NASA Astrophysics Data System (ADS)
Melville, Bethany; Lucieer, Arko; Aryal, Jagannath
2018-04-01
This paper presents a random forest classification approach for identifying and mapping three types of lowland native grassland communities found in the Tasmanian Midlands region. Due to the high conservation priority assigned to these communities, there has been an increasing need to identify appropriate datasets that can be used to derive accurate and frequently updateable maps of community extent. Therefore, this paper proposes a method employing repeat classification and statistical significance testing as a means of identifying the most appropriate dataset for mapping these communities. Two datasets were acquired and analysed; a Landsat ETM+ scene, and a WorldView-2 scene, both from 2010. Training and validation data were randomly subset using a k-fold (k = 50) approach from a pre-existing field dataset. Poa labillardierei, Themeda triandra and lowland native grassland complex communities were identified in addition to dry woodland and agriculture. For each subset of randomly allocated points, a random forest model was trained based on each dataset, and then used to classify the corresponding imagery. Validation was performed using the reciprocal points from the independent subset that had not been used to train the model. Final training and classification accuracies were reported as per class means for each satellite dataset. Analysis of Variance (ANOVA) was undertaken to determine whether classification accuracy differed between the two datasets, as well as between classifications. Results showed mean class accuracies between 54% and 87%. Class accuracy only differed significantly between datasets for the dry woodland and Themeda grassland classes, with the WorldView-2 dataset showing higher mean classification accuracies. The results of this study indicate that remote sensing is a viable method for the identification of lowland native grassland communities in the Tasmanian Midlands, and that repeat classification and statistical significant testing can be used to identify optimal datasets for vegetation community mapping.
Estimation of different data compositions for early-season crop type classification.
Hao, Pengyu; Wu, Mingquan; Niu, Zheng; Wang, Li; Zhan, Yulin
2018-01-01
Timely and accurate crop type distribution maps are an important inputs for crop yield estimation and production forecasting as multi-temporal images can observe phenological differences among crops. Therefore, time series remote sensing data are essential for crop type mapping, and image composition has commonly been used to improve the quality of the image time series. However, the optimal composition period is unclear as long composition periods (such as compositions lasting half a year) are less informative and short composition periods lead to information redundancy and missing pixels. In this study, we initially acquired daily 30 m Normalized Difference Vegetation Index (NDVI) time series by fusing MODIS, Landsat, Gaofen and Huanjing (HJ) NDVI, and then composited the NDVI time series using four strategies (daily, 8-day, 16-day, and 32-day). We used Random Forest to identify crop types and evaluated the classification performances of the NDVI time series generated from four composition strategies in two studies regions from Xinjiang, China. Results indicated that crop classification performance improved as crop separabilities and classification accuracies increased, and classification uncertainties dropped in the green-up stage of the crops. When using daily NDVI time series, overall accuracies saturated at 113-day and 116-day in Bole and Luntai, and the saturated overall accuracies (OAs) were 86.13% and 91.89%, respectively. Cotton could be identified 40∼60 days and 35∼45 days earlier than the harvest in Bole and Luntai when using daily, 8-day and 16-day composition NDVI time series since both producer's accuracies (PAs) and user's accuracies (UAs) were higher than 85%. Among the four compositions, the daily NDVI time series generated the highest classification accuracies. Although the 8-day, 16-day and 32-day compositions had similar saturated overall accuracies (around 85% in Bole and 83% in Luntai), the 8-day and 16-day compositions achieved these accuracies around 155-day in Bole and 133-day in Luntai, which were earlier than the 32-day composition (170-day in both Bole and Luntai). Therefore, when the daily NDVI time series cannot be acquired, the 16-day composition is recommended in this study.
Estimation of different data compositions for early-season crop type classification
Wu, Mingquan; Wang, Li; Zhan, Yulin
2018-01-01
Timely and accurate crop type distribution maps are an important inputs for crop yield estimation and production forecasting as multi-temporal images can observe phenological differences among crops. Therefore, time series remote sensing data are essential for crop type mapping, and image composition has commonly been used to improve the quality of the image time series. However, the optimal composition period is unclear as long composition periods (such as compositions lasting half a year) are less informative and short composition periods lead to information redundancy and missing pixels. In this study, we initially acquired daily 30 m Normalized Difference Vegetation Index (NDVI) time series by fusing MODIS, Landsat, Gaofen and Huanjing (HJ) NDVI, and then composited the NDVI time series using four strategies (daily, 8-day, 16-day, and 32-day). We used Random Forest to identify crop types and evaluated the classification performances of the NDVI time series generated from four composition strategies in two studies regions from Xinjiang, China. Results indicated that crop classification performance improved as crop separabilities and classification accuracies increased, and classification uncertainties dropped in the green-up stage of the crops. When using daily NDVI time series, overall accuracies saturated at 113-day and 116-day in Bole and Luntai, and the saturated overall accuracies (OAs) were 86.13% and 91.89%, respectively. Cotton could be identified 40∼60 days and 35∼45 days earlier than the harvest in Bole and Luntai when using daily, 8-day and 16-day composition NDVI time series since both producer’s accuracies (PAs) and user’s accuracies (UAs) were higher than 85%. Among the four compositions, the daily NDVI time series generated the highest classification accuracies. Although the 8-day, 16-day and 32-day compositions had similar saturated overall accuracies (around 85% in Bole and 83% in Luntai), the 8-day and 16-day compositions achieved these accuracies around 155-day in Bole and 133-day in Luntai, which were earlier than the 32-day composition (170-day in both Bole and Luntai). Therefore, when the daily NDVI time series cannot be acquired, the 16-day composition is recommended in this study. PMID:29868265
Semi-supervised classification tool for DubaiSat-2 multispectral imagery
NASA Astrophysics Data System (ADS)
Al-Mansoori, Saeed
2015-10-01
This paper addresses a semi-supervised classification tool based on a pixel-based approach of the multi-spectral satellite imagery. There are not many studies demonstrating such algorithm for the multispectral images, especially when the image consists of 4 bands (Red, Green, Blue and Near Infrared) as in DubaiSat-2 satellite images. The proposed approach utilizes both unsupervised and supervised classification schemes sequentially to identify four classes in the image, namely, water bodies, vegetation, land (developed and undeveloped areas) and paved areas (i.e. roads). The unsupervised classification concept is applied to identify two classes; water bodies and vegetation, based on a well-known index that uses the distinct wavelengths of visible and near-infrared sunlight that is absorbed and reflected by the plants to identify the classes; this index parameter is called "Normalized Difference Vegetation Index (NDVI)". Afterward, the supervised classification is performed by selecting training homogenous samples for roads and land areas. Here, a precise selection of training samples plays a vital role in the classification accuracy. Post classification is finally performed to enhance the classification accuracy, where the classified image is sieved, clumped and filtered before producing final output. Overall, the supervised classification approach produced higher accuracy than the unsupervised method. This paper shows some current preliminary research results which point out the effectiveness of the proposed technique in a virtual perspective.
Classification Consistency and Accuracy for Complex Assessments Using Item Response Theory
ERIC Educational Resources Information Center
Lee, Won-Chan
2010-01-01
In this article, procedures are described for estimating single-administration classification consistency and accuracy indices for complex assessments using item response theory (IRT). This IRT approach was applied to real test data comprising dichotomous and polytomous items. Several different IRT model combinations were considered. Comparisons…
Conceptual Scoring and Classification Accuracy of Vocabulary Testing in Bilingual Children
ERIC Educational Resources Information Center
Anaya, Jissel B.; Peña, Elizabeth D.; Bedore, Lisa M.
2018-01-01
Purpose: This study examined the effects of single-language and conceptual scoring on the vocabulary performance of bilingual children with and without specific language impairment. We assessed classification accuracy across 3 scoring methods. Method: Participants included Spanish-English bilingual children (N = 247) aged 5;1 (years;months) to…
Classification with spatio-temporal interpixel class dependency contexts
NASA Technical Reports Server (NTRS)
Jeon, Byeungwoo; Landgrebe, David A.
1992-01-01
A contextual classifier which can utilize both spatial and temporal interpixel dependency contexts is investigated. After spatial and temporal neighbors are defined, a general form of maximum a posterior spatiotemporal contextual classifier is derived. This contextual classifier is simplified under several assumptions. Joint prior probabilities of the classes of each pixel and its spatial neighbors are modeled by the Gibbs random field. The classification is performed in a recursive manner to allow a computationally efficient contextual classification. Experimental results with bitemporal TM data show significant improvement of classification accuracy over noncontextual pixelwise classifiers. This spatiotemporal contextual classifier should find use in many applications of remote sensing, especially when the classification accuracy is important.
Gastric precancerous diseases classification using CNN with a concise model.
Zhang, Xu; Hu, Weiling; Chen, Fei; Liu, Jiquan; Yang, Yuanhang; Wang, Liangjing; Duan, Huilong; Si, Jianmin
2017-01-01
Gastric precancerous diseases (GPD) may deteriorate into early gastric cancer if misdiagnosed, so it is important to help doctors recognize GPD accurately and quickly. In this paper, we realize the classification of 3-class GPD, namely, polyp, erosion, and ulcer using convolutional neural networks (CNN) with a concise model called the Gastric Precancerous Disease Network (GPDNet). GPDNet introduces fire modules from SqueezeNet to reduce the model size and parameters about 10 times while improving speed for quick classification. To maintain classification accuracy with fewer parameters, we propose an innovative method called iterative reinforced learning (IRL). After training GPDNet from scratch, we apply IRL to fine-tune the parameters whose values are close to 0, and then we take the modified model as a pretrained model for the next training. The result shows that IRL can improve the accuracy about 9% after 6 iterations. The final classification accuracy of our GPDNet was 88.90%, which is promising for clinical GPD recognition.
Convolutional neural network with transfer learning for rice type classification
NASA Astrophysics Data System (ADS)
Patel, Vaibhav Amit; Joshi, Manjunath V.
2018-04-01
Presently, rice type is identified manually by humans, which is time consuming and error prone. Therefore, there is a need to do this by machine which makes it faster with greater accuracy. This paper proposes a deep learning based method for classification of rice types. We propose two methods to classify the rice types. In the first method, we train a deep convolutional neural network (CNN) using the given segmented rice images. In the second method, we train a combination of a pretrained VGG16 network and the proposed method, while using transfer learning in which the weights of a pretrained network are used to achieve better accuracy. Our approach can also be used for classification of rice grain as broken or fine. We train a 5-class model for classifying rice types using 4000 training images and another 2- class model for the classification of broken and normal rice using 1600 training images. We observe that despite having distinct rice images, our architecture, pretrained on ImageNet data boosts classification accuracy significantly.
Frenzel, Jochen; Gessner, Christian; Sandvoss, Torsten; Hammerschmidt, Stefan; Schellenberger, Wolfgang; Sack, Ulrich; Eschrich, Klaus; Wirtz, Hubert
2011-01-01
Background Peptide patterns of bronchoalveolar lavage fluid (BALF) were assumed to reflect the complex pathology of acute lung injury (ALI)/acute respiratory distress syndrome (ARDS) better than clinical and inflammatory parameters and may be superior for outcome prediction. Methodology/Principal Findings A training group of patients suffering from ALI/ARDS was compiled from equal numbers of survivors and nonsurvivors. Clinical history, ventilation parameters, Murray's lung injury severity score (Murray's LISS) and interleukins in BALF were gathered. In addition, samples of bronchoalveolar lavage fluid were analyzed by means of hydrophobic chromatography and MALDI-ToF mass spectrometry (MALDI-ToF MS). Receiver operating characteristic (ROC) analysis for each clinical and cytokine parameter revealed interleukin-6>interleukin-8>diabetes mellitus>Murray's LISS as the best outcome predictors. Outcome predicted on the basis of BALF levels of interleukin-6 resulted in 79.4% accuracy, 82.7% sensitivity and 76.1% specificity (area under the ROC curve, AUC, 0.853). Both clinical parameters and cytokines as well as peptide patterns determined by MALDI-ToF MS were analyzed by classification and regression tree (CART) analysis and support vector machine (SVM) algorithms. CART analysis including Murray's LISS, interleukin-6 and interleukin-8 in combination was correct in 78.0%. MALDI-ToF MS of BALF peptides did not reveal a single identifiable biomarker for ARDS. However, classification of patients was successfully achieved based on the entire peptide pattern analyzed using SVM. This method resulted in 90% accuracy, 93.3% sensitivity and 86.7% specificity following a 10-fold cross validation (AUC = 0.953). Subsequent validation of the optimized SVM algorithm with a test group of patients with unknown prognosis yielded 87.5% accuracy, 83.3% sensitivity and 90.0% specificity. Conclusions/Significance MALDI-ToF MS peptide patterns of BALF, evaluated by appropriate mathematical methods can be of value in predicting outcome in pneumonia induced ALI/ARDS. PMID:21991318
Frenzel, Jochen; Gessner, Christian; Sandvoss, Torsten; Hammerschmidt, Stefan; Schellenberger, Wolfgang; Sack, Ulrich; Eschrich, Klaus; Wirtz, Hubert
2011-01-01
Peptide patterns of bronchoalveolar lavage fluid (BALF) were assumed to reflect the complex pathology of acute lung injury (ALI)/acute respiratory distress syndrome (ARDS) better than clinical and inflammatory parameters and may be superior for outcome prediction. A training group of patients suffering from ALI/ARDS was compiled from equal numbers of survivors and nonsurvivors. Clinical history, ventilation parameters, Murray's lung injury severity score (Murray's LISS) and interleukins in BALF were gathered. In addition, samples of bronchoalveolar lavage fluid were analyzed by means of hydrophobic chromatography and MALDI-ToF mass spectrometry (MALDI-ToF MS). Receiver operating characteristic (ROC) analysis for each clinical and cytokine parameter revealed interleukin-6>interleukin-8>diabetes mellitus>Murray's LISS as the best outcome predictors. Outcome predicted on the basis of BALF levels of interleukin-6 resulted in 79.4% accuracy, 82.7% sensitivity and 76.1% specificity (area under the ROC curve, AUC, 0.853). Both clinical parameters and cytokines as well as peptide patterns determined by MALDI-ToF MS were analyzed by classification and regression tree (CART) analysis and support vector machine (SVM) algorithms. CART analysis including Murray's LISS, interleukin-6 and interleukin-8 in combination was correct in 78.0%. MALDI-ToF MS of BALF peptides did not reveal a single identifiable biomarker for ARDS. However, classification of patients was successfully achieved based on the entire peptide pattern analyzed using SVM. This method resulted in 90% accuracy, 93.3% sensitivity and 86.7% specificity following a 10-fold cross validation (AUC = 0.953). Subsequent validation of the optimized SVM algorithm with a test group of patients with unknown prognosis yielded 87.5% accuracy, 83.3% sensitivity and 90.0% specificity. MALDI-ToF MS peptide patterns of BALF, evaluated by appropriate mathematical methods can be of value in predicting outcome in pneumonia induced ALI/ARDS.
Kim, SungHwan; Lin, Chien-Wei; Tseng, George C
2016-07-01
Supervised machine learning is widely applied to transcriptomic data to predict disease diagnosis, prognosis or survival. Robust and interpretable classifiers with high accuracy are usually favored for their clinical and translational potential. The top scoring pair (TSP) algorithm is an example that applies a simple rank-based algorithm to identify rank-altered gene pairs for classifier construction. Although many classification methods perform well in cross-validation of single expression profile, the performance usually greatly reduces in cross-study validation (i.e. the prediction model is established in the training study and applied to an independent test study) for all machine learning methods, including TSP. The failure of cross-study validation has largely diminished the potential translational and clinical values of the models. The purpose of this article is to develop a meta-analytic top scoring pair (MetaKTSP) framework that combines multiple transcriptomic studies and generates a robust prediction model applicable to independent test studies. We proposed two frameworks, by averaging TSP scores or by combining P-values from individual studies, to select the top gene pairs for model construction. We applied the proposed methods in simulated data sets and three large-scale real applications in breast cancer, idiopathic pulmonary fibrosis and pan-cancer methylation. The result showed superior performance of cross-study validation accuracy and biomarker selection for the new meta-analytic framework. In conclusion, combining multiple omics data sets in the public domain increases robustness and accuracy of the classification model that will ultimately improve disease understanding and clinical treatment decisions to benefit patients. An R package MetaKTSP is available online. (http://tsenglab.biostat.pitt.edu/software.htm). ctseng@pitt.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Predicting geomorphic stability in low-order streams of the western Lake Superior basin
Width:depth ratios, entrenchment ratios, gradients, and median substrate particle sizes (D50s) were measured in 32 second and third order stream reaches in the western Lake Superior basin, and stream reaches were assigned a Rosgen geomorphic classification. Over 700 measurements ...
Monteiro-Soares, M; Martins-Mendes, D; Vaz-Carneiro, A; Sampaio, S; Dinis-Ribeiro, M
2014-10-01
We systematically review the available systems used to classify diabetic foot ulcers in order to synthesize their methodological qualitative issues and accuracy to predict lower extremity amputation, as this may represent a critical point in these patients' care. Two investigators searched, in EBSCO, ISI, PubMed and SCOPUS databases, and independently selected studies published until May 2013 and reporting prognostic accuracy and/or reliability of specific systems for patients with diabetic foot ulcer in order to predict lower extremity amputation. We included 25 studies reporting a prevalence of lower extremity amputation between 6% and 78%. Eight different diabetic foot ulcer descriptions and seven prognostic stratification classification systems were addressed with a variable (1-9) number of factors included, specially peripheral arterial disease (n = 12) or infection at the ulcer site (n = 10) or ulcer depth (n = 10). The Meggitt-Wagner, S(AD)SAD and Texas University Classification systems were the most extensively validated, whereas ten classifications were derived or validated only once. Reliability was reported in a single study, and accuracy measures were reported in five studies with another eight allowing their calculation. Pooled accuracy ranged from 0.65 (for gangrene) to 0.74 (for infection). There are numerous classification systems for diabetic foot ulcer outcome prediction, but only few studies evaluated their reliability or external validity. Studies rarely validated several systems simultaneously and only a few reported accuracy measures. Further studies assessing reliability and accuracy of the available systems and their composing variables are needed. Copyright © 2014 John Wiley & Sons, Ltd.
Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio
NASA Astrophysics Data System (ADS)
Nababan, A. A.; Sitompul, O. S.; Tulus
2018-04-01
K- Nearest Neighbor (KNN) is a good classifier, but from several studies, the result performance accuracy of KNN still lower than other methods. One of the causes of the low accuracy produced, because each attribute has the same effect on the classification process, while some less relevant characteristics lead to miss-classification of the class assignment for new data. In this research, we proposed Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio as a parameter to see the correlation between each attribute in the data and the Gain Ratio also will be used as the basis for weighting each attribute of the dataset. The accuracy of results is compared to the accuracy acquired from the original KNN method using 10-fold Cross-Validation with several datasets from the UCI Machine Learning repository and KEEL-Dataset Repository, such as abalone, glass identification, haberman, hayes-roth and water quality status. Based on the result of the test, the proposed method was able to increase the classification accuracy of KNN, where the highest difference of accuracy obtained hayes-roth dataset is worth 12.73%, and the lowest difference of accuracy obtained in the abalone dataset of 0.07%. The average result of the accuracy of all dataset increases the accuracy by 5.33%.
NASA Technical Reports Server (NTRS)
Myint, Soe W.; Mesev, Victor; Quattrochi, Dale; Wentz, Elizabeth A.
2013-01-01
Remote sensing methods used to generate base maps to analyze the urban environment rely predominantly on digital sensor data from space-borne platforms. This is due in part from new sources of high spatial resolution data covering the globe, a variety of multispectral and multitemporal sources, sophisticated statistical and geospatial methods, and compatibility with GIS data sources and methods. The goal of this chapter is to review the four groups of classification methods for digital sensor data from space-borne platforms; per-pixel, sub-pixel, object-based (spatial-based), and geospatial methods. Per-pixel methods are widely used methods that classify pixels into distinct categories based solely on the spectral and ancillary information within that pixel. They are used for simple calculations of environmental indices (e.g., NDVI) to sophisticated expert systems to assign urban land covers. Researchers recognize however, that even with the smallest pixel size the spectral information within a pixel is really a combination of multiple urban surfaces. Sub-pixel classification methods therefore aim to statistically quantify the mixture of surfaces to improve overall classification accuracy. While within pixel variations exist, there is also significant evidence that groups of nearby pixels have similar spectral information and therefore belong to the same classification category. Object-oriented methods have emerged that group pixels prior to classification based on spectral similarity and spatial proximity. Classification accuracy using object-based methods show significant success and promise for numerous urban 3 applications. Like the object-oriented methods that recognize the importance of spatial proximity, geospatial methods for urban mapping also utilize neighboring pixels in the classification process. The primary difference though is that geostatistical methods (e.g., spatial autocorrelation methods) are utilized during both the pre- and post-classification steps. Within this chapter, each of the four approaches is described in terms of scale and accuracy classifying urban land use and urban land cover; and for its range of urban applications. We demonstrate the overview of four main classification groups in Figure 1 while Table 1 details the approaches with respect to classification requirements and procedures (e.g., reflectance conversion, steps before training sample selection, training samples, spatial approaches commonly used, classifiers, primary inputs for classification, output structures, number of output layers, and accuracy assessment). The chapter concludes with a brief summary of the methods reviewed and the challenges that remain in developing new classification methods for improving the efficiency and accuracy of mapping urban areas.
Wang, Xueyi; Davidson, Nicholas J.
2011-01-01
Ensemble methods have been widely used to improve prediction accuracy over individual classifiers. In this paper, we achieve a few results about the prediction accuracies of ensemble methods for binary classification that are missed or misinterpreted in previous literature. First we show the upper and lower bounds of the prediction accuracies (i.e. the best and worst possible prediction accuracies) of ensemble methods. Next we show that an ensemble method can achieve > 0.5 prediction accuracy, while individual classifiers have < 0.5 prediction accuracies. Furthermore, for individual classifiers with different prediction accuracies, the average of the individual accuracies determines the upper and lower bounds. We perform two experiments to verify the results and show that it is hard to achieve the upper and lower bounds accuracies by random individual classifiers and better algorithms need to be developed. PMID:21853162
Landenburger, L.; Lawrence, R.L.; Podruzny, S.; Schwartz, C.C.
2008-01-01
Moderate resolution satellite imagery traditionally has been thought to be inadequate for mapping vegetation at the species level. This has made comprehensive mapping of regional distributions of sensitive species, such as whitebark pine, either impractical or extremely time consuming. We sought to determine whether using a combination of moderate resolution satellite imagery (Landsat Enhanced Thematic Mapper Plus), extensive stand data collected by land management agencies for other purposes, and modern statistical classification techniques (boosted classification trees) could result in successful mapping of whitebark pine. Overall classification accuracies exceeded 90%, with similar individual class accuracies. Accuracies on a localized basis varied based on elevation. Accuracies also varied among administrative units, although we were not able to determine whether these differences related to inherent spatial variations or differences in the quality of available reference data.
Integrative genetic risk prediction using non-parametric empirical Bayes classification.
Zhao, Sihai Dave
2017-06-01
Genetic risk prediction is an important component of individualized medicine, but prediction accuracies remain low for many complex diseases. A fundamental limitation is the sample sizes of the studies on which the prediction algorithms are trained. One way to increase the effective sample size is to integrate information from previously existing studies. However, it can be difficult to find existing data that examine the target disease of interest, especially if that disease is rare or poorly studied. Furthermore, individual-level genotype data from these auxiliary studies are typically difficult to obtain. This article proposes a new approach to integrative genetic risk prediction of complex diseases with binary phenotypes. It accommodates possible heterogeneity in the genetic etiologies of the target and auxiliary diseases using a tuning parameter-free non-parametric empirical Bayes procedure, and can be trained using only auxiliary summary statistics. Simulation studies show that the proposed method can provide superior predictive accuracy relative to non-integrative as well as integrative classifiers. The method is applied to a recent study of pediatric autoimmune diseases, where it substantially reduces prediction error for certain target/auxiliary disease combinations. The proposed method is implemented in the R package ssa. © 2016, The International Biometric Society.
NASA Astrophysics Data System (ADS)
Li, S. X.; Zhang, Y. J.; Zeng, Q. Y.; Li, L. F.; Guo, Z. Y.; Liu, Z. M.; Xiong, H. L.; Liu, S. H.
2014-06-01
Cancer is the most common disease to threaten human health. The ability to screen individuals with malignant tumours with only a blood sample would be greatly advantageous to early diagnosis and intervention. This study explores the possibility of discriminating between cancer patients and normal subjects with serum surface-enhanced Raman spectroscopy (SERS) and a support vector machine (SVM) through a peripheral blood sample. A total of 130 blood samples were obtained from patients with liver cancer, colonic cancer, esophageal cancer, nasopharyngeal cancer, gastric cancer, as well as 113 blood samples from normal volunteers. Several diagnostic models were built with the serum SERS spectra using SVM and principal component analysis (PCA) techniques. The results show that a diagnostic accuracy of 85.5% is acquired with a PCA algorithm, while a diagnostic accuracy of 95.8% is obtained using radial basis function (RBF), PCA-SVM methods. The results prove that a RBF kernel PCA-SVM technique is superior to PCA and conventional SVM (C-SVM) algorithms in classification serum SERS spectra. The study demonstrates that serum SERS, in combination with SVM techniques, has great potential for screening cancerous patients with any solid malignant tumour through a peripheral blood sample.
NASA Technical Reports Server (NTRS)
Wrigley, R. C.; Acevedo, W.; Alexander, D.; Buis, J.; Card, D.
1984-01-01
An experiment of a factorial design was conducted to test the effects on classification accuracy of land cover types due to the improved spatial, spectral and radiometric characteristics of the Thematic Mapper (TM) in comparison to the Multispectral Scanner (MSS). High altitude aircraft scanner data from the Airborne Thematic Mapper instrument was acquired over central California in August, 1983 and used to simulate Thematic Mapper data as well as all combinations of the three characteristics for eight data sets in all. Results for the training sites (field center pixels) showed better classification accuracies for MSS spatial resolution, TM spectral bands and TM radiometry in order of importance.
Cannon, Edward O; Amini, Ata; Bender, Andreas; Sternberg, Michael J E; Muggleton, Stephen H; Glen, Robert C; Mitchell, John B O
2007-05-01
We investigate the classification performance of circular fingerprints in combination with the Naive Bayes Classifier (MP2D), Inductive Logic Programming (ILP) and Support Vector Inductive Logic Programming (SVILP) on a standard molecular benchmark dataset comprising 11 activity classes and about 102,000 structures. The Naive Bayes Classifier treats features independently while ILP combines structural fragments, and then creates new features with higher predictive power. SVILP is a very recently presented method which adds a support vector machine after common ILP procedures. The performance of the methods is evaluated via a number of statistical measures, namely recall, specificity, precision, F-measure, Matthews Correlation Coefficient, area under the Receiver Operating Characteristic (ROC) curve and enrichment factor (EF). According to the F-measure, which takes both recall and precision into account, SVILP is for seven out of the 11 classes the superior method. The results show that the Bayes Classifier gives the best recall performance for eight of the 11 targets, but has a much lower precision, specificity and F-measure. The SVILP model on the other hand has the highest recall for only three of the 11 classes, but generally far superior specificity and precision. To evaluate the statistical significance of the SVILP superiority, we employ McNemar's test which shows that SVILP performs significantly (p < 5%) better than both other methods for six out of 11 activity classes, while being superior with less significance for three of the remaining classes. While previously the Bayes Classifier was shown to perform very well in molecular classification studies, these results suggest that SVILP is able to extract additional knowledge from the data, thus improving classification results further.
High Ability and Learner Characteristics
ERIC Educational Resources Information Center
Hindal, Huda; Reid, Norman; Whitehead, Rex
2013-01-01
The outstandingly able learner has been conceptualised, in terms of test and examination performance, as the learner showing superior academic performance which is markedly better than that of peers and in ways regarded as of value by wider society. In Kuwait, such superior examination performance leads to a classification regarded as being…
Main and interactive effects of watershed storage and forest fragmentation on watershed exports, habitat quality, community composition and food-web relationships were compared within and acoss two hydrogeomorphic regions (HGM, North Shore Highlands and Lake Superior clay plains/...
Gutschalk, Alexander; Uppenkamp, Stefan; Riedel, Bernhard; Bartsch, Andreas; Brandt, Tobias; Vogt-Schaden, Marlies
2015-12-01
Based on results from functional imaging, cortex along the superior temporal sulcus (STS) has been suggested to subserve phoneme and pre-lexical speech perception. For vowel classification, both superior temporal plane (STP) and STS areas have been suggested relevant. Lesion of bilateral STS may conversely be expected to cause pure word deafness and possibly also impaired vowel classification. Here we studied a patient with bilateral STS lesions caused by ischemic strokes and relatively intact medial STPs to characterize the behavioral consequences of STS loss. The patient showed severe deficits in auditory speech perception, whereas his speech production was fluent and communication by written speech was grossly intact. Auditory-evoked fields in the STP were within normal limits on both sides, suggesting that major parts of the auditory cortex were functionally intact. Further studies showed that the patient had normal hearing thresholds and only mild disability in tests for telencephalic hearing disorder. Prominent deficits were discovered in an auditory-object classification task, where the patient performed four standard deviations below the control group. In marked contrast, performance in a vowel-classification task was intact. Auditory evoked fields showed enhanced responses for vowels compared to matched non-vowels within normal limits. Our results are consistent with the notion that cortex along STS is important for auditory speech perception, although it does not appear to be entirely speech specific. Formant analysis and single vowel classification, however, appear to be already implemented in auditory cortex on the STP. Copyright © 2015 Elsevier Ltd. All rights reserved.
Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor.
Xu, Chang; Wang, Yingguan; Bao, Xinghe; Li, Fengrong
2018-05-24
This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs). Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN) classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.
NASA Technical Reports Server (NTRS)
Stoner, E. R.; May, G. A.; Kalcic, M. T. (Principal Investigator)
1981-01-01
Sample segments of ground-verified land cover data collected in conjunction with the USDA/ESS June Enumerative Survey were merged with LANDSAT data and served as a focus for unsupervised spectral class development and accuracy assessment. Multitemporal data sets were created from single-date LANDSAT MSS acquisitions from a nominal scene covering an eleven-county area in north central Missouri. Classification accuracies for the four land cover types predominant in the test site showed significant improvement in going from unitemporal to multitemporal data sets. Transformed LANDSAT data sets did not significantly improve classification accuracies. Regression estimators yielded mixed results for different land covers. Misregistration of two LANDSAT data sets by as much and one half pixels did not significantly alter overall classification accuracies. Existing algorithms for scene-to scene overlay proved adequate for multitemporal data analysis as long as statistical class development and accuracy assessment were restricted to field interior pixels.
Examining the Classification Accuracy of a Vocabulary Screening Measure with Preschool Children
ERIC Educational Resources Information Center
Marcotte, Amanda M.; Clemens, Nathan H.; Parker, Christopher; Whitcomb, Sara A.
2016-01-01
This study investigated the classification accuracy of the "Dynamic Indicators of Vocabulary Skills" (DIVS) as a preschool vocabulary screening measure. With a sample of 240 preschoolers, fall and winter DIVS scores were used to predict year-end vocabulary risk using the 25th percentile on the "Peabody Picture Vocabulary Test--Third…
ERIC Educational Resources Information Center
Daniels, Brian; Volpe, Robert J.; Fabiano, Gregory A.; Briesch, Amy M.
2017-01-01
This study examines the classification accuracy and teacher acceptability of a problem-focused screener for academic and disruptive behavior problems, which is directly linked to evidence-based intervention. Participants included 39 classroom teachers from 2 public school districts in the Northeastern United States. Teacher ratings were obtained…
ERIC Educational Resources Information Center
Zhang, Bo
2010-01-01
This article investigates how measurement models and statistical procedures can be applied to estimate the accuracy of proficiency classification in language testing. The paper starts with a concise introduction of four measurement models: the classical test theory (CTT) model, the dichotomous item response theory (IRT) model, the testlet response…
ERIC Educational Resources Information Center
Furey, William M.; Marcotte, Amanda M.; Hintze, John M.; Shackett, Caroline M.
2016-01-01
The study presents a critical analysis of written expression curriculum-based measurement (WE-CBM) metrics derived from 3- and 10-min test lengths. Criterion validity and classification accuracy were examined for Total Words Written (TWW), Correct Writing Sequences (CWS), Percent Correct Writing Sequences (%CWS), and Correct Minus Incorrect…
ERIC Educational Resources Information Center
Pena, Elizabeth D.; Gillam, Ronald B.; Malek, Melynn; Ruiz-Felter, Roxanna; Resendiz, Maria; Fiestas, Christine; Sabel, Tracy
2006-01-01
Two experiments examined reliability and classification accuracy of a narration-based dynamic assessment task. Purpose: The first experiment evaluated whether parallel results were obtained from stories created in response to 2 different wordless picture books. If so, the tasks and measures would be appropriate for assessing pretest and posttest…
The Potential Impact of Not Being Able to Create Parallel Tests on Expected Classification Accuracy
ERIC Educational Resources Information Center
Wyse, Adam E.
2011-01-01
In many practical testing situations, alternate test forms from the same testing program are not strictly parallel to each other and instead the test forms exhibit small psychometric differences. This article investigates the potential practical impact that these small psychometric differences can have on expected classification accuracy. Ten…
Emotion recognition from multichannel EEG signals using K-nearest neighbor classification.
Li, Mi; Xu, Hongpei; Liu, Xingwang; Lu, Shengfu
2018-04-27
Many studies have been done on the emotion recognition based on multi-channel electroencephalogram (EEG) signals. This paper explores the influence of the emotion recognition accuracy of EEG signals in different frequency bands and different number of channels. We classified the emotional states in the valence and arousal dimensions using different combinations of EEG channels. Firstly, DEAP default preprocessed data were normalized. Next, EEG signals were divided into four frequency bands using discrete wavelet transform, and entropy and energy were calculated as features of K-nearest neighbor Classifier. The classification accuracies of the 10, 14, 18 and 32 EEG channels based on the Gamma frequency band were 89.54%, 92.28%, 93.72% and 95.70% in the valence dimension and 89.81%, 92.24%, 93.69% and 95.69% in the arousal dimension. As the number of channels increases, the classification accuracy of emotional states also increases, the classification accuracy of the gamma frequency band is greater than that of the beta frequency band followed by the alpha and theta frequency bands. This paper provided better frequency bands and channels reference for emotion recognition based on EEG.
Zourmand, Alireza; Ting, Hua-Nong; Mirhassani, Seyed Mostafa
2013-03-01
Speech is one of the prevalent communication mediums for humans. Identifying the gender of a child speaker based on his/her speech is crucial in telecommunication and speech therapy. This article investigates the use of fundamental and formant frequencies from sustained vowel phonation to distinguish the gender of Malay children aged between 7 and 12 years. The Euclidean minimum distance and multilayer perceptron were used to classify the gender of 360 Malay children based on different combinations of fundamental and formant frequencies (F0, F1, F2, and F3). The Euclidean minimum distance with normalized frequency data achieved a classification accuracy of 79.44%, which was higher than that of the nonnormalized frequency data. Age-dependent modeling was used to improve the accuracy of gender classification. The Euclidean distance method obtained 84.17% based on the optimal classification accuracy for all age groups. The accuracy was further increased to 99.81% using multilayer perceptron based on mel-frequency cepstral coefficients. Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Du, Peijun; Tan, Kun; Xing, Xiaoshi
2010-12-01
Combining Support Vector Machine (SVM) with wavelet analysis, we constructed wavelet SVM (WSVM) classifier based on wavelet kernel functions in Reproducing Kernel Hilbert Space (RKHS). In conventional kernel theory, SVM is faced with the bottleneck of kernel parameter selection which further results in time-consuming and low classification accuracy. The wavelet kernel in RKHS is a kind of multidimensional wavelet function that can approximate arbitrary nonlinear functions. Implications on semiparametric estimation are proposed in this paper. Airborne Operational Modular Imaging Spectrometer II (OMIS II) hyperspectral remote sensing image with 64 bands and Reflective Optics System Imaging Spectrometer (ROSIS) data with 115 bands were used to experiment the performance and accuracy of the proposed WSVM classifier. The experimental results indicate that the WSVM classifier can obtain the highest accuracy when using the Coiflet Kernel function in wavelet transform. In contrast with some traditional classifiers, including Spectral Angle Mapping (SAM) and Minimum Distance Classification (MDC), and SVM classifier using Radial Basis Function kernel, the proposed wavelet SVM classifier using the wavelet kernel function in Reproducing Kernel Hilbert Space is capable of improving classification accuracy obviously.
NASA Astrophysics Data System (ADS)
Hwang, Han-Jeong; Lim, Jeong-Hwan; Kim, Do-Won; Im, Chang-Hwan
2014-07-01
A number of recent studies have demonstrated that near-infrared spectroscopy (NIRS) is a promising neuroimaging modality for brain-computer interfaces (BCIs). So far, most NIRS-based BCI studies have focused on enhancing the accuracy of the classification of different mental tasks. In the present study, we evaluated the performances of a variety of mental task combinations in order to determine the mental task pairs that are best suited for customized NIRS-based BCIs. To this end, we recorded event-related hemodynamic responses while seven participants performed eight different mental tasks. Classification accuracies were then estimated for all possible pairs of the eight mental tasks (C=28). Based on this analysis, mental task combinations with relatively high classification accuracies frequently included the following three mental tasks: "mental multiplication," "mental rotation," and "right-hand motor imagery." Specifically, mental task combinations consisting of two of these three mental tasks showed the highest mean classification accuracies. It is expected that our results will be a useful reference to reduce the time needed for preliminary tests when discovering individual-specific mental task combinations.
Palaniappan, Rajkumar; Sundaraj, Kenneth; Sundaraj, Sebastian; Huliraj, N; Revadi, S S
2017-06-08
Auscultation is a medical procedure used for the initial diagnosis and assessment of lung and heart diseases. From this perspective, we propose assessing the performance of the extreme learning machine (ELM) classifiers for the diagnosis of pulmonary pathology using breath sounds. Energy and entropy features were extracted from the breath sound using the wavelet packet transform. The statistical significance of the extracted features was evaluated by one-way analysis of variance (ANOVA). The extracted features were inputted into the ELM classifier. The maximum classification accuracies obtained for the conventional validation (CV) of the energy and entropy features were 97.36% and 98.37%, respectively, whereas the accuracies obtained for the cross validation (CRV) of the energy and entropy features were 96.80% and 97.91%, respectively. In addition, maximum classification accuracies of 98.25% and 99.25% were obtained for the CV and CRV of the ensemble features, respectively. The results indicate that the classification accuracy obtained with the ensemble features was higher than those obtained with the energy and entropy features.
Learning Weight Uncertainty with Stochastic Gradient MCMC for Shape Classification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Chunyuan; Stevens, Andrew J.; Chen, Changyou
2016-08-10
Learning the representation of shape cues in 2D & 3D objects for recognition is a fundamental task in computer vision. Deep neural networks (DNNs) have shown promising performance on this task. Due to the large variability of shapes, accurate recognition relies on good estimates of model uncertainty, ignored in traditional training of DNNs, typically learned via stochastic optimization. This paper leverages recent advances in stochastic gradient Markov Chain Monte Carlo (SG-MCMC) to learn weight uncertainty in DNNs. It yields principled Bayesian interpretations for the commonly used Dropout/DropConnect techniques and incorporates them into the SG-MCMC framework. Extensive experiments on 2D &more » 3D shape datasets and various DNN models demonstrate the superiority of the proposed approach over stochastic optimization. Our approach yields higher recognition accuracy when used in conjunction with Dropout and Batch-Normalization.« less
Taniguchi, Hidetaka; Sato, Hiroshi; Shirakawa, Tomohiro
2018-05-09
Human learners can generalize a new concept from a small number of samples. In contrast, conventional machine learning methods require large amounts of data to address the same types of problems. Humans have cognitive biases that promote fast learning. Here, we developed a method to reduce the gap between human beings and machines in this type of inference by utilizing cognitive biases. We implemented a human cognitive model into machine learning algorithms and compared their performance with the currently most popular methods, naïve Bayes, support vector machine, neural networks, logistic regression and random forests. We focused on the task of spam classification, which has been studied for a long time in the field of machine learning and often requires a large amount of data to obtain high accuracy. Our models achieved superior performance with small and biased samples in comparison with other representative machine learning methods.
Prediction of Patient-Controlled Analgesic Consumption: A Multimodel Regression Tree Approach.
Hu, Yuh-Jyh; Ku, Tien-Hsiung; Yang, Yu-Hung; Shen, Jia-Ying
2018-01-01
Several factors contribute to individual variability in postoperative pain, therefore, individuals consume postoperative analgesics at different rates. Although many statistical studies have analyzed postoperative pain and analgesic consumption, most have identified only the correlation and have not subjected the statistical model to further tests in order to evaluate its predictive accuracy. In this study involving 3052 patients, a multistrategy computational approach was developed for analgesic consumption prediction. This approach uses data on patient-controlled analgesia demand behavior over time and combines clustering, classification, and regression to mitigate the limitations of current statistical models. Cross-validation results indicated that the proposed approach significantly outperforms various existing regression methods. Moreover, a comparison between the predictions by anesthesiologists and medical specialists and those of the computational approach for an independent test data set of 60 patients further evidenced the superiority of the computational approach in predicting analgesic consumption because it produced markedly lower root mean squared errors.
Impacts of land use/cover classification accuracy on regional climate simulations
NASA Astrophysics Data System (ADS)
Ge, Jianjun; Qi, Jiaguo; Lofgren, Brent M.; Moore, Nathan; Torbick, Nathan; Olson, Jennifer M.
2007-03-01
Land use/cover change has been recognized as a key component in global change. Various land cover data sets, including historically reconstructed, recently observed, and future projected, have been used in numerous climate modeling studies at regional to global scales. However, little attention has been paid to the effect of land cover classification accuracy on climate simulations, though accuracy assessment has become a routine procedure in land cover production community. In this study, we analyzed the behavior of simulated precipitation in the Regional Atmospheric Modeling System (RAMS) over a range of simulated classification accuracies over a 3 month period. This study found that land cover accuracy under 80% had a strong effect on precipitation especially when the land surface had a greater control of the atmosphere. This effect became stronger as the accuracy decreased. As shown in three follow-on experiments, the effect was further influenced by model parameterizations such as convection schemes and interior nudging, which can mitigate the strength of surface boundary forcings. In reality, land cover accuracy rarely obtains the commonly recommended 85% target. Its effect on climate simulations should therefore be considered, especially when historically reconstructed and future projected land covers are employed.
Multi-parameter machine learning approach to the neuroanatomical basis of developmental dyslexia.
Płoński, Piotr; Gradkowski, Wojciech; Altarelli, Irene; Monzalvo, Karla; van Ermingen-Marbach, Muna; Grande, Marion; Heim, Stefan; Marchewka, Artur; Bogorodzki, Piotr; Ramus, Franck; Jednoróg, Katarzyna
2017-02-01
Despite decades of research, the anatomical abnormalities associated with developmental dyslexia are still not fully described. Studies have focused on between-group comparisons in which different neuroanatomical measures were generally explored in isolation, disregarding potential interactions between regions and measures. Here, for the first time a multivariate classification approach was used to investigate grey matter disruptions in children with dyslexia in a large (N = 236) multisite sample. A variety of cortical morphological features, including volumetric (volume, thickness and area) and geometric (folding index and mean curvature) measures were taken into account and generalizability of classification was assessed with both 10-fold and leave-one-out cross validation (LOOCV) techniques. Classification into control vs. dyslexic subjects achieved above chance accuracy (AUC = 0.66 and ACC = 0.65 in the case of 10-fold CV, and AUC = 0.65 and ACC = 0.64 using LOOCV) after principled feature selection. Features that discriminated between dyslexic and control children were exclusively situated in the left hemisphere including superior and middle temporal gyri, subparietal sulcus and prefrontal areas. They were related to geometric properties of the cortex, with generally higher mean curvature and a greater folding index characterizing the dyslexic group. Our results support the hypothesis that an atypical curvature pattern with extra folds in left hemispheric perisylvian regions characterizes dyslexia. Hum Brain Mapp 38:900-908, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Liu, Wanjun; Liang, Xuejian; Qu, Haicheng
2017-11-01
Hyperspectral image (HSI) classification is one of the most popular topics in remote sensing community. Traditional and deep learning-based classification methods were proposed constantly in recent years. In order to improve the classification accuracy and robustness, a dimensionality-varied convolutional neural network (DVCNN) was proposed in this paper. DVCNN was a novel deep architecture based on convolutional neural network (CNN). The input of DVCNN was a set of 3D patches selected from HSI which contained spectral-spatial joint information. In the following feature extraction process, each patch was transformed into some different 1D vectors by 3D convolution kernels, which were able to extract features from spectral-spatial data. The rest of DVCNN was about the same as general CNN and processed 2D matrix which was constituted by by all 1D data. So that the DVCNN could not only extract more accurate and rich features than CNN, but also fused spectral-spatial information to improve classification accuracy. Moreover, the robustness of network on water-absorption bands was enhanced in the process of spectral-spatial fusion by 3D convolution, and the calculation was simplified by dimensionality varied convolution. Experiments were performed on both Indian Pines and Pavia University scene datasets, and the results showed that the classification accuracy of DVCNN improved by 32.87% on Indian Pines and 19.63% on Pavia University scene than spectral-only CNN. The maximum accuracy improvement of DVCNN achievement was 13.72% compared with other state-of-the-art HSI classification methods, and the robustness of DVCNN on water-absorption bands noise was demonstrated.
Integrative Chemical-Biological Read-Across Approach for Chemical Hazard Classification
Low, Yen; Sedykh, Alexander; Fourches, Denis; Golbraikh, Alexander; Whelan, Maurice; Rusyn, Ivan; Tropsha, Alexander
2013-01-01
Traditional read-across approaches typically rely on the chemical similarity principle to predict chemical toxicity; however, the accuracy of such predictions is often inadequate due to the underlying complex mechanisms of toxicity. Here we report on the development of a hazard classification and visualization method that draws upon both chemical structural similarity and comparisons of biological responses to chemicals measured in multiple short-term assays (”biological” similarity). The Chemical-Biological Read-Across (CBRA) approach infers each compound's toxicity from those of both chemical and biological analogs whose similarities are determined by the Tanimoto coefficient. Classification accuracy of CBRA was compared to that of classical RA and other methods using chemical descriptors alone, or in combination with biological data. Different types of adverse effects (hepatotoxicity, hepatocarcinogenicity, mutagenicity, and acute lethality) were classified using several biological data types (gene expression profiling and cytotoxicity screening). CBRA-based hazard classification exhibited consistently high external classification accuracy and applicability to diverse chemicals. Transparency of the CBRA approach is aided by the use of radial plots that show the relative contribution of analogous chemical and biological neighbors. Identification of both chemical and biological features that give rise to the high accuracy of CBRA-based toxicity prediction facilitates mechanistic interpretation of the models. PMID:23848138
Classification of ECG beats using deep belief network and active learning.
G, Sayantan; T, Kien P; V, Kadambari K
2018-04-12
A new semi-supervised approach based on deep learning and active learning for classification of electrocardiogram signals (ECG) is proposed. The objective of the proposed work is to model a scientific method for classification of cardiac irregularities using electrocardiogram beats. The model follows the Association for the Advancement of medical instrumentation (AAMI) standards and consists of three phases. In phase I, feature representation of ECG is learnt using Gaussian-Bernoulli deep belief network followed by a linear support vector machine (SVM) training in the consecutive phase. It yields three deep models which are based on AAMI-defined classes, namely N, V, S, and F. In the last phase, a query generator is introduced to interact with the expert to label few beats to improve accuracy and sensitivity. The proposed approach depicts significant improvement in accuracy with minimal queries posed to the expert and fast online training as tested on the MIT-BIH Arrhythmia Database and the MIT-BIH Supra-ventricular Arrhythmia Database (SVDB). With 100 queries labeled by the expert in phase III, the method achieves an accuracy of 99.5% in "S" versus all classifications (SVEB) and 99.4% accuracy in "V " versus all classifications (VEB) on MIT-BIH Arrhythmia Database. In a similar manner, it is attributed that an accuracy of 97.5% for SVEB and 98.6% for VEB on SVDB database is achieved respectively. Graphical Abstract Reply- Deep belief network augmented by active learning for efficient prediction of arrhythmia.
NASA Astrophysics Data System (ADS)
Xie, W.-J.; Zhang, L.; Chen, H.-P.; Zhou, J.; Mao, W.-J.
2018-04-01
The purpose of carrying out national geographic conditions monitoring is to obtain information of surface changes caused by human social and economic activities, so that the geographic information can be used to offer better services for the government, enterprise and public. Land cover data contains detailed geographic conditions information, thus has been listed as one of the important achievements in the national geographic conditions monitoring project. At present, the main issue of the production of the land cover data is about how to improve the classification accuracy. For the land cover data quality inspection and acceptance, classification accuracy is also an important check point. So far, the classification accuracy inspection is mainly based on human-computer interaction or manual inspection in the project, which are time consuming and laborious. By harnessing the automatic high-resolution remote sensing image change detection technology based on the ERDAS IMAGINE platform, this paper carried out the classification accuracy inspection test of land cover data in the project, and presented a corresponding technical route, which includes data pre-processing, change detection, result output and information extraction. The result of the quality inspection test shows the effectiveness of the technical route, which can meet the inspection needs for the two typical errors, that is, missing and incorrect update error, and effectively reduces the work intensity of human-computer interaction inspection for quality inspectors, and also provides a technical reference for the data production and quality control of the land cover data.
Cao, Jianfang; Cui, Hongyan; Shi, Hao; Jiao, Lijuan
2016-01-01
A back-propagation (BP) neural network can solve complicated random nonlinear mapping problems; therefore, it can be applied to a wide range of problems. However, as the sample size increases, the time required to train BP neural networks becomes lengthy. Moreover, the classification accuracy decreases as well. To improve the classification accuracy and runtime efficiency of the BP neural network algorithm, we proposed a parallel design and realization method for a particle swarm optimization (PSO)-optimized BP neural network based on MapReduce on the Hadoop platform using both the PSO algorithm and a parallel design. The PSO algorithm was used to optimize the BP neural network’s initial weights and thresholds and improve the accuracy of the classification algorithm. The MapReduce parallel programming model was utilized to achieve parallel processing of the BP algorithm, thereby solving the problems of hardware and communication overhead when the BP neural network addresses big data. Datasets on 5 different scales were constructed using the scene image library from the SUN Database. The classification accuracy of the parallel PSO-BP neural network algorithm is approximately 92%, and the system efficiency is approximately 0.85, which presents obvious advantages when processing big data. The algorithm proposed in this study demonstrated both higher classification accuracy and improved time efficiency, which represents a significant improvement obtained from applying parallel processing to an intelligent algorithm on big data. PMID:27304987
Hierarchical vs non-hierarchical audio indexation and classification for video genres
NASA Astrophysics Data System (ADS)
Dammak, Nouha; BenAyed, Yassine
2018-04-01
In this paper, Support Vector Machines (SVMs) are used for segmenting and indexing video genres based on only audio features extracted at block level, which has a prominent asset by capturing local temporal information. The main contribution of our study is to show the wide effect on the classification accuracies while using an hierarchical categorization structure based on Mel Frequency Cepstral Coefficients (MFCC) audio descriptor. In fact, the classification consists in three common video genres: sports videos, music clips and news scenes. The sub-classification may divide each genre into several multi-speaker and multi-dialect sub-genres. The validation of this approach was carried out on over 360 minutes of video span yielding a classification accuracy of over 99%.
Acosta-Mesa, Héctor-Gabriel; Rechy-Ramírez, Fernando; Mezura-Montes, Efrén; Cruz-Ramírez, Nicandro; Hernández Jiménez, Rodolfo
2014-06-01
In this work, we present a novel application of time series discretization using evolutionary programming for the classification of precancerous cervical lesions. The approach optimizes the number of intervals in which the length and amplitude of the time series should be compressed, preserving the important information for classification purposes. Using evolutionary programming, the search for a good discretization scheme is guided by a cost function which considers three criteria: the entropy regarding the classification, the complexity measured as the number of different strings needed to represent the complete data set, and the compression rate assessed as the length of the discrete representation. This discretization approach is evaluated using a time series data based on temporal patterns observed during a classical test used in cervical cancer detection; the classification accuracy reached by our method is compared with the well-known times series discretization algorithm SAX and the dimensionality reduction method PCA. Statistical analysis of the classification accuracy shows that the discrete representation is as efficient as the complete raw representation for the present application, reducing the dimensionality of the time series length by 97%. This representation is also very competitive in terms of classification accuracy when compared with similar approaches. Copyright © 2014 Elsevier Inc. All rights reserved.
Cognitive-motivational deficits in ADHD: development of a classification system.
Gupta, Rashmi; Kar, Bhoomika R; Srinivasan, Narayanan
2011-01-01
The classification systems developed so far to detect attention deficit/hyperactivity disorder (ADHD) do not have high sensitivity and specificity. We have developed a classification system based on several neuropsychological tests that measure cognitive-motivational functions that are specifically impaired in ADHD children. A total of 240 (120 ADHD children and 120 healthy controls) children in the age range of 6-9 years and 32 Oppositional Defiant Disorder (ODD) children (aged 9 years) participated in the study. Stop-Signal, Task-Switching, Attentional Network, and Choice Delay tests were administered to all the participants. Receiver operating characteristic (ROC) analysis indicated that percentage choice of long-delay reward best classified the ADHD children from healthy controls. Single parameters were not helpful in making a differential classification of ADHD with ODD. Multinominal logistic regression (MLR) was performed with multiple parameters (data fusion) that produced improved overall classification accuracy. A combination of stop-signal reaction time, posterror-slowing, mean delay, switch cost, and percentage choice of long-delay reward produced an overall classification accuracy of 97.8%; with internal validation, the overall accuracy was 92.2%. Combining parameters from different tests of control functions not only enabled us to accurately classify ADHD children from healthy controls but also in making a differential classification with ODD. These results have implications for the theories of ADHD.
Classification of EEG Signals Based on Pattern Recognition Approach.
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a "pattern recognition" approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90-7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11-89.63% and 91.60-81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy.
Classification of EEG Signals Based on Pattern Recognition Approach
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a “pattern recognition” approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90–7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11–89.63% and 91.60–81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy. PMID:29209190
Mediterranean Land Use and Land Cover Classification Assessment Using High Spatial Resolution Data
NASA Astrophysics Data System (ADS)
Elhag, Mohamed; Boteva, Silvena
2016-10-01
Landscape fragmentation is noticeably practiced in Mediterranean regions and imposes substantial complications in several satellite image classification methods. To some extent, high spatial resolution data were able to overcome such complications. For better classification performances in Land Use Land Cover (LULC) mapping, the current research adopts different classification methods comparison for LULC mapping using Sentinel-2 satellite as a source of high spatial resolution. Both of pixel-based and an object-based classification algorithms were assessed; the pixel-based approach employs Maximum Likelihood (ML), Artificial Neural Network (ANN) algorithms, Support Vector Machine (SVM), and, the object-based classification uses the Nearest Neighbour (NN) classifier. Stratified Masking Process (SMP) that integrates a ranking process within the classes based on spectral fluctuation of the sum of the training and testing sites was implemented. An analysis of the overall and individual accuracy of the classification results of all four methods reveals that the SVM classifier was the most efficient overall by distinguishing most of the classes with the highest accuracy. NN succeeded to deal with artificial surface classes in general while agriculture area classes, and forest and semi-natural area classes were segregated successfully with SVM. Furthermore, a comparative analysis indicates that the conventional classification method yielded better accuracy results than the SMP method overall with both classifiers used, ML and SVM.
AVNM: A Voting based Novel Mathematical Rule for Image Classification.
Vidyarthi, Ankit; Mittal, Namita
2016-12-01
In machine learning, the accuracy of the system depends upon classification result. Classification accuracy plays an imperative role in various domains. Non-parametric classifier like K-Nearest Neighbor (KNN) is the most widely used classifier for pattern analysis. Besides its easiness, simplicity and effectiveness characteristics, the main problem associated with KNN classifier is the selection of a number of nearest neighbors i.e. "k" for computation. At present, it is hard to find the optimal value of "k" using any statistical algorithm, which gives perfect accuracy in terms of low misclassification error rate. Motivated by the prescribed problem, a new sample space reduction weighted voting mathematical rule (AVNM) is proposed for classification in machine learning. The proposed AVNM rule is also non-parametric in nature like KNN. AVNM uses the weighted voting mechanism with sample space reduction to learn and examine the predicted class label for unidentified sample. AVNM is free from any initial selection of predefined variable and neighbor selection as found in KNN algorithm. The proposed classifier also reduces the effect of outliers. To verify the performance of the proposed AVNM classifier, experiments are made on 10 standard datasets taken from UCI database and one manually created dataset. The experimental result shows that the proposed AVNM rule outperforms the KNN classifier and its variants. Experimentation results based on confusion matrix accuracy parameter proves higher accuracy value with AVNM rule. The proposed AVNM rule is based on sample space reduction mechanism for identification of an optimal number of nearest neighbor selections. AVNM results in better classification accuracy and minimum error rate as compared with the state-of-art algorithm, KNN, and its variants. The proposed rule automates the selection of nearest neighbor selection and improves classification rate for UCI dataset and manually created dataset. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Use of collateral information to improve LANDSAT classification accuracies
NASA Technical Reports Server (NTRS)
Strahler, A. H. (Principal Investigator)
1981-01-01
Methods to improve LANDSAT classification accuracies were investigated including: (1) the use of prior probabilities in maximum likelihood classification as a methodology to integrate discrete collateral data with continuously measured image density variables; (2) the use of the logit classifier as an alternative to multivariate normal classification that permits mixing both continuous and categorical variables in a single model and fits empirical distributions of observations more closely than the multivariate normal density function; and (3) the use of collateral data in a geographic information system as exercised to model a desired output information layer as a function of input layers of raster format collateral and image data base layers.
NASA Astrophysics Data System (ADS)
Szuflitowska, B.; Orlowski, P.
2017-08-01
Automated detection system consists of two key steps: extraction of features from EEG signals and classification for detection of pathology activity. The EEG sequences were analyzed using Short-Time Fourier Transform and the classification was performed using Linear Discriminant Analysis. The accuracy of the technique was tested on three sets of EEG signals: epilepsy, healthy and Alzheimer's Disease. The classification error below 10% has been considered a success. The higher accuracy are obtained for new data of unknown classes than testing data. The methodology can be helpful in differentiation epilepsy seizure and disturbances in the EEG signal in Alzheimer's Disease.
NASA Astrophysics Data System (ADS)
Hu, Ruiguang; Xiao, Liping; Zheng, Wenjuan
2015-12-01
In this paper, multi-kernel learning(MKL) is used for drug-related webpages classification. First, body text and image-label text are extracted through HTML parsing, and valid images are chosen by the FOCARSS algorithm. Second, text based BOW model is used to generate text representation, and image-based BOW model is used to generate images representation. Last, text and images representation are fused with a few methods. Experimental results demonstrate that the classification accuracy of MKL is higher than those of all other fusion methods in decision level and feature level, and much higher than the accuracy of single-modal classification.
Landcover classification in MRF context using Dempster-Shafer fusion for multisensor imagery.
Sarkar, Anjan; Banerjee, Anjan; Banerjee, Nilanjan; Brahma, Siddhartha; Kartikeyan, B; Chakraborty, Manab; Majumder, K L
2005-05-01
This work deals with multisensor data fusion to obtain landcover classification. The role of feature-level fusion using the Dempster-Shafer rule and that of data-level fusion in the MRF context is studied in this paper to obtain an optimally segmented image. Subsequently, segments are validated and classification accuracy for the test data is evaluated. Two examples of data fusion of optical images and a synthetic aperture radar image are presented, each set having been acquired on different dates. Classification accuracies of the technique proposed are compared with those of some recent techniques in literature for the same image data.
Random forests for classification in ecology
Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J.
2007-01-01
Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in ecology. Advantages of RF compared to other statistical classifiers include (1) very high classification accuracy; (2) a novel method of determining variable importance; (3) ability to model complex interactions among predictor variables; (4) flexibility to perform several types of statistical data analysis, including regression, classification, survival analysis, and unsupervised learning; and (5) an algorithm for imputing missing values. We compared the accuracies of RF and four other commonly used statistical classifiers using data on invasive plant species presence in Lava Beds National Monument, California, USA, rare lichen species presence in the Pacific Northwest, USA, and nest sites for cavity nesting birds in the Uinta Mountains, Utah, USA. We observed high classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods. We also observed that the variables that RF identified as most important for classifying invasive plant species coincided with expectations based on the literature. ?? 2007 by the Ecological Society of America.
Application of Sensor Fusion to Improve Uav Image Classification
NASA Astrophysics Data System (ADS)
Jabari, S.; Fathollahi, F.; Zhang, Y.
2017-08-01
Image classification is one of the most important tasks of remote sensing projects including the ones that are based on using UAV images. Improving the quality of UAV images directly affects the classification results and can save a huge amount of time and effort in this area. In this study, we show that sensor fusion can improve image quality which results in increasing the accuracy of image classification. Here, we tested two sensor fusion configurations by using a Panchromatic (Pan) camera along with either a colour camera or a four-band multi-spectral (MS) camera. We use the Pan camera to benefit from its higher sensitivity and the colour or MS camera to benefit from its spectral properties. The resulting images are then compared to the ones acquired by a high resolution single Bayer-pattern colour camera (here referred to as HRC). We assessed the quality of the output images by performing image classification tests. The outputs prove that the proposed sensor fusion configurations can achieve higher accuracies compared to the images of the single Bayer-pattern colour camera. Therefore, incorporating a Pan camera on-board in the UAV missions and performing image fusion can help achieving higher quality images and accordingly higher accuracy classification results.
Convolutional Neural Network for Histopathological Analysis of Osteosarcoma.
Mishra, Rashika; Daescu, Ovidiu; Leavey, Patrick; Rakheja, Dinesh; Sengupta, Anita
2018-03-01
Pathologists often deal with high complexity and sometimes disagreement over osteosarcoma tumor classification due to cellular heterogeneity in the dataset. Segmentation and classification of histology tissue in H&E stained tumor image datasets is a challenging task because of intra-class variations, inter-class similarity, crowded context, and noisy data. In recent years, deep learning approaches have led to encouraging results in breast cancer and prostate cancer analysis. In this article, we propose convolutional neural network (CNN) as a tool to improve efficiency and accuracy of osteosarcoma tumor classification into tumor classes (viable tumor, necrosis) versus nontumor. The proposed CNN architecture contains eight learned layers: three sets of stacked two convolutional layers interspersed with max pooling layers for feature extraction and two fully connected layers with data augmentation strategies to boost performance. The use of a neural network results in higher accuracy of average 92% for the classification. We compare the proposed architecture with three existing and proven CNN architectures for image classification: AlexNet, LeNet, and VGGNet. We also provide a pipeline to calculate percentage necrosis in a given whole slide image. We conclude that the use of neural networks can assure both high accuracy and efficiency in osteosarcoma classification.
Retinal vasculature classification using novel multifractal features
NASA Astrophysics Data System (ADS)
Ding, Y.; Ward, W. O. C.; Duan, Jinming; Auer, D. P.; Gowland, Penny; Bai, L.
2015-11-01
Retinal blood vessels have been implicated in a large number of diseases including diabetic retinopathy and cardiovascular diseases, which cause damages to retinal blood vessels. The availability of retinal vessel imaging provides an excellent opportunity for monitoring and diagnosis of retinal diseases, and automatic analysis of retinal vessels will help with the processes. However, state of the art vascular analysis methods such as counting the number of branches or measuring the curvature and diameter of individual vessels are unsuitable for the microvasculature. There has been published research using fractal analysis to calculate fractal dimensions of retinal blood vessels, but so far there has been no systematic research extracting discriminant features from retinal vessels for classifications. This paper introduces new methods for feature extraction from multifractal spectra of retinal vessels for classification. Two publicly available retinal vascular image databases are used for the experiments, and the proposed methods have produced accuracies of 85.5% and 77% for classification of healthy and diabetic retinal vasculatures. Experiments show that classification with multiple fractal features produces better rates compared with methods using a single fractal dimension value. In addition to this, experiments also show that classification accuracy can be affected by the accuracy of vessel segmentation algorithms.
Lu, Dengsheng; Batistella, Mateus; de Miranda, Evaristo E; Moran, Emilio
2008-01-01
Complex forest structure and abundant tree species in the moist tropical regions often cause difficulties in classifying vegetation classes with remotely sensed data. This paper explores improvement in vegetation classification accuracies through a comparative study of different image combinations based on the integration of Landsat Thematic Mapper (TM) and SPOT High Resolution Geometric (HRG) instrument data, as well as the combination of spectral signatures and textures. A maximum likelihood classifier was used to classify the different image combinations into thematic maps. This research indicated that data fusion based on HRG multispectral and panchromatic data slightly improved vegetation classification accuracies: a 3.1 to 4.6 percent increase in the kappa coefficient compared with the classification results based on original HRG or TM multispectral images. A combination of HRG spectral signatures and two textural images improved the kappa coefficient by 6.3 percent compared with pure HRG multispectral images. The textural images based on entropy or second-moment texture measures with a window size of 9 pixels × 9 pixels played an important role in improving vegetation classification accuracy. Overall, optical remote-sensing data are still insufficient for accurate vegetation classifications in the Amazon basin.
Lu, Dengsheng; Batistella, Mateus; de Miranda, Evaristo E.; Moran, Emilio
2009-01-01
Complex forest structure and abundant tree species in the moist tropical regions often cause difficulties in classifying vegetation classes with remotely sensed data. This paper explores improvement in vegetation classification accuracies through a comparative study of different image combinations based on the integration of Landsat Thematic Mapper (TM) and SPOT High Resolution Geometric (HRG) instrument data, as well as the combination of spectral signatures and textures. A maximum likelihood classifier was used to classify the different image combinations into thematic maps. This research indicated that data fusion based on HRG multispectral and panchromatic data slightly improved vegetation classification accuracies: a 3.1 to 4.6 percent increase in the kappa coefficient compared with the classification results based on original HRG or TM multispectral images. A combination of HRG spectral signatures and two textural images improved the kappa coefficient by 6.3 percent compared with pure HRG multispectral images. The textural images based on entropy or second-moment texture measures with a window size of 9 pixels × 9 pixels played an important role in improving vegetation classification accuracy. Overall, optical remote-sensing data are still insufficient for accurate vegetation classifications in the Amazon basin. PMID:19789716
Ralston, Barbara E.; Davis, Philip A.; Weber, Robert M.; Rundall, Jill M.
2008-01-01
A vegetation database of the riparian vegetation located within the Colorado River ecosystem (CRE), a subsection of the Colorado River between Glen Canyon Dam and the western boundary of Grand Canyon National Park, was constructed using four-band image mosaics acquired in May 2002. A digital line scanner was flown over the Colorado River corridor in Arizona by ISTAR Americas, using a Leica ADS-40 digital camera to acquire a digital surface model and four-band image mosaics (blue, green, red, and near-infrared) for vegetation mapping. The primary objective of this mapping project was to develop a digital inventory map of vegetation to enable patch- and landscape-scale change detection, and to establish randomized sampling points for ground surveys of terrestrial fauna (principally, but not exclusively, birds). The vegetation base map was constructed through a combination of ground surveys to identify vegetation classes, image processing, and automated supervised classification procedures. Analysis of the imagery and subsequent supervised classification involved multiple steps to evaluate band quality, band ratios, and vegetation texture and density. Identification of vegetation classes involved collection of cover data throughout the river corridor and subsequent analysis using two-way indicator species analysis (TWINSPAN). Vegetation was classified into six vegetation classes, following the National Vegetation Classification Standard, based on cover dominance. This analysis indicated that total area covered by all vegetation within the CRE was 3,346 ha. Considering the six vegetation classes, the sparse shrub (SS) class accounted for the greatest amount of vegetation (627 ha) followed by Pluchea (PLSE) and Tamarix (TARA) at 494 and 366 ha, respectively. The wetland (WTLD) and Prosopis-Acacia (PRGL) classes both had similar areal cover values (227 and 213 ha, respectively). Baccharis-Salix (BAXX) was the least represented at 94 ha. Accuracy assessment of the supervised classification determined that accuracies varied among vegetation classes from 90% to 49%. Causes for low accuracies were similar spectral signatures among vegetation classes. Fuzzy accuracy assessment improved classification accuracies such that Federal mapping standards of 80% accuracies for all classes were met. The scale used to quantify vegetation adequately meets the needs of the stakeholder group. Increasing the scale to meet the U.S. Geological Survey (USGS)-National Park Service (NPS)National Mapping Program's minimum mapping unit of 0.5 ha is unwarranted because this scale would reduce the resolution of some classes (e.g., seep willow/coyote willow would likely be combined with tamarisk). While this would undoubtedly improve classification accuracies, it would not provide the community-level information about vegetation change that would benefit stakeholders. The identification of vegetation classes should follow NPS mapping approaches to complement the national effort and should incorporate the alternative analysis for community identification that is being incorporated into newer NPS mapping efforts. National Vegetation Classification is followed in this report for association- to formation-level categories. Accuracies could be improved by including more environmental variables such as stage elevation in the classification process and incorporating object-based classification methods. Another approach that may address the heterogeneous species issue and classification is to use spectral mixing analysis to estimate the fractional cover of species within each pixel and better quantify the cover of individual species that compose a cover class. Varying flights to capture vegetation at different times of the year might also help separate some vegetation classes, though the cost may be prohibitive. Lastly, photointerpretation instead of automated mapping could be tried. Photointerpretation would likely not improve accuracies in this case, howev
ChariDingari, Narahara; Barman, Ishan; Myakalwar, Ashwin Kumar; Tewari, Surya P.; Kumar, G. Manoj
2012-01-01
Despite the intrinsic elemental analysis capability and lack of sample preparation requirements, laser-induced breakdown spectroscopy (LIBS) has not been extensively used for real world applications, e.g. quality assurance and process monitoring. Specifically, variability in sample, system and experimental parameters in LIBS studies present a substantive hurdle for robust classification, even when standard multivariate chemometric techniques are used for analysis. Considering pharmaceutical sample investigation as an example, we propose the use of support vector machines (SVM) as a non-linear classification method over conventional linear techniques such as soft independent modeling of class analogy (SIMCA) and partial least-squares discriminant analysis (PLS-DA) for discrimination based on LIBS measurements. Using over-the-counter pharmaceutical samples, we demonstrate that application of SVM enables statistically significant improvements in prospective classification accuracy (sensitivity), due to its ability to address variability in LIBS sample ablation and plasma self-absorption behavior. Furthermore, our results reveal that SVM provides nearly 10% improvement in correct allocation rate and a concomitant reduction in misclassification rates of 75% (cf. PLS-DA) and 80% (cf. SIMCA)-when measurements from samples not included in the training set are incorporated in the test data – highlighting its robustness. While further studies on a wider matrix of sample types performed using different LIBS systems is needed to fully characterize the capability of SVM to provide superior predictions, we anticipate that the improved sensitivity and robustness observed here will facilitate application of the proposed LIBS-SVM toolbox for screening drugs and detecting counterfeit samples as well as in related areas of forensic and biological sample analysis. PMID:22292496
Wong, Gerard; Leckie, Christopher; Kowalczyk, Adam
2012-01-15
Feature selection is a key concept in machine learning for microarray datasets, where features represented by probesets are typically several orders of magnitude larger than the available sample size. Computational tractability is a key challenge for feature selection algorithms in handling very high-dimensional datasets beyond a hundred thousand features, such as in datasets produced on single nucleotide polymorphism microarrays. In this article, we present a novel feature set reduction approach that enables scalable feature selection on datasets with hundreds of thousands of features and beyond. Our approach enables more efficient handling of higher resolution datasets to achieve better disease subtype classification of samples for potentially more accurate diagnosis and prognosis, which allows clinicians to make more informed decisions in regards to patient treatment options. We applied our feature set reduction approach to several publicly available cancer single nucleotide polymorphism (SNP) array datasets and evaluated its performance in terms of its multiclass predictive classification accuracy over different cancer subtypes, its speedup in execution as well as its scalability with respect to sample size and array resolution. Feature Set Reduction (FSR) was able to reduce the dimensions of an SNP array dataset by more than two orders of magnitude while achieving at least equal, and in most cases superior predictive classification performance over that achieved on features selected by existing feature selection methods alone. An examination of the biological relevance of frequently selected features from FSR-reduced feature sets revealed strong enrichment in association with cancer. FSR was implemented in MATLAB R2010b and is available at http://ww2.cs.mu.oz.au/~gwong/FSR.
Dingari, Narahara Chari; Barman, Ishan; Myakalwar, Ashwin Kumar; Tewari, Surya P; Kumar Gundawar, Manoj
2012-03-20
Despite the intrinsic elemental analysis capability and lack of sample preparation requirements, laser-induced breakdown spectroscopy (LIBS) has not been extensively used for real-world applications, e.g., quality assurance and process monitoring. Specifically, variability in sample, system, and experimental parameters in LIBS studies present a substantive hurdle for robust classification, even when standard multivariate chemometric techniques are used for analysis. Considering pharmaceutical sample investigation as an example, we propose the use of support vector machines (SVM) as a nonlinear classification method over conventional linear techniques such as soft independent modeling of class analogy (SIMCA) and partial least-squares discriminant analysis (PLS-DA) for discrimination based on LIBS measurements. Using over-the-counter pharmaceutical samples, we demonstrate that the application of SVM enables statistically significant improvements in prospective classification accuracy (sensitivity), because of its ability to address variability in LIBS sample ablation and plasma self-absorption behavior. Furthermore, our results reveal that SVM provides nearly 10% improvement in correct allocation rate and a concomitant reduction in misclassification rates of 75% (cf. PLS-DA) and 80% (cf. SIMCA)-when measurements from samples not included in the training set are incorporated in the test data-highlighting its robustness. While further studies on a wider matrix of sample types performed using different LIBS systems is needed to fully characterize the capability of SVM to provide superior predictions, we anticipate that the improved sensitivity and robustness observed here will facilitate application of the proposed LIBS-SVM toolbox for screening drugs and detecting counterfeit samples, as well as in related areas of forensic and biological sample analysis.
Bladder segmentation in MR images with watershed segmentation and graph cut algorithm
NASA Astrophysics Data System (ADS)
Blaffert, Thomas; Renisch, Steffen; Schadewaldt, Nicole; Schulz, Heinrich; Wiemker, Rafael
2014-03-01
Prostate and cervix cancer diagnosis and treatment planning that is based on MR images benefit from superior soft tissue contrast compared to CT images. For these images an automatic delineation of the prostate or cervix and the organs at risk such as the bladder is highly desirable. This paper describes a method for bladder segmentation that is based on a watershed transform on high image gradient values and gray value valleys together with the classification of watershed regions into bladder contents and tissue by a graph cut algorithm. The obtained results are superior if compared to a simple region-after-region classification.
Spatial modeling and classification of corneal shape.
Marsolo, Keith; Twa, Michael; Bullimore, Mark A; Parthasarathy, Srinivasan
2007-03-01
One of the most promising applications of data mining is in biomedical data used in patient diagnosis. Any method of data analysis intended to support the clinical decision-making process should meet several criteria: it should capture clinically relevant features, be computationally feasible, and provide easily interpretable results. In an initial study, we examined the feasibility of using Zernike polynomials to represent biomedical instrument data in conjunction with a decision tree classifier to distinguish between the diseased and non-diseased eyes. Here, we provide a comprehensive follow-up to that work, examining a second representation, pseudo-Zernike polynomials, to determine whether they provide any increase in classification accuracy. We compare the fidelity of both methods using residual root-mean-square (rms) error and evaluate accuracy using several classifiers: neural networks, C4.5 decision trees, Voting Feature Intervals, and Naïve Bayes. We also examine the effect of several meta-learning strategies: boosting, bagging, and Random Forests (RFs). We present results comparing accuracy as it relates to dataset and transformation resolution over a larger, more challenging, multi-class dataset. They show that classification accuracy is similar for both data transformations, but differs by classifier. We find that the Zernike polynomials provide better feature representation than the pseudo-Zernikes and that the decision trees yield the best balance of classification accuracy and interpretability.
AVHRR channel selection for land cover classification
Maxwell, S.K.; Hoffer, R.M.; Chapman, P.L.
2002-01-01
Mapping land cover of large regions often requires processing of satellite images collected from several time periods at many spectral wavelength channels. However, manipulating and processing large amounts of image data increases the complexity and time, and hence the cost, that it takes to produce a land cover map. Very few studies have evaluated the importance of individual Advanced Very High Resolution Radiometer (AVHRR) channels for discriminating cover types, especially the thermal channels (channels 3, 4 and 5). Studies rarely perform a multi-year analysis to determine the impact of inter-annual variability on the classification results. We evaluated 5 years of AVHRR data using combinations of the original AVHRR spectral channels (1-5) to determine which channels are most important for cover type discrimination, yet stabilize inter-annual variability. Particular attention was placed on the channels in the thermal portion of the spectrum. Fourteen cover types over the entire state of Colorado were evaluated using a supervised classification approach on all two-, three-, four- and five-channel combinations for seven AVHRR biweekly composite datasets covering the entire growing season for each of 5 years. Results show that all three of the major portions of the electromagnetic spectrum represented by the AVHRR sensor are required to discriminate cover types effectively and stabilize inter-annual variability. Of the two-channel combinations, channels 1 (red visible) and 2 (near-infrared) had, by far, the highest average overall accuracy (72.2%), yet the inter-annual classification accuracies were highly variable. Including a thermal channel (channel 4) significantly increased the average overall classification accuracy by 5.5% and stabilized interannual variability. Each of the thermal channels gave similar classification accuracies; however, because of the problems in consistently interpreting channel 3 data, either channel 4 or 5 was found to be a more appropriate choice. Substituting the thermal channel with a single elevation layer resulted in equivalent classification accuracies and inter-annual variability.
Activity classification using the GENEA: optimum sampling frequency and number of axes.
Zhang, Shaoyan; Murray, Peter; Zillmer, Ruediger; Eston, Roger G; Catt, Michael; Rowlands, Alex V
2012-11-01
The GENEA shows high accuracy for classification of sedentary, household, walking, and running activities when sampling at 80 Hz on three axes. It is not known whether it is possible to decrease this sampling frequency and/or the number of axes without detriment to classification accuracy. The purpose of this study was to compare the classification rate of activities on the basis of data from a single axis, two axes, and three axes, with sampling rates ranging from 5 to 80 Hz. Sixty participants (age, 49.4 yr (6.5 yr); BMI, 24.6 kg·m (3.4 kg·m)) completed 10-12 semistructured activities in the laboratory and outdoor environment while wearing a GENEA accelerometer on the right wrist. We analyzed data from single axis, dual axes, and three axes at sampling rates of 5, 10, 20, 40, and 80 Hz. Mathematical models based on features extracted from mean, SD, fast Fourier transform, and wavelet decomposition were built, which combined one of the numbers of axes with one of the sampling rates to classify activities into sedentary, household, walking, and running. Classification accuracy was high irrespective of the number of axes for data collected at 80 Hz (96.93% ± 0.97%), 40 Hz (97.4% ± 0.73%), 20 Hz (96.86% ± 1.12%), and 10 Hz (97.01% ± 1.01%) but dropped for data collected at 5 Hz (94.98% ± 1.36%). Sampling frequencies >10 Hz and/or more than one axis of measurement were not associated with greater classification accuracy. Lower sampling rates and measurement of a single axis would result in a lower data load, longer battery life, and higher efficiency of data processing. Further research should investigate whether a lower sampling rate and a single axis affects classification accuracy when considering a wider range of activities.
Optimizing Support Vector Machine Parameters with Genetic Algorithm for Credit Risk Assessment
NASA Astrophysics Data System (ADS)
Manurung, Jonson; Mawengkang, Herman; Zamzami, Elviawaty
2017-12-01
Support vector machine (SVM) is a popular classification method known to have strong generalization capabilities. SVM can solve the problem of classification and linear regression or nonlinear kernel which can be a learning algorithm for the ability of classification and regression. However, SVM also has a weakness that is difficult to determine the optimal parameter value. SVM calculates the best linear separator on the input feature space according to the training data. To classify data which are non-linearly separable, SVM uses kernel tricks to transform the data into a linearly separable data on a higher dimension feature space. The kernel trick using various kinds of kernel functions, such as : linear kernel, polynomial, radial base function (RBF) and sigmoid. Each function has parameters which affect the accuracy of SVM classification. To solve the problem genetic algorithms are proposed to be applied as the optimal parameter value search algorithm thus increasing the best classification accuracy on SVM. Data taken from UCI repository of machine learning database: Australian Credit Approval. The results show that the combination of SVM and genetic algorithms is effective in improving classification accuracy. Genetic algorithms has been shown to be effective in systematically finding optimal kernel parameters for SVM, instead of randomly selected kernel parameters. The best accuracy for data has been upgraded from kernel Linear: 85.12%, polynomial: 81.76%, RBF: 77.22% Sigmoid: 78.70%. However, for bigger data sizes, this method is not practical because it takes a lot of time.
Accurate crop classification using hierarchical genetic fuzzy rule-based systems
NASA Astrophysics Data System (ADS)
Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.
2014-10-01
This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.
The effect of superior auditory skills on vocal accuracy
NASA Astrophysics Data System (ADS)
Amir, Ofer; Amir, Noam; Kishon-Rabin, Liat
2003-02-01
The relationship between auditory perception and vocal production has been typically investigated by evaluating the effect of either altered or degraded auditory feedback on speech production in either normal hearing or hearing-impaired individuals. Our goal in the present study was to examine this relationship in individuals with superior auditory abilities. Thirteen professional musicians and thirteen nonmusicians, with no vocal or singing training, participated in this study. For vocal production accuracy, subjects were presented with three tones. They were asked to reproduce the pitch using the vowel /a/. This procedure was repeated three times. The fundamental frequency of each production was measured using an autocorrelation pitch detection algorithm designed for this study. The musicians' superior auditory abilities (compared to the nonmusicians) were established in a frequency discrimination task reported elsewhere. Results indicate that (a) musicians had better vocal production accuracy than nonmusicians (production errors of 1/2 a semitone compared to 1.3 semitones, respectively); (b) frequency discrimination thresholds explain 43% of the variance of the production data, and (c) all subjects with superior frequency discrimination thresholds showed accurate vocal production; the reverse relationship, however, does not hold true. In this study we provide empirical evidence to the importance of auditory feedback on vocal production in listeners with superior auditory skills.
NASA Technical Reports Server (NTRS)
Spruce, J. P.; Smoot, James; Ellis, Jean; Hilbert, Kent; Swann, Roberta
2012-01-01
This paper discusses the development and implementation of a geospatial data processing method and multi-decadal Landsat time series for computing general coastal U.S. land-use and land-cover (LULC) classifications and change products consisting of seven classes (water, barren, upland herbaceous, non-woody wetland, woody upland, woody wetland, and urban). Use of this approach extends the observational period of the NOAA-generated Coastal Change and Analysis Program (C-CAP) products by almost two decades, assuming the availability of one cloud free Landsat scene from any season for each targeted year. The Mobile Bay region in Alabama was used as a study area to develop, demonstrate, and validate the method that was applied to derive LULC products for nine dates at approximate five year intervals across a 34-year time span, using single dates of data for each classification in which forests were either leaf-on, leaf-off, or mixed senescent conditions. Classifications were computed and refined using decision rules in conjunction with unsupervised classification of Landsat data and C-CAP value-added products. Each classification's overall accuracy was assessed by comparing stratified random locations to available reference data, including higher spatial resolution satellite and aerial imagery, field survey data, and raw Landsat RGBs. Overall classification accuracies ranged from 83 to 91% with overall Kappa statistics ranging from 0.78 to 0.89. The accuracies are comparable to those from similar, generalized LULC products derived from C-CAP data. The Landsat MSS-based LULC product accuracies are similar to those from Landsat TM or ETM+ data. Accurate classifications were computed for all nine dates, yielding effective results regardless of season. This classification method yielded products that were used to compute LULC change products via additive GIS overlay techniques.
ERIC Educational Resources Information Center
Guiberson, Mark; Rodriguez, Barbara L.; Dale, Philip S.
2011-01-01
Purpose: The purpose of the current study was to examine the concurrent validity and classification accuracy of 3 parent report measures of language development in Spanish-speaking toddlers. Method: Forty-five Spanish-speaking parents and their 2-year-old children participated. Twenty-three children had expressive language delays (ELDs) as…
ERIC Educational Resources Information Center
Guiberson, Mark; Rodriguez, Barbara L.
2010-01-01
Purpose: To describe the concurrent validity and classification accuracy of 2 Spanish parent surveys of language development, the Spanish Ages and Stages Questionnaire (ASQ; Squires, Potter, & Bricker, 1999) and the Pilot Inventario-III (Pilot INV-III; Guiberson, 2008a). Method: Forty-eight Spanish-speaking parents of preschool-age children…
ERIC Educational Resources Information Center
Zytowski, Donald G.
1972-01-01
Owing to the uncertainty concerning the concurrent validity of the SVIB and the KOIS, a test of accuracy of classification of men in the occupations common to both inventories was undertaken. The results suggest that neither show any less validity than had been shown in separate studies previously. (Author)
ERIC Educational Resources Information Center
Cohen, Ira L.; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N. S.; Romanczyk, Raymond G.; Karmel, Bernard Z.; Gardner, Judith M.
2016-01-01
In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80%,…
Developing Local Oral Reading Fluency Cut Scores for Predicting High-Stakes Test Performance
ERIC Educational Resources Information Center
Grapin, Sally L.; Kranzler, John H.; Waldron, Nancy; Joyce-Beaulieu, Diana; Algina, James
2017-01-01
This study evaluated the classification accuracy of a second grade oral reading fluency curriculum-based measure (R-CBM) in predicting third grade state test performance. It also compared the long-term classification accuracy of local and publisher-recommended R-CBM cut scores. Participants were 266 students who were divided into a calibration…
Factors Affecting the Item Parameter Estimation and Classification Accuracy of the DINA Model
ERIC Educational Resources Information Center
de la Torre, Jimmy; Hong, Yuan; Deng, Weiling
2010-01-01
To better understand the statistical properties of the deterministic inputs, noisy "and" gate cognitive diagnosis (DINA) model, the impact of several factors on the quality of the item parameter estimates and classification accuracy was investigated. Results of the simulation study indicate that the fully Bayes approach is most accurate when the…
NASA Astrophysics Data System (ADS)
Hänsch, Ronny; Hellwich, Olaf
2018-04-01
Random Forests have continuously proven to be one of the most accurate, robust, as well as efficient methods for the supervised classification of images in general and polarimetric synthetic aperture radar data in particular. While the majority of previous work focus on improving classification accuracy, we aim for accelerating the training of the classifier as well as its usage during prediction while maintaining its accuracy. Unlike other approaches we mainly consider algorithmic changes to stay as much as possible independent of platform and programming language. The final model achieves an approximately 60 times faster training and a 500 times faster prediction, while the accuracy is only marginally decreased by roughly 1 %.
How reliable and accurate is the AO/OTA comprehensive classification for adult long-bone fractures?
Meling, Terje; Harboe, Knut; Enoksen, Cathrine H; Aarflot, Morten; Arthursson, Astvaldur J; Søreide, Kjetil
2012-07-01
Reliable classification of fractures is important for treatment allocation and study comparisons. The overall accuracy of scoring applied to a general population of fractures is little known. This study aimed to investigate the accuracy and reliability of the comprehensive Arbeitsgemeinschaft für Osteosynthesefragen/Orthopedic Trauma Association classification for adult long-bone fractures and identify factors associated with poor coding agreement. Adults (>16 years) with long-bone fractures coded in a Fracture and Dislocation Registry at the Stavanger University Hospital during the fiscal year 2008 were included. An unblinded reference code dataset was generated for the overall accuracy assessment by two experienced orthopedic trauma surgeons. Blinded analysis of intrarater reliability was performed by rescoring and of interrater reliability by recoding of a randomly selected fracture sample. Proportion of agreement (PA) and kappa (κ) statistics are presented. Uni- and multivariate logistic regression analyses of factors predicting accuracy were performed. During the study period, 949 fractures were included and coded by 26 surgeons. For the intrarater analysis, overall agreements were κ = 0.67 (95% confidence interval [CI]: 0.64-0.70) and PA 69%. For interrater assessment, κ = 0.67 (95% CI: 0.62-0.72) and PA 69%. The accuracy of surgeons' blinded recoding was κ = 0.68 (95% CI: 0.65- 0.71) and PA 68%. Fracture type, frequency of the fracture, and segment fractured significantly influenced accuracy whereas the coder's experience did not. Both the reliability and accuracy of the comprehensive Arbeitsgemeinschaft für Osteosynthesefragen/Orthopedic Trauma Association classification for long-bone fractures ranged from substantial to excellent. Variations in coding accuracy seem to be related more to the fracture itself than the surgeon. Diagnostic study, level I.
Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.
Chen, Shizhi; Yang, Xiaodong; Tian, Yingli
2015-09-01
A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement.
NASA Astrophysics Data System (ADS)
Tao, C.-S.; Chen, S.-W.; Li, Y.-Z.; Xiao, S.-P.
2017-09-01
Land cover classification is an important application for polarimetric synthetic aperture radar (PolSAR) data utilization. Rollinvariant polarimetric features such as H / Ani / α / Span are commonly adopted in PolSAR land cover classification. However, target orientation diversity effect makes PolSAR images understanding and interpretation difficult. Only using the roll-invariant polarimetric features may introduce ambiguity in the interpretation of targets' scattering mechanisms and limit the followed classification accuracy. To address this problem, this work firstly focuses on hidden polarimetric feature mining in the rotation domain along the radar line of sight using the recently reported uniform polarimetric matrix rotation theory and the visualization and characterization tool of polarimetric coherence pattern. The former rotates the acquired polarimetric matrix along the radar line of sight and fully describes the rotation characteristics of each entry of the matrix. Sets of new polarimetric features are derived to describe the hidden scattering information of the target in the rotation domain. The latter extends the traditional polarimetric coherence at a given rotation angle to the rotation domain for complete interpretation. A visualization and characterization tool is established to derive new polarimetric features for hidden information exploration. Then, a classification scheme is developed combing both the selected new hidden polarimetric features in rotation domain and the commonly used roll-invariant polarimetric features with a support vector machine (SVM) classifier. Comparison experiments based on AIRSAR and multi-temporal UAVSAR data demonstrate that compared with the conventional classification scheme which only uses the roll-invariant polarimetric features, the proposed classification scheme achieves both higher classification accuracy and better robustness. For AIRSAR data, the overall classification accuracy with the proposed classification scheme is 94.91 %, while that with the conventional classification scheme is 93.70 %. Moreover, for multi-temporal UAVSAR data, the averaged overall classification accuracy with the proposed classification scheme is up to 97.08 %, which is much higher than the 87.79 % from the conventional classification scheme. Furthermore, for multitemporal PolSAR data, the proposed classification scheme can achieve better robustness. The comparison studies also clearly demonstrate that mining and utilization of hidden polarimetric features and information in the rotation domain can gain the added benefits for PolSAR land cover classification and provide a new vision for PolSAR image interpretation and application.
Analysis of spatial distribution of land cover maps accuracy
NASA Astrophysics Data System (ADS)
Khatami, R.; Mountrakis, G.; Stehman, S. V.
2017-12-01
Land cover maps have become one of the most important products of remote sensing science. However, classification errors will exist in any classified map and affect the reliability of subsequent map usage. Moreover, classification accuracy often varies over different regions of a classified map. These variations of accuracy will affect the reliability of subsequent analyses of different regions based on the classified maps. The traditional approach of map accuracy assessment based on an error matrix does not capture the spatial variation in classification accuracy. Here, per-pixel accuracy prediction methods are proposed based on interpolating accuracy values from a test sample to produce wall-to-wall accuracy maps. Different accuracy prediction methods were developed based on four factors: predictive domain (spatial versus spectral), interpolation function (constant, linear, Gaussian, and logistic), incorporation of class information (interpolating each class separately versus grouping them together), and sample size. Incorporation of spectral domain as explanatory feature spaces of classification accuracy interpolation was done for the first time in this research. Performance of the prediction methods was evaluated using 26 test blocks, with 10 km × 10 km dimensions, dispersed throughout the United States. The performance of the predictions was evaluated using the area under the curve (AUC) of the receiver operating characteristic. Relative to existing accuracy prediction methods, our proposed methods resulted in improvements of AUC of 0.15 or greater. Evaluation of the four factors comprising the accuracy prediction methods demonstrated that: i) interpolations should be done separately for each class instead of grouping all classes together; ii) if an all-classes approach is used, the spectral domain will result in substantially greater AUC than the spatial domain; iii) for the smaller sample size and per-class predictions, the spectral and spatial domain yielded similar AUC; iv) for the larger sample size (i.e., very dense spatial sample) and per-class predictions, the spatial domain yielded larger AUC; v) increasing the sample size improved accuracy predictions with a greater benefit accruing to the spatial domain; and vi) the function used for interpolation had the smallest effect on AUC.
Zhang, He-Hua; Yang, Liuyang; Liu, Yuchuan; Wang, Pin; Yin, Jun; Li, Yongming; Qiu, Mingguo; Zhu, Xueru; Yan, Fang
2016-11-16
The use of speech based data in the classification of Parkinson disease (PD) has been shown to provide an effect, non-invasive mode of classification in recent years. Thus, there has been an increased interest in speech pattern analysis methods applicable to Parkinsonism for building predictive tele-diagnosis and tele-monitoring models. One of the obstacles in optimizing classifications is to reduce noise within the collected speech samples, thus ensuring better classification accuracy and stability. While the currently used methods are effect, the ability to invoke instance selection has been seldomly examined. In this study, a PD classification algorithm was proposed and examined that combines a multi-edit-nearest-neighbor (MENN) algorithm and an ensemble learning algorithm. First, the MENN algorithm is applied for selecting optimal training speech samples iteratively, thereby obtaining samples with high separability. Next, an ensemble learning algorithm, random forest (RF) or decorrelated neural network ensembles (DNNE), is used to generate trained samples from the collected training samples. Lastly, the trained ensemble learning algorithms are applied to the test samples for PD classification. This proposed method was examined using a more recently deposited public datasets and compared against other currently used algorithms for validation. Experimental results showed that the proposed algorithm obtained the highest degree of improved classification accuracy (29.44%) compared with the other algorithm that was examined. Furthermore, the MENN algorithm alone was found to improve classification accuracy by as much as 45.72%. Moreover, the proposed algorithm was found to exhibit a higher stability, particularly when combining the MENN and RF algorithms. This study showed that the proposed method could improve PD classification when using speech data and can be applied to future studies seeking to improve PD classification methods.
Validity of an Integrative Method for Processing Physical Activity Data.
Ellingson, Laura D; Schwabacher, Isaac J; Kim, Youngwon; Welk, Gregory J; Cook, Dane B
2016-08-01
Accurate assessments of both physical activity and sedentary behaviors are crucial to understand the health consequences of movement patterns and to track changes over time and in response to interventions. The study evaluates the validity of an integrative, machine learning method for processing activity monitor data in relation to a portable metabolic analyzer (Oxycon mobile [OM]) and direct observation (DO). Forty-nine adults (age 18-40 yr) each completed 5-min bouts of 15 activities ranging from sedentary to vigorous intensity in a laboratory setting while wearing ActiGraph (AG) on the hip, activPAL on the thigh, and OM. Estimates of energy expenditure (EE) and categorization of activity intensity were obtained from the AG processed with Lyden's sojourn (SOJ) method and from our new sojourns including posture (SIP) method, which integrates output from the AG and activPAL. Classification accuracy and estimates of EE were then compared with criterion measures (OM and DO) using confusion matrices and comparisons of the mean absolute error of log-transformed data (MAE ln Q). The SIP method had a higher overall classification agreement (79%, 95% CI = 75%-82%) than the SOJ (56%, 95% CI = 52%-59%) based on DO. Compared with OM, estimates of EE from SIP had lower mean absolute error of log-transformed data than SOJ for light-intensity (0.21 vs 0.27), moderate-intensity (0.33 vs 0.42), and vigorous-intensity (0.16 vs 0.35) activities. The SIP method was superior to SOJ for distinguishing between sedentary and light activities as well as estimating EE at higher intensities. Thus, SIP is recommended for research in which accuracy of measurement across the full range of activity intensities is of interest.
Tahir, Muhammad; Jan, Bismillah; Hayat, Maqsood; Shah, Shakir Ullah; Amin, Muhammad
2018-04-01
Discriminative and informative feature extraction is the core requirement for accurate and efficient classification of protein subcellular localization images so that drug development could be more effective. The objective of this paper is to propose a novel modification in the Threshold Adjacency Statistics technique and enhance its discriminative power. In this work, we utilized Threshold Adjacency Statistics from a novel perspective to enhance its discrimination power and efficiency. In this connection, we utilized seven threshold ranges to produce seven distinct feature spaces, which are then used to train seven SVMs. The final prediction is obtained through the majority voting scheme. The proposed ETAS-SubLoc system is tested on two benchmark datasets using 5-fold cross-validation technique. We observed that our proposed novel utilization of TAS technique has improved the discriminative power of the classifier. The ETAS-SubLoc system has achieved 99.2% accuracy, 99.3% sensitivity and 99.1% specificity for Endogenous dataset outperforming the classical Threshold Adjacency Statistics technique. Similarly, 91.8% accuracy, 96.3% sensitivity and 91.6% specificity values are achieved for Transfected dataset. Simulation results validated the effectiveness of ETAS-SubLoc that provides superior prediction performance compared to the existing technique. The proposed methodology aims at providing support to pharmaceutical industry as well as research community towards better drug designing and innovation in the fields of bioinformatics and computational biology. The implementation code for replicating the experiments presented in this paper is available at: https://drive.google.com/file/d/0B7IyGPObWbSqRTRMcXI2bG5CZWs/view?usp=sharing. Copyright © 2018 Elsevier B.V. All rights reserved.
Calès, P; Boursier, J; Lebigot, J; de Ledinghen, V; Aubé, C; Hubert, I; Oberti, F
2017-04-01
In chronic hepatitis C, the European Association for the Study of the Liver and the Asociacion Latinoamericana para el Estudio del Higado recommend performing transient elastography plus a blood test to diagnose significant fibrosis; test concordance confirms the diagnosis. To validate this rule and improve it by combining a blood test, FibroMeter (virus second generation, Echosens, Paris, France) and transient elastography (constitutive tests) into a single combined test, as suggested by the American Association for the Study of Liver Diseases and the Infectious Diseases Society of America. A total of 1199 patients were included in an exploratory set (HCV, n = 679) or in two validation sets (HCV ± HIV, HBV, n = 520). Accuracy was mainly evaluated by correct diagnosis rate for severe fibrosis (pathological Metavir F ≥ 3, primary outcome) by classical test scores or a fibrosis classification, reflecting Metavir staging, as a function of test concordance. Score accuracy: there were no significant differences between the blood test (75.7%), elastography (79.1%) and the combined test (79.4%) (P = 0.066); the score accuracy of each test was significantly (P < 0.001) decreased in discordant vs. concordant tests. Classification accuracy: combined test accuracy (91.7%) was significantly (P < 0.001) increased vs. the blood test (84.1%) and elastography (88.2%); accuracy of each constitutive test was significantly (P < 0.001) decreased in discordant vs. concordant tests but not with combined test: 89.0 vs. 92.7% (P = 0.118). Multivariate analysis for accuracy showed an interaction between concordance and fibrosis level: in the 1% of patients with full classification discordance and severe fibrosis, non-invasive tests were unreliable. The advantage of combined test classification was confirmed in the validation sets. The concordance recommendation is validated. A combined test, expressed in classification instead of score, improves this rule and validates the recommendation of a combined test, avoiding 99% of biopsies, and offering precise staging. © 2017 John Wiley & Sons Ltd.
Simulation of seagrass bed mapping by satellite images based on the radiative transfer model
NASA Astrophysics Data System (ADS)
Sagawa, Tatsuyuki; Komatsu, Teruhisa
2015-06-01
Seagrass and seaweed beds play important roles in coastal marine ecosystems. They are food sources and habitats for many marine organisms, and influence the physical, chemical, and biological environment. They are sensitive to human impacts such as reclamation and pollution. Therefore, their management and preservation are necessary for a healthy coastal environment. Satellite remote sensing is a useful tool for mapping and monitoring seagrass beds. The efficiency of seagrass mapping, seagrass bed classification in particular, has been evaluated by mapping accuracy using an error matrix. However, mapping accuracies are influenced by coastal environments such as seawater transparency, bathymetry, and substrate type. Coastal management requires sufficient accuracy and an understanding of mapping limitations for monitoring coastal habitats including seagrass beds. Previous studies are mainly based on case studies in specific regions and seasons. Extensive data are required to generalise assessments of classification accuracy from case studies, which has proven difficult. This study aims to build a simulator based on a radiative transfer model to produce modelled satellite images and assess the visual detectability of seagrass beds under different transparencies and seagrass coverages, as well as to examine mapping limitations and classification accuracy. Our simulations led to the development of a model of water transparency and the mapping of depth limits and indicated the possibility for seagrass density mapping under certain ideal conditions. The results show that modelling satellite images is useful in evaluating the accuracy of classification and that establishing seagrass bed monitoring by remote sensing is a reliable tool.
Corcoran, Jennifer M.; Knight, Joseph F.; Gallant, Alisa L.
2013-01-01
Wetland mapping at the landscape scale using remotely sensed data requires both affordable data and an efficient accurate classification method. Random forest classification offers several advantages over traditional land cover classification techniques, including a bootstrapping technique to generate robust estimations of outliers in the training data, as well as the capability of measuring classification confidence. Though the random forest classifier can generate complex decision trees with a multitude of input data and still not run a high risk of over fitting, there is a great need to reduce computational and operational costs by including only key input data sets without sacrificing a significant level of accuracy. Our main questions for this study site in Northern Minnesota were: (1) how does classification accuracy and confidence of mapping wetlands compare using different remote sensing platforms and sets of input data; (2) what are the key input variables for accurate differentiation of upland, water, and wetlands, including wetland type; and (3) which datasets and seasonal imagery yield the best accuracy for wetland classification. Our results show the key input variables include terrain (elevation and curvature) and soils descriptors (hydric), along with an assortment of remotely sensed data collected in the spring (satellite visible, near infrared, and thermal bands; satellite normalized vegetation index and Tasseled Cap greenness and wetness; and horizontal-horizontal (HH) and horizontal-vertical (HV) polarization using L-band satellite radar). We undertook this exploratory analysis to inform decisions by natural resource managers charged with monitoring wetland ecosystems and to aid in designing a system for consistent operational mapping of wetlands across landscapes similar to those found in Northern Minnesota.
A Visual mining based framework for classification accuracy estimation
NASA Astrophysics Data System (ADS)
Arun, Pattathal Vijayakumar
2013-12-01
Classification techniques have been widely used in different remote sensing applications and correct classification of mixed pixels is a tedious task. Traditional approaches adopt various statistical parameters, however does not facilitate effective visualisation. Data mining tools are proving very helpful in the classification process. We propose a visual mining based frame work for accuracy assessment of classification techniques using open source tools such as WEKA and PREFUSE. These tools in integration can provide an efficient approach for getting information about improvements in the classification accuracy and helps in refining training data set. We have illustrated framework for investigating the effects of various resampling methods on classification accuracy and found that bilinear (BL) is best suited for preserving radiometric characteristics. We have also investigated the optimal number of folds required for effective analysis of LISS-IV images. Techniki klasyfikacji są szeroko wykorzystywane w różnych aplikacjach teledetekcyjnych, w których poprawna klasyfikacja pikseli stanowi poważne wyzwanie. Podejście tradycyjne wykorzystujące różnego rodzaju parametry statystyczne nie zapewnia efektywnej wizualizacji. Wielce obiecujące wydaje się zastosowanie do klasyfikacji narzędzi do eksploracji danych. W artykule zaproponowano podejście bazujące na wizualnej analizie eksploracyjnej, wykorzystujące takie narzędzia typu open source jak WEKA i PREFUSE. Wymienione narzędzia ułatwiają korektę pół treningowych i efektywnie wspomagają poprawę dokładności klasyfikacji. Działanie metody sprawdzono wykorzystując wpływ różnych metod resampling na zachowanie dokładności radiometrycznej i uzyskując najlepsze wyniki dla metody bilinearnej (BL).
NASA Technical Reports Server (NTRS)
Quattrochi, D. A.; Anderson, J. E.; Brannon, D. P.; Hill, C. L.
1982-01-01
An initial analysis of LANDSAT 4 thematic mapper (TM) data for the delineation and classification of agricultural, forested wetland, and urban land covers was conducted. A study area in Poinsett County, Arkansas was used to evaluate a classification of agricultural lands derived from multitemporal LANDSAT multispectral scanner (MSS) data in comparison with a classification of TM data for the same area. Data over Reelfoot Lake in northwestern Tennessee were utilized to evaluate the TM for delineating forested wetland species. A classification of the study area was assessed for accuracy in discriminating five forested wetland categories. Finally, the TM data were used to identify urban features within a small city. A computer generated classification of Union City, Tennessee was analyzed for accuracy in delineating urban land covers. An evaluation of digitally enhanced TM data using principal components analysis to facilitate photointerpretation of urban features was also performed.
NASA Technical Reports Server (NTRS)
Lillesand, T. M.; Werth, L. F. (Principal Investigator)
1980-01-01
A 25% improvement in average classification accuracy was realized by processing double-date vs. single-date data. Under the spectrally and spatially complex site conditions characterizing the geographical area used, further improvement in wetland classification accuracy is apparently precluded by the spectral and spatial resolution restrictions of the LANDSAT MSS. Full scene analysis of scanning densitometer data extracted from scale infrared photography failed to permit discrimination of many wetland and nonwetland cover types. When classification of photographic data was limited to wetland areas only, much more detailed and accurate classification could be made. The integration of conventional image interpretation (to simply delineate wetland boundaries) and machine assisted classification (to discriminate among cover types present within the wetland areas) appears to warrant further research to study the feasibility and cost of extending this methodology over a large area using LANDSAT and/or small scale photography.
An SVM-Based Solution for Fault Detection in Wind Turbines
Santos, Pedro; Villa, Luisa F.; Reñones, Aníbal; Bustillo, Andres; Maudes, Jesús
2015-01-01
Research into fault diagnosis in machines with a wide range of variable loads and speeds, such as wind turbines, is of great industrial interest. Analysis of the power signals emitted by wind turbines for the diagnosis of mechanical faults in their mechanical transmission chain is insufficient. A successful diagnosis requires the inclusion of accelerometers to evaluate vibrations. This work presents a multi-sensory system for fault diagnosis in wind turbines, combined with a data-mining solution for the classification of the operational state of the turbine. The selected sensors are accelerometers, in which vibration signals are processed using angular resampling techniques and electrical, torque and speed measurements. Support vector machines (SVMs) are selected for the classification task, including two traditional and two promising new kernels. This multi-sensory system has been validated on a test-bed that simulates the real conditions of wind turbines with two fault typologies: misalignment and imbalance. Comparison of SVM performance with the results of artificial neural networks (ANNs) shows that linear kernel SVM outperforms other kernels and ANNs in terms of accuracy, training and tuning times. The suitability and superior performance of linear SVM is also experimentally analyzed, to conclude that this data acquisition technique generates linearly separable datasets. PMID:25760051
Characterization and delineation of caribou habitat on Unimak Island using remote sensing techniques
NASA Astrophysics Data System (ADS)
Atkinson, Brain M.
The assessment of herbivore habitat quality is traditionally based on quantifying the forages available to the animal across their home range through ground-based techniques. While these methods are highly accurate, they can be time-consuming and highly expensive, especially for herbivores that occupy vast spatial landscapes. The Unimak Island caribou herd has been decreasing in the last decade at rates that have prompted discussion of management intervention. Frequent inclement weather in this region of Alaska has provided for little opportunity to study the caribou forage habitat on Unimak Island. The overall objectives of this study were two-fold 1) to assess the feasibility of using high-resolution color and near-infrared aerial imagery to map the forage distribution of caribou habitat on Unimak Island and 2) to assess the use of a new high-resolution multispectral satellite imagery platform, RapidEye, and use of the "red-edge" spectral band on vegetation classification accuracy. Maximum likelihood classification algorithms were used to create land cover maps in aerial and satellite imagery. Accuracy assessments and transformed divergence values were produced to assess vegetative spectral information and classification accuracy. By using RapidEye and aerial digital imagery in a hierarchical supervised classification technique, we were able to produce a high resolution land cover map of Unimak Island. We obtained overall accuracy rates of 71.4 percent which are comparable to other land cover maps using RapidEye imagery. The "red-edge" spectral band included in the RapidEye imagery provides additional spectral information that allows for a more accurate overall classification, raising overall accuracy 5.2 percent.
Texture as a basis for acoustic classification of substrate in the nearshore region
NASA Astrophysics Data System (ADS)
Dennison, A.; Wattrus, N. J.
2016-12-01
Segmentation and classification of substrate type from two locations in Lake Superior, are predicted using multivariate statistical processing of textural measures derived from shallow-water, high-resolution multibeam bathymetric data. During a multibeam sonar survey, both bathymetric and backscatter data are collected. It is well documented that the statistical characteristic of a sonar backscatter mosaic is dependent on substrate type. While classifying the bottom-type on the basis on backscatter alone can accurately predict and map bottom-type, it lacks the ability to resolve and capture fine textural details, an important factor in many habitat mapping studies. Statistical processing can capture the pertinent details about the bottom-type that are rich in textural information. Further multivariate statistical processing can then isolate characteristic features, and provide the basis for an accurate classification scheme. Preliminary results from an analysis of bathymetric data and ground-truth samples collected from the Amnicon River, Superior, Wisconsin, and the Lester River, Duluth, Minnesota, demonstrate the ability to process and develop a novel classification scheme of the bottom type in two geomorphologically distinct areas.
CW-SSIM kernel based random forest for image classification
NASA Astrophysics Data System (ADS)
Fan, Guangzhe; Wang, Zhou; Wang, Jiheng
2010-07-01
Complex wavelet structural similarity (CW-SSIM) index has been proposed as a powerful image similarity metric that is robust to translation, scaling and rotation of images, but how to employ it in image classification applications has not been deeply investigated. In this paper, we incorporate CW-SSIM as a kernel function into a random forest learning algorithm. This leads to a novel image classification approach that does not require a feature extraction or dimension reduction stage at the front end. We use hand-written digit recognition as an example to demonstrate our algorithm. We compare the performance of the proposed approach with random forest learning based on other kernels, including the widely adopted Gaussian and the inner product kernels. Empirical evidences show that the proposed method is superior in its classification power. We also compared our proposed approach with the direct random forest method without kernel and the popular kernel-learning method support vector machine. Our test results based on both simulated and realworld data suggest that the proposed approach works superior to traditional methods without the feature selection procedure.
NASA Astrophysics Data System (ADS)
Susanti, Yuliana; Zukhronah, Etik; Pratiwi, Hasih; Respatiwulan; Sri Sulistijowati, H.
2017-11-01
To achieve food resilience in Indonesia, food diversification by exploring potentials of local food is required. Corn is one of alternating staple food of Javanese society. For that reason, corn production needs to be improved by considering the influencing factors. CHAID and CRT are methods of data mining which can be used to classify the influencing variables. The present study seeks to dig up information on the potentials of local food availability of corn in regencies and cities in Java Island. CHAID analysis yields four classifications with accuracy of 78.8%, while CRT analysis yields seven classifications with accuracy of 79.6%.
Deep multi-scale convolutional neural network for hyperspectral image classification
NASA Astrophysics Data System (ADS)
Zhang, Feng-zhe; Yang, Xia
2018-04-01
In this paper, we proposed a multi-scale convolutional neural network for hyperspectral image classification task. Firstly, compared with conventional convolution, we utilize multi-scale convolutions, which possess larger respective fields, to extract spectral features of hyperspectral image. We design a deep neural network with a multi-scale convolution layer which contains 3 different convolution kernel sizes. Secondly, to avoid overfitting of deep neural network, dropout is utilized, which randomly sleeps neurons, contributing to improve the classification accuracy a bit. In addition, new skills like ReLU in deep learning is utilized in this paper. We conduct experiments on University of Pavia and Salinas datasets, and obtained better classification accuracy compared with other methods.
Rey, Sergio J.; Stephens, Philip A.; Laura, Jason R.
2017-01-01
Large data contexts present a number of challenges to optimal choropleth map classifiers. Application of optimal classifiers to a sample of the attribute space is one proposed solution. The properties of alternative sampling-based classification methods are examined through a series of Monte Carlo simulations. The impacts of spatial autocorrelation, number of desired classes, and form of sampling are shown to have significant impacts on the accuracy of map classifications. Tradeoffs between improved speed of the sampling approaches and loss of accuracy are also considered. The results suggest the possibility of guiding the choice of classification scheme as a function of the properties of large data sets.
Decoding Multiple Sound Categories in the Human Temporal Cortex Using High Resolution fMRI
Zhang, Fengqing; Wang, Ji-Ping; Kim, Jieun; Parrish, Todd; Wong, Patrick C. M.
2015-01-01
Perception of sound categories is an important aspect of auditory perception. The extent to which the brain’s representation of sound categories is encoded in specialized subregions or distributed across the auditory cortex remains unclear. Recent studies using multivariate pattern analysis (MVPA) of brain activations have provided important insights into how the brain decodes perceptual information. In the large existing literature on brain decoding using MVPA methods, relatively few studies have been conducted on multi-class categorization in the auditory domain. Here, we investigated the representation and processing of auditory categories within the human temporal cortex using high resolution fMRI and MVPA methods. More importantly, we considered decoding multiple sound categories simultaneously through multi-class support vector machine-recursive feature elimination (MSVM-RFE) as our MVPA tool. Results show that for all classifications the model MSVM-RFE was able to learn the functional relation between the multiple sound categories and the corresponding evoked spatial patterns and classify the unlabeled sound-evoked patterns significantly above chance. This indicates the feasibility of decoding multiple sound categories not only within but across subjects. However, the across-subject variation affects classification performance more than the within-subject variation, as the across-subject analysis has significantly lower classification accuracies. Sound category-selective brain maps were identified based on multi-class classification and revealed distributed patterns of brain activity in the superior temporal gyrus and the middle temporal gyrus. This is in accordance with previous studies, indicating that information in the spatially distributed patterns may reflect a more abstract perceptual level of representation of sound categories. Further, we show that the across-subject classification performance can be significantly improved by averaging the fMRI images over items, because the irrelevant variations between different items of the same sound category are reduced and in turn the proportion of signals relevant to sound categorization increases. PMID:25692885
Decoding multiple sound categories in the human temporal cortex using high resolution fMRI.
Zhang, Fengqing; Wang, Ji-Ping; Kim, Jieun; Parrish, Todd; Wong, Patrick C M
2015-01-01
Perception of sound categories is an important aspect of auditory perception. The extent to which the brain's representation of sound categories is encoded in specialized subregions or distributed across the auditory cortex remains unclear. Recent studies using multivariate pattern analysis (MVPA) of brain activations have provided important insights into how the brain decodes perceptual information. In the large existing literature on brain decoding using MVPA methods, relatively few studies have been conducted on multi-class categorization in the auditory domain. Here, we investigated the representation and processing of auditory categories within the human temporal cortex using high resolution fMRI and MVPA methods. More importantly, we considered decoding multiple sound categories simultaneously through multi-class support vector machine-recursive feature elimination (MSVM-RFE) as our MVPA tool. Results show that for all classifications the model MSVM-RFE was able to learn the functional relation between the multiple sound categories and the corresponding evoked spatial patterns and classify the unlabeled sound-evoked patterns significantly above chance. This indicates the feasibility of decoding multiple sound categories not only within but across subjects. However, the across-subject variation affects classification performance more than the within-subject variation, as the across-subject analysis has significantly lower classification accuracies. Sound category-selective brain maps were identified based on multi-class classification and revealed distributed patterns of brain activity in the superior temporal gyrus and the middle temporal gyrus. This is in accordance with previous studies, indicating that information in the spatially distributed patterns may reflect a more abstract perceptual level of representation of sound categories. Further, we show that the across-subject classification performance can be significantly improved by averaging the fMRI images over items, because the irrelevant variations between different items of the same sound category are reduced and in turn the proportion of signals relevant to sound categorization increases.
A new self-report inventory of dyslexia for students: criterion and construct validity.
Tamboer, Peter; Vorst, Harrie C M
2015-02-01
The validity of a Dutch self-report inventory of dyslexia was ascertained in two samples of students. Six biographical questions, 20 general language statements and 56 specific language statements were based on dyslexia as a multi-dimensional deficit. Dyslexia and non-dyslexia were assessed with two criteria: identification with test results (Sample 1) and classification using biographical information (both samples). Using discriminant analyses, these criteria were predicted with various groups of statements. All together, 11 discriminant functions were used to estimate classification accuracy of the inventory. In Sample 1, 15 statements predicted the test criterion with classification accuracy of 98%, and 18 statements predicted the biographical criterion with classification accuracy of 97%. In Sample 2, 16 statements predicted the biographical criterion with classification accuracy of 94%. Estimations of positive and negative predictive value were 89% and 99%. Items of various discriminant functions were factor analysed to find characteristic difficulties of students with dyslexia, resulting in a five-factor structure in Sample 1 and a four-factor structure in Sample 2. Answer bias was investigated with measures of internal consistency reliability. Less than 20 self-report items are sufficient to accurately classify students with and without dyslexia. This supports the usefulness of self-assessment of dyslexia as a valid alternative to diagnostic test batteries. Copyright © 2015 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Liu, Tao; Im, Jungho; Quackenbush, Lindi J.
2015-12-01
This study provides a novel approach to individual tree crown delineation (ITCD) using airborne Light Detection and Ranging (LiDAR) data in dense natural forests using two main steps: crown boundary refinement based on a proposed Fishing Net Dragging (FiND) method, and segment merging based on boundary classification. FiND starts with approximate tree crown boundaries derived using a traditional watershed method with Gaussian filtering and refines these boundaries using an algorithm that mimics how a fisherman drags a fishing net. Random forest machine learning is then used to classify boundary segments into two classes: boundaries between trees and boundaries between branches that belong to a single tree. Three groups of LiDAR-derived features-two from the pseudo waveform generated along with crown boundaries and one from a canopy height model (CHM)-were used in the classification. The proposed ITCD approach was tested using LiDAR data collected over a mountainous region in the Adirondack Park, NY, USA. Overall accuracy of boundary classification was 82.4%. Features derived from the CHM were generally more important in the classification than the features extracted from the pseudo waveform. A comprehensive accuracy assessment scheme for ITCD was also introduced by considering both area of crown overlap and crown centroids. Accuracy assessment using this new scheme shows the proposed ITCD achieved 74% and 78% as overall accuracy, respectively, for deciduous and mixed forest.
Thomas C. Edwards; D. Richard Cutler; Niklaus E. Zimmermann; Linda Geiser; Gretchen G. Moisen
2006-01-01
We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by...
Using New Models to Analyze Complex Regularities of the World: Commentary on Musso et al. (2013)
ERIC Educational Resources Information Center
Nokelainen, Petri; Silander, Tomi
2014-01-01
This commentary to the recent article by Musso et al. (2013) discusses issues related to model fitting, comparison of classification accuracy of generative and discriminative models, and two (or more) cultures of data modeling. We start by questioning the extremely high classification accuracy with an empirical data from a complex domain. There is…
ERIC Educational Resources Information Center
Ball, Carrie R.; O'Connor, Edward
2016-01-01
This study examined the predictive validity and classification accuracy of two commonly used universal screening measures relative to a statewide achievement test. Results indicated that second-grade performance on oral reading fluency and the Measures of Academic Progress (MAP), together with special education status, explained 68% of the…
Two Approaches to Estimation of Classification Accuracy Rate under Item Response Theory
ERIC Educational Resources Information Center
Lathrop, Quinn N.; Cheng, Ying
2013-01-01
Within the framework of item response theory (IRT), there are two recent lines of work on the estimation of classification accuracy (CA) rate. One approach estimates CA when decisions are made based on total sum scores, the other based on latent trait estimates. The former is referred to as the Lee approach, and the latter, the Rudner approach,…
Subject-Adaptive Real-Time Sleep Stage Classification Based on Conditional Random Field
Luo, Gang; Min, Wanli
2007-01-01
Sleep staging is the pattern recognition task of classifying sleep recordings into sleep stages. This task is one of the most important steps in sleep analysis. It is crucial for the diagnosis and treatment of various sleep disorders, and also relates closely to brain-machine interfaces. We report an automatic, online sleep stager using electroencephalogram (EEG) signal based on a recently-developed statistical pattern recognition method, conditional random field, and novel potential functions that have explicit physical meanings. Using sleep recordings from human subjects, we show that the average classification accuracy of our sleep stager almost approaches the theoretical limit and is about 8% higher than that of existing systems. Moreover, for a new subject snew with limited training data Dnew, we perform subject adaptation to improve classification accuracy. Our idea is to use the knowledge learned from old subjects to obtain from Dnew a regulated estimate of CRF’s parameters. Using sleep recordings from human subjects, we show that even without any Dnew, our sleep stager can achieve an average classification accuracy of 70% on snew. This accuracy increases with the size of Dnew and eventually becomes close to the theoretical limit. PMID:18693884
NASA Astrophysics Data System (ADS)
Zhang, Zhiming; de Wulf, Robert R.; van Coillie, Frieke M. B.; Verbeke, Lieven P. C.; de Clercq, Eva M.; Ou, Xiaokun
2011-01-01
Mapping of vegetation using remote sensing in mountainous areas is considerably hampered by topographic effects on the spectral response pattern. A variety of topographic normalization techniques have been proposed to correct these illumination effects due to topography. The purpose of this study was to compare six different topographic normalization methods (Cosine correction, Minnaert correction, C-correction, Sun-canopy-sensor correction, two-stage topographic normalization, and slope matching technique) for their effectiveness in enhancing vegetation classification in mountainous environments. Since most of the vegetation classes in the rugged terrain of the Lancang Watershed (China) did not feature a normal distribution, artificial neural networks (ANNs) were employed as a classifier. Comparing the ANN classifications, none of the topographic correction methods could significantly improve ETM+ image classification overall accuracy. Nevertheless, at the class level, the accuracy of pine forest could be increased by using topographically corrected images. On the contrary, oak forest and mixed forest accuracies were significantly decreased by using corrected images. The results also showed that none of the topographic normalization strategies was satisfactorily able to correct for the topographic effects in severely shadowed areas.
Support vector machine and principal component analysis for microarray data classification
NASA Astrophysics Data System (ADS)
Astuti, Widi; Adiwijaya
2018-03-01
Cancer is a leading cause of death worldwide although a significant proportion of it can be cured if it is detected early. In recent decades, technology called microarray takes an important role in the diagnosis of cancer. By using data mining technique, microarray data classification can be performed to improve the accuracy of cancer diagnosis compared to traditional techniques. The characteristic of microarray data is small sample but it has huge dimension. Since that, there is a challenge for researcher to provide solutions for microarray data classification with high performance in both accuracy and running time. This research proposed the usage of Principal Component Analysis (PCA) as a dimension reduction method along with Support Vector Method (SVM) optimized by kernel functions as a classifier for microarray data classification. The proposed scheme was applied on seven data sets using 5-fold cross validation and then evaluation and analysis conducted on term of both accuracy and running time. The result showed that the scheme can obtained 100% accuracy for Ovarian and Lung Cancer data when Linear and Cubic kernel functions are used. In term of running time, PCA greatly reduced the running time for every data sets.
Trakoolwilaiwan, Thanawin; Behboodi, Bahareh; Lee, Jaeseok; Kim, Kyungsoo; Choi, Ji-Woong
2018-01-01
The aim of this work is to develop an effective brain-computer interface (BCI) method based on functional near-infrared spectroscopy (fNIRS). In order to improve the performance of the BCI system in terms of accuracy, the ability to discriminate features from input signals and proper classification are desired. Previous studies have mainly extracted features from the signal manually, but proper features need to be selected carefully. To avoid performance degradation caused by manual feature selection, we applied convolutional neural networks (CNNs) as the automatic feature extractor and classifier for fNIRS-based BCI. In this study, the hemodynamic responses evoked by performing rest, right-, and left-hand motor execution tasks were measured on eight healthy subjects to compare performances. Our CNN-based method provided improvements in classification accuracy over conventional methods employing the most commonly used features of mean, peak, slope, variance, kurtosis, and skewness, classified by support vector machine (SVM) and artificial neural network (ANN). Specifically, up to 6.49% and 3.33% improvement in classification accuracy was achieved by CNN compared with SVM and ANN, respectively.
NASA Astrophysics Data System (ADS)
Huang, Xin; Chen, Huijun; Gong, Jianya
2018-01-01
Spaceborne multi-angle images with a high-resolution are capable of simultaneously providing spatial details and three-dimensional (3D) information to support detailed and accurate classification of complex urban scenes. In recent years, satellite-derived digital surface models (DSMs) have been increasingly utilized to provide height information to complement spectral properties for urban classification. However, in such a way, the multi-angle information is not effectively exploited, which is mainly due to the errors and difficulties of the multi-view image matching and the inaccuracy of the generated DSM over complex and dense urban scenes. Therefore, it is still a challenging task to effectively exploit the available angular information from high-resolution multi-angle images. In this paper, we investigate the potential for classifying urban scenes based on local angular properties characterized from high-resolution ZY-3 multi-view images. Specifically, three categories of angular difference features (ADFs) are proposed to describe the angular information at three levels (i.e., pixel, feature, and label levels): (1) ADF-pixel: the angular information is directly extrapolated by pixel comparison between the multi-angle images; (2) ADF-feature: the angular differences are described in the feature domains by comparing the differences between the multi-angle spatial features (e.g., morphological attribute profiles (APs)). (3) ADF-label: label-level angular features are proposed based on a group of urban primitives (e.g., buildings and shadows), in order to describe the specific angular information related to the types of primitive classes. In addition, we utilize spatial-contextual information to refine the multi-level ADF features using superpixel segmentation, for the purpose of alleviating the effects of salt-and-pepper noise and representing the main angular characteristics within a local area. The experiments on ZY-3 multi-angle images confirm that the proposed ADF features can effectively improve the accuracy of urban scene classification, with a significant increase in overall accuracy (3.8-11.7%) compared to using the spectral bands alone. Furthermore, the results indicated the superiority of the proposed ADFs in distinguishing between the spectrally similar and complex man-made classes, including roads and various types of buildings (e.g., high buildings, urban villages, and residential apartments).
Lu, Huijuan; Wei, Shasha; Zhou, Zili; Miao, Yanzi; Lu, Yi
2015-01-01
The main purpose of traditional classification algorithms on bioinformatics application is to acquire better classification accuracy. However, these algorithms cannot meet the requirement that minimises the average misclassification cost. In this paper, a new algorithm of cost-sensitive regularised extreme learning machine (CS-RELM) was proposed by using probability estimation and misclassification cost to reconstruct the classification results. By improving the classification accuracy of a group of small sample which higher misclassification cost, the new CS-RELM can minimise the classification cost. The 'rejection cost' was integrated into CS-RELM algorithm to further reduce the average misclassification cost. By using Colon Tumour dataset and SRBCT (Small Round Blue Cells Tumour) dataset, CS-RELM was compared with other cost-sensitive algorithms such as extreme learning machine (ELM), cost-sensitive extreme learning machine, regularised extreme learning machine, cost-sensitive support vector machine (SVM). The results of experiments show that CS-RELM with embedded rejection cost could reduce the average cost of misclassification and made more credible classification decision than others.
Myint, S.W.; Yuan, M.; Cerveny, R.S.; Giri, C.P.
2008-01-01
Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and objectoriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. ?? 2008 by MDPI.
Myint, Soe W.; Yuan, May; Cerveny, Randall S.; Giri, Chandra P.
2008-01-01
Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and object-oriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. PMID:27879757
Delineation of marsh types of the Texas coast from Corpus Christi Bay to the Sabine River in 2010
Enwright, Nicholas M.; Hartley, Stephen B.; Brasher, Michael G.; Visser, Jenneke M.; Mitchell, Michael K.; Ballard, Bart M.; Parr, Mark W.; Couvillion, Brady R.; Wilson, Barry C.
2014-01-01
Coastal zone managers and researchers often require detailed information regarding emergent marsh vegetation types for modeling habitat capacities and needs of marsh-reliant wildlife (such as waterfowl and alligator). Detailed information on the extent and distribution of marsh vegetation zones throughout the Texas coast has been historically unavailable. In response, the U.S. Geological Survey, in cooperation and collaboration with the U.S. Fish and Wildlife Service via the Gulf Coast Joint Venture, Texas A&M University-Kingsville, the University of Louisiana-Lafayette, and Ducks Unlimited, Inc., has produced a classification of marsh vegetation types along the middle and upper Texas coast from Corpus Christi Bay to the Sabine River. This study incorporates approximately 1,000 ground reference locations collected via helicopter surveys in coastal marsh areas and about 2,000 supplemental locations from fresh marsh, water, and “other” (that is, nonmarsh) areas. About two-thirds of these data were used for training, and about one-third were used for assessing accuracy. Decision-tree analyses using Rulequest See5 were used to classify emergent marsh vegetation types by using these data, multitemporal satellite-based multispectral imagery from 2009 to 2011, a bare-earth digital elevation model (DEM) based on airborne light detection and ranging (lidar), alternative contemporary land cover classifications, and other spatially explicit variables believed to be important for delineating the extent and distribution of marsh vegetation communities. Image objects were generated from segmentation of high-resolution airborne imagery acquired in 2010 and were used to refine the classification. The classification is dated 2010 because the year is both the midpoint of the multitemporal satellite-based imagery (2009–11) classified and the date of the high-resolution airborne imagery that was used to develop image objects. Overall accuracy corrected for bias (accuracy estimate incorporates true marginal proportions) was 91 percent (95 percent confidence interval [CI]: 89.2–92.8), with a kappa statistic of 0.79 (95 percent CI: 0.77–0.81). The classification performed best for saline marsh (user’s accuracy 81.5 percent; producer’s accuracy corrected for bias 62.9 percent) but showed a lesser ability to discriminate intermediate marsh (user’s accuracy 47.7 percent; producer’s accuracy corrected for bias 49.5 percent). Because of confusion in intermediate and brackish marsh classes, an alternative classification containing only three marsh types was created in which intermediate and brackish marshes were combined into a single class. Image objects were reattributed by using this alternative three-marsh-type classification. Overall accuracy, corrected for bias, of this more general classification was 92.4 percent (95 percent CI: 90.7–94.2), and the kappa statistic was 0.83 (95 percent CI: 0.81–0.85). Mean user’s accuracy for marshes within the four-marsh-type and three-marsh-type classifications was 65.4 percent and 75.6 percent, respectively, whereas mean producer’s accuracy was 56.7 percent and 65.1 percent, respectively. This study provides a more objective and repeatable method for classifying marsh types of the middle and upper Texas coast at an extent and greater level of detail than previously available for the study area. The seamless classification produced through this work is now available to help State agencies (such as the Texas Parks and Wildlife Department) and landscape-scale conservation partnerships (such as the Gulf Coast Prairie Landscape Conservation Cooperative and the Gulf Coast Joint Venture) to develop and (or) refine conservation plans targeting priority natural resources. Moreover, these data may improve projections of landscape change and serve as a baseline for monitoring future changes resulting from chronic and episodic stressors.
DOT National Transportation Integrated Search
2014-09-01
Vehicle classification is an important traffic parameter for transportation planning and infrastructure : management. Length-based vehicle classification from dual loop detectors is among the lowest cost : technologies commonly used for collecting th...
Virtual Sensor of Surface Electromyography in a New Extensive Fault-Tolerant Classification System.
de Moura, Karina de O A; Balbinot, Alexandre
2018-05-01
A few prosthetic control systems in the scientific literature obtain pattern recognition algorithms adapted to changes that occur in the myoelectric signal over time and, frequently, such systems are not natural and intuitive. These are some of the several challenges for myoelectric prostheses for everyday use. The concept of the virtual sensor, which has as its fundamental objective to estimate unavailable measures based on other available measures, is being used in other fields of research. The virtual sensor technique applied to surface electromyography can help to minimize these problems, typically related to the degradation of the myoelectric signal that usually leads to a decrease in the classification accuracy of the movements characterized by computational intelligent systems. This paper presents a virtual sensor in a new extensive fault-tolerant classification system to maintain the classification accuracy after the occurrence of the following contaminants: ECG interference, electrode displacement, movement artifacts, power line interference, and saturation. The Time-Varying Autoregressive Moving Average (TVARMA) and Time-Varying Kalman filter (TVK) models are compared to define the most robust model for the virtual sensor. Results of movement classification were presented comparing the usual classification techniques with the method of the degraded signal replacement and classifier retraining. The experimental results were evaluated for these five noise types in 16 surface electromyography (sEMG) channel degradation case studies. The proposed system without using classifier retraining techniques recovered of mean classification accuracy was of 4% to 38% for electrode displacement, movement artifacts, and saturation noise. The best mean classification considering all signal contaminants and channel combinations evaluated was the classification using the retraining method, replacing the degraded channel by the virtual sensor TVARMA model. This method recovered the classification accuracy after the degradations, reaching an average of 5.7% below the classification of the clean signal, that is the signal without the contaminants or the original signal. Moreover, the proposed intelligent technique minimizes the impact of the motion classification caused by signal contamination related to degrading events over time. There are improvements in the virtual sensor model and in the algorithm optimization that need further development to provide an increase the clinical application of myoelectric prostheses but already presents robust results to enable research with virtual sensors on biological signs with stochastic behavior.
Virtual Sensor of Surface Electromyography in a New Extensive Fault-Tolerant Classification System
Balbinot, Alexandre
2018-01-01
A few prosthetic control systems in the scientific literature obtain pattern recognition algorithms adapted to changes that occur in the myoelectric signal over time and, frequently, such systems are not natural and intuitive. These are some of the several challenges for myoelectric prostheses for everyday use. The concept of the virtual sensor, which has as its fundamental objective to estimate unavailable measures based on other available measures, is being used in other fields of research. The virtual sensor technique applied to surface electromyography can help to minimize these problems, typically related to the degradation of the myoelectric signal that usually leads to a decrease in the classification accuracy of the movements characterized by computational intelligent systems. This paper presents a virtual sensor in a new extensive fault-tolerant classification system to maintain the classification accuracy after the occurrence of the following contaminants: ECG interference, electrode displacement, movement artifacts, power line interference, and saturation. The Time-Varying Autoregressive Moving Average (TVARMA) and Time-Varying Kalman filter (TVK) models are compared to define the most robust model for the virtual sensor. Results of movement classification were presented comparing the usual classification techniques with the method of the degraded signal replacement and classifier retraining. The experimental results were evaluated for these five noise types in 16 surface electromyography (sEMG) channel degradation case studies. The proposed system without using classifier retraining techniques recovered of mean classification accuracy was of 4% to 38% for electrode displacement, movement artifacts, and saturation noise. The best mean classification considering all signal contaminants and channel combinations evaluated was the classification using the retraining method, replacing the degraded channel by the virtual sensor TVARMA model. This method recovered the classification accuracy after the degradations, reaching an average of 5.7% below the classification of the clean signal, that is the signal without the contaminants or the original signal. Moreover, the proposed intelligent technique minimizes the impact of the motion classification caused by signal contamination related to degrading events over time. There are improvements in the virtual sensor model and in the algorithm optimization that need further development to provide an increase the clinical application of myoelectric prostheses but already presents robust results to enable research with virtual sensors on biological signs with stochastic behavior. PMID:29723994
Modeling misregistration and related effects on multispectral classification
NASA Technical Reports Server (NTRS)
Billingsley, F. C.
1981-01-01
The effects of misregistration on the multispectral classification accuracy when the scene registration accuracy is relaxed from 0.3 to 0.5 pixel are investigated. Noise, class separability, spatial transient response, and field size are considered simultaneously with misregistration in their effects on accuracy. Any noise due to the scene, sensor, or to the analog/digital conversion, causes a finite fraction of the measurements to fall outside of the classification limits, even within nominally uniform fields. Misregistration causes field borders in a given band or set of bands to be closer than expected to a given pixel, causing additional pixels to be misclassified due to the mixture of materials in the pixel. Simplified first order models of the various effects are presented, and are used to estimate the performance to be expected.
Delavarian, Mona; Towhidkhah, Farzad; Gharibzadeh, Shahriar; Dibajnia, Parvin
2011-07-12
Automatic classification of different behavioral disorders with many similarities (e.g. in symptoms) by using an automated approach will help psychiatrists to concentrate on correct disorder and its treatment as soon as possible, to avoid wasting time on diagnosis, and to increase the accuracy of diagnosis. In this study, we tried to differentiate and classify (diagnose) 306 children with many similar symptoms and different behavioral disorders such as ADHD, depression, anxiety, comorbid depression and anxiety and conduct disorder with high accuracy. Classification was based on the symptoms and their severity. With examining 16 different available classifiers, by using "Prtools", we have proposed nearest mean classifier as the most accurate classifier with 96.92% accuracy in this research. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Das, Dev Kumar; Ghosh, Madhumala; Pal, Mallika; Maiti, Asok K; Chakraborty, Chandan
2013-02-01
The aim of this paper is to address the development of computer assisted malaria parasite characterization and classification using machine learning approach based on light microscopic images of peripheral blood smears. In doing this, microscopic image acquisition from stained slides, illumination correction and noise reduction, erythrocyte segmentation, feature extraction, feature selection and finally classification of different stages of malaria (Plasmodium vivax and Plasmodium falciparum) have been investigated. The erythrocytes are segmented using marker controlled watershed transformation and subsequently total ninety six features describing shape-size and texture of erythrocytes are extracted in respect to the parasitemia infected versus non-infected cells. Ninety four features are found to be statistically significant in discriminating six classes. Here a feature selection-cum-classification scheme has been devised by combining F-statistic, statistical learning techniques i.e., Bayesian learning and support vector machine (SVM) in order to provide the higher classification accuracy using best set of discriminating features. Results show that Bayesian approach provides the highest accuracy i.e., 84% for malaria classification by selecting 19 most significant features while SVM provides highest accuracy i.e., 83.5% with 9 most significant features. Finally, the performance of these two classifiers under feature selection framework has been compared toward malaria parasite classification. Copyright © 2012 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Hammann, Mark Gregory
The fusion of electro-optical (EO) multi-spectral satellite imagery with Synthetic Aperture Radar (SAR) data was explored with the working hypothesis that the addition of multi-band SAR will increase the land-cover (LC) classification accuracy compared to EO alone. Three satellite sources for SAR imagery were used: X-band from TerraSAR-X, C-band from RADARSAT-2, and L-band from PALSAR. Images from the RapidEye satellites were the source of the EO imagery. Imagery from the GeoEye-1 and WorldView-2 satellites aided the selection of ground truth. Three study areas were chosen: Wad Medani, Sudan; Campinas, Brazil; and Fresno- Kings Counties, USA. EO imagery were radiometrically calibrated, atmospherically compensated, orthorectifed, co-registered, and clipped to a common area of interest (AOI). SAR imagery were radiometrically calibrated, and geometrically corrected for terrain and incidence angle by converting to ground range and Sigma Naught (?0). The original SAR HH data were included in the fused image stack after despeckling with a 3x3 Enhanced Lee filter. The variance and Gray-Level-Co-occurrence Matrix (GLCM) texture measures of contrast, entropy, and correlation were derived from the non-despeckled SAR HH bands. Data fusion was done with layer stacking and all data were resampled to a common spatial resolution. The Support Vector Machine (SVM) decision rule was used for the supervised classifications. Similar LC classes were identified and tested for each study area. For Wad Medani, nine classes were tested: low and medium intensity urban, sparse forest, water, barren ground, and four agriculture classes (fallow, bare agricultural ground, green crops, and orchards). For Campinas, Brazil, five generic classes were tested: urban, agriculture, forest, water, and barren ground. For the Fresno-Kings Counties location 11 classes were studied: three generic classes (urban, water, barren land), and eight specific crops. In all cases the addition of SAR to EO resulted in higher overall classification accuracies. In many cases using more than a single SAR band also improved the classification accuracy. There was no single best SAR band for all cases; for specific study areas or LC classes, different SAR bands were better. For Wad Medani, the overall accuracy increased nearly 25% over EO by using all three SAR bands and GLCM texture. For Campinas, the improvement over EO was 4.3%; the large areas of vegetation were classified by EO with good accuracy. At Fresno-Kings Counties, EO+SAR fusion improved the overall classification accuracy by 7%. For times or regions where EO is not available due to extended cloud cover, classification with SAR is often the only option; note that SAR alone typically results in lower classification accuracies than when using EO or EO-SAR fusion. Fusion of EO and SAR was especially important to improve the separability of orchards from other crops, and separating urban areas with buildings from bare soil; those classes are difficult to accurately separate with EO. The outcome of this dissertation contributes to the understanding of the benefits of combining data from EO imagery with different SAR bands and SAR derived texture data to identify different LC classes. In times of increased public and private budget constraints and industry consolidation, this dissertation provides insight as to which band packages could be most useful for increased accuracy in LC classification.
Tissue classification for laparoscopic image understanding based on multispectral texture analysis
NASA Astrophysics Data System (ADS)
Zhang, Yan; Wirkert, Sebastian J.; Iszatt, Justin; Kenngott, Hannes; Wagner, Martin; Mayer, Benjamin; Stock, Christian; Clancy, Neil T.; Elson, Daniel S.; Maier-Hein, Lena
2016-03-01
Intra-operative tissue classification is one of the prerequisites for providing context-aware visualization in computer-assisted minimally invasive surgeries. As many anatomical structures are difficult to differentiate in conventional RGB medical images, we propose a classification method based on multispectral image patches. In a comprehensive ex vivo study we show (1) that multispectral imaging data is superior to RGB data for organ tissue classification when used in conjunction with widely applied feature descriptors and (2) that combining the tissue texture with the reflectance spectrum improves the classification performance. Multispectral tissue analysis could thus evolve as a key enabling technique in computer-assisted laparoscopy.
Mikhno, Arthur; Nuevo, Pablo Martinez; Devanand, Davangere P.; Parsey, Ramin V.; Laine, Andrew F.
2013-01-01
Multimodality classification of Alzheimer’s disease (AD) and its prodromal stage, Mild Cognitive Impairment (MCI), is of interest to the medical community. We improve on prior classification frameworks by incorporating multiple features from MRI and PET data obtained with multiple radioligands, fluorodeoxyglucose (FDG) and Pittsburg compound B (PIB). We also introduce a new MRI feature, invariant shape descriptors based on 3D Zernike moments applied to the hippocampus region. Classification performance is evaluated on data from 17 healthy controls (CTR), 22 MCI, and 17 AD subjects. Zernike significantly outperforms volume, accuracy (Zernike to volume): CTR/AD (90.7% to 71.6%), CTR/MCI (76.2% to 60.0%), MCI/AD (84.3% to 65.5%). Zernike also provides comparable and complementary performance to PET. Optimal accuracy is achieved when Zernike and PET features are combined (accuracy, specificity, sensitivity), CTR/AD (98.8%, 99.5%, 98.1%), CTR/MCI (84.3%, 82.9%, 85.9%) and MCI/AD (93.3%, 93.6%, 93.3%). PMID:24576927
Mikhno, Arthur; Nuevo, Pablo Martinez; Devanand, Davangere P; Parsey, Ramin V; Laine, Andrew F
2012-01-01
Multimodality classification of Alzheimer's disease (AD) and its prodromal stage, Mild Cognitive Impairment (MCI), is of interest to the medical community. We improve on prior classification frameworks by incorporating multiple features from MRI and PET data obtained with multiple radioligands, fluorodeoxyglucose (FDG) and Pittsburg compound B (PIB). We also introduce a new MRI feature, invariant shape descriptors based on 3D Zernike moments applied to the hippocampus region. Classification performance is evaluated on data from 17 healthy controls (CTR), 22 MCI, and 17 AD subjects. Zernike significantly outperforms volume, accuracy (Zernike to volume): CTR/AD (90.7% to 71.6%), CTR/MCI (76.2% to 60.0%), MCI/AD (84.3% to 65.5%). Zernike also provides comparable and complementary performance to PET. Optimal accuracy is achieved when Zernike and PET features are combined (accuracy, specificity, sensitivity), CTR/AD (98.8%, 99.5%, 98.1%), CTR/MCI (84.3%, 82.9%, 85.9%) and MCI/AD (93.3%, 93.6%, 93.3%).
A neural network approach for enhancing information extraction from multispectral image data
Liu, J.; Shao, G.; Zhu, H.; Liu, S.
2005-01-01
A back-propagation artificial neural network (ANN) was applied to classify multispectral remote sensing imagery data. The classification procedure included four steps: (i) noisy training that adds minor random variations to the sampling data to make the data more representative and to reduce the training sample size; (ii) iterative or multi-tier classification that reclassifies the unclassified pixels by making a subset of training samples from the original training set, which means the neural model can focus on fewer classes; (iii) spectral channel selection based on neural network weights that can distinguish the relative importance of each channel in the classification process to simplify the ANN model; and (iv) voting rules that adjust the accuracy of classification and produce outputs of different confidence levels. The Purdue Forest, located west of Purdue University, West Lafayette, Indiana, was chosen as the test site. The 1992 Landsat thematic mapper imagery was used as the input data. High-quality airborne photographs of the same Lime period were used for the ground truth. A total of 11 land use and land cover classes were defined, including water, broadleaved forest, coniferous forest, young forest, urban and road, and six types of cropland-grassland. The experiment, indicated that the back-propagation neural network application was satisfactory in distinguishing different land cover types at US Geological Survey levels II-III. The single-tier classification reached an overall accuracy of 85%. and the multi-tier classification an overall accuracy of 95%. For the whole test, region, the final output of this study reached an overall accuracy of 87%. ?? 2005 CASI.
Crabtree, Nathaniel M; Moore, Jason H; Bowyer, John F; George, Nysia I
2017-01-01
A computational evolution system (CES) is a knowledge discovery engine that can identify subtle, synergistic relationships in large datasets. Pareto optimization allows CESs to balance accuracy with model complexity when evolving classifiers. Using Pareto optimization, a CES is able to identify a very small number of features while maintaining high classification accuracy. A CES can be designed for various types of data, and the user can exploit expert knowledge about the classification problem in order to improve discrimination between classes. These characteristics give CES an advantage over other classification and feature selection algorithms, particularly when the goal is to identify a small number of highly relevant, non-redundant biomarkers. Previously, CESs have been developed only for binary class datasets. In this study, we developed a multi-class CES. The multi-class CES was compared to three common feature selection and classification algorithms: support vector machine (SVM), random k-nearest neighbor (RKNN), and random forest (RF). The algorithms were evaluated on three distinct multi-class RNA sequencing datasets. The comparison criteria were run-time, classification accuracy, number of selected features, and stability of selected feature set (as measured by the Tanimoto distance). The performance of each algorithm was data-dependent. CES performed best on the dataset with the smallest sample size, indicating that CES has a unique advantage since the accuracy of most classification methods suffer when sample size is small. The multi-class extension of CES increases the appeal of its application to complex, multi-class datasets in order to identify important biomarkers and features.
A novel artificial immune clonal selection classification and rule mining with swarm learning model
NASA Astrophysics Data System (ADS)
Al-Sheshtawi, Khaled A.; Abdul-Kader, Hatem M.; Elsisi, Ashraf B.
2013-06-01
Metaheuristic optimisation algorithms have become popular choice for solving complex problems. By integrating Artificial Immune clonal selection algorithm (CSA) and particle swarm optimisation (PSO) algorithm, a novel hybrid Clonal Selection Classification and Rule Mining with Swarm Learning Algorithm (CS2) is proposed. The main goal of the approach is to exploit and explore the parallel computation merit of Clonal Selection and the speed and self-organisation merits of Particle Swarm by sharing information between clonal selection population and particle swarm. Hence, we employed the advantages of PSO to improve the mutation mechanism of the artificial immune CSA and to mine classification rules within datasets. Consequently, our proposed algorithm required less training time and memory cells in comparison to other AIS algorithms. In this paper, classification rule mining has been modelled as a miltiobjective optimisation problem with predictive accuracy. The multiobjective approach is intended to allow the PSO algorithm to return an approximation to the accuracy and comprehensibility border, containing solutions that are spread across the border. We compared our proposed algorithm classification accuracy CS2 with five commonly used CSAs, namely: AIRS1, AIRS2, AIRS-Parallel, CLONALG, and CSCA using eight benchmark datasets. We also compared our proposed algorithm classification accuracy CS2 with other five methods, namely: Naïve Bayes, SVM, MLP, CART, and RFB. The results show that the proposed algorithm is comparable to the 10 studied algorithms. As a result, the hybridisation, built of CSA and PSO, can develop respective merit, compensate opponent defect, and make search-optimal effect and speed better.
Armutlu, Pelin; Ozdemir, Muhittin E; Uney-Yuksektepe, Fadime; Kavakli, I Halil; Turkay, Metin
2008-10-03
A priori analysis of the activity of drugs on the target protein by computational approaches can be useful in narrowing down drug candidates for further experimental tests. Currently, there are a large number of computational methods that predict the activity of drugs on proteins. In this study, we approach the activity prediction problem as a classification problem and, we aim to improve the classification accuracy by introducing an algorithm that combines partial least squares regression with mixed-integer programming based hyper-boxes classification method, where drug molecules are classified as low active or high active regarding their binding activity (IC50 values) on target proteins. We also aim to determine the most significant molecular descriptors for the drug molecules. We first apply our approach by analyzing the activities of widely known inhibitor datasets including Acetylcholinesterase (ACHE), Benzodiazepine Receptor (BZR), Dihydrofolate Reductase (DHFR), Cyclooxygenase-2 (COX-2) with known IC50 values. The results at this stage proved that our approach consistently gives better classification accuracies compared to 63 other reported classification methods such as SVM, Naïve Bayes, where we were able to predict the experimentally determined IC50 values with a worst case accuracy of 96%. To further test applicability of this approach we first created dataset for Cytochrome P450 C17 inhibitors and then predicted their activities with 100% accuracy. Our results indicate that this approach can be utilized to predict the inhibitory effects of inhibitors based on their molecular descriptors. This approach will not only enhance drug discovery process, but also save time and resources committed.
Classifying four-category visual objects using multiple ERP components in single-trial ERP.
Qin, Yu; Zhan, Yu; Wang, Changming; Zhang, Jiacai; Yao, Li; Guo, Xiaojuan; Wu, Xia; Hu, Bin
2016-08-01
Object categorization using single-trial electroencephalography (EEG) data measured while participants view images has been studied intensively. In previous studies, multiple event-related potential (ERP) components (e.g., P1, N1, P2, and P3) were used to improve the performance of object categorization of visual stimuli. In this study, we introduce a novel method that uses multiple-kernel support vector machine to fuse multiple ERP component features. We investigate whether fusing the potential complementary information of different ERP components (e.g., P1, N1, P2a, and P2b) can improve the performance of four-category visual object classification in single-trial EEGs. We also compare the classification accuracy of different ERP component fusion methods. Our experimental results indicate that the classification accuracy increases through multiple ERP fusion. Additional comparative analyses indicate that the multiple-kernel fusion method can achieve a mean classification accuracy higher than 72 %, which is substantially better than that achieved with any single ERP component feature (55.07 % for the best single ERP component, N1). We compare the classification results with those of other fusion methods and determine that the accuracy of the multiple-kernel fusion method is 5.47, 4.06, and 16.90 % higher than those of feature concatenation, feature extraction, and decision fusion, respectively. Our study shows that our multiple-kernel fusion method outperforms other fusion methods and thus provides a means to improve the classification performance of single-trial ERPs in brain-computer interface research.
Orhan, Umut; Erdogmus, Deniz; Roark, Brian; Purwar, Shalini; Hild, Kenneth E.; Oken, Barry; Nezamfar, Hooman; Fried-Oken, Melanie
2013-01-01
Event related potentials (ERP) corresponding to a stimulus in electroencephalography (EEG) can be used to detect the intent of a person for brain computer interfaces (BCI). This paradigm is widely utilized to build letter-by-letter text input systems using BCI. Nevertheless using a BCI-typewriter depending only on EEG responses will not be sufficiently accurate for single-trial operation in general, and existing systems utilize many-trial schemes to achieve accuracy at the cost of speed. Hence incorporation of a language model based prior or additional evidence is vital to improve accuracy and speed. In this paper, we study the effects of Bayesian fusion of an n-gram language model with a regularized discriminant analysis ERP detector for EEG-based BCIs. The letter classification accuracies are rigorously evaluated for varying language model orders as well as number of ERP-inducing trials. The results demonstrate that the language models contribute significantly to letter classification accuracy. Specifically, we find that a BCI-speller supported by a 4-gram language model may achieve the same performance using 3-trial ERP classification for the initial letters of the words and using single trial ERP classification for the subsequent ones. Overall, fusion of evidence from EEG and language models yields a significant opportunity to increase the word rate of a BCI based typing system. PMID:22255652
Ozcift, Akin; Gulten, Arif
2011-12-01
Improving accuracies of machine learning algorithms is vital in designing high performance computer-aided diagnosis (CADx) systems. Researches have shown that a base classifier performance might be enhanced by ensemble classification strategies. In this study, we construct rotation forest (RF) ensemble classifiers of 30 machine learning algorithms to evaluate their classification performances using Parkinson's, diabetes and heart diseases from literature. While making experiments, first the feature dimension of three datasets is reduced using correlation based feature selection (CFS) algorithm. Second, classification performances of 30 machine learning algorithms are calculated for three datasets. Third, 30 classifier ensembles are constructed based on RF algorithm to assess performances of respective classifiers with the same disease data. All the experiments are carried out with leave-one-out validation strategy and the performances of the 60 algorithms are evaluated using three metrics; classification accuracy (ACC), kappa error (KE) and area under the receiver operating characteristic (ROC) curve (AUC). Base classifiers succeeded 72.15%, 77.52% and 84.43% average accuracies for diabetes, heart and Parkinson's datasets, respectively. As for RF classifier ensembles, they produced average accuracies of 74.47%, 80.49% and 87.13% for respective diseases. RF, a newly proposed classifier ensemble algorithm, might be used to improve accuracy of miscellaneous machine learning algorithms to design advanced CADx systems. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Zou, Xiaoliang; Zhao, Guihua; Li, Jonathan; Yang, Yuanxi; Fang, Yong
2016-06-01
With the rapid developments of the sensor technology, high spatial resolution imagery and airborne Lidar point clouds can be captured nowadays, which make classification, extraction, evaluation and analysis of a broad range of object features available. High resolution imagery, Lidar dataset and parcel map can be widely used for classification as information carriers. Therefore, refinement of objects classification is made possible for the urban land cover. The paper presents an approach to object based image analysis (OBIA) combing high spatial resolution imagery and airborne Lidar point clouds. The advanced workflow for urban land cover is designed with four components. Firstly, colour-infrared TrueOrtho photo and laser point clouds were pre-processed to derive the parcel map of water bodies and nDSM respectively. Secondly, image objects are created via multi-resolution image segmentation integrating scale parameter, the colour and shape properties with compactness criterion. Image can be subdivided into separate object regions. Thirdly, image objects classification is performed on the basis of segmentation and a rule set of knowledge decision tree. These objects imagery are classified into six classes such as water bodies, low vegetation/grass, tree, low building, high building and road. Finally, in order to assess the validity of the classification results for six classes, accuracy assessment is performed through comparing randomly distributed reference points of TrueOrtho imagery with the classification results, forming the confusion matrix and calculating overall accuracy and Kappa coefficient. The study area focuses on test site Vaihingen/Enz and a patch of test datasets comes from the benchmark of ISPRS WG III/4 test project. The classification results show higher overall accuracy for most types of urban land cover. Overall accuracy is 89.5% and Kappa coefficient equals to 0.865. The OBIA approach provides an effective and convenient way to combine high resolution imagery and Lidar ancillary data for classification of urban land cover.
Iacucci, Marietta; Trovato, Cristina; Daperno, Marco; Akinola, Oluseyi; Greenwald, David; Gross, Seth A; Hoffman, Arthur; Lee, Jeffrey; Lethebe, Brendan C; Lowerison, Mark; Nayor, Jennifer; Neumann, Helmut; Rath, Timo; Sanduleanu, Silvia; Sharma, Prateek; Kiesslich, Ralf; Ghosh, Subrata; Saltzman, John R
2018-03-23
Prediction of histology of small polyps facilitates colonoscopic treatment. The aims of this study were: 1) to develop a simplified polyp classification, 2) to evaluate its performance in predicting polyp histology, and 3) to evaluate the reproducibility of the classification by trainees using multiplatform endoscopic systems. In phase 1, a new simplified endoscopic classification for polyps - Simplified Identification Method for Polyp Labeling during Endoscopy (SIMPLE) - was created, using the new I-SCAN OE system (Pentax, Tokyo, Japan), by eight international experts. In phase 2, the accuracy, level of confidence, and interobserver agreement to predict polyp histology before and after training, and univariable/multivariable analysis of the endoscopic features, were performed. In phase 3, the reproducibility of SIMPLE by trainees using different endoscopy platforms was evaluated. Using the SIMPLE classification, the accuracy of experts in predicting polyps was 83 % (95 % confidence interval [CI] 77 % - 88 %) before and 94 % (95 %CI 89 % - 97 %) after training ( P = 0.002). The sensitivity, specificity, positive predictive value, and negative predictive value after training were 97 %, 88 %, 95 %, and 91 %. The interobserver agreement of polyp diagnosis improved from 0.46 (95 %CI 0.30 - 0.64) before to 0.66 (95 %CI 0.48 - 0.82) after training. The trainees demonstrated that the SIMPLE classification is applicable across endoscopy platforms, with similar post-training accuracies for narrow-band imaging NBI classification (0.69; 95 %CI 0.64 - 0.73) and SIMPLE (0.71; 95 %CI 0.67 - 0.75). Using the I-SCAN OE system, the new SIMPLE classification demonstrated a high degree of accuracy for adenoma diagnosis, meeting the ASGE PIVI recommendations. We demonstrated that SIMPLE may be used with either I-SCAN OE or NBI. © Georg Thieme Verlag KG Stuttgart · New York.
NASA Astrophysics Data System (ADS)
Suiter, Ashley Elizabeth
Multi-spectral imagery provides a robust and low-cost dataset for assessing wetland extent and quality over broad regions and is frequently used for wetland inventories. However in forested wetlands, hydrology is obscured by tree canopy making it difficult to detect with multi-spectral imagery alone. Because of this, classification of forested wetlands often includes greater errors than that of other wetlands types. Elevation and terrain derivatives have been shown to be useful for modelling wetland hydrology. But, few studies have addressed the use of LiDAR intensity data detecting hydrology in forested wetlands. Due the tendency of LiDAR signal to be attenuated by water, this research proposed the fusion of LiDAR intensity data with LiDAR elevation, terrain data, and aerial imagery, for the detection of forested wetland hydrology. We examined the utility of LiDAR intensity data and determined whether the fusion of Lidar derived data with multispectral imagery increased the accuracy of forested wetland classification compared with a classification performed with only multi-spectral image. Four classifications were performed: Classification A -- All Imagery, Classification B -- All LiDAR, Classification C -- LiDAR without Intensity, and Classification D -- Fusion of All Data. These classifications were performed using random forest and each resulted in a 3-foot resolution thematic raster of forested upland and forested wetland locations in Vermilion County, Illinois. The accuracies of these classifications were compared using Kappa Coefficient of Agreement. Importance statistics produced within the random forest classifier were evaluated in order to understand the contribution of individual datasets. Classification D, which used the fusion of LiDAR and multi-spectral imagery as input variables, had moderate to strong agreement between reference data and classification results. It was found that Classification A performed using all the LiDAR data and its derivatives (intensity, elevation, slope, aspect, curvatures, and Topographic Wetness Index) was the most accurate classification with Kappa: 78.04%, indicating moderate to strong agreement. However, Classification C, performed with LiDAR derivative without intensity data had less agreement than would be expected by chance, indicating that LiDAR contributed significantly to the accuracy of Classification B.
Classification of weld defect based on information fusion technology for radiographic testing system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jiang, Hongquan; Liang, Zeming, E-mail: heavenlzm@126.com; Gao, Jianmin
Improving the efficiency and accuracy of weld defect classification is an important technical problem in developing the radiographic testing system. This paper proposes a novel weld defect classification method based on information fusion technology, Dempster–Shafer evidence theory. First, to characterize weld defects and improve the accuracy of their classification, 11 weld defect features were defined based on the sub-pixel level edges of radiographic images, four of which are presented for the first time in this paper. Second, we applied information fusion technology to combine different features for weld defect classification, including a mass function defined based on the weld defectmore » feature information and the quartile-method-based calculation of standard weld defect class which is to solve a sample problem involving a limited number of training samples. A steam turbine weld defect classification case study is also presented herein to illustrate our technique. The results show that the proposed method can increase the correct classification rate with limited training samples and address the uncertainties associated with weld defect classification.« less
NASA Technical Reports Server (NTRS)
Joyce, A. T.
1974-01-01
Significant progress has been made in the classification of surface conditions (land uses) with computer-implemented techniques based on the use of ERTS digital data and pattern recognition software. The supervised technique presently used at the NASA Earth Resources Laboratory is based on maximum likelihood ratioing with a digital table look-up approach to classification. After classification, colors are assigned to the various surface conditions (land uses) classified, and the color-coded classification is film recorded on either positive or negative 9 1/2 in. film at the scale desired. Prints of the film strips are then mosaicked and photographed to produce a land use map in the format desired. Computer extraction of statistical information is performed to show the extent of each surface condition (land use) within any given land unit that can be identified in the image. Evaluations of the product indicate that classification accuracy is well within the limits for use by land resource managers and administrators. Classifications performed with digital data acquired during different seasons indicate that the combination of two or more classifications offer even better accuracy.
Jiang, Hongquan; Liang, Zeming; Gao, Jianmin; Dang, Changying
2016-03-01
Improving the efficiency and accuracy of weld defect classification is an important technical problem in developing the radiographic testing system. This paper proposes a novel weld defect classification method based on information fusion technology, Dempster-Shafer evidence theory. First, to characterize weld defects and improve the accuracy of their classification, 11 weld defect features were defined based on the sub-pixel level edges of radiographic images, four of which are presented for the first time in this paper. Second, we applied information fusion technology to combine different features for weld defect classification, including a mass function defined based on the weld defect feature information and the quartile-method-based calculation of standard weld defect class which is to solve a sample problem involving a limited number of training samples. A steam turbine weld defect classification case study is also presented herein to illustrate our technique. The results show that the proposed method can increase the correct classification rate with limited training samples and address the uncertainties associated with weld defect classification.
A Unified Classification Framework for FP, DP and CP Data at X-Band in Southern China
NASA Astrophysics Data System (ADS)
Xie, Lei; Zhang, Hong; Li, Hhongzhong; Wang, Chao
2015-04-01
The main objective of this paper is to introduce an unified framework for crop classification in Southern China using data in fully polarimetric (FP), dual-pol (DP) and compact polarimetric (CP) modes. The TerraSAR-X data acquired over the Leizhou Peninsula, South China are used in our experiments. The study site involves four main crops (rice, banana, sugarcane eucalyptus). Through exploring the similarities between data in these three modes, a knowledge-based characteristic space is created and the unified framework is presented. The overall classification accuracies for data in the FP, coherent HH/VV are about 95%, and is about 91% in CP modes, which suggests that the proposed classification scheme is effective and promising. Compared with the Wishart Maximum Likelihood (ML) classifier, the proposed method exhibits higher classification accuracy.
NASA Astrophysics Data System (ADS)
Georganos, Stefanos; Grippa, Tais; Vanhuysse, Sabine; Lennert, Moritz; Shimoni, Michal; Wolff, Eléonore
2017-10-01
This study evaluates the impact of three Feature Selection (FS) algorithms in an Object Based Image Analysis (OBIA) framework for Very-High-Resolution (VHR) Land Use-Land Cover (LULC) classification. The three selected FS algorithms, Correlation Based Selection (CFS), Mean Decrease in Accuracy (MDA) and Random Forest (RF) based Recursive Feature Elimination (RFE), were tested on Support Vector Machine (SVM), K-Nearest Neighbor, and Random Forest (RF) classifiers. The results demonstrate that the accuracy of SVM and KNN classifiers are the most sensitive to FS. The RF appeared to be more robust to high dimensionality, although a significant increase in accuracy was found by using the RFE method. In terms of classification accuracy, SVM performed the best using FS, followed by RF and KNN. Finally, only a small number of features is needed to achieve the highest performance using each classifier. This study emphasizes the benefits of rigorous FS for maximizing performance, as well as for minimizing model complexity and interpretation.
NASA Astrophysics Data System (ADS)
Dou, P.
2017-12-01
Guangzhou has experienced a rapid urbanization period called "small change in three years and big change in five years" since the reform of China, resulting in significant land use/cover changes(LUC). To overcome the disadvantages of single classifier for remote sensing image classification accuracy, a multiple classifier system (MCS) is proposed to improve the quality of remote sensing image classification. The new method combines advantages of different learning algorithms, and achieves higher accuracy (88.12%) than any single classifier did. With the proposed MCS, land use/cover (LUC) on Landsat images from 1987 to 2015 was obtained, and the LUCs were used on three watersheds (Shijing river, Chebei stream, and Shahe stream) to estimate the impact of urbanization on water flood. The results show that with the high accuracy LUC, the uncertainty in flood simulations are reduced effectively (for Shijing river, Chebei stream, and Shahe stream, the uncertainty reduced 15.5%, 17.3% and 19.8% respectively).
SVM classifier on chip for melanoma detection.
Afifi, Shereen; GholamHosseini, Hamid; Sinha, Roopak
2017-07-01
Support Vector Machine (SVM) is a common classifier used for efficient classification with high accuracy. SVM shows high accuracy for classifying melanoma (skin cancer) clinical images within computer-aided diagnosis systems used by skin cancer specialists to detect melanoma early and save lives. We aim to develop a medical low-cost handheld device that runs a real-time embedded SVM-based diagnosis system for use in primary care for early detection of melanoma. In this paper, an optimized SVM classifier is implemented onto a recent FPGA platform using the latest design methodology to be embedded into the proposed device for realizing online efficient melanoma detection on a single system on chip/device. The hardware implementation results demonstrate a high classification accuracy of 97.9% and a significant acceleration factor of 26 from equivalent software implementation on an embedded processor, with 34% of resources utilization and 2 watts for power consumption. Consequently, the implemented system meets crucial embedded systems constraints of high performance and low cost, resources utilization and power consumption, while achieving high classification accuracy.
Diagnostic Accuracy Comparison of Artificial Immune Algorithms for Primary Headaches.
Çelik, Ufuk; Yurtay, Nilüfer; Koç, Emine Rabia; Tepe, Nermin; Güllüoğlu, Halil; Ertaş, Mustafa
2015-01-01
The present study evaluated the diagnostic accuracy of immune system algorithms with the aim of classifying the primary types of headache that are not related to any organic etiology. They are divided into four types: migraine, tension, cluster, and other primary headaches. After we took this main objective into consideration, three different neurologists were required to fill in the medical records of 850 patients into our web-based expert system hosted on our project web site. In the evaluation process, Artificial Immune Systems (AIS) were used as the classification algorithms. The AIS are classification algorithms that are inspired by the biological immune system mechanism that involves significant and distinct capabilities. These algorithms simulate the specialties of the immune system such as discrimination, learning, and the memorizing process in order to be used for classification, optimization, or pattern recognition. According to the results, the accuracy level of the classifier used in this study reached a success continuum ranging from 95% to 99%, except for the inconvenient one that yielded 71% accuracy.
ERIC Educational Resources Information Center
Decker, Dawn M.; Hixson, Michael D.; Shaw, Amber; Johnson, Gloria
2014-01-01
The purpose of this study was to examine whether using a multiple-measure framework yielded better classification accuracy than oral reading fluency (ORF) or maze alone in predicting pass/fail rates for middle-school students on a large-scale reading assessment. Participants were 178 students in Grades 7 and 8 from a Midwestern school district.…
Application of LANDSAT-2 to the management of Delaware's marine and wetland resources
NASA Technical Reports Server (NTRS)
Klemas, V. (Principal Investigator); Bartlett, D.; Philpot, W.; Davis, G.
1976-01-01
The author has identified the following significant results. Digital multispectral classification techniques can be used to discriminate coastal land use and vegetation with 87% to 94% categorization accuracy. Wetlands plant species, representing more detail than U.S.G.S. classification system level 2 categories can be discriminated using LANDSAT data with 85% to 88% accuracy at scales up to 1:24,000.
The SCHEIE Visual Field Grading System
Sankar, Prithvi S.; O’Keefe, Laura; Choi, Daniel; Salowe, Rebecca; Miller-Ellis, Eydie; Lehman, Amanda; Addis, Victoria; Ramakrishnan, Meera; Natesh, Vikas; Whitehead, Gideon; Khachatryan, Naira; O’Brien, Joan
2017-01-01
Objective No method of grading visual field (VF) defects has been widely accepted throughout the glaucoma community. The SCHEIE (Systematic Classification of Humphrey visual fields-Easy Interpretation and Evaluation) grading system for glaucomatous visual fields was created to convey qualitative and quantitative information regarding visual field defects in an objective, reproducible, and easily applicable manner for research purposes. Methods The SCHEIE grading system is composed of a qualitative and quantitative score. The qualitative score consists of designation in one or more of the following categories: normal, central scotoma, paracentral scotoma, paracentral crescent, temporal quadrant, nasal quadrant, peripheral arcuate defect, expansive arcuate, or altitudinal defect. The quantitative component incorporates the Humphrey visual field index (VFI), location of visual defects for superior and inferior hemifields, and blind spot involvement. Accuracy and speed at grading using the qualitative and quantitative components was calculated for non-physician graders. Results Graders had a median accuracy of 96.67% for their qualitative scores and a median accuracy of 98.75% for their quantitative scores. Graders took a mean of 56 seconds per visual field to assign a qualitative score and 20 seconds per visual field to assign a quantitative score. Conclusion The SCHEIE grading system is a reproducible tool that combines qualitative and quantitative measurements to grade glaucomatous visual field defects. The system aims to standardize clinical staging and to make specific visual field defects more easily identifiable. Specific patterns of visual field loss may also be associated with genetic variants in future genetic analysis. PMID:28932621
[Object-oriented aquatic vegetation extracting approach based on visible vegetation indices.
Jing, Ran; Deng, Lei; Zhao, Wen Ji; Gong, Zhao Ning
2016-05-01
Using the estimation of scale parameters (ESP) image segmentation tool to determine the ideal image segmentation scale, the optimal segmented image was created by the multi-scale segmentation method. Based on the visible vegetation indices derived from mini-UAV imaging data, we chose a set of optimal vegetation indices from a series of visible vegetation indices, and built up a decision tree rule. A membership function was used to automatically classify the study area and an aquatic vegetation map was generated. The results showed the overall accuracy of image classification using the supervised classification was 53.7%, and the overall accuracy of object-oriented image analysis (OBIA) was 91.7%. Compared with pixel-based supervised classification method, the OBIA method improved significantly the image classification result and further increased the accuracy of extracting the aquatic vegetation. The Kappa value of supervised classification was 0.4, and the Kappa value based OBIA was 0.9. The experimental results demonstrated that using visible vegetation indices derived from the mini-UAV data and OBIA method extracting the aquatic vegetation developed in this study was feasible and could be applied in other physically similar areas.
Mohammed, Ameer; Zamani, Majid; Bayford, Richard; Demosthenous, Andreas
2017-12-01
In Parkinson's disease (PD), on-demand deep brain stimulation is required so that stimulation is regulated to reduce side effects resulting from continuous stimulation and PD exacerbation due to untimely stimulation. Also, the progressive nature of PD necessitates the use of dynamic detection schemes that can track the nonlinearities in PD. This paper proposes the use of dynamic feature extraction and dynamic pattern classification to achieve dynamic PD detection taking into account the demand for high accuracy, low computation, and real-time detection. The dynamic feature extraction and dynamic pattern classification are selected by evaluating a subset of feature extraction, dimensionality reduction, and classification algorithms that have been used in brain-machine interfaces. A novel dimensionality reduction technique, the maximum ratio method (MRM) is proposed, which provides the most efficient performance. In terms of accuracy and complexity for hardware implementation, a combination having discrete wavelet transform for feature extraction, MRM for dimensionality reduction, and dynamic k-nearest neighbor for classification was chosen as the most efficient. It achieves a classification accuracy of 99.29%, an F1-score of 97.90%, and a choice probability of 99.86%.