On-line analysis of algae in water by discrete three-dimensional fluorescence spectroscopy.
Zhao, Nanjing; Zhang, Xiaoling; Yin, Gaofang; Yang, Ruifang; Hu, Li; Chen, Shuang; Liu, Jianguo; Liu, Wenqing
2018-03-19
In view of the problem of the on-line measurement of algae classification, a method of algae classification and concentration determination based on the discrete three-dimensional fluorescence spectra was studied in this work. The discrete three-dimensional fluorescence spectra of twelve common species of algae belonging to five categories were analyzed, the discrete three-dimensional standard spectra of five categories were built, and the recognition, classification and concentration prediction of algae categories were realized by the discrete three-dimensional fluorescence spectra coupled with non-negative weighted least squares linear regression analysis. The results show that similarities between discrete three-dimensional standard spectra of different categories were reduced and the accuracies of recognition, classification and concentration prediction of the algae categories were significantly improved. By comparing with that of the chlorophyll a fluorescence excitation spectra method, the recognition accuracy rate in pure samples by discrete three-dimensional fluorescence spectra is improved 1.38%, and the recovery rate and classification accuracy in pure diatom samples 34.1% and 46.8%, respectively; the recognition accuracy rate of mixed samples by discrete-three dimensional fluorescence spectra is enhanced by 26.1%, the recovery rate of mixed samples with Chlorophyta 37.8%, and the classification accuracy of mixed samples with diatoms 54.6%.
Compensatory neurofuzzy model for discrete data classification in biomedical
NASA Astrophysics Data System (ADS)
Ceylan, Rahime
2015-03-01
Biomedical data is separated to two main sections: signals and discrete data. So, studies in this area are about biomedical signal classification or biomedical discrete data classification. There are artificial intelligence models which are relevant to classification of ECG, EMG or EEG signals. In same way, in literature, many models exist for classification of discrete data taken as value of samples which can be results of blood analysis or biopsy in medical process. Each algorithm could not achieve high accuracy rate on classification of signal and discrete data. In this study, compensatory neurofuzzy network model is presented for classification of discrete data in biomedical pattern recognition area. The compensatory neurofuzzy network has a hybrid and binary classifier. In this system, the parameters of fuzzy systems are updated by backpropagation algorithm. The realized classifier model is conducted to two benchmark datasets (Wisconsin Breast Cancer dataset and Pima Indian Diabetes dataset). Experimental studies show that compensatory neurofuzzy network model achieved 96.11% accuracy rate in classification of breast cancer dataset and 69.08% accuracy rate was obtained in experiments made on diabetes dataset with only 10 iterations.
ERIC Educational Resources Information Center
Daniels, Brian; Volpe, Robert J.; Fabiano, Gregory A.; Briesch, Amy M.
2017-01-01
This study examines the classification accuracy and teacher acceptability of a problem-focused screener for academic and disruptive behavior problems, which is directly linked to evidence-based intervention. Participants included 39 classroom teachers from 2 public school districts in the Northeastern United States. Teacher ratings were obtained…
Activity classification using the GENEA: optimum sampling frequency and number of axes.
Zhang, Shaoyan; Murray, Peter; Zillmer, Ruediger; Eston, Roger G; Catt, Michael; Rowlands, Alex V
2012-11-01
The GENEA shows high accuracy for classification of sedentary, household, walking, and running activities when sampling at 80 Hz on three axes. It is not known whether it is possible to decrease this sampling frequency and/or the number of axes without detriment to classification accuracy. The purpose of this study was to compare the classification rate of activities on the basis of data from a single axis, two axes, and three axes, with sampling rates ranging from 5 to 80 Hz. Sixty participants (age, 49.4 yr (6.5 yr); BMI, 24.6 kg·m (3.4 kg·m)) completed 10-12 semistructured activities in the laboratory and outdoor environment while wearing a GENEA accelerometer on the right wrist. We analyzed data from single axis, dual axes, and three axes at sampling rates of 5, 10, 20, 40, and 80 Hz. Mathematical models based on features extracted from mean, SD, fast Fourier transform, and wavelet decomposition were built, which combined one of the numbers of axes with one of the sampling rates to classify activities into sedentary, household, walking, and running. Classification accuracy was high irrespective of the number of axes for data collected at 80 Hz (96.93% ± 0.97%), 40 Hz (97.4% ± 0.73%), 20 Hz (96.86% ± 1.12%), and 10 Hz (97.01% ± 1.01%) but dropped for data collected at 5 Hz (94.98% ± 1.36%). Sampling frequencies >10 Hz and/or more than one axis of measurement were not associated with greater classification accuracy. Lower sampling rates and measurement of a single axis would result in a lower data load, longer battery life, and higher efficiency of data processing. Further research should investigate whether a lower sampling rate and a single axis affects classification accuracy when considering a wider range of activities.
Edwards, T.C.; Cutler, D.R.; Zimmermann, N.E.; Geiser, L.; Moisen, Gretchen G.
2006-01-01
We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by resubstitution rates were similar for each lichen species irrespective of the underlying sample survey form. Cross-validation estimates of prediction accuracies were lower than resubstitution accuracies for all species and both design types, and in all cases were closer to the true prediction accuracies based on the EVALUATION data set. We argue that greater emphasis should be placed on calculating and reporting cross-validation accuracy rates rather than simple resubstitution accuracy rates. Evaluation of the DESIGN and PURPOSIVE tree models on the EVALUATION data set shows significantly lower prediction accuracy for the PURPOSIVE tree models relative to the DESIGN models, indicating that non-probabilistic sample surveys may generate models with limited predictive capability. These differences were consistent across all four lichen species, with 11 of the 12 possible species and sample survey type comparisons having significantly lower accuracy rates. Some differences in accuracy were as large as 50%. The classification tree structures also differed considerably both among and within the modelled species, depending on the sample survey form. Overlap in the predictor variables selected by the DESIGN and PURPOSIVE tree models ranged from only 20% to 38%, indicating the classification trees fit the two evaluated survey forms on different sets of predictor variables. The magnitude of these differences in predictor variables throws doubt on ecological interpretation derived from prediction models based on non-probabilistic sample surveys. ?? 2006 Elsevier B.V. All rights reserved.
Two Approaches to Estimation of Classification Accuracy Rate under Item Response Theory
ERIC Educational Resources Information Center
Lathrop, Quinn N.; Cheng, Ying
2013-01-01
Within the framework of item response theory (IRT), there are two recent lines of work on the estimation of classification accuracy (CA) rate. One approach estimates CA when decisions are made based on total sum scores, the other based on latent trait estimates. The former is referred to as the Lee approach, and the latter, the Rudner approach,…
Di-codon Usage for Gene Classification
NASA Astrophysics Data System (ADS)
Nguyen, Minh N.; Ma, Jianmin; Fogel, Gary B.; Rajapakse, Jagath C.
Classification of genes into biologically related groups facilitates inference of their functions. Codon usage bias has been described previously as a potential feature for gene classification. In this paper, we demonstrate that di-codon usage can further improve classification of genes. By using both codon and di-codon features, we achieve near perfect accuracies for the classification of HLA molecules into major classes and sub-classes. The method is illustrated on 1,841 HLA sequences which are classified into two major classes, HLA-I and HLA-II. Major classes are further classified into sub-groups. A binary SVM using di-codon usage patterns achieved 99.95% accuracy in the classification of HLA genes into major HLA classes; and multi-class SVM achieved accuracy rates of 99.82% and 99.03% for sub-class classification of HLA-I and HLA-II genes, respectively. Furthermore, by combining codon and di-codon usages, the prediction accuracies reached 100%, 99.82%, and 99.84% for HLA major class classification, and for sub-class classification of HLA-I and HLA-II genes, respectively.
Optimization of the ANFIS using a genetic algorithm for physical work rate classification.
Habibi, Ehsanollah; Salehi, Mina; Yadegarfar, Ghasem; Taheri, Ali
2018-03-13
Recently, a new method was proposed for physical work rate classification based on an adaptive neuro-fuzzy inference system (ANFIS). This study aims to present a genetic algorithm (GA)-optimized ANFIS model for a highly accurate classification of physical work rate. Thirty healthy men participated in this study. Directly measured heart rate and oxygen consumption of the participants in the laboratory were used for training the ANFIS classifier model in MATLAB version 8.0.0 using a hybrid algorithm. A similar process was done using the GA as an optimization technique. The accuracy, sensitivity and specificity of the ANFIS classifier model were increased successfully. The mean accuracy of the model was increased from 92.95 to 97.92%. Also, the calculated root mean square error of the model was reduced from 5.4186 to 3.1882. The maximum estimation error of the optimized ANFIS during the network testing process was ± 5%. The GA can be effectively used for ANFIS optimization and leads to an accurate classification of physical work rate. In addition to high accuracy, simple implementation and inter-individual variability consideration are two other advantages of the presented model.
Identification of Anisomerous Motor Imagery EEG Signals Based on Complex Algorithms
Zhang, Zhiwen; Duan, Feng; Zhou, Xin; Meng, Zixuan
2017-01-01
Motor imagery (MI) electroencephalograph (EEG) signals are widely applied in brain-computer interface (BCI). However, classified MI states are limited, and their classification accuracy rates are low because of the characteristics of nonlinearity and nonstationarity. This study proposes a novel MI pattern recognition system that is based on complex algorithms for classifying MI EEG signals. In electrooculogram (EOG) artifact preprocessing, band-pass filtering is performed to obtain the frequency band of MI-related signals, and then, canonical correlation analysis (CCA) combined with wavelet threshold denoising (WTD) is used for EOG artifact preprocessing. We propose a regularized common spatial pattern (R-CSP) algorithm for EEG feature extraction by incorporating the principle of generic learning. A new classifier combining the K-nearest neighbor (KNN) and support vector machine (SVM) approaches is used to classify four anisomerous states, namely, imaginary movements with the left hand, right foot, and right shoulder and the resting state. The highest classification accuracy rate is 92.5%, and the average classification accuracy rate is 87%. The proposed complex algorithm identification method can significantly improve the identification rate of the minority samples and the overall classification performance. PMID:28874909
Austin, Peter C; Lee, Douglas S
2011-01-01
Purpose: Classification trees are increasingly being used to classifying patients according to the presence or absence of a disease or health outcome. A limitation of classification trees is their limited predictive accuracy. In the data-mining and machine learning literature, boosting has been developed to improve classification. Boosting with classification trees iteratively grows classification trees in a sequence of reweighted datasets. In a given iteration, subjects that were misclassified in the previous iteration are weighted more highly than subjects that were correctly classified. Classifications from each of the classification trees in the sequence are combined through a weighted majority vote to produce a final classification. The authors' objective was to examine whether boosting improved the accuracy of classification trees for predicting outcomes in cardiovascular patients. Methods: We examined the utility of boosting classification trees for classifying 30-day mortality outcomes in patients hospitalized with either acute myocardial infarction or congestive heart failure. Results: Improvements in the misclassification rate using boosted classification trees were at best minor compared to when conventional classification trees were used. Minor to modest improvements to sensitivity were observed, with only a negligible reduction in specificity. For predicting cardiovascular mortality, boosted classification trees had high specificity, but low sensitivity. Conclusions: Gains in predictive accuracy for predicting cardiovascular outcomes were less impressive than gains in performance observed in the data mining literature. PMID:22254181
AVNM: A Voting based Novel Mathematical Rule for Image Classification.
Vidyarthi, Ankit; Mittal, Namita
2016-12-01
In machine learning, the accuracy of the system depends upon classification result. Classification accuracy plays an imperative role in various domains. Non-parametric classifier like K-Nearest Neighbor (KNN) is the most widely used classifier for pattern analysis. Besides its easiness, simplicity and effectiveness characteristics, the main problem associated with KNN classifier is the selection of a number of nearest neighbors i.e. "k" for computation. At present, it is hard to find the optimal value of "k" using any statistical algorithm, which gives perfect accuracy in terms of low misclassification error rate. Motivated by the prescribed problem, a new sample space reduction weighted voting mathematical rule (AVNM) is proposed for classification in machine learning. The proposed AVNM rule is also non-parametric in nature like KNN. AVNM uses the weighted voting mechanism with sample space reduction to learn and examine the predicted class label for unidentified sample. AVNM is free from any initial selection of predefined variable and neighbor selection as found in KNN algorithm. The proposed classifier also reduces the effect of outliers. To verify the performance of the proposed AVNM classifier, experiments are made on 10 standard datasets taken from UCI database and one manually created dataset. The experimental result shows that the proposed AVNM rule outperforms the KNN classifier and its variants. Experimentation results based on confusion matrix accuracy parameter proves higher accuracy value with AVNM rule. The proposed AVNM rule is based on sample space reduction mechanism for identification of an optimal number of nearest neighbor selections. AVNM results in better classification accuracy and minimum error rate as compared with the state-of-art algorithm, KNN, and its variants. The proposed rule automates the selection of nearest neighbor selection and improves classification rate for UCI dataset and manually created dataset. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Madison, Matthew J; Bradshaw, Laine P
2015-06-01
Diagnostic classification models are psychometric models that aim to classify examinees according to their mastery or non-mastery of specified latent characteristics. These models are well-suited for providing diagnostic feedback on educational assessments because of their practical efficiency and increased reliability when compared with other multidimensional measurement models. A priori specifications of which latent characteristics or attributes are measured by each item are a core element of the diagnostic assessment design. This item-attribute alignment, expressed in a Q-matrix, precedes and supports any inference resulting from the application of the diagnostic classification model. This study investigates the effects of Q-matrix design on classification accuracy for the log-linear cognitive diagnosis model. Results indicate that classification accuracy, reliability, and convergence rates improve when the Q-matrix contains isolated information from each measured attribute.
PCA based feature reduction to improve the accuracy of decision tree c4.5 classification
NASA Astrophysics Data System (ADS)
Nasution, M. Z. F.; Sitompul, O. S.; Ramli, M.
2018-03-01
Splitting attribute is a major process in Decision Tree C4.5 classification. However, this process does not give a significant impact on the establishment of the decision tree in terms of removing irrelevant features. It is a major problem in decision tree classification process called over-fitting resulting from noisy data and irrelevant features. In turns, over-fitting creates misclassification and data imbalance. Many algorithms have been proposed to overcome misclassification and overfitting on classifications Decision Tree C4.5. Feature reduction is one of important issues in classification model which is intended to remove irrelevant data in order to improve accuracy. The feature reduction framework is used to simplify high dimensional data to low dimensional data with non-correlated attributes. In this research, we proposed a framework for selecting relevant and non-correlated feature subsets. We consider principal component analysis (PCA) for feature reduction to perform non-correlated feature selection and Decision Tree C4.5 algorithm for the classification. From the experiments conducted using available data sets from UCI Cervical cancer data set repository with 858 instances and 36 attributes, we evaluated the performance of our framework based on accuracy, specificity and precision. Experimental results show that our proposed framework is robust to enhance classification accuracy with 90.70% accuracy rates.
Rifai Chai; Naik, Ganesh R; Tran, Yvonne; Sai Ho Ling; Craig, Ashley; Nguyen, Hung T
2015-08-01
An electroencephalography (EEG)-based counter measure device could be used for fatigue detection during driving. This paper explores the classification of fatigue and alert states using power spectral density (PSD) as a feature extractor and fuzzy swarm based-artificial neural network (ANN) as a classifier. An independent component analysis of entropy rate bound minimization (ICA-ERBM) is investigated as a novel source separation technique for fatigue classification using EEG analysis. A comparison of the classification accuracy of source separator versus no source separator is presented. Classification performance based on 43 participants without the inclusion of the source separator resulted in an overall sensitivity of 71.67%, a specificity of 75.63% and an accuracy of 73.65%. However, these results were improved after the inclusion of a source separator module, resulting in an overall sensitivity of 78.16%, a specificity of 79.60% and an accuracy of 78.88% (p <; 0.05).
ERIC Educational Resources Information Center
Decker, Dawn M.; Hixson, Michael D.; Shaw, Amber; Johnson, Gloria
2014-01-01
The purpose of this study was to examine whether using a multiple-measure framework yielded better classification accuracy than oral reading fluency (ORF) or maze alone in predicting pass/fail rates for middle-school students on a large-scale reading assessment. Participants were 178 students in Grades 7 and 8 from a Midwestern school district.…
Classification of EEG Signals Based on Pattern Recognition Approach.
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a "pattern recognition" approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90-7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11-89.63% and 91.60-81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy.
Classification of EEG Signals Based on Pattern Recognition Approach
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a “pattern recognition” approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90–7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11–89.63% and 91.60–81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy. PMID:29209190
The Effect of Normalization in Violence Video Classification Performance
NASA Astrophysics Data System (ADS)
Ali, Ashikin; Senan, Norhalina
2017-08-01
Basically, data pre-processing is an important part of data mining. Normalization is a pre-processing stage for any type of problem statement, especially in video classification. Challenging problems that arises in video classification is because of the heterogeneous content, large variations in video quality and complex semantic meanings of the concepts involved. Therefore, to regularize this problem, it is thoughtful to ensure normalization or basically involvement of thorough pre-processing stage aids the robustness of classification performance. This process is to scale all the numeric variables into certain range to make it more meaningful for further phases in available data mining techniques. Thus, this paper attempts to examine the effect of 2 normalization techniques namely Min-max normalization and Z-score in violence video classifications towards the performance of classification rate using Multi-layer perceptron (MLP) classifier. Using Min-Max Normalization range of [0,1] the result shows almost 98% of accuracy, meanwhile Min-Max Normalization range of [-1,1] accuracy is 59% and for Z-score the accuracy is 50%.
Ensemble of classifiers for confidence-rated classification of NDE signal
NASA Astrophysics Data System (ADS)
Banerjee, Portia; Safdarnejad, Seyed; Udpa, Lalita; Udpa, Satish
2016-02-01
Ensemble of classifiers in general, aims to improve classification accuracy by combining results from multiple weak hypotheses into a single strong classifier through weighted majority voting. Improved versions of ensemble of classifiers generate self-rated confidence scores which estimate the reliability of each of its prediction and boost the classifier using these confidence-rated predictions. However, such a confidence metric is based only on the rate of correct classification. In existing works, although ensemble of classifiers has been widely used in computational intelligence, the effect of all factors of unreliability on the confidence of classification is highly overlooked. With relevance to NDE, classification results are affected by inherent ambiguity of classifica-tion, non-discriminative features, inadequate training samples and noise due to measurement. In this paper, we extend the existing ensemble classification by maximizing confidence of every classification decision in addition to minimizing the classification error. Initial results of the approach on data from eddy current inspection show improvement in classification performance of defect and non-defect indications.
NASA Astrophysics Data System (ADS)
Kurniawan, Dian; Suparti; Sugito
2018-05-01
Population growth in Indonesia has increased every year. According to the population census conducted by the Central Bureau of Statistics (BPS) in 2010, the population of Indonesia has reached 237.6 million people. Therefore, to control the population growth rate, the government hold Family Planning or Keluarga Berencana (KB) program for couples of childbearing age. The purpose of this program is to improve the health of mothers and children in order to manifest prosperous society by controlling births while ensuring control of population growth. The data used in this study is the updated family data of Semarang city in 2016 that conducted by National Family Planning Coordinating Board (BKKBN). From these data, classifiers with kernel discriminant analysis will be obtained, and also classification accuracy will be obtained from that method. The result of the analysis showed that normal kernel discriminant analysis gives 71.05 % classification accuracy with 28.95 % classification error. Whereas triweight kernel discriminant analysis gives 73.68 % classification accuracy with 26.32 % classification error. Using triweight kernel discriminant for data preprocessing of family planning participation of childbearing age couples in Semarang City of 2016 can be stated better than with normal kernel discriminant.
Characterization and delineation of caribou habitat on Unimak Island using remote sensing techniques
NASA Astrophysics Data System (ADS)
Atkinson, Brain M.
The assessment of herbivore habitat quality is traditionally based on quantifying the forages available to the animal across their home range through ground-based techniques. While these methods are highly accurate, they can be time-consuming and highly expensive, especially for herbivores that occupy vast spatial landscapes. The Unimak Island caribou herd has been decreasing in the last decade at rates that have prompted discussion of management intervention. Frequent inclement weather in this region of Alaska has provided for little opportunity to study the caribou forage habitat on Unimak Island. The overall objectives of this study were two-fold 1) to assess the feasibility of using high-resolution color and near-infrared aerial imagery to map the forage distribution of caribou habitat on Unimak Island and 2) to assess the use of a new high-resolution multispectral satellite imagery platform, RapidEye, and use of the "red-edge" spectral band on vegetation classification accuracy. Maximum likelihood classification algorithms were used to create land cover maps in aerial and satellite imagery. Accuracy assessments and transformed divergence values were produced to assess vegetative spectral information and classification accuracy. By using RapidEye and aerial digital imagery in a hierarchical supervised classification technique, we were able to produce a high resolution land cover map of Unimak Island. We obtained overall accuracy rates of 71.4 percent which are comparable to other land cover maps using RapidEye imagery. The "red-edge" spectral band included in the RapidEye imagery provides additional spectral information that allows for a more accurate overall classification, raising overall accuracy 5.2 percent.
Boursier, Jérôme; Bertrais, Sandrine; Oberti, Frédéric; Gallois, Yves; Fouchard-Hubert, Isabelle; Rousselet, Marie-Christine; Zarski, Jean-Pierre; Calès, Paul
2011-11-30
Non-invasive tests have been constructed and evaluated mainly for binary diagnoses such as significant fibrosis. Recently, detailed fibrosis classifications for several non-invasive tests have been developed, but their accuracy has not been thoroughly evaluated in comparison to liver biopsy, especially in clinical practice and for Fibroscan. Therefore, the main aim of the present study was to evaluate the accuracy of detailed fibrosis classifications available for non-invasive tests and liver biopsy. The secondary aim was to validate these accuracies in independent populations. Four HCV populations provided 2,068 patients with liver biopsy, four different pathologist skill-levels and non-invasive tests. Results were expressed as percentages of correctly classified patients. In population #1 including 205 patients and comparing liver biopsy (reference: consensus reading by two experts) and blood tests, Metavir fibrosis (FM) stage accuracy was 64.4% in local pathologists vs. 82.2% (p < 10-3) in single expert pathologist. Significant discrepancy (≥ 2FM vs reference histological result) rates were: Fibrotest: 17.2%, FibroMeter2G: 5.6%, local pathologists: 4.9%, FibroMeter3G: 0.5%, expert pathologist: 0% (p < 10-3). In population #2 including 1,056 patients and comparing blood tests, the discrepancy scores, taking into account the error magnitude, of detailed fibrosis classification were significantly different between FibroMeter2G (0.30 ± 0.55) and FibroMeter3G (0.14 ± 0.37, p < 10-3) or Fibrotest (0.84 ± 0.80, p < 10-3). In population #3 (and #4) including 458 (359) patients and comparing blood tests and Fibroscan, accuracies of detailed fibrosis classification were, respectively: Fibrotest: 42.5% (33.5%), Fibroscan: 64.9% (50.7%), FibroMeter2G: 68.7% (68.2%), FibroMeter3G: 77.1% (83.4%), p < 10-3 (p < 10-3). Significant discrepancy (≥ 2 FM) rates were, respectively: Fibrotest: 21.3% (22.2%), Fibroscan: 12.9% (12.3%), FibroMeter2G: 5.7% (6.0%), FibroMeter3G: 0.9% (0.9%), p < 10-3 (p < 10-3). The accuracy in detailed fibrosis classification of the best-performing blood test outperforms liver biopsy read by a local pathologist, i.e., in clinical practice; however, the classification precision is apparently lesser. This detailed classification accuracy is much lower than that of significant fibrosis with Fibroscan and even Fibrotest but higher with FibroMeter3G. FibroMeter classification accuracy was significantly higher than those of other non-invasive tests. Finally, for hepatitis C evaluation in clinical practice, fibrosis degree can be evaluated using an accurate blood test.
2011-01-01
Background Non-invasive tests have been constructed and evaluated mainly for binary diagnoses such as significant fibrosis. Recently, detailed fibrosis classifications for several non-invasive tests have been developed, but their accuracy has not been thoroughly evaluated in comparison to liver biopsy, especially in clinical practice and for Fibroscan. Therefore, the main aim of the present study was to evaluate the accuracy of detailed fibrosis classifications available for non-invasive tests and liver biopsy. The secondary aim was to validate these accuracies in independent populations. Methods Four HCV populations provided 2,068 patients with liver biopsy, four different pathologist skill-levels and non-invasive tests. Results were expressed as percentages of correctly classified patients. Results In population #1 including 205 patients and comparing liver biopsy (reference: consensus reading by two experts) and blood tests, Metavir fibrosis (FM) stage accuracy was 64.4% in local pathologists vs. 82.2% (p < 10-3) in single expert pathologist. Significant discrepancy (≥ 2FM vs reference histological result) rates were: Fibrotest: 17.2%, FibroMeter2G: 5.6%, local pathologists: 4.9%, FibroMeter3G: 0.5%, expert pathologist: 0% (p < 10-3). In population #2 including 1,056 patients and comparing blood tests, the discrepancy scores, taking into account the error magnitude, of detailed fibrosis classification were significantly different between FibroMeter2G (0.30 ± 0.55) and FibroMeter3G (0.14 ± 0.37, p < 10-3) or Fibrotest (0.84 ± 0.80, p < 10-3). In population #3 (and #4) including 458 (359) patients and comparing blood tests and Fibroscan, accuracies of detailed fibrosis classification were, respectively: Fibrotest: 42.5% (33.5%), Fibroscan: 64.9% (50.7%), FibroMeter2G: 68.7% (68.2%), FibroMeter3G: 77.1% (83.4%), p < 10-3 (p < 10-3). Significant discrepancy (≥ 2 FM) rates were, respectively: Fibrotest: 21.3% (22.2%), Fibroscan: 12.9% (12.3%), FibroMeter2G: 5.7% (6.0%), FibroMeter3G: 0.9% (0.9%), p < 10-3 (p < 10-3). Conclusions The accuracy in detailed fibrosis classification of the best-performing blood test outperforms liver biopsy read by a local pathologist, i.e., in clinical practice; however, the classification precision is apparently lesser. This detailed classification accuracy is much lower than that of significant fibrosis with Fibroscan and even Fibrotest but higher with FibroMeter3G. FibroMeter classification accuracy was significantly higher than those of other non-invasive tests. Finally, for hepatitis C evaluation in clinical practice, fibrosis degree can be evaluated using an accurate blood test. PMID:22129438
Shin, Jaeyoung; Kwon, Jinuk; Im, Chang-Hwan
2018-01-01
The performance of a brain-computer interface (BCI) can be enhanced by simultaneously using two or more modalities to record brain activity, which is generally referred to as a hybrid BCI. To date, many BCI researchers have tried to implement a hybrid BCI system by combining electroencephalography (EEG) and functional near-infrared spectroscopy (NIRS) to improve the overall accuracy of binary classification. However, since hybrid EEG-NIRS BCI, which will be denoted by hBCI in this paper, has not been applied to ternary classification problems, paradigms and classification strategies appropriate for ternary classification using hBCI are not well investigated. Here we propose the use of an hBCI for the classification of three brain activation patterns elicited by mental arithmetic, motor imagery, and idle state, with the aim to elevate the information transfer rate (ITR) of hBCI by increasing the number of classes while minimizing the loss of accuracy. EEG electrodes were placed over the prefrontal cortex and the central cortex, and NIRS optodes were placed only on the forehead. The ternary classification problem was decomposed into three binary classification problems using the "one-versus-one" (OVO) classification strategy to apply the filter-bank common spatial patterns filter to EEG data. A 10 × 10-fold cross validation was performed using shrinkage linear discriminant analysis (sLDA) to evaluate the average classification accuracies for EEG-BCI, NIRS-BCI, and hBCI when the meta-classification method was adopted to enhance classification accuracy. The ternary classification accuracies for EEG-BCI, NIRS-BCI, and hBCI were 76.1 ± 12.8, 64.1 ± 9.7, and 82.2 ± 10.2%, respectively. The classification accuracy of the proposed hBCI was thus significantly higher than those of the other BCIs ( p < 0.005). The average ITR for the proposed hBCI was calculated to be 4.70 ± 1.92 bits/minute, which was 34.3% higher than that reported for a previous binary hBCI study.
Performance of Activity Classification Algorithms in Free-living Older Adults
Sasaki, Jeffer Eidi; Hickey, Amanda; Staudenmayer, John; John, Dinesh; Kent, Jane A.; Freedson, Patty S.
2015-01-01
Purpose To compare activity type classification rates of machine learning algorithms trained on laboratory versus free-living accelerometer data in older adults. Methods Thirty-five older adults (21F and 14M ; 70.8 ± 4.9 y) performed selected activities in the laboratory while wearing three ActiGraph GT3X+ activity monitors (dominant hip, wrist, and ankle). Monitors were initialized to collect raw acceleration data at a sampling rate of 80 Hz. Fifteen of the participants also wore the GT3X+ in free-living settings and were directly observed for 2-3 hours. Time- and frequency- domain features from acceleration signals of each monitor were used to train Random Forest (RF) and Support Vector Machine (SVM) models to classify five activity types: sedentary, standing, household, locomotion, and recreational activities. All algorithms were trained on lab data (RFLab and SVMLab) and free-living data (RFFL and SVMFL) using 20 s signal sampling windows. Classification accuracy rates of both types of algorithms were tested on free-living data using a leave-one-out technique. Results Overall classification accuracy rates for the algorithms developed from lab data were between 49% (wrist) to 55% (ankle) for the SVMLab algorithms, and 49% (wrist) to 54% (ankle) for RFLab algorithms. The classification accuracy rates for SVMFL and RFFL algorithms ranged from 58% (wrist) to 69% (ankle) and from 61% (wrist) to 67% (ankle), respectively. Conclusion Our algorithms developed on free-living accelerometer data were more accurate in classifying activity type in free-living older adults than our algorithms developed on laboratory accelerometer data. Future studies should consider using free-living accelerometer data to train machine-learning algorithms in older adults. PMID:26673129
Performance of Activity Classification Algorithms in Free-Living Older Adults.
Sasaki, Jeffer Eidi; Hickey, Amanda M; Staudenmayer, John W; John, Dinesh; Kent, Jane A; Freedson, Patty S
2016-05-01
The objective of this study is to compare activity type classification rates of machine learning algorithms trained on laboratory versus free-living accelerometer data in older adults. Thirty-five older adults (21 females and 14 males, 70.8 ± 4.9 yr) performed selected activities in the laboratory while wearing three ActiGraph GT3X+ activity monitors (in the dominant hip, wrist, and ankle; ActiGraph, LLC, Pensacola, FL). Monitors were initialized to collect raw acceleration data at a sampling rate of 80 Hz. Fifteen of the participants also wore GT3X+ in free-living settings and were directly observed for 2-3 h. Time- and frequency-domain features from acceleration signals of each monitor were used to train random forest (RF) and support vector machine (SVM) models to classify five activity types: sedentary, standing, household, locomotion, and recreational activities. All algorithms were trained on laboratory data (RFLab and SVMLab) and free-living data (RFFL and SVMFL) using 20-s signal sampling windows. Classification accuracy rates of both types of algorithms were tested on free-living data using a leave-one-out technique. Overall classification accuracy rates for the algorithms developed from laboratory data were between 49% (wrist) and 55% (ankle) for the SVMLab algorithms and 49% (wrist) to 54% (ankle) for the RFLab algorithms. The classification accuracy rates for SVMFL and RFFL algorithms ranged from 58% (wrist) to 69% (ankle) and from 61% (wrist) to 67% (ankle), respectively. Our algorithms developed on free-living accelerometer data were more accurate in classifying the activity type in free-living older adults than those on our algorithms developed on laboratory accelerometer data. Future studies should consider using free-living accelerometer data to train machine learning algorithms in older adults.
Detection of epileptic seizure in EEG signals using linear least squares preprocessing.
Roshan Zamir, Z
2016-09-01
An epileptic seizure is a transient event of abnormal excessive neuronal discharge in the brain. This unwanted event can be obstructed by detection of electrical changes in the brain that happen before the seizure takes place. The automatic detection of seizures is necessary since the visual screening of EEG recordings is a time consuming task and requires experts to improve the diagnosis. Much of the prior research in detection of seizures has been developed based on artificial neural network, genetic programming, and wavelet transforms. Although the highest achieved accuracy for classification is 100%, there are drawbacks, such as the existence of unbalanced datasets and the lack of investigations in performances consistency. To address these, four linear least squares-based preprocessing models are proposed to extract key features of an EEG signal in order to detect seizures. The first two models are newly developed. The original signal (EEG) is approximated by a sinusoidal curve. Its amplitude is formed by a polynomial function and compared with the predeveloped spline function. Different statistical measures, namely classification accuracy, true positive and negative rates, false positive and negative rates and precision, are utilised to assess the performance of the proposed models. These metrics are derived from confusion matrices obtained from classifiers. Different classifiers are used over the original dataset and the set of extracted features. The proposed models significantly reduce the dimension of the classification problem and the computational time while the classification accuracy is improved in most cases. The first and third models are promising feature extraction methods with the classification accuracy of 100%. Logistic, LazyIB1, LazyIB5, and J48 are the best classifiers. Their true positive and negative rates are 1 while false positive and negative rates are 0 and the corresponding precision values are 1. Numerical results suggest that these models are robust and efficient for detecting epileptic seizure. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Situmorang, B. H.; Setiawan, M. P.; Tosida, E. T.
2017-01-01
Refractive errors are abnormalities of the refraction of light so that the shadows do not focus precisely on the retina resulting in blurred vision [1]. Refractive errors causing the patient should wear glasses or contact lenses in order eyesight returned to normal. The use of glasses or contact lenses in a person will be different from others, it is influenced by patient age, the amount of tear production, vision prescription, and astigmatic. Because the eye is one organ of the human body is very important to see, then the accuracy in determining glasses or contact lenses which will be used is required. This research aims to develop a decision support system that can produce output on the right contact lenses for refractive errors patients with a value of 100% accuracy. Iterative Dichotomize Three (ID3) classification methods will generate gain and entropy values of attributes that include code sample data, age of the patient, astigmatic, the ratio of tear production, vision prescription, and classes that will affect the outcome of the decision tree. The eye specialist test result for the training data obtained the accuracy rate of 96.7% and an error rate of 3.3%, the result test using confusion matrix obtained the accuracy rate of 96.1% and an error rate of 3.1%; for the data testing obtained accuracy rate of 100% and an error rate of 0.
EEG-based classification of imaginary left and right foot movements using beta rebound.
Hashimoto, Yasunari; Ushiba, Junichi
2013-11-01
The purpose of this study was to investigate cortical lateralization of event-related (de)synchronization during left and right foot motor imagery tasks and to determine classification accuracy of the two imaginary movements in a brain-computer interface (BCI) paradigm. We recorded 31-channel scalp electroencephalograms (EEGs) from nine healthy subjects during brisk imagery tasks of left and right foot movements. EEG was analyzed with time-frequency maps and topographies, and the accuracy rate of classification between left and right foot movements was calculated. Beta rebound at the end of imagination (increase of EEG beta rhythm amplitude) was identified from the two EEGs derived from the right-shift and left-shift bipolar pairs at the vertex. This process enabled discrimination between right or left foot imagery at a high accuracy rate (maximum 81.6% in single trial analysis). These data suggest that foot motor imagery has potential to elicit left-right differences in EEG, while BCI using the unilateral foot imagery can achieve high classification accuracy, similar to ordinary BCI, based on hand motor imagery. By combining conventional discrimination techniques, the left-right discrimination of unilateral foot motor imagery provides a novel BCI system that could control a foot neuroprosthesis or a robotic foot. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods.
Polat, Huseyin; Danaei Mehr, Homay; Cetin, Aydin
2017-04-01
As Chronic Kidney Disease progresses slowly, early detection and effective treatment are the only cure to reduce the mortality rate. Machine learning techniques are gaining significance in medical diagnosis because of their classification ability with high accuracy rates. The accuracy of classification algorithms depend on the use of correct feature selection algorithms to reduce the dimension of datasets. In this study, Support Vector Machine classification algorithm was used to diagnose Chronic Kidney Disease. To diagnose the Chronic Kidney Disease, two essential types of feature selection methods namely, wrapper and filter approaches were chosen to reduce the dimension of Chronic Kidney Disease dataset. In wrapper approach, classifier subset evaluator with greedy stepwise search engine and wrapper subset evaluator with the Best First search engine were used. In filter approach, correlation feature selection subset evaluator with greedy stepwise search engine and filtered subset evaluator with the Best First search engine were used. The results showed that the Support Vector Machine classifier by using filtered subset evaluator with the Best First search engine feature selection method has higher accuracy rate (98.5%) in the diagnosis of Chronic Kidney Disease compared to other selected methods.
Retinal vasculature classification using novel multifractal features
NASA Astrophysics Data System (ADS)
Ding, Y.; Ward, W. O. C.; Duan, Jinming; Auer, D. P.; Gowland, Penny; Bai, L.
2015-11-01
Retinal blood vessels have been implicated in a large number of diseases including diabetic retinopathy and cardiovascular diseases, which cause damages to retinal blood vessels. The availability of retinal vessel imaging provides an excellent opportunity for monitoring and diagnosis of retinal diseases, and automatic analysis of retinal vessels will help with the processes. However, state of the art vascular analysis methods such as counting the number of branches or measuring the curvature and diameter of individual vessels are unsuitable for the microvasculature. There has been published research using fractal analysis to calculate fractal dimensions of retinal blood vessels, but so far there has been no systematic research extracting discriminant features from retinal vessels for classifications. This paper introduces new methods for feature extraction from multifractal spectra of retinal vessels for classification. Two publicly available retinal vascular image databases are used for the experiments, and the proposed methods have produced accuracies of 85.5% and 77% for classification of healthy and diabetic retinal vasculatures. Experiments show that classification with multiple fractal features produces better rates compared with methods using a single fractal dimension value. In addition to this, experiments also show that classification accuracy can be affected by the accuracy of vessel segmentation algorithms.
[Electroencephalogram Feature Selection Based on Correlation Coefficient Analysis].
Zhou, Jinzhi; Tang, Xiaofang
2015-08-01
In order to improve the accuracy of classification with small amount of motor imagery training data on the development of brain-computer interface (BCD systems, we proposed an analyzing method to automatically select the characteristic parameters based on correlation coefficient analysis. Throughout the five sample data of dataset IV a from 2005 BCI Competition, we utilized short-time Fourier transform (STFT) and correlation coefficient calculation to reduce the number of primitive electroencephalogram dimension, then introduced feature extraction based on common spatial pattern (CSP) and classified by linear discriminant analysis (LDA). Simulation results showed that the average rate of classification accuracy could be improved by using correlation coefficient feature selection method than those without using this algorithm. Comparing with support vector machine (SVM) optimization features algorithm, the correlation coefficient analysis can lead better selection parameters to improve the accuracy of classification.
An Extreme Learning Machine-Based Neuromorphic Tactile Sensing System for Texture Recognition.
Rasouli, Mahdi; Chen, Yi; Basu, Arindam; Kukreja, Sunil L; Thakor, Nitish V
2018-04-01
Despite significant advances in computational algorithms and development of tactile sensors, artificial tactile sensing is strikingly less efficient and capable than the human tactile perception. Inspired by efficiency of biological systems, we aim to develop a neuromorphic system for tactile pattern recognition. We particularly target texture recognition as it is one of the most necessary and challenging tasks for artificial sensory systems. Our system consists of a piezoresistive fabric material as the sensor to emulate skin, an interface that produces spike patterns to mimic neural signals from mechanoreceptors, and an extreme learning machine (ELM) chip to analyze spiking activity. Benefiting from intrinsic advantages of biologically inspired event-driven systems and massively parallel and energy-efficient processing capabilities of the ELM chip, the proposed architecture offers a fast and energy-efficient alternative for processing tactile information. Moreover, it provides the opportunity for the development of low-cost tactile modules for large-area applications by integration of sensors and processing circuits. We demonstrate the recognition capability of our system in a texture discrimination task, where it achieves a classification accuracy of 92% for categorization of ten graded textures. Our results confirm that there exists a tradeoff between response time and classification accuracy (and information transfer rate). A faster decision can be achieved at early time steps or by using a shorter time window. This, however, results in deterioration of the classification accuracy and information transfer rate. We further observe that there exists a tradeoff between the classification accuracy and the input spike rate (and thus energy consumption). Our work substantiates the importance of development of efficient sparse codes for encoding sensory data to improve the energy efficiency. These results have a significance for a wide range of wearable, robotic, prosthetic, and industrial applications.
Acosta-Mesa, Héctor-Gabriel; Rechy-Ramírez, Fernando; Mezura-Montes, Efrén; Cruz-Ramírez, Nicandro; Hernández Jiménez, Rodolfo
2014-06-01
In this work, we present a novel application of time series discretization using evolutionary programming for the classification of precancerous cervical lesions. The approach optimizes the number of intervals in which the length and amplitude of the time series should be compressed, preserving the important information for classification purposes. Using evolutionary programming, the search for a good discretization scheme is guided by a cost function which considers three criteria: the entropy regarding the classification, the complexity measured as the number of different strings needed to represent the complete data set, and the compression rate assessed as the length of the discrete representation. This discretization approach is evaluated using a time series data based on temporal patterns observed during a classical test used in cervical cancer detection; the classification accuracy reached by our method is compared with the well-known times series discretization algorithm SAX and the dimensionality reduction method PCA. Statistical analysis of the classification accuracy shows that the discrete representation is as efficient as the complete raw representation for the present application, reducing the dimensionality of the time series length by 97%. This representation is also very competitive in terms of classification accuracy when compared with similar approaches. Copyright © 2014 Elsevier Inc. All rights reserved.
Statistical sensor fusion of ECG data using automotive-grade sensors
NASA Astrophysics Data System (ADS)
Koenig, A.; Rehg, T.; Rasshofer, R.
2015-11-01
Driver states such as fatigue, stress, aggression, distraction or even medical emergencies continue to be yield to severe mistakes in driving and promote accidents. A pathway towards improving driver state assessment can be found in psycho-physiological measures to directly quantify the driver's state from physiological recordings. Although heart rate is a well-established physiological variable that reflects cognitive stress, obtaining heart rate contactless and reliably is a challenging task in an automotive environment. Our aim was to investigate, how sensory fusion of two automotive grade sensors would influence the accuracy of automatic classification of cognitive stress levels. We induced cognitive stress in subjects and estimated levels from their heart rate signals, acquired from automotive ready ECG sensors. Using signal quality indices and Kalman filters, we were able to decrease Root Mean Squared Error (RMSE) of heart rate recordings by 10 beats per minute. We then trained a neural network to classify the cognitive workload state of subjects from heart rate and compared classification performance for ground truth, the individual sensors and the fused heart rate signal. We obtained an increase of 5 % higher correct classification by fusing signals as compared to individual sensors, staying only 4 % below the maximally possible classification accuracy from ground truth. These results are a first step towards real world applications of psycho-physiological measurements in vehicle settings. Future implementations of driver state modeling will be able to draw from a larger pool of data sources, such as additional physiological values or vehicle related data, which can be expected to drive classification to significantly higher values.
De Nunzio, Cosimo; Pastore, Antonio Luigi; Lombardo, Riccardo; Simone, Giuseppe; Leonardo, Costantino; Mastroianni, Riccardo; Collura, Devis; Muto, Giovanni; Gallucci, Michele; Carbone, Antonio; Fuschi, Andrea; Dutto, Lorenzo; Witt, Joern Heinrich; De Dominicis, Carlo; Tubaro, Andrea
2018-06-01
To evaluate the differences between the old and the new Gleason score classification systems in upgrading and downgrading rates. Between 2012 and 2015, we identified 9703 patients treated with retropubic radical prostatectomy (RP) in four tertiary centers. Biopsy specimens as well as radical prostatectomy specimens were graded according to both 2005 Gleason and 2014 ISUP five-tier Gleason grading system (five-tier GG system). Upgrading and downgrading rates on radical prostatectomy were first recorded for both classifications and then compared. The accuracy of the biopsy for each histological classification was determined by using the kappa coefficient of agreement and by assessing sensitivity, specificity, positive and negative predictive value. The five-tier GG system presented a lower clinically significant upgrading rate (1895/9703: 19,5% vs 2332/9703:24.0%; p = .001) and a similar clinically significant downgrading rate (756/9703: 7,7% vs 779/9703: 8%; p = .267) when compared to the 2005 ISUP classification. When evaluating their accuracy, the new five-tier GG system presented a better specificity (91% vs 83%) and a better negative predictive value (78% vs 60%). The kappa-statistics measures of agreement between needle biopsy and radical prostatectomy specimens were poor and good respectively for the five-tier GG system and for the 2005 Gleason score (k = 0.360 ± 0.007 vs k = 0.426 ± 0.007). The new Epstein classification significantly reduces upgrading events. The implementation of this new classification could better define prostate cancer aggressiveness with important clinical implications, particularly in prostate cancer management. Copyright © 2018 Elsevier Ltd, BASO ~ The Association for Cancer Surgery, and the European Society of Surgical Oncology. All rights reserved.
Classification of weld defect based on information fusion technology for radiographic testing system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jiang, Hongquan; Liang, Zeming, E-mail: heavenlzm@126.com; Gao, Jianmin
Improving the efficiency and accuracy of weld defect classification is an important technical problem in developing the radiographic testing system. This paper proposes a novel weld defect classification method based on information fusion technology, Dempster–Shafer evidence theory. First, to characterize weld defects and improve the accuracy of their classification, 11 weld defect features were defined based on the sub-pixel level edges of radiographic images, four of which are presented for the first time in this paper. Second, we applied information fusion technology to combine different features for weld defect classification, including a mass function defined based on the weld defectmore » feature information and the quartile-method-based calculation of standard weld defect class which is to solve a sample problem involving a limited number of training samples. A steam turbine weld defect classification case study is also presented herein to illustrate our technique. The results show that the proposed method can increase the correct classification rate with limited training samples and address the uncertainties associated with weld defect classification.« less
Jiang, Hongquan; Liang, Zeming; Gao, Jianmin; Dang, Changying
2016-03-01
Improving the efficiency and accuracy of weld defect classification is an important technical problem in developing the radiographic testing system. This paper proposes a novel weld defect classification method based on information fusion technology, Dempster-Shafer evidence theory. First, to characterize weld defects and improve the accuracy of their classification, 11 weld defect features were defined based on the sub-pixel level edges of radiographic images, four of which are presented for the first time in this paper. Second, we applied information fusion technology to combine different features for weld defect classification, including a mass function defined based on the weld defect feature information and the quartile-method-based calculation of standard weld defect class which is to solve a sample problem involving a limited number of training samples. A steam turbine weld defect classification case study is also presented herein to illustrate our technique. The results show that the proposed method can increase the correct classification rate with limited training samples and address the uncertainties associated with weld defect classification.
Detection of stress factors in crop and weed species using hyperspectral remote sensing reflectance
NASA Astrophysics Data System (ADS)
Henry, William Brien
The primary objective of this work was to determine if stress factors such as moisture stress or herbicide injury stress limit the ability to distinguish between weeds and crops using remotely sensed data. Additional objectives included using hyperspectral reflectance data to measure moisture content within a species, and to measure crop injury in response to drift rates of non-selective herbicides. Moisture stress did not reduce the ability to discriminate between species. Regardless of analysis technique, the trend was that as moisture stress increased, so too did the ability to distinguish between species. Signature amplitudes (SA) of the top 5 bands, discrete wavelet transforms (DWT), and multiple indices were promising analysis techniques. Discriminant models created from one year's data set and validated on additional data sets provided, on average, approximately 80% accurate classification among weeds and crop. This suggests that these models are relatively robust and could potentially be used across environmental conditions in field scenarios. Distinguishing between leaves grown at high-moisture stress and no-stress was met with limited success, primarily because there was substantial variation among samples within the treatments. Leaf water potential (LWP) was measured, and these were classified into three categories using indices. Classification accuracies were as high as 68%. The 10 bands most highly correlated to LWP were selected; however, there were no obvious trends or patterns in these top 10 bands with respect to time, species or moisture level, suggesting that LWP is an elusive parameter to quantify spectrally. In order to address herbicide injury stress and its impact on species discrimination, discriminant models were created from combinations of multiple indices. The model created from the second experimental run's data set and validated on the first experimental run's data provided an average of 97% correct classification of soybean and an overall average classification accuracy of 65% for all species. This suggests that these models are relatively robust and could potentially be used across a wide range of herbicide applications in field scenarios. From the pooled data set, a single discriminant model was created with multiple indices that discriminated soybean from weeds 88%, on average, regardless of herbicide, rate or species. Several analysis techniques including multiple indices, signature amplitude with spectral bands as features, and wavelet analysis were employed to distinguish between herbicide-treated and nontreated plants. Classification accuracy using signature amplitude (SA) analysis of paraquat injury on soybean was better than 75% for both 1/2 and 1/8X rates at 1, 4, and 7 DAA. Classification accuracy of paraquat injury on corn was better than 72% for the 1/2X rate at 1, 4, and 7 DAA. These data suggest that hyperspectral reflectance may be used to distinguish between healthy plants and injured plants to which herbicides have been applied; however, the classification accuracies remained at 75% or higher only when the higher rates of herbicide were applied. (Abstract shortened by UMI.)
Improved classification accuracy by feature extraction using genetic algorithms
NASA Astrophysics Data System (ADS)
Patriarche, Julia; Manduca, Armando; Erickson, Bradley J.
2003-05-01
A feature extraction algorithm has been developed for the purposes of improving classification accuracy. The algorithm uses a genetic algorithm / hill-climber hybrid to generate a set of linearly recombined features, which may be of reduced dimensionality compared with the original set. The genetic algorithm performs the global exploration, and a hill climber explores local neighborhoods. Hybridizing the genetic algorithm with a hill climber improves both the rate of convergence, and the final overall cost function value; it also reduces the sensitivity of the genetic algorithm to parameter selection. The genetic algorithm includes the operators: crossover, mutation, and deletion / reactivation - the last of these effects dimensionality reduction. The feature extractor is supervised, and is capable of deriving a separate feature space for each tissue (which are reintegrated during classification). A non-anatomical digital phantom was developed as a gold standard for testing purposes. In tests with the phantom, and with images of multiple sclerosis patients, classification with feature extractor derived features yielded lower error rates than using standard pulse sequences, and with features derived using principal components analysis. Using the multiple sclerosis patient data, the algorithm resulted in a mean 31% reduction in classification error of pure tissues.
Rajasekaran, S; Bhushan, Manindra; Aiyer, Siddharth; Kanna, Rishi; Shetty, Ajoy Prasad
2018-01-09
To develop a classification based on the technical complexity encountered during pedicle screw insertion and to evaluate the performance of AIRO ® CT navigation system based on this classification, in the clinical scenario of complex spinal deformity. 31 complex spinal deformity correction surgeries were prospectively analyzed for performance of AIRO ® mobile CT-based navigation system. Pedicles were classified according to complexity of insertion into five types. Analysis was performed to estimate the accuracy of screw placement and time for screw insertion. Breach greater than 2 mm was considered for analysis. 452 pedicle screws were inserted (T1-T6: 116; T7-T12: 171; L1-S1: 165). The average Cobb angle was 68.3° (range 60°-104°). We had 242 grade 2 pedicles, 133 grade 3, and 77 grade 4, and 44 pedicles were unfit for pedicle screw insertion. We noted 27 pedicle screw breach (medial: 10; lateral: 16; anterior: 1). Among lateral breach (n = 16), ten screws were planned for in-out-in pedicle screw insertion. Among lateral breach (n = 16), ten screws were planned for in-out-in pedicle screw insertion. Average screw insertion time was 1.76 ± 0.89 min. After accounting for planned breach, the effective breach rate was 3.8% resulting in 96.2% accuracy for pedicle screw placement. This classification helps compare the accuracy of screw insertion in range of conditions by considering the complexity of screw insertion. Considering the clinical scenario of complex pedicle anatomy in spinal deformity AIRO ® navigation showed an excellent accuracy rate of 96.2%.
Learning to Classify with Possible Sensor Failures
2014-05-04
SVMs), have demonstrated good classification performance when the training data is representative of the test data [1, 2, 3]. However, in many real...Detection of people and animals using non- imaging sensors,” Information Fusion (FUSION), 2011 Proceedings of the 14th International Conference on, pp...classification methods in terms of both classification accuracy and anomaly detection rate using 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND SUBTITLE 13
Zhang, Fan; Zhang, Xinhong
2011-01-01
Most of classification, quality evaluation or grading of the flue-cured tobacco leaves are manually operated, which relies on the judgmental experience of experts, and inevitably limited by personal, physical and environmental factors. The classification and the quality evaluation are therefore subjective and experientially based. In this paper, an automatic classification method of tobacco leaves based on the digital image processing and the fuzzy sets theory is presented. A grading system based on image processing techniques was developed for automatically inspecting and grading flue-cured tobacco leaves. This system uses machine vision for the extraction and analysis of color, size, shape and surface texture. Fuzzy comprehensive evaluation provides a high level of confidence in decision making based on the fuzzy logic. The neural network is used to estimate and forecast the membership function of the features of tobacco leaves in the fuzzy sets. The experimental results of the two-level fuzzy comprehensive evaluation (FCE) show that the accuracy rate of classification is about 94% for the trained tobacco leaves, and the accuracy rate of the non-trained tobacco leaves is about 72%. We believe that the fuzzy comprehensive evaluation is a viable way for the automatic classification and quality evaluation of the tobacco leaves. PMID:22163744
A Spiking Neural Network in sEMG Feature Extraction.
Lobov, Sergey; Mironov, Vasiliy; Kastalskiy, Innokentiy; Kazantsev, Victor
2015-11-03
We have developed a novel algorithm for sEMG feature extraction and classification. It is based on a hybrid network composed of spiking and artificial neurons. The spiking neuron layer with mutual inhibition was assigned as feature extractor. We demonstrate that the classification accuracy of the proposed model could reach high values comparable with existing sEMG interface systems. Moreover, the algorithm sensibility for different sEMG collecting systems characteristics was estimated. Results showed rather equal accuracy, despite a significant sampling rate difference. The proposed algorithm was successfully tested for mobile robot control.
Identifying Wrist Fracture Patients with High Accuracy by Automatic Categorization of X-ray Reports
de Bruijn, Berry; Cranney, Ann; O’Donnell, Siobhan; Martin, Joel D.; Forster, Alan J.
2006-01-01
The authors performed this study to determine the accuracy of several text classification methods to categorize wrist x-ray reports. We randomly sampled 751 textual wrist x-ray reports. Two expert reviewers rated the presence (n = 301) or absence (n = 450) of an acute fracture of wrist. We developed two information retrieval (IR) text classification methods and a machine learning method using a support vector machine (TC-1). In cross-validation on the derivation set (n = 493), TC-1 outperformed the two IR based methods and six benchmark classifiers, including Naive Bayes and a Neural Network. In the validation set (n = 258), TC-1 demonstrated consistent performance with 93.8% accuracy; 95.5% sensitivity; 92.9% specificity; and 87.5% positive predictive value. TC-1 was easy to implement and superior in performance to the other classification methods. PMID:16929046
Tuberculosis disease diagnosis using artificial immune recognition system.
Shamshirband, Shahaboddin; Hessam, Somayeh; Javidnia, Hossein; Amiribesheli, Mohsen; Vahdat, Shaghayegh; Petković, Dalibor; Gani, Abdullah; Kiah, Miss Laiha Mat
2014-01-01
There is a high risk of tuberculosis (TB) disease diagnosis among conventional methods. This study is aimed at diagnosing TB using hybrid machine learning approaches. Patient epicrisis reports obtained from the Pasteur Laboratory in the north of Iran were used. All 175 samples have twenty features. The features are classified based on incorporating a fuzzy logic controller and artificial immune recognition system. The features are normalized through a fuzzy rule based on a labeling system. The labeled features are categorized into normal and tuberculosis classes using the Artificial Immune Recognition Algorithm. Overall, the highest classification accuracy reached was for the 0.8 learning rate (α) values. The artificial immune recognition system (AIRS) classification approaches using fuzzy logic also yielded better diagnosis results in terms of detection accuracy compared to other empirical methods. Classification accuracy was 99.14%, sensitivity 87.00%, and specificity 86.12%.
NASA Astrophysics Data System (ADS)
Wu, Jie; Besnehard, Quentin; Marchessoux, Cédric
2011-03-01
Clinical studies for the validation of new medical imaging devices require hundreds of images. An important step in creating and tuning the study protocol is the classification of images into "difficult" and "easy" cases. This consists of classifying the image based on features like the complexity of the background, the visibility of the disease (lesions). Therefore, an automatic medical background classification tool for mammograms would help for such clinical studies. This classification tool is based on a multi-content analysis framework (MCA) which was firstly developed to recognize image content of computer screen shots. With the implementation of new texture features and a defined breast density scale, the MCA framework is able to automatically classify digital mammograms with a satisfying accuracy. BI-RADS (Breast Imaging Reporting Data System) density scale is used for grouping the mammograms, which standardizes the mammography reporting terminology and assessment and recommendation categories. Selected features are input into a decision tree classification scheme in MCA framework, which is the so called "weak classifier" (any classifier with a global error rate below 50%). With the AdaBoost iteration algorithm, these "weak classifiers" are combined into a "strong classifier" (a classifier with a low global error rate) for classifying one category. The results of classification for one "strong classifier" show the good accuracy with the high true positive rates. For the four categories the results are: TP=90.38%, TN=67.88%, FP=32.12% and FN =9.62%.
An Evaluation of Item Response Theory Classification Accuracy and Consistency Indices
ERIC Educational Resources Information Center
Wyse, Adam E.; Hao, Shiqi
2012-01-01
This article introduces two new classification consistency indices that can be used when item response theory (IRT) models have been applied. The new indices are shown to be related to Rudner's classification accuracy index and Guo's classification accuracy index. The Rudner- and Guo-based classification accuracy and consistency indices are…
Sex estimation standards for medieval and contemporary Croats
Bašić, Željana; Kružić, Ivana; Jerković, Ivan; Anđelinović, Deny; Anđelinović, Šimun
2017-01-01
Aim To develop discriminant functions for sex estimation on medieval Croatian population and test their application on contemporary Croatian population. Methods From a total of 519 skeletons, we chose 84 adult excellently preserved skeletons free of antemortem and postmortem changes and took all standard measurements. Sex was estimated/determined using standard anthropological procedures and ancient DNA (amelogenin analysis) where pelvis was insufficiently preserved or where sex morphological indicators were not consistent. We explored which measurements showed sexual dimorphism and used them for developing univariate and multivariate discriminant functions for sex estimation. We included only those functions that reached accuracy rate ≥80%. We tested the applicability of developed functions on modern Croatian sample (n = 37). Results From 69 standard skeletal measurements used in this study, 56 of them showed statistically significant sexual dimorphism (74.7%). We developed five univariate discriminant functions with classification rate 80.6%-85.2% and seven multivariate discriminant functions with an accuracy rate of 81.8%-93.0%. When tested on the modern population functions showed classification rates 74.1%-100%, and ten of them reached aimed accuracy rate. Females showed higher classification rates in the medieval populations, whereas males were better classified in the modern populations. Conclusion Developed discriminant functions are sufficiently accurate for reliable sex estimation in both medieval Croatian population and modern Croatian samples and may be used in forensic settings. The methodological issues that emerged regarding the importance of considering external factors in development and application of discriminant functions for sex estimation should be further explored. PMID:28613039
Research on aviation unsafe incidents classification with improved TF-IDF algorithm
NASA Astrophysics Data System (ADS)
Wang, Yanhua; Zhang, Zhiyuan; Huo, Weigang
2016-05-01
The text content of Aviation Safety Confidential Reports contains a large number of valuable information. Term frequency-inverse document frequency algorithm is commonly used in text analysis, but it does not take into account the sequential relationship of the words in the text and its role in semantic expression. According to the seven category labels of civil aviation unsafe incidents, aiming at solving the problems of TF-IDF algorithm, this paper improved TF-IDF algorithm based on co-occurrence network; established feature words extraction and words sequential relations for classified incidents. Aviation domain lexicon was used to improve the accuracy rate of classification. Feature words network model was designed for multi-documents unsafe incidents classification, and it was used in the experiment. Finally, the classification accuracy of improved algorithm was verified by the experiments.
Gómez-Valdés, Jorge A; Menéndez Garmendia, Antinea; García-Barzola, Lizbeth; Sánchez-Mejorada, Gabriela; Karam, Carlos; Baraybar, José Pablo; Klales, Alexandra
2017-03-01
The aim of this study was to test the accuracy of the Klales et al. (2012) equation for sex estimation in contemporary Mexican population. Our investigation was carried out on a sample of 203 left innominates of identified adult skeletons from the UNAM-Collection and the Santa María Xigui Cemetery, in Central Mexico. The Klales' original equation produces a sex bias in sex estimation against males (86-92% accuracy versus 100% accuracy in females). Based on these results, the Klales et al. (2012) method was recalibrated for a new cutt-of-point for sex estimation in contemporary Mexican populations. The results show cross-validated classification accuracy rates as high as 100% after recalibrating the original logistic regression equation. Recalibration improved classification accuracy and eliminated sex bias. This new formula will improve sex estimation for Mexican contemporary populations. © 2017 Wiley Periodicals, Inc.
Targeting an efficient target-to-target interval for P300 speller brain–computer interfaces
Sellers, Eric W.; Wang, Xingyu
2013-01-01
Longer target-to-target intervals (TTI) produce greater P300 event-related potential amplitude, which can increase brain–computer interface (BCI) classification accuracy and decrease the number of flashes needed for accurate character classification. However, longer TTIs requires more time for each trial, which will decrease the information transfer rate of BCI. In this paper, a P300 BCI using a 7 × 12 matrix explored new flash patterns (16-, 18- and 21-flash pattern) with different TTIs to assess the effects of TTI on P300 BCI performance. The new flash patterns were designed to minimize TTI, decrease repetition blindness, and examine the temporal relationship between each flash of a given stimulus by placing a minimum of one (16-flash pattern), two (18-flash pattern), or three (21-flash pattern) non-target flashes between each target flashes. Online results showed that the 16-flash pattern yielded the lowest classification accuracy among the three patterns. The results also showed that the 18-flash pattern provides a significantly higher information transfer rate (ITR) than the 21-flash pattern; both patterns provide high ITR and high accuracy for all subjects. PMID:22350331
An embedded implementation based on adaptive filter bank for brain-computer interface systems.
Belwafi, Kais; Romain, Olivier; Gannouni, Sofien; Ghaffari, Fakhreddine; Djemal, Ridha; Ouni, Bouraoui
2018-07-15
Brain-computer interface (BCI) is a new communication pathway for users with neurological deficiencies. The implementation of a BCI system requires complex electroencephalography (EEG) signal processing including filtering, feature extraction and classification algorithms. Most of current BCI systems are implemented on personal computers. Therefore, there is a great interest in implementing BCI on embedded platforms to meet system specifications in terms of time response, cost effectiveness, power consumption, and accuracy. This article presents an embedded-BCI (EBCI) system based on a Stratix-IV field programmable gate array. The proposed system relays on the weighted overlap-add (WOLA) algorithm to perform dynamic filtering of EEG-signals by analyzing the event-related desynchronization/synchronization (ERD/ERS). The EEG-signals are classified, using the linear discriminant analysis algorithm, based on their spatial features. The proposed system performs fast classification within a time delay of 0.430 s/trial, achieving an average accuracy of 76.80% according to an offline approach and 80.25% using our own recording. The estimated power consumption of the prototype is approximately 0.7 W. Results show that the proposed EBCI system reduces the overall classification error rate for the three datasets of the BCI-competition by 5% compared to other similar implementations. Moreover, experiment shows that the proposed system maintains a high accuracy rate with a short processing time, a low power consumption, and a low cost. Performing dynamic filtering of EEG-signals using WOLA increases the recognition rate of ERD/ERS patterns of motor imagery brain activity. This approach allows to develop a complete prototype of a EBCI system that achieves excellent accuracy rates. Copyright © 2018 Elsevier B.V. All rights reserved.
Development and Psychometric Evaluation of the Brief Adolescent Gambling Screen (BAGS)
Stinchfield, Randy; Wynne, Harold; Wiebe, Jamie; Tremblay, Joel
2017-01-01
The purpose of this study was to develop and evaluate the initial reliability, validity and classification accuracy of a new brief screen for adolescent problem gambling. The three-item Brief Adolescent Gambling Screen (BAGS) was derived from the nine-item Gambling Problem Severity Subscale (GPSS) of the Canadian Adolescent Gambling Inventory (CAGI) using a secondary analysis of existing CAGI data. The sample of 105 adolescents included 49 females and 56 males from Canada who completed the CAGI, a self-administered measure of DSM-IV diagnostic criteria for Pathological Gambling, and a clinician-administered diagnostic interview including the DSM-IV diagnostic criteria for Pathological Gambling (both of which were adapted to yield DSM-5 Gambling Disorder diagnosis). A stepwise multivariate discriminant function analysis selected three GPSS items as the best predictors of a diagnosis of Gambling Disorder. The BAGS demonstrated satisfactory estimates of reliability, validity and classification accuracy and was equivalent to the nine-item GPSS of the CAGI and the BAGS was more accurate than the SOGS-RA. The BAGS estimates of classification accuracy include hit rate = 0.95, sensitivity = 0.88, specificity = 0.98, false positive rate = 0.02, and false negative rate = 0.12. Since these classification estimates are preliminary, derived from a relatively small sample size, and based upon the same sample from which the items were selected, it will be important to cross-validate the BAGS with larger and more diverse samples. The BAGS should be evaluated for use as a screening tool in both clinical and school settings as well as epidemiological surveys. PMID:29312064
Orhan, Umut; Erdogmus, Deniz; Roark, Brian; Purwar, Shalini; Hild, Kenneth E.; Oken, Barry; Nezamfar, Hooman; Fried-Oken, Melanie
2013-01-01
Event related potentials (ERP) corresponding to a stimulus in electroencephalography (EEG) can be used to detect the intent of a person for brain computer interfaces (BCI). This paradigm is widely utilized to build letter-by-letter text input systems using BCI. Nevertheless using a BCI-typewriter depending only on EEG responses will not be sufficiently accurate for single-trial operation in general, and existing systems utilize many-trial schemes to achieve accuracy at the cost of speed. Hence incorporation of a language model based prior or additional evidence is vital to improve accuracy and speed. In this paper, we study the effects of Bayesian fusion of an n-gram language model with a regularized discriminant analysis ERP detector for EEG-based BCIs. The letter classification accuracies are rigorously evaluated for varying language model orders as well as number of ERP-inducing trials. The results demonstrate that the language models contribute significantly to letter classification accuracy. Specifically, we find that a BCI-speller supported by a 4-gram language model may achieve the same performance using 3-trial ERP classification for the initial letters of the words and using single trial ERP classification for the subsequent ones. Overall, fusion of evidence from EEG and language models yields a significant opportunity to increase the word rate of a BCI based typing system. PMID:22255652
NASA Astrophysics Data System (ADS)
Chen, Y.; Luo, M.; Xu, L.; Zhou, X.; Ren, J.; Zhou, J.
2018-04-01
The RF method based on grid-search parameter optimization could achieve a classification accuracy of 88.16 % in the classification of images with multiple feature variables. This classification accuracy was higher than that of SVM and ANN under the same feature variables. In terms of efficiency, the RF classification method performs better than SVM and ANN, it is more capable of handling multidimensional feature variables. The RF method combined with object-based analysis approach could highlight the classification accuracy further. The multiresolution segmentation approach on the basis of ESP scale parameter optimization was used for obtaining six scales to execute image segmentation, when the segmentation scale was 49, the classification accuracy reached the highest value of 89.58 %. The classification accuracy of object-based RF classification was 1.42 % higher than that of pixel-based classification (88.16 %), and the classification accuracy was further improved. Therefore, the RF classification method combined with object-based analysis approach could achieve relatively high accuracy in the classification and extraction of land use information for industrial and mining reclamation areas. Moreover, the interpretation of remotely sensed imagery using the proposed method could provide technical support and theoretical reference for remotely sensed monitoring land reclamation.
Integrating human and machine intelligence in galaxy morphology classification tasks
NASA Astrophysics Data System (ADS)
Beck, Melanie R.; Scarlata, Claudia; Fortson, Lucy F.; Lintott, Chris J.; Simmons, B. D.; Galloway, Melanie A.; Willett, Kyle W.; Dickinson, Hugh; Masters, Karen L.; Marshall, Philip J.; Wright, Darryl
2018-06-01
Quantifying galaxy morphology is a challenging yet scientifically rewarding task. As the scale of data continues to increase with upcoming surveys, traditional classification methods will struggle to handle the load. We present a solution through an integration of visual and automated classifications, preserving the best features of both human and machine. We demonstrate the effectiveness of such a system through a re-analysis of visual galaxy morphology classifications collected during the Galaxy Zoo 2 (GZ2) project. We reprocess the top-level question of the GZ2 decision tree with a Bayesian classification aggregation algorithm dubbed SWAP, originally developed for the Space Warps gravitational lens project. Through a simple binary classification scheme, we increase the classification rate nearly 5-fold classifying 226 124 galaxies in 92 d of GZ2 project time while reproducing labels derived from GZ2 classification data with 95.7 per cent accuracy. We next combine this with a Random Forest machine learning algorithm that learns on a suite of non-parametric morphology indicators widely used for automated morphologies. We develop a decision engine that delegates tasks between human and machine and demonstrate that the combined system provides at least a factor of 8 increase in the classification rate, classifying 210 803 galaxies in just 32 d of GZ2 project time with 93.1 per cent accuracy. As the Random Forest algorithm requires a minimal amount of computational cost, this result has important implications for galaxy morphology identification tasks in the era of Euclid and other large-scale surveys.
NASA Astrophysics Data System (ADS)
Erener, A.
2013-04-01
Automatic extraction of urban features from high resolution satellite images is one of the main applications in remote sensing. It is useful for wide scale applications, namely: urban planning, urban mapping, disaster management, GIS (geographic information systems) updating, and military target detection. One common approach to detecting urban features from high resolution images is to use automatic classification methods. This paper has four main objectives with respect to detecting buildings. The first objective is to compare the performance of the most notable supervised classification algorithms, including the maximum likelihood classifier (MLC) and the support vector machine (SVM). In this experiment the primary consideration is the impact of kernel configuration on the performance of the SVM. The second objective of the study is to explore the suitability of integrating additional bands, namely first principal component (1st PC) and the intensity image, for original data for multi classification approaches. The performance evaluation of classification results is done using two different accuracy assessment methods: pixel based and object based approaches, which reflect the third aim of the study. The objective here is to demonstrate the differences in the evaluation of accuracies of classification methods. Considering consistency, the same set of ground truth data which is produced by labeling the building boundaries in the GIS environment is used for accuracy assessment. Lastly, the fourth aim is to experimentally evaluate variation in the accuracy of classifiers for six different real situations in order to identify the impact of spatial and spectral diversity on results. The method is applied to Quickbird images for various urban complexity levels, extending from simple to complex urban patterns. The simple surface type includes a regular urban area with low density and systematic buildings with brick rooftops. The complex surface type involves almost all kinds of challenges, such as high dense build up areas, regions with bare soil, and small and large buildings with different rooftops, such as concrete, brick, and metal. Using the pixel based accuracy assessment it was shown that the percent building detection (PBD) and quality percent (QP) of the MLC and SVM depend on the complexity and texture variation of the region. Generally, PBD values range between 70% and 90% for the MLC and SVM, respectively. No substantial improvements were observed when the SVM and MLC classifications were developed by the addition of more variables, instead of the use of only four bands. In the evaluation of object based accuracy assessment, it was demonstrated that while MLC and SVM provide higher rates of correct detection, they also provide higher rates of false alarms.
Lin, Xiaohui; Li, Chao; Zhang, Yanhui; Su, Benzhe; Fan, Meng; Wei, Hai
2017-12-26
Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE) is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA) algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data.
Optical signal processing using photonic reservoir computing
NASA Astrophysics Data System (ADS)
Salehi, Mohammad Reza; Dehyadegari, Louiza
2014-10-01
As a new approach to recognition and classification problems, photonic reservoir computing has such advantages as parallel information processing, power efficient and high speed. In this paper, a photonic structure has been proposed for reservoir computing which is investigated using a simple, yet, non-partial noisy time series prediction task. This study includes the application of a suitable topology with self-feedbacks in a network of SOA's - which lends the system a strong memory - and leads to adjusting adequate parameters resulting in perfect recognition accuracy (100%) for noise-free time series, which shows a 3% improvement over previous results. For the classification of noisy time series, the rate of accuracy showed a 4% increase and amounted to 96%. Furthermore, an analytical approach was suggested to solve rate equations which led to a substantial decrease in the simulation time, which is an important parameter in classification of large signals such as speech recognition, and better results came up compared with previous works.
NASA Astrophysics Data System (ADS)
Selwyn, Ebenezer Juliet; Florinabel, D. Jemi
2018-04-01
Compound image segmentation plays a vital role in the compression of computer screen images. Computer screen images are images which are mixed with textual, graphical, or pictorial contents. In this paper, we present a comparison of two transform based block classification of compound images based on metrics like speed of classification, precision and recall rate. Block based classification approaches normally divide the compound images into fixed size blocks of non-overlapping in nature. Then frequency transform like Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT) are applied over each block. Mean and standard deviation are computed for each 8 × 8 block and are used as features set to classify the compound images into text/graphics and picture/background block. The classification accuracy of block classification based segmentation techniques are measured by evaluation metrics like precision and recall rate. Compound images of smooth background and complex background images containing text of varying size, colour and orientation are considered for testing. Experimental evidence shows that the DWT based segmentation provides significant improvement in recall rate and precision rate approximately 2.3% than DCT based segmentation with an increase in block classification time for both smooth and complex background images.
Electromyogram whitening for improved classification accuracy in upper limb prosthesis control.
Liu, Lukai; Liu, Pu; Clancy, Edward A; Scheme, Erik; Englehart
2013-09-01
Time and frequency domain features of the surface electromyogram (EMG) signal acquired from multiple channels have frequently been investigated for use in controlling upper-limb prostheses. A common control method is EMG-based motion classification. We propose the use of EMG signal whitening as a preprocessing step in EMG-based motion classification. Whitening decorrelates the EMG signal and has been shown to be advantageous in other EMG applications including EMG amplitude estimation and EMG-force processing. In a study of ten intact subjects and five amputees with up to 11 motion classes and ten electrode channels, we found that the coefficient of variation of time domain features (mean absolute value, average signal length and normalized zero crossing rate) was significantly reduced due to whitening. When using these features along with autoregressive power spectrum coefficients, whitening added approximately five percentage points to classification accuracy when small window lengths were considered.
ATLS Hypovolemic Shock Classification by Prediction of Blood Loss in Rats Using Regression Models.
Choi, Soo Beom; Choi, Joon Yul; Park, Jee Soo; Kim, Deok Won
2016-07-01
In our previous study, our input data set consisted of 78 rats, the blood loss in percent as a dependent variable, and 11 independent variables (heart rate, systolic blood pressure, diastolic blood pressure, mean arterial pressure, pulse pressure, respiration rate, temperature, perfusion index, lactate concentration, shock index, and new index (lactate concentration/perfusion)). The machine learning methods for multicategory classification were applied to a rat model in acute hemorrhage to predict the four Advanced Trauma Life Support (ATLS) hypovolemic shock classes for triage in our previous study. However, multicategory classification is much more difficult and complicated than binary classification. We introduce a simple approach for classifying ATLS hypovolaemic shock class by predicting blood loss in percent using support vector regression and multivariate linear regression (MLR). We also compared the performance of the classification models using absolute and relative vital signs. The accuracies of support vector regression and MLR models with relative values by predicting blood loss in percent were 88.5% and 84.6%, respectively. These were better than the best accuracy of 80.8% of the direct multicategory classification using the support vector machine one-versus-one model in our previous study for the same validation data set. Moreover, the simple MLR models with both absolute and relative values could provide possibility of the future clinical decision support system for ATLS classification. The perfusion index and new index were more appropriate with relative changes than absolute values.
NASA Astrophysics Data System (ADS)
Yao, C.; Zhang, Y.; Zhang, Y.; Liu, H.
2017-09-01
With the rapid development of Precision Agriculture (PA) promoted by high-resolution remote sensing, it makes significant sense in management and estimation of agriculture through crop classification of high-resolution remote sensing image. Due to the complex and fragmentation of the features and the surroundings in the circumstance of high-resolution, the accuracy of the traditional classification methods has not been able to meet the standard of agricultural problems. In this case, this paper proposed a classification method for high-resolution agricultural remote sensing images based on convolution neural networks(CNN). For training, a large number of training samples were produced by panchromatic images of GF-1 high-resolution satellite of China. In the experiment, through training and testing on the CNN under the toolbox of deep learning by MATLAB, the crop classification finally got the correct rate of 99.66 % after the gradual optimization of adjusting parameter during training. Through improving the accuracy of image classification and image recognition, the applications of CNN provide a reference value for the field of remote sensing in PA.
Men, Hong; Fu, Songlin; Yang, Jialin; Cheng, Meiqi; Shi, Yan; Liu, Jingjing
2018-01-18
Paraffin odor intensity is an important quality indicator when a paraffin inspection is performed. Currently, paraffin odor level assessment is mainly dependent on an artificial sensory evaluation. In this paper, we developed a paraffin odor analysis system to classify and grade four kinds of paraffin samples. The original feature set was optimized using Principal Component Analysis (PCA) and Partial Least Squares (PLS). Support Vector Machine (SVM), Random Forest (RF), and Extreme Learning Machine (ELM) were applied to three different feature data sets for classification and level assessment of paraffin. For classification, the model based on SVM, with an accuracy rate of 100%, was superior to that based on RF, with an accuracy rate of 98.33-100%, and ELM, with an accuracy rate of 98.01-100%. For level assessment, the R² related to the training set was above 0.97 and the R² related to the test set was above 0.87. Through comprehensive comparison, the generalization of the model based on ELM was superior to those based on SVM and RF. The scoring errors for the three models were 0.0016-0.3494, lower than the error of 0.5-1.0 measured by industry standard experts, meaning these methods have a higher prediction accuracy for scoring paraffin level.
Calès, P; Boursier, J; Lebigot, J; de Ledinghen, V; Aubé, C; Hubert, I; Oberti, F
2017-04-01
In chronic hepatitis C, the European Association for the Study of the Liver and the Asociacion Latinoamericana para el Estudio del Higado recommend performing transient elastography plus a blood test to diagnose significant fibrosis; test concordance confirms the diagnosis. To validate this rule and improve it by combining a blood test, FibroMeter (virus second generation, Echosens, Paris, France) and transient elastography (constitutive tests) into a single combined test, as suggested by the American Association for the Study of Liver Diseases and the Infectious Diseases Society of America. A total of 1199 patients were included in an exploratory set (HCV, n = 679) or in two validation sets (HCV ± HIV, HBV, n = 520). Accuracy was mainly evaluated by correct diagnosis rate for severe fibrosis (pathological Metavir F ≥ 3, primary outcome) by classical test scores or a fibrosis classification, reflecting Metavir staging, as a function of test concordance. Score accuracy: there were no significant differences between the blood test (75.7%), elastography (79.1%) and the combined test (79.4%) (P = 0.066); the score accuracy of each test was significantly (P < 0.001) decreased in discordant vs. concordant tests. Classification accuracy: combined test accuracy (91.7%) was significantly (P < 0.001) increased vs. the blood test (84.1%) and elastography (88.2%); accuracy of each constitutive test was significantly (P < 0.001) decreased in discordant vs. concordant tests but not with combined test: 89.0 vs. 92.7% (P = 0.118). Multivariate analysis for accuracy showed an interaction between concordance and fibrosis level: in the 1% of patients with full classification discordance and severe fibrosis, non-invasive tests were unreliable. The advantage of combined test classification was confirmed in the validation sets. The concordance recommendation is validated. A combined test, expressed in classification instead of score, improves this rule and validates the recommendation of a combined test, avoiding 99% of biopsies, and offering precise staging. © 2017 John Wiley & Sons Ltd.
Study on bayes discriminant analysis of EEG data.
Shi, Yuan; He, DanDan; Qin, Fang
2014-01-01
In this paper, we have done Bayes Discriminant analysis to EEG data of experiment objects which are recorded impersonally come up with a relatively accurate method used in feature extraction and classification decisions. In accordance with the strength of α wave, the head electrodes are divided into four species. In use of part of 21 electrodes EEG data of 63 people, we have done Bayes Discriminant analysis to EEG data of six objects. Results In use of part of EEG data of 63 people, we have done Bayes Discriminant analysis, the electrode classification accuracy rates is 64.4%. Bayes Discriminant has higher prediction accuracy, EEG features (mainly αwave) extract more accurate. Bayes Discriminant would be better applied to the feature extraction and classification decisions of EEG data.
Mehrang, Saeed; Pietilä, Julia; Korhonen, Ilkka
2018-02-22
Wrist-worn sensors have better compliance for activity monitoring compared to hip, waist, ankle or chest positions. However, wrist-worn activity monitoring is challenging due to the wide degree of freedom for the hand movements, as well as similarity of hand movements in different activities such as varying intensities of cycling. To strengthen the ability of wrist-worn sensors in detecting human activities more accurately, motion signals can be complemented by physiological signals such as optical heart rate (HR) based on photoplethysmography. In this paper, an activity monitoring framework using an optical HR sensor and a triaxial wrist-worn accelerometer is presented. We investigated a range of daily life activities including sitting, standing, household activities and stationary cycling with two intensities. A random forest (RF) classifier was exploited to detect these activities based on the wrist motions and optical HR. The highest overall accuracy of 89.6 ± 3.9% was achieved with a forest of a size of 64 trees and 13-s signal segments with 90% overlap. Removing the HR-derived features decreased the classification accuracy of high-intensity cycling by almost 7%, but did not affect the classification accuracies of other activities. A feature reduction utilizing the feature importance scores of RF was also carried out and resulted in a shrunken feature set of only 21 features. The overall accuracy of the classification utilizing the shrunken feature set was 89.4 ± 4.2%, which is almost equivalent to the above-mentioned peak overall accuracy.
Mumtaz, Wajid; Ali, Syed Saad Azhar; Yasin, Mohd Azhar Mohd; Malik, Aamir Saeed
2018-02-01
Major depressive disorder (MDD), a debilitating mental illness, could cause functional disabilities and could become a social problem. An accurate and early diagnosis for depression could become challenging. This paper proposed a machine learning framework involving EEG-derived synchronization likelihood (SL) features as input data for automatic diagnosis of MDD. It was hypothesized that EEG-based SL features could discriminate MDD patients and healthy controls with an acceptable accuracy better than measures such as interhemispheric coherence and mutual information. In this work, classification models such as support vector machine (SVM), logistic regression (LR) and Naïve Bayesian (NB) were employed to model relationship between the EEG features and the study groups (MDD patient and healthy controls) and ultimately achieved discrimination of study participants. The results indicated that the classification rates were better than chance. More specifically, the study resulted into SVM classification accuracy = 98%, sensitivity = 99.9%, specificity = 95% and f-measure = 0.97; LR classification accuracy = 91.7%, sensitivity = 86.66%, specificity = 96.6% and f-measure = 0.90; NB classification accuracy = 93.6%, sensitivity = 100%, specificity = 87.9% and f-measure = 0.95. In conclusion, SL could be a promising method for diagnosing depression. The findings could be generalized to develop a robust CAD-based tool that may help for clinical purposes.
NASA Astrophysics Data System (ADS)
Chan, Heang-Ping; Helvie, Mark A.; Petrick, Nicholas; Sahiner, Berkman; Adler, Dorit D.; Blane, Caroline E.; Joynt, Lynn K.; Paramagul, Chintana; Roubidoux, Marilyn A.; Wilson, Todd E.; Hadjiiski, Lubomir M.; Goodsitt, Mitchell M.
1999-05-01
A receiver operating characteristic (ROC) experiment was conducted to evaluate the effects of pixel size on the characterization of mammographic microcalcifications. Digital mammograms were obtained by digitizing screen-film mammograms with a laser film scanner. One hundred twelve two-view mammograms with biopsy-proven microcalcifications were digitized at a pixel size of 35 micrometer X 35 micrometer. A region of interest (ROI) containing the microcalcifications was extracted from each image. ROI images with pixel sizes of 70 micrometers, 105 micrometers, and 140 micrometers were derived from the ROI of 35 micrometer pixel size by averaging 2 X 2, 3 X 3, and 4 X 4 neighboring pixels, respectively. The ROI images were printed on film with a laser imager. Seven MQSA-approved radiologists participated as observers. The likelihood of malignancy of the microcalcifications was rated on a 10-point confidence rating scale and analyzed with ROC methodology. The classification accuracy was quantified by the area, Az, under the ROC curve. The statistical significance of the differences in the Az values for different pixel sizes was estimated with the Dorfman-Berbaum-Metz (DBM) method for multi-reader, multi-case ROC data. It was found that five of the seven radiologists demonstrated a higher classification accuracy with the 70 micrometer or 105 micrometer images. The average Az also showed a higher classification accuracy in the range of 70 to 105 micrometer pixel size. However, the differences in A(subscript z/ between different pixel sizes did not achieve statistical significance. The low specificity of image features of microcalcifications an the large interobserver and intraobserver variabilities may have contributed to the relatively weak dependence of classification accuracy on pixel size.
SeqRate: sequence-based protein folding type classification and rates prediction
2010-01-01
Background Protein folding rate is an important property of a protein. Predicting protein folding rate is useful for understanding protein folding process and guiding protein design. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. And most methods do not distinguish the different kinetic nature (two-state folding or multi-state folding) of the proteins. Here we developed a method, SeqRate, to predict both protein folding kinetic type (two-state versus multi-state) and real-value folding rate using sequence length, amino acid composition, contact order, contact number, and secondary structure information predicted from only protein sequence with support vector machines. Results We systematically studied the contributions of individual features to folding rate prediction. On a standard benchmark dataset, the accuracy of folding kinetic type classification is 80%. The Pearson correlation coefficient and the mean absolute difference between predicted and experimental folding rates (sec-1) in the base-10 logarithmic scale are 0.81 and 0.79 for two-state protein folders, and 0.80 and 0.68 for three-state protein folders. SeqRate is the first sequence-based method for protein folding type classification and its accuracy of fold rate prediction is improved over previous sequence-based methods. Its performance can be further enhanced with additional information, such as structure-based geometric contacts, as inputs. Conclusions Both the web server and software of predicting folding rate are publicly available at http://casp.rnet.missouri.edu/fold_rate/index.html. PMID:20438647
Adaptive sleep-wake discrimination for wearable devices.
Karlen, Walter; Floreano, Dario
2011-04-01
Sleep/wake classification systems that rely on physiological signals suffer from intersubject differences that make accurate classification with a single, subject-independent model difficult. To overcome the limitations of intersubject variability, we suggest a novel online adaptation technique that updates the sleep/wake classifier in real time. The objective of the present study was to evaluate the performance of a newly developed adaptive classification algorithm that was embedded on a wearable sleep/wake classification system called SleePic. The algorithm processed ECG and respiratory effort signals for the classification task and applied behavioral measurements (obtained from accelerometer and press-button data) for the automatic adaptation task. When trained as a subject-independent classifier algorithm, the SleePic device was only able to correctly classify 74.94 ± 6.76% of the human-rated sleep/wake data. By using the suggested automatic adaptation method, the mean classification accuracy could be significantly improved to 92.98 ± 3.19%. A subject-independent classifier based on activity data only showed a comparable accuracy of 90.44 ± 3.57%. We demonstrated that subject-independent models used for online sleep-wake classification can successfully be adapted to previously unseen subjects without the intervention of human experts or off-line calibration.
Assessment of various supervised learning algorithms using different performance metrics
NASA Astrophysics Data System (ADS)
Susheel Kumar, S. M.; Laxkar, Deepak; Adhikari, Sourav; Vijayarajan, V.
2017-11-01
Our work brings out comparison based on the performance of supervised machine learning algorithms on a binary classification task. The supervised machine learning algorithms which are taken into consideration in the following work are namely Support Vector Machine(SVM), Decision Tree(DT), K Nearest Neighbour (KNN), Naïve Bayes(NB) and Random Forest(RF). This paper mostly focuses on comparing the performance of above mentioned algorithms on one binary classification task by analysing the Metrics such as Accuracy, F-Measure, G-Measure, Precision, Misclassification Rate, False Positive Rate, True Positive Rate, Specificity, Prevalence.
Bauer, Robert; Fels, Meike; Royter, Vladislav; Raco, Valerio; Gharabaghi, Alireza
2016-09-01
Considering self-rated mental effort during neurofeedback may improve training of brain self-regulation. Twenty-one healthy, right-handed subjects performed kinesthetic motor imagery of opening their left hand, while threshold-based classification of beta-band desynchronization resulted in proprioceptive robotic feedback. The experiment consisted of two blocks in a cross-over design. The participants rated their perceived mental effort nine times per block. In the adaptive block, the threshold was adjusted on the basis of these ratings whereas adjustments were carried out at random in the other block. Electroencephalography was used to examine the cortical activation patterns during the training sessions. The perceived mental effort was correlated with the difficulty threshold of neurofeedback training. Adaptive threshold-setting reduced mental effort and increased the classification accuracy and positive predictive value. This was paralleled by an inter-hemispheric cortical activation pattern in low frequency bands connecting the right frontal and left parietal areas. Optimal balance of mental effort was achieved at thresholds significantly higher than maximum classification accuracy. Rating of mental effort is a feasible approach for effective threshold-adaptation during neurofeedback training. Closed-loop adaptation of the neurofeedback difficulty level facilitates reinforcement learning of brain self-regulation. Copyright © 2016 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Approximated mutual information training for speech recognition using myoelectric signals.
Guo, Hua J; Chan, A D C
2006-01-01
A new training algorithm called the approximated maximum mutual information (AMMI) is proposed to improve the accuracy of myoelectric speech recognition using hidden Markov models (HMMs). Previous studies have demonstrated that automatic speech recognition can be performed using myoelectric signals from articulatory muscles of the face. Classification of facial myoelectric signals can be performed using HMMs that are trained using the maximum likelihood (ML) algorithm; however, this algorithm maximizes the likelihood of the observations in the training sequence, which is not directly associated with optimal classification accuracy. The AMMI training algorithm attempts to maximize the mutual information, thereby training the HMMs to optimize their parameters for discrimination. Our results show that AMMI training consistently reduces the error rates compared to these by the ML training, increasing the accuracy by approximately 3% on average.
Three-Class Mammogram Classification Based on Descriptive CNN Features
Zhang, Qianni; Jadoon, Adeel
2017-01-01
In this paper, a novel classification technique for large data set of mammograms using a deep learning method is proposed. The proposed model targets a three-class classification study (normal, malignant, and benign cases). In our model we have presented two methods, namely, convolutional neural network-discrete wavelet (CNN-DW) and convolutional neural network-curvelet transform (CNN-CT). An augmented data set is generated by using mammogram patches. To enhance the contrast of mammogram images, the data set is filtered by contrast limited adaptive histogram equalization (CLAHE). In the CNN-DW method, enhanced mammogram images are decomposed as its four subbands by means of two-dimensional discrete wavelet transform (2D-DWT), while in the second method discrete curvelet transform (DCT) is used. In both methods, dense scale invariant feature (DSIFT) for all subbands is extracted. Input data matrix containing these subband features of all the mammogram patches is created that is processed as input to convolutional neural network (CNN). Softmax layer and support vector machine (SVM) layer are used to train CNN for classification. Proposed methods have been compared with existing methods in terms of accuracy rate, error rate, and various validation assessment measures. CNN-DW and CNN-CT have achieved accuracy rate of 81.83% and 83.74%, respectively. Simulation results clearly validate the significance and impact of our proposed model as compared to other well-known existing techniques. PMID:28191461
Three-Class Mammogram Classification Based on Descriptive CNN Features.
Jadoon, M Mohsin; Zhang, Qianni; Haq, Ihsan Ul; Butt, Sharjeel; Jadoon, Adeel
2017-01-01
In this paper, a novel classification technique for large data set of mammograms using a deep learning method is proposed. The proposed model targets a three-class classification study (normal, malignant, and benign cases). In our model we have presented two methods, namely, convolutional neural network-discrete wavelet (CNN-DW) and convolutional neural network-curvelet transform (CNN-CT). An augmented data set is generated by using mammogram patches. To enhance the contrast of mammogram images, the data set is filtered by contrast limited adaptive histogram equalization (CLAHE). In the CNN-DW method, enhanced mammogram images are decomposed as its four subbands by means of two-dimensional discrete wavelet transform (2D-DWT), while in the second method discrete curvelet transform (DCT) is used. In both methods, dense scale invariant feature (DSIFT) for all subbands is extracted. Input data matrix containing these subband features of all the mammogram patches is created that is processed as input to convolutional neural network (CNN). Softmax layer and support vector machine (SVM) layer are used to train CNN for classification. Proposed methods have been compared with existing methods in terms of accuracy rate, error rate, and various validation assessment measures. CNN-DW and CNN-CT have achieved accuracy rate of 81.83% and 83.74%, respectively. Simulation results clearly validate the significance and impact of our proposed model as compared to other well-known existing techniques.
Kim, Eun Young; Lee, Min Young; Kim, Se Hyun; Ha, Kyooseob; Kim, Kwang Pyo; Ahn, Yong Min
2017-06-02
Major depressive disorder (MDD) is a systemic and multifactorial disorder that involves abnormalities in multiple biochemical pathways and the autonomic nervous system. This study applied a machine-learning method to classify MDD and control groups by incorporating data from serum proteomic analysis and heart rate variability (HRV) analysis for the identification of novel peripheral biomarkers. The study subjects consisted of 25 drug-free female MDD patients and 25 age- and sex-matched healthy controls. First, quantitative serum proteome profiles were analyzed by liquid chromatography-tandem mass spectrometry using pooled serum samples from 10 patients and 10 controls. Next, candidate proteins were quantified with multiple reaction monitoring (MRM) in 50 subjects. We also analyzed 22 linear and nonlinear HRV parameters in 50 subjects. Finally, we identified a combined biomarker panel consisting of proteins and HRV indexes using a support vector machine with recursive feature elimination. A separation between MDD and control groups was achieved using five parameters (apolipoprotein B, group-specific component, ceruloplasmin, RMSSD, and SampEn) at 80.1% classification accuracy. A combination of HRV and proteomic data achieved better classification accuracy. A high classification accuracy can be achieved by combining multimodal information from heart rate dynamics and serum proteomics in MDD. Our approach can be helpful for accurate clinical diagnosis of MDD. Further studies using larger, independent cohorts are needed to verify the role of these candidate biomarkers for MDD diagnosis. Copyright © 2017 Elsevier Inc. All rights reserved.
Variance approximations for assessments of classification accuracy
R. L. Czaplewski
1994-01-01
Variance approximations are derived for the weighted and unweighted kappa statistics, the conditional kappa statistic, and conditional probabilities. These statistics are useful to assess classification accuracy, such as accuracy of remotely sensed classifications in thematic maps when compared to a sample of reference classifications made in the field. Published...
Estimation of the Age and Amount of Brown Rice Plant Hoppers Based on Bionic Electronic Nose Use
Xu, Sai; Zhou, Zhiyan; Lu, Huazhong; Luo, Xiwen; Lan, Yubin; Zhang, Yang; Li, Yanfang
2014-01-01
The brown rice plant hopper (BRPH), Nilaparvata lugens (Stal), is one of the most important insect pests affecting rice and causes serious damage to the yield and quality of rice plants in Asia. This study used bionic electronic nose technology to sample BRPH volatiles, which vary in age and amount. Principal component analysis (PCA), linear discrimination analysis (LDA), probabilistic neural network (PNN), BP neural network (BPNN) and loading analysis (Loadings) techniques were used to analyze the sampling data. The results indicate that the PCA and LDA classification ability is poor, but the LDA classification displays superior performance relative to PCA. When a PNN was used to evaluate the BRPH age and amount, the classification rates of the training set were 100% and 96.67%, respectively, and the classification rates of the test set were 90.67% and 64.67%, respectively. When BPNN was used for the evaluation of the BRPH age and amount, the classification accuracies of the training set were 100% and 48.93%, respectively, and the classification accuracies of the test set were 96.67% and 47.33%, respectively. Loadings for BRPH volatiles indicate that the main elements of BRPHs' volatiles are sulfur-containing organics, aromatics, sulfur- and chlorine-containing organics and nitrogen oxides, which provide a reference for sensors chosen when exploited in specialized BRPH identification devices. This research proves the feasibility and broad application prospects of bionic electronic noses for BRPH recognition. PMID:25268913
Laufer, Shlomi; D'Angelo, Anne-Lise D; Kwan, Calvin; Ray, Rebbeca D; Yudkowsky, Rachel; Boulet, John R; McGaghie, William C; Pugh, Carla M
2017-12-01
Develop new performance evaluation standards for the clinical breast examination (CBE). There are several, technical aspects of a proper CBE. Our recent work discovered a significant, linear relationship between palpation force and CBE accuracy. This article investigates the relationship between other technical aspects of the CBE and accuracy. This performance assessment study involved data collection from physicians (n = 553) attending 3 different clinical meetings between 2013 and 2014: American Society of Breast Surgeons, American Academy of Family Physicians, and American College of Obstetricians and Gynecologists. Four, previously validated, sensor-enabled breast models were used for clinical skills assessment. Models A and B had solitary, superficial, 2 cm and 1 cm soft masses, respectively. Models C and D had solitary, deep, 2 cm hard and moderately firm masses, respectively. Finger movements (search technique) from 1137 CBE video recordings were independently classified by 2 observers. Final classifications were compared with CBE accuracy. Accuracy rates were model A = 99.6%, model B = 89.7%, model C = 75%, and model D = 60%. Final classification categories for search technique included rubbing movement, vertical movement, piano fingers, and other. Interrater reliability was (k = 0.79). Rubbing movement was 4 times more likely to yield an accurate assessment (odds ratio 3.81, P < 0.001) compared with vertical movement and piano fingers. Piano fingers had the highest failure rate (36.5%). Regression analysis of search pattern, search technique, palpation force, examination time, and 6 demographic variables, revealed that search technique independently and significantly affected CBE accuracy (P < 0.001). Our results support measurement and classification of CBE techniques and provide the foundation for a new paradigm in teaching and assessing hands-on clinical skills. The newly described piano fingers palpation technique was noted to have unusually high failure rates. Medical educators should be aware of the potential differences in effectiveness for various CBE techniques.
Ramsey, Elijah W.; Nelson, Gene A.; Sapkota, Sijan
1998-01-01
A progressive classification of a marsh and forest system using Landsat Thematic Mapper (TM), color infrared (CIR) photograph, and ERS-1 synthetic aperture radar (SAR) data improved classification accuracy when compared to classification using solely TM reflective band data. The classification resulted in a detailed identification of differences within a nearly monotypic black needlerush marsh. Accuracy percentages of these classes were surprisingly high given the complexities of classification. The detailed classification resulted in a more accurate portrayal of the marsh transgressive sequence than was obtainable with TM data alone. Individual sensor contribution to the improved classification was compared to that using only the six reflective TM bands. Individually, the green reflective CIR and SAR data identified broad categories of water, marsh, and forest. In combination with TM, SAR and the green CIR band each improved overall accuracy by about 3% and 15% respectively. The SAR data improved the TM classification accuracy mostly in the marsh classes. The green CIR data also improved the marsh classification accuracy and accuracies in some water classes. The final combination of all sensor data improved almost all class accuracies from 2% to 70% with an overall improvement of about 20% over TM data alone. Not only was the identification of vegetation types improved, but the spatial detail of the classification approached 10 m in some areas.
Comparison of Feature Selection Techniques in Machine Learning for Anatomical Brain MRI in Dementia.
Tohka, Jussi; Moradi, Elaheh; Huttunen, Heikki
2016-07-01
We present a comparative split-half resampling analysis of various data driven feature selection and classification methods for the whole brain voxel-based classification analysis of anatomical magnetic resonance images. We compared support vector machines (SVMs), with or without filter based feature selection, several embedded feature selection methods and stability selection. While comparisons of the accuracy of various classification methods have been reported previously, the variability of the out-of-training sample classification accuracy and the set of selected features due to independent training and test sets have not been previously addressed in a brain imaging context. We studied two classification problems: 1) Alzheimer's disease (AD) vs. normal control (NC) and 2) mild cognitive impairment (MCI) vs. NC classification. In AD vs. NC classification, the variability in the test accuracy due to the subject sample did not vary between different methods and exceeded the variability due to different classifiers. In MCI vs. NC classification, particularly with a large training set, embedded feature selection methods outperformed SVM-based ones with the difference in the test accuracy exceeding the test accuracy variability due to the subject sample. The filter and embedded methods produced divergent feature patterns for MCI vs. NC classification that suggests the utility of the embedded feature selection for this problem when linked with the good generalization performance. The stability of the feature sets was strongly correlated with the number of features selected, weakly correlated with the stability of classification accuracy, and uncorrelated with the average classification accuracy.
ERIC Educational Resources Information Center
Wang, Wenyi; Song, Lihong; Chen, Ping; Meng, Yaru; Ding, Shuliang
2015-01-01
Classification consistency and accuracy are viewed as important indicators for evaluating the reliability and validity of classification results in cognitive diagnostic assessment (CDA). Pattern-level classification consistency and accuracy indices were introduced by Cui, Gierl, and Chang. However, the indices at the attribute level have not yet…
NASA Astrophysics Data System (ADS)
Mahvash Mohammadi, Neda; Hezarkhani, Ardeshir
2018-07-01
Classification of mineralised zones is an important factor for the analysis of economic deposits. In this paper, the support vector machine (SVM), a supervised learning algorithm, based on subsurface data is proposed for classification of mineralised zones in the Takht-e-Gonbad porphyry Cu-deposit (SE Iran). The effects of the input features are evaluated via calculating the accuracy rates on the SVM performance. Ultimately, the SVM model, is developed based on input features namely lithology, alteration, mineralisation, the level and, radial basis function (RBF) as a kernel function. Moreover, the optimal amount of parameters λ and C, using n-fold cross-validation method, are calculated at level 0.001 and 0.01 respectively. The accuracy of this model is 0.931 for classification of mineralised zones in the Takht-e-Gonbad porphyry deposit. The results of the study confirm the efficiency of SVM method for classification the mineralised zones.
Sevel, Landrew S; Boissoneault, Jeff; Letzen, Janelle E; Robinson, Michael E; Staud, Roland
2018-05-30
Chronic fatigue syndrome (CFS) is a disorder associated with fatigue, pain, and structural/functional abnormalities seen during magnetic resonance brain imaging (MRI). Therefore, we evaluated the performance of structural MRI (sMRI) abnormalities in the classification of CFS patients versus healthy controls and compared it to machine learning (ML) classification based upon self-report (SR). Participants included 18 CFS patients and 15 healthy controls (HC). All subjects underwent T1-weighted sMRI and provided visual analogue-scale ratings of fatigue, pain intensity, anxiety, depression, anger, and sleep quality. sMRI data were segmented using FreeSurfer and 61 regions based on functional and structural abnormalities previously reported in patients with CFS. Classification was performed in RapidMiner using a linear support vector machine and bootstrap optimism correction. We compared ML classifiers based on (1) 61 a priori sMRI regional estimates and (2) SR ratings. The sMRI model achieved 79.58% classification accuracy. The SR (accuracy = 95.95%) outperformed both sMRI models. Estimates from multiple brain areas related to cognition, emotion, and memory contributed strongly to group classification. This is the first ML-based group classification of CFS. Our findings suggest that sMRI abnormalities are useful for discriminating CFS patients from HC, but SR ratings remain most effective in classification tasks.
A Novel Energy-Efficient Approach for Human Activity Recognition.
Zheng, Lingxiang; Wu, Dihong; Ruan, Xiaoyang; Weng, Shaolin; Peng, Ao; Tang, Biyu; Lu, Hai; Shi, Haibin; Zheng, Huiru
2017-09-08
In this paper, we propose a novel energy-efficient approach for mobile activity recognition system (ARS) to detect human activities. The proposed energy-efficient ARS, using low sampling rates, can achieve high recognition accuracy and low energy consumption. A novel classifier that integrates hierarchical support vector machine and context-based classification (HSVMCC) is presented to achieve a high accuracy of activity recognition when the sampling rate is less than the activity frequency, i.e., the Nyquist sampling theorem is not satisfied. We tested the proposed energy-efficient approach with the data collected from 20 volunteers (14 males and six females) and the average recognition accuracy of around 96.0% was achieved. Results show that using a low sampling rate of 1Hz can save 17.3% and 59.6% of energy compared with the sampling rates of 5 Hz and 50 Hz. The proposed low sampling rate approach can greatly reduce the power consumption while maintaining high activity recognition accuracy. The composition of power consumption in online ARS is also investigated in this paper.
Automated sleep scoring and sleep apnea detection in children
NASA Astrophysics Data System (ADS)
Baraglia, David P.; Berryman, Matthew J.; Coussens, Scott W.; Pamula, Yvonne; Kennedy, Declan; Martin, A. James; Abbott, Derek
2005-12-01
This paper investigates the automated detection of a patient's breathing rate and heart rate from their skin conductivity as well as sleep stage scoring and breathing event detection from their EEG. The software developed for these tasks is tested on data sets obtained from the sleep disorders unit at the Adelaide Women's and Children's Hospital. The sleep scoring and breathing event detection tasks used neural networks to achieve signal classification. The Fourier transform and the Higuchi fractal dimension were used to extract features for input to the neural network. The filtered skin conductivity appeared visually to bear a similarity to the breathing and heart rate signal, but a more detailed evaluation showed the relation was not consistent. Sleep stage classification was achieved with and accuracy of around 65% with some stages being accurately scored and others poorly scored. The two breathing events hypopnea and apnea were scored with varying degrees of accuracy with the highest scores being around 75% and 30%.
Men, Hong; Fu, Songlin; Yang, Jialin; Cheng, Meiqi; Shi, Yan
2018-01-01
Paraffin odor intensity is an important quality indicator when a paraffin inspection is performed. Currently, paraffin odor level assessment is mainly dependent on an artificial sensory evaluation. In this paper, we developed a paraffin odor analysis system to classify and grade four kinds of paraffin samples. The original feature set was optimized using Principal Component Analysis (PCA) and Partial Least Squares (PLS). Support Vector Machine (SVM), Random Forest (RF), and Extreme Learning Machine (ELM) were applied to three different feature data sets for classification and level assessment of paraffin. For classification, the model based on SVM, with an accuracy rate of 100%, was superior to that based on RF, with an accuracy rate of 98.33–100%, and ELM, with an accuracy rate of 98.01–100%. For level assessment, the R2 related to the training set was above 0.97 and the R2 related to the test set was above 0.87. Through comprehensive comparison, the generalization of the model based on ELM was superior to those based on SVM and RF. The scoring errors for the three models were 0.0016–0.3494, lower than the error of 0.5–1.0 measured by industry standard experts, meaning these methods have a higher prediction accuracy for scoring paraffin level. PMID:29346328
Hrabok, Marianne; Brooks, Brian L; Fay-McClymont, Taryn B; Sherman, Elisabeth M S
2014-01-01
The purpose of this article was to investigate the accuracy of the WISC-IV short forms in estimating Full Scale Intelligence Quotient (FSIQ) and General Ability Index (GAI) in pediatric epilepsy. One hundred and four children with epilepsy completed the WISC-IV as part of a neuropsychological assessment at a tertiary-level children's hospital. The clinical accuracy of eight short forms was assessed in two ways: (a) accuracy within +/- 5 index points of FSIQ and (b) the clinical classification rate according to Wechsler conventions. The sample was further subdivided into low FSIQ (≤ 80) and high FSIQ (> 80). All short forms were significantly correlated with FSIQ. Seven-subtest (Crawford et al. [2010] FSIQ) and 5-subtest (BdSiCdVcLn) short forms yielded the highest clinical accuracy rates (77%-89%). Overall, a 2-subtest (VcMr) short form yielded the lowest clinical classification rates for FSIQ (35%-63%). The short form yielding the most accurate estimate of GAI was VcSiMrBd (73%-84%). Short forms show promise as useful estimates. The 7-subtest (Crawford et al., 2010) and 5-subtest (BdSiVcLnCd) short forms yielded the most accurate estimates of FSIQ. VcSiMrBd yielded the most accurate estimate of GAI. Clinical recommendations are provided for use of short forms in pediatric epilepsy.
[Accuracy improvement of spectral classification of crop using microwave backscatter data].
Jia, Kun; Li, Qiang-Zi; Tian, Yi-Chen; Wu, Bing-Fang; Zhang, Fei-Fei; Meng, Ji-Hua
2011-02-01
In the present study, VV polarization microwave backscatter data used for improving accuracies of spectral classification of crop is investigated. Classification accuracy using different classifiers based on the fusion data of HJ satellite multi-spectral and Envisat ASAR VV backscatter data are compared. The results indicate that fusion data can take full advantage of spectral information of HJ multi-spectral data and the structure sensitivity feature of ASAR VV polarization data. The fusion data enlarges the spectral difference among different classifications and improves crop classification accuracy. The classification accuracy using fusion data can be increased by 5 percent compared to the single HJ data. Furthermore, ASAR VV polarization data is sensitive to non-agrarian area of planted field, and VV polarization data joined classification can effectively distinguish the field border. VV polarization data associating with multi-spectral data used in crop classification enlarges the application of satellite data and has the potential of spread in the domain of agriculture.
Thorne, John C; Coggins, Truman E; Carmichael Olson, Heather; Astley, Susan J
2007-04-01
To evaluate classification accuracy and clinical feasibility of a narrative analysis tool for identifying children with a fetal alcohol spectrum disorder (FASD). Picture-elicited narratives generated by 16 age-matched pairs of school-aged children (FASD vs. typical development [TD]) were coded for semantic elaboration and reference strategy by judges who were unaware of age, gender, and group membership of the participants. Receiver operating characteristic (ROC) curves were used to examine the classification accuracy of the resulting set of narrative measures for making 2 classifications: (a) for the 16 children diagnosed with FASD, low performance (n = 7) versus average performance (n = 9) on a standardized expressive language task and (b) FASD (n = 16) versus TD (n = 16). Combining the rates of semantic elaboration and pragmatically inappropriate reference perfectly matched a classification based on performance on the standardized language task. More importantly, the rate of ambiguous nominal reference was highly accurate in classifying children with an FASD regardless of their performance on the standardized language task (area under the ROC curve = .863, confidence interval = .736-.991). Results support further study of the diagnostic utility of narrative analysis using discourse level measures of elaboration and children's strategic use of reference.
Goshvarpour, Ateke; Goshvarpour, Atefeh
2018-04-30
Heart rate variability (HRV) analysis has become a widely used tool for monitoring pathological and psychological states in medical applications. In a typical classification problem, information fusion is a process whereby the effective combination of the data can achieve a more accurate system. The purpose of this article was to provide an accurate algorithm for classifying HRV signals in various psychological states. Therefore, a novel feature level fusion approach was proposed. First, using the theory of information, two similarity indicators of the signal were extracted, including correntropy and Cauchy-Schwarz divergence. Applying probabilistic neural network (PNN) and k-nearest neighbor (kNN), the performance of each index in the classification of meditators and non-meditators HRV signals was appraised. Then, three fusion rules, including division, product, and weighted sum rules were used to combine the information of both similarity measures. For the first time, we propose an algorithm to define the weights of each feature based on the statistical p-values. The performance of HRV classification using combined features was compared with the non-combined features. Totally, the accuracy of 100% was obtained for discriminating all states. The results showed the strong ability and proficiency of division and weighted sum rules in the improvement of the classifier accuracies.
Bhaduri, Aritra; Banerjee, Amitava; Roy, Subhrajit; Kar, Sougata; Basu, Arindam
2018-03-01
We present a neuromorphic current mode implementation of a spiking neural classifier with lumped square law dendritic nonlinearity. It has been shown previously in software simulations that such a system with binary synapses can be trained with structural plasticity algorithms to achieve comparable classification accuracy with fewer synaptic resources than conventional algorithms. We show that even in real analog systems with manufacturing imperfections (CV of 23.5% and 14.4% for dendritic branch gains and leaks respectively), this network is able to produce comparable results with fewer synaptic resources. The chip fabricated in [Formula: see text]m complementary metal oxide semiconductor has eight dendrites per cell and uses two opposing cells per class to cancel common-mode inputs. The chip can operate down to a [Formula: see text] V and dissipates 19 nW of static power per neuronal cell and [Formula: see text] 125 pJ/spike. For two-class classification problems of high-dimensional rate encoded binary patterns, the hardware achieves comparable performance as software implementation of the same with only about a 0.5% reduction in accuracy. On two UCI data sets, the IC integrated circuit has classification accuracy comparable to standard machine learners like support vector machines and extreme learning machines while using two to five times binary synapses. We also show that the system can operate on mean rate encoded spike patterns, as well as short bursts of spikes. To the best of our knowledge, this is the first attempt in hardware to perform classification exploiting dendritic properties and binary synapses.
NASA Astrophysics Data System (ADS)
Prochazka, D.; Mazura, M.; Samek, O.; Rebrošová, K.; Pořízka, P.; Klus, J.; Prochazková, P.; Novotný, J.; Novotný, K.; Kaiser, J.
2018-01-01
In this work, we investigate the impact of data provided by complementary laser-based spectroscopic methods on multivariate classification accuracy. Discrimination and classification of five Staphylococcus bacterial strains and one strain of Escherichia coli is presented. The technique that we used for measurements is a combination of Raman spectroscopy and Laser-Induced Breakdown Spectroscopy (LIBS). Obtained spectroscopic data were then processed using Multivariate Data Analysis algorithms. Principal Components Analysis (PCA) was selected as the most suitable technique for visualization of bacterial strains data. To classify the bacterial strains, we used Neural Networks, namely a supervised version of Kohonen's self-organizing maps (SOM). We were processing results in three different ways - separately from LIBS measurements, from Raman measurements, and we also merged data from both mentioned methods. The three types of results were then compared. By applying the PCA to Raman spectroscopy data, we observed that two bacterial strains were fully distinguished from the rest of the data set. In the case of LIBS data, three bacterial strains were fully discriminated. Using a combination of data from both methods, we achieved the complete discrimination of all bacterial strains. All the data were classified with a high success rate using SOM algorithm. The most accurate classification was obtained using a combination of data from both techniques. The classification accuracy varied, depending on specific samples and techniques. As for LIBS, the classification accuracy ranged from 45% to 100%, as for Raman Spectroscopy from 50% to 100% and in case of merged data, all samples were classified correctly. Based on the results of the experiments presented in this work, we can assume that the combination of Raman spectroscopy and LIBS significantly enhances discrimination and classification accuracy of bacterial species and strains. The reason is the complementarity in obtained chemical information while using these two methods.
NASA Technical Reports Server (NTRS)
Justice, C.; Townshend, J. (Principal Investigator)
1981-01-01
Two unsupervised classification procedures were applied to ratioed and unratioed LANDSAT multispectral scanner data of an area of spatially complex vegetation and terrain. An objective accuracy assessment was undertaken on each classification and comparison was made of the classification accuracies. The two unsupervised procedures use the same clustering algorithm. By on procedure the entire area is clustered and by the other a representative sample of the area is clustered and the resulting statistics are extrapolated to the remaining area using a maximum likelihood classifier. Explanation is given of the major steps in the classification procedures including image preprocessing; classification; interpretation of cluster classes; and accuracy assessment. Of the four classifications undertaken, the monocluster block approach on the unratioed data gave the highest accuracy of 80% for five coarse cover classes. This accuracy was increased to 84% by applying a 3 x 3 contextual filter to the classified image. A detailed description and partial explanation is provided for the major misclassification. The classification of the unratioed data produced higher percentage accuracies than for the ratioed data and the monocluster block approach gave higher accuracies than clustering the entire area. The moncluster block approach was additionally the most economical in terms of computing time.
Zhou, Tao; Li, Zhaofu; Pan, Jianjun
2018-01-27
This paper focuses on evaluating the ability and contribution of using backscatter intensity, texture, coherence, and color features extracted from Sentinel-1A data for urban land cover classification and comparing different multi-sensor land cover mapping methods to improve classification accuracy. Both Landsat-8 OLI and Hyperion images were also acquired, in combination with Sentinel-1A data, to explore the potential of different multi-sensor urban land cover mapping methods to improve classification accuracy. The classification was performed using a random forest (RF) method. The results showed that the optimal window size of the combination of all texture features was 9 × 9, and the optimal window size was different for each individual texture feature. For the four different feature types, the texture features contributed the most to the classification, followed by the coherence and backscatter intensity features; and the color features had the least impact on the urban land cover classification. Satisfactory classification results can be obtained using only the combination of texture and coherence features, with an overall accuracy up to 91.55% and a kappa coefficient up to 0.8935, respectively. Among all combinations of Sentinel-1A-derived features, the combination of the four features had the best classification result. Multi-sensor urban land cover mapping obtained higher classification accuracy. The combination of Sentinel-1A and Hyperion data achieved higher classification accuracy compared to the combination of Sentinel-1A and Landsat-8 OLI images, with an overall accuracy of up to 99.12% and a kappa coefficient up to 0.9889. When Sentinel-1A data was added to Hyperion images, the overall accuracy and kappa coefficient were increased by 4.01% and 0.0519, respectively.
Can segmentation evaluation metric be used as an indicator of land cover classification accuracy?
NASA Astrophysics Data System (ADS)
Švab Lenarčič, Andreja; Đurić, Nataša; Čotar, Klemen; Ritlop, Klemen; Oštir, Krištof
2016-10-01
It is a broadly established belief that the segmentation result significantly affects subsequent image classification accuracy. However, the actual correlation between the two has never been evaluated. Such an evaluation would be of considerable importance for any attempts to automate the object-based classification process, as it would reduce the amount of user intervention required to fine-tune the segmentation parameters. We conducted an assessment of segmentation and classification by analyzing 100 different segmentation parameter combinations, 3 classifiers, 5 land cover classes, 20 segmentation evaluation metrics, and 7 classification accuracy measures. The reliability definition of segmentation evaluation metrics as indicators of land cover classification accuracy was based on the linear correlation between the two. All unsupervised metrics that are not based on number of segments have a very strong correlation with all classification measures and are therefore reliable as indicators of land cover classification accuracy. On the other hand, correlation at supervised metrics is dependent on so many factors that it cannot be trusted as a reliable classification quality indicator. Algorithms for land cover classification studied in this paper are widely used; therefore, presented results are applicable to a wider area.
Seligman, D A; Pullinger, A G
2006-11-01
To determine whether patients with temporomandibular joint disease or masticatory muscle pain can be usefully differentiated from asymptomatic controls using multifactorial classification tree models of attrition severity and/or rates. Measures of attrition severity and rates in patients diagnosed with disc displacement (n = 52), osteoarthrosis (n = 74), or masticatory muscle pain only (n = 43) were compared against those in asymptomatic controls (n = 132). Cross-validated classification tree models were tested for fit with sensitivity, specificity, accuracy and log likelihood accountability. The model for identifying asymptomatic controls only required the three measures of attrition severity (anterior, mediotrusive and laterotrusive posterior) to be differentiated from the patients with a 74.2 +/- 3.8% cross-validation accuracy. This compared with cross-validation accuracies of 69.7 +/- 3.7% for differentiating disc displacement using anterior and laterotrusive attrition severity, 68.7 +/- 3.9% for differentiating disc displacement using anterior and laterotrusive attrition rates, 70.9 +/- 3.3% for differentiating osteoarthrosis using anterior attrition severity and rates, 94.6 +/- 2.1% for differentiating myofascial pain using mediotrusive and laterotrusive attrition severity, and 92.0 +/- 2.1% for differentiating myofascial pain using mediotrusive and anterior attrition rates. The myofascial pain models exceeded the > or =75% sensitivity and > or =90% specificity thresholds recommended for diagnostic tests, and the asymptomatic control model approached these thresholds. Multifactorial models using attrition severity and rates may differentiate masticatory muscle pain patients from asymptomatic controls, and have some predictive value for differentiating intracapsular temporomandibular disorder patients as well.
NASA Astrophysics Data System (ADS)
Porto, C. D. N.; Costa Filho, C. F. F.; Macedo, M. M. G.; Gutierrez, M. A.; Costa, M. G. F.
2017-03-01
Studies in intravascular optical coherence tomography (IV-OCT) have demonstrated the importance of coronary bifurcation regions in intravascular medical imaging analysis, as plaques are more likely to accumulate in this region leading to coronary disease. A typical IV-OCT pullback acquires hundreds of frames, thus developing an automated tool to classify the OCT frames as bifurcation or non-bifurcation can be an important step to speed up OCT pullbacks analysis and assist automated methods for atherosclerotic plaque quantification. In this work, we evaluate the performance of two state-of-the-art classifiers, SVM and Neural Networks in the bifurcation classification task. The study included IV-OCT frames from 9 patients. In order to improve classification performance, we trained and tested the SVM with different parameters by means of a grid search and different stop criteria were applied to the Neural Network classifier: mean square error, early stop and regularization. Different sets of features were tested, using feature selection techniques: PCA, LDA and scalar feature selection with correlation. Training and test were performed in sets with a maximum of 1460 OCT frames. We quantified our results in terms of false positive rate, true positive rate, accuracy, specificity, precision, false alarm, f-measure and area under ROC curve. Neural networks obtained the best classification accuracy, 98.83%, overcoming the results found in literature. Our methods appear to offer a robust and reliable automated classification of OCT frames that might assist physicians indicating potential frames to analyze. Methods for improving neural networks generalization have increased the classification performance.
Comparison of seven protocols to identify fecal contamination sources using Escherichia coli
Stoeckel, D.M.; Mathes, M.V.; Hyer, K.E.; Hagedorn, C.; Kator, H.; Lukasik, J.; O'Brien, T. L.; Fenger, T.W.; Samadpour, M.; Strickler, K.M.; Wiggins, B.A.
2004-01-01
Microbial source tracking (MST) uses various approaches to classify fecal-indicator microorganisms to source hosts. Reproducibility, accuracy, and robustness of seven phenotypic and genotypic MST protocols were evaluated by use of Escherichia coli from an eight-host library of known-source isolates and a separate, blinded challenge library. In reproducibility tests, measuring each protocol's ability to reclassify blinded replicates, only one (pulsed-field gel electrophoresis; PFGE) correctly classified all test replicates to host species; three protocols classified 48-62% correctly, and the remaining three classified fewer than 25% correctly. In accuracy tests, measuring each protocol's ability to correctly classify new isolates, ribotyping with EcoRI and PvuII approached 100% correct classification but only 6% of isolates were classified; four of the other six protocols (antibiotic resistance analysis, PFGE, and two repetitive-element PCR protocols) achieved better than random accuracy rates when 30-100% of challenge isolates were classified. In robustness tests, measuring each protocol's ability to recognize isolates from nonlibrary hosts, three protocols correctly classified 33-100% of isolates as "unknown origin," whereas four protocols classified all isolates to a source category. A relevance test, summarizing interpretations for a hypothetical water sample containing 30 challenge isolates, indicated that false-positive classifications would hinder interpretations for most protocols. Study results indicate that more representation in known-source libraries and better classification accuracy would be needed before field application. Thorough reliability assessment of classification results is crucial before and during application of MST protocols.
Tse, Samson; Davidson, Larry; Chung, Ka-Fai; Yu, Chong Ho; Ng, King Lam; Tsoi, Emily
2015-02-01
More mental health services are adopting the recovery paradigm. This study adds to prior research by (a) using measures of stages of recovery and elements of recovery that were designed and validated in a non-Western, Chinese culture and (b) testing which demographic factors predict advanced recovery and whether placing importance on certain elements predicts advanced recovery. We examined recovery and factors associated with recovery among 75 Hong Kong adults who were diagnosed with schizophrenia and assessed to be in clinical remission. Data were collected on socio-demographic factors, recovery stages and elements associated with recovery. Logistic regression analysis was used to identify variables that could best predict stages of recovery. Receiver operating characteristic curves were used to detect the classification accuracy of the model (i.e. rates of correct classification of stages of recovery). Logistic regression results indicated that stages of recovery could be distinguished with reasonable accuracy for Stage 3 ('living with disability', classification accuracy = 75.45%) and Stage 4 ('living beyond disability', classification accuracy = 75.50%). However, there was no sufficient information to predict Combined Stages 1 and 2 ('overwhelmed by disability' and 'struggling with disability'). It was found that having a meaningful role and age were the most important differentiators of recovery stage. Preliminary findings suggest that adopting salient life roles personally is important to recovery and that this component should be incorporated into mental health services. © The Author(s) 2014.
Integrating Human and Machine Intelligence in Galaxy Morphology Classification Tasks
NASA Astrophysics Data System (ADS)
Beck, Melanie Renee
The large flood of data flowing from observatories presents significant challenges to astronomy and cosmology--challenges that will only be magnified by projects currently under development. Growth in both volume and velocity of astrophysics data is accelerating: whereas the Sloan Digital Sky Survey (SDSS) has produced 60 terabytes of data in the last decade, the upcoming Large Synoptic Survey Telescope (LSST) plans to register 30 terabytes per night starting in the year 2020. Additionally, the Euclid Mission will acquire imaging for 5 x 107 resolvable galaxies. The field of galaxy evolution faces a particularly challenging future as complete understanding often cannot be reached without analysis of detailed morphological galaxy features. Historically, morphological analysis has relied on visual classification by astronomers, accessing the human brains capacity for advanced pattern recognition. However, this accurate but inefficient method falters when confronted with many thousands (or millions) of images. In the SDSS era, efforts to automate morphological classifications of galaxies (e.g., Conselice et al., 2000; Lotz et al., 2004) are reasonably successful and can distinguish between elliptical and disk-dominated galaxies with accuracies of 80%. While this is statistically very useful, a key problem with these methods is that they often cannot say which 80% of their samples are accurate. Furthermore, when confronted with the more complex task of identifying key substructure within galaxies, automated classification algorithms begin to fail. The Galaxy Zoo project uses a highly innovative approach to solving the scalability problem of visual classification. Displaying images of SDSS galaxies to volunteers via a simple and engaging web interface, www.galaxyzoo.org asks people to classify images by eye. Within the first year hundreds of thousands of members of the general public had classified each of the 1 million SDSS galaxies an average of 40 times. Galaxy Zoo thus solved both the visual classification problem of time efficiency and improved accuracy by producing a distribution of independent classifications for each galaxy. While crowd-sourced galaxy classifications have proven their worth, challenges remain before establishing this method as a critical and standard component of the data processing pipelines for the next generation of surveys. In particular, though innovative, crowd-sourcing techniques do not have the capacity to handle the data volume and rates expected in the next generation of surveys. These algorithms will be delegated to handle the majority of the classification tasks, freeing citizen scientists to contribute their efforts on subtler and more complex assignments. This thesis presents a solution through an integration of visual and automated classifications, preserving the best features of both human and machine. We demonstrate the effectiveness of such a system through a re-analysis of visual galaxy morphology classifications collected during the Galaxy Zoo 2 (GZ2) project. We reprocess the top-level question of the GZ2 decision tree with a Bayesian classification aggregation algorithm dubbed SWAP, originally developed for the Space Warps gravitational lens project. Through a simple binary classification scheme we increase the classification rate nearly 5-fold classifying 226,124 galaxies in 92 days of GZ2 project time while reproducing labels derived from GZ2 classification data with 95.7% accuracy. We next combine this with a Random Forest machine learning algorithm that learns on a suite of non-parametric morphology indicators widely used for automated morphologies. We develop a decision engine that delegates tasks between human and machine and demonstrate that the combined system provides a factor of 11.4 increase in the classification rate, classifying 210,803 galaxies in just 32 days of GZ2 project time with 93.1% accuracy. As the Random Forest algorithm requires a minimal amount of computational cost, this result has important implications for galaxy morphology identification tasks in the era of Euclid and other large-scale surveys.
A new classification scheme of plastic wastes based upon recycling labels
DOE Office of Scientific and Technical Information (OSTI.GOV)
Özkan, Kemal, E-mail: kozkan@ogu.edu.tr; Ergin, Semih, E-mail: sergin@ogu.edu.tr; Işık, Şahin, E-mail: sahini@ogu.edu.tr
Highlights: • PET, HPDE or PP types of plastics are considered. • An automated classification of plastic bottles based on the feature extraction and classification methods is performed. • The decision mechanism consists of PCA, Kernel PCA, FLDA, SVD and Laplacian Eigenmaps methods. • SVM is selected to achieve the classification task and majority voting technique is used. - Abstract: Since recycling of materials is widely assumed to be environmentally and economically beneficial, reliable sorting and processing of waste packaging materials such as plastics is very important for recycling with high efficiency. An automated system that can quickly categorize thesemore » materials is certainly needed for obtaining maximum classification while maintaining high throughput. In this paper, first of all, the photographs of the plastic bottles have been taken and several preprocessing steps were carried out. The first preprocessing step is to extract the plastic area of a bottle from the background. Then, the morphological image operations are implemented. These operations are edge detection, noise removal, hole removing, image enhancement, and image segmentation. These morphological operations can be generally defined in terms of the combinations of erosion and dilation. The effect of bottle color as well as label are eliminated using these operations. Secondly, the pixel-wise intensity values of the plastic bottle images have been used together with the most popular subspace and statistical feature extraction methods to construct the feature vectors in this study. Only three types of plastics are considered due to higher existence ratio of them than the other plastic types in the world. The decision mechanism consists of five different feature extraction methods including as Principal Component Analysis (PCA), Kernel PCA (KPCA), Fisher’s Linear Discriminant Analysis (FLDA), Singular Value Decomposition (SVD) and Laplacian Eigenmaps (LEMAP) and uses a simple experimental setup with a camera and homogenous backlighting. Due to the giving global solution for a classification problem, Support Vector Machine (SVM) is selected to achieve the classification task and majority voting technique is used as the decision mechanism. This technique equally weights each classification result and assigns the given plastic object to the class that the most classification results agree on. The proposed classification scheme provides high accuracy rate, and also it is able to run in real-time applications. It can automatically classify the plastic bottle types with approximately 90% recognition accuracy. Besides this, the proposed methodology yields approximately 96% classification rate for the separation of PET or non-PET plastic types. It also gives 92% accuracy for the categorization of non-PET plastic types into HPDE or PP.« less
Domínguez, Rocio Berenice; Moreno-Barón, Laura; Muñoz, Roberto; Gutiérrez, Juan Manuel
2014-01-01
This paper describes a new method based on a voltammetric electronic tongue (ET) for the recognition of distinctive features in coffee samples. An ET was directly applied to different samples from the main Mexican coffee regions without any pretreatment before the analysis. The resulting electrochemical information was modeled with two different mathematical tools, namely Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM). Growing conditions (i.e., organic or non-organic practices and altitude of crops) were considered for a first classification. LDA results showed an average discrimination rate of 88% ± 6.53% while SVM successfully accomplished an overall accuracy of 96.4% ± 3.50% for the same task. A second classification based on geographical origin of samples was carried out. Results showed an overall accuracy of 87.5% ± 7.79% for LDA and a superior performance of 97.5% ± 3.22% for SVM. Given the complexity of coffee samples, the high accuracy percentages achieved by ET coupled with SVM in both classification problems suggested a potential applicability of ET in the assessment of selected coffee features with a simpler and faster methodology along with a null sample pretreatment. In addition, the proposed method can be applied to authentication assessment while improving cost, time and accuracy of the general procedure. PMID:25254303
Domínguez, Rocio Berenice; Moreno-Barón, Laura; Muñoz, Roberto; Gutiérrez, Juan Manuel
2014-09-24
This paper describes a new method based on a voltammetric electronic tongue (ET) for the recognition of distinctive features in coffee samples. An ET was directly applied to different samples from the main Mexican coffee regions without any pretreatment before the analysis. The resulting electrochemical information was modeled with two different mathematical tools, namely Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM). Growing conditions (i.e., organic or non-organic practices and altitude of crops) were considered for a first classification. LDA results showed an average discrimination rate of 88% ± 6.53% while SVM successfully accomplished an overall accuracy of 96.4% ± 3.50% for the same task. A second classification based on geographical origin of samples was carried out. Results showed an overall accuracy of 87.5% ± 7.79% for LDA and a superior performance of 97.5% ± 3.22% for SVM. Given the complexity of coffee samples, the high accuracy percentages achieved by ET coupled with SVM in both classification problems suggested a potential applicability of ET in the assessment of selected coffee features with a simpler and faster methodology along with a null sample pretreatment. In addition, the proposed method can be applied to authentication assessment while improving cost, time and accuracy of the general procedure.
NASA Astrophysics Data System (ADS)
Schudlo, Larissa C.; Chau, Tom
2015-12-01
Objective. The majority of near-infrared spectroscopy (NIRS) brain-computer interface (BCI) studies have investigated binary classification problems. Limited work has considered differentiation of more than two mental states, or multi-class differentiation of higher-level cognitive tasks using measurements outside of the anterior prefrontal cortex. Improvements in accuracies are needed to deliver effective communication with a multi-class NIRS system. We investigated the feasibility of a ternary NIRS-BCI that supports mental states corresponding to verbal fluency task (VFT) performance, Stroop task performance, and unconstrained rest using prefrontal and parietal measurements. Approach. Prefrontal and parietal NIRS signals were acquired from 11 able-bodied adults during rest and performance of the VFT or Stroop task. Classification was performed offline using bagging with a linear discriminant base classifier trained on a 10 dimensional feature set. Main results. VFT, Stroop task and rest were classified at an average accuracy of 71.7% ± 7.9%. The ternary classification system provided a statistically significant improvement in information transfer rate relative to a binary system controlled by either mental task (0.87 ± 0.35 bits/min versus 0.73 ± 0.24 bits/min). Significance. These results suggest that effective communication can be achieved with a ternary NIRS-BCI that supports VFT, Stroop task and rest via measurements from the frontal and parietal cortices. Further development of such a system is warranted. Accurate ternary classification can enhance communication rates offered by NIRS-BCIs, improving the practicality of this technology.
NASA Astrophysics Data System (ADS)
Wang, Bingjie; Pi, Shaohua; Sun, Qi; Jia, Bo
2015-05-01
An improved classification algorithm that considers multiscale wavelet packet Shannon entropy is proposed. Decomposition coefficients at all levels are obtained to build the initial Shannon entropy feature vector. After subtracting the Shannon entropy map of the background signal, components of the strongest discriminating power in the initial feature vector are picked out to rebuild the Shannon entropy feature vector, which is transferred to radial basis function (RBF) neural network for classification. Four types of man-made vibrational intrusion signals are recorded based on a modified Sagnac interferometer. The performance of the improved classification algorithm has been evaluated by the classification experiments via RBF neural network under different diffusion coefficients. An 85% classification accuracy rate is achieved, which is higher than the other common algorithms. The classification results show that this improved classification algorithm can be used to classify vibrational intrusion signals in an automatic real-time monitoring system.
Porter, Teresita M.; Golding, G. Brian
2012-01-01
Nuclear large subunit ribosomal DNA is widely used in fungal phylogenetics and to an increasing extent also amplicon-based environmental sequencing. The relatively short reads produced by next-generation sequencing, however, makes primer choice and sequence error important variables for obtaining accurate taxonomic classifications. In this simulation study we tested the performance of three classification methods: 1) a similarity-based method (BLAST + Metagenomic Analyzer, MEGAN); 2) a composition-based method (Ribosomal Database Project naïve Bayesian classifier, NBC); and, 3) a phylogeny-based method (Statistical Assignment Package, SAP). We also tested the effects of sequence length, primer choice, and sequence error on classification accuracy and perceived community composition. Using a leave-one-out cross validation approach, results for classifications to the genus rank were as follows: BLAST + MEGAN had the lowest error rate and was particularly robust to sequence error; SAP accuracy was highest when long LSU query sequences were classified; and, NBC runs significantly faster than the other tested methods. All methods performed poorly with the shortest 50–100 bp sequences. Increasing simulated sequence error reduced classification accuracy. Community shifts were detected due to sequence error and primer selection even though there was no change in the underlying community composition. Short read datasets from individual primers, as well as pooled datasets, appear to only approximate the true community composition. We hope this work informs investigators of some of the factors that affect the quality and interpretation of their environmental gene surveys. PMID:22558215
Land-cover classification in a moist tropical region of Brazil with Landsat TM imagery.
Li, Guiying; Lu, Dengsheng; Moran, Emilio; Hetrick, Scott
2011-01-01
This research aims to improve land-cover classification accuracy in a moist tropical region in Brazil by examining the use of different remote sensing-derived variables and classification algorithms. Different scenarios based on Landsat Thematic Mapper (TM) spectral data and derived vegetation indices and textural images, and different classification algorithms - maximum likelihood classification (MLC), artificial neural network (ANN), classification tree analysis (CTA), and object-based classification (OBC), were explored. The results indicated that a combination of vegetation indices as extra bands into Landsat TM multispectral bands did not improve the overall classification performance, but the combination of textural images was valuable for improving vegetation classification accuracy. In particular, the combination of both vegetation indices and textural images into TM multispectral bands improved overall classification accuracy by 5.6% and kappa coefficient by 6.25%. Comparison of the different classification algorithms indicated that CTA and ANN have poor classification performance in this research, but OBC improved primary forest and pasture classification accuracies. This research indicates that use of textural images or use of OBC are especially valuable for improving the vegetation classes such as upland and liana forest classes having complex stand structures and having relatively large patch sizes.
Land-cover classification in a moist tropical region of Brazil with Landsat TM imagery
LI, GUIYING; LU, DENGSHENG; MORAN, EMILIO; HETRICK, SCOTT
2011-01-01
This research aims to improve land-cover classification accuracy in a moist tropical region in Brazil by examining the use of different remote sensing-derived variables and classification algorithms. Different scenarios based on Landsat Thematic Mapper (TM) spectral data and derived vegetation indices and textural images, and different classification algorithms – maximum likelihood classification (MLC), artificial neural network (ANN), classification tree analysis (CTA), and object-based classification (OBC), were explored. The results indicated that a combination of vegetation indices as extra bands into Landsat TM multispectral bands did not improve the overall classification performance, but the combination of textural images was valuable for improving vegetation classification accuracy. In particular, the combination of both vegetation indices and textural images into TM multispectral bands improved overall classification accuracy by 5.6% and kappa coefficient by 6.25%. Comparison of the different classification algorithms indicated that CTA and ANN have poor classification performance in this research, but OBC improved primary forest and pasture classification accuracies. This research indicates that use of textural images or use of OBC are especially valuable for improving the vegetation classes such as upland and liana forest classes having complex stand structures and having relatively large patch sizes. PMID:22368311
Classification of urine sediment based on convolution neural network
NASA Astrophysics Data System (ADS)
Pan, Jingjing; Jiang, Cunbo; Zhu, Tiantian
2018-04-01
By designing a new convolution neural network framework, this paper breaks the constraints of the original convolution neural network framework requiring large training samples and samples of the same size. Move and cropping the input images, generate the same size of the sub-graph. And then, the generated sub-graph uses the method of dropout, increasing the diversity of samples and preventing the fitting generation. Randomly select some proper subset in the sub-graphic set and ensure that the number of elements in the proper subset is same and the proper subset is not the same. The proper subsets are used as input layers for the convolution neural network. Through the convolution layer, the pooling, the full connection layer and output layer, we can obtained the classification loss rate of test set and training set. In the red blood cells, white blood cells, calcium oxalate crystallization classification experiment, the classification accuracy rate of 97% or more.
Kilavuz, Ahmet Erdem; Songu, Murat; İmre, Abdulkadir; Arslanoğlu, Secil; Özkul, Yilmaz; Pinar, Ercan; Ateş, Düzgün
2018-05-01
The accuracy of fine-needle aspiration biopsy (FNAB) is controversial in parotid tumors. We aimed to compare FNAB results with the final histopathological diagnosis and to apply the "Sal classification" to our data and discuss its results and its place in parotid gland cytology. The FNAB cytological findings and final histological diagnosis were assessed retrospectively in 2 different scenarios based on the distribution of nondefinitive cytology, and we applied the Sal classification and determined malignancy rate, sensitivity, and specificity for each category. In 2 different scenarios FNAB sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were found to be 81%, 87%, 54.7%, and 96.1%; and 65.3%, 100%, 100%, and 96.1%, respectively. The malignancy rates and sensitivity and specificity were also calculated and discussed for each Sal category. We believe that the Sal classification has a great potential to be a useful tool in classification of parotid gland cytology. © 2018 Wiley Periodicals, Inc.
ERIC Educational Resources Information Center
Greve, Kevin W.; Springer, Steven; Bianchini, Kevin J.; Black, F. William; Heinly, Matthew T.; Love, Jeffrey M.; Swift, Douglas A.; Ciota, Megan A.
2007-01-01
This study examined the sensitivity and false-positive error rate of reliable digit span (RDS) and the WAIS-III Digit Span (DS) scaled score in persons alleging toxic exposure and determined whether error rates differed from published rates in traumatic brain injury (TBI) and chronic pain (CP). Data were obtained from the files of 123 persons…
Classification of right-hand grasp movement based on EMOTIV Epoc+
NASA Astrophysics Data System (ADS)
Tobing, T. A. M. L.; Prawito, Wijaya, S. K.
2017-07-01
Combinations of BCT elements for right-hand grasp movement have been obtained, providing the average value of their classification accuracy. The aim of this study is to find a suitable combination for best classification accuracy of right-hand grasp movement based on EEG headset, EMOTIV Epoc+. There are three movement classifications: grasping hand, relax, and opening hand. These classifications take advantage of Event-Related Desynchronization (ERD) phenomenon that makes it possible to differ relaxation, imagery, and movement state from each other. The combinations of elements are the usage of Independent Component Analysis (ICA), spectrum analysis by Fast Fourier Transform (FFT), maximum mu and beta power with their frequency as features, and also classifier Probabilistic Neural Network (PNN) and Radial Basis Function (RBF). The average values of classification accuracy are ± 83% for training and ± 57% for testing. To have a better understanding of the signal quality recorded by EMOTIV Epoc+, the result of classification accuracy of left or right-hand grasping movement EEG signal (provided by Physionet) also be given, i.e.± 85% for training and ± 70% for testing. The comparison of accuracy value from each combination, experiment condition, and external EEG data are provided for the purpose of value analysis of classification accuracy.
Classification of breast cancer cytological specimen using convolutional neural network
NASA Astrophysics Data System (ADS)
Żejmo, Michał; Kowal, Marek; Korbicz, Józef; Monczak, Roman
2017-01-01
The paper presents a deep learning approach for automatic classification of breast tumors based on fine needle cytology. The main aim of the system is to distinguish benign from malignant cases based on microscopic images. Experiment was carried out on cytological samples derived from 50 patients (25 benign cases + 25 malignant cases) diagnosed in Regional Hospital in Zielona Góra. To classify microscopic images, we used convolutional neural networks (CNN) of two types: GoogLeNet and AlexNet. Due to the very large size of images of cytological specimen (on average 200000 × 100000 pixels), they were divided into smaller patches of size 256 × 256 pixels. Breast cancer classification usually is based on morphometric features of nuclei. Therefore, training and validation patches were selected using Support Vector Machine (SVM) so that suitable amount of cell material was depicted. Neural classifiers were tuned using GPU accelerated implementation of gradient descent algorithm. Training error was defined as a cross-entropy classification loss. Classification accuracy was defined as the percentage ratio of successfully classified validation patches to the total number of validation patches. The best accuracy rate of 83% was obtained by GoogLeNet model. We observed that more misclassified patches belong to malignant cases.
Moncada-Torres, A; Leuenberger, K; Gonzenbach, R; Luft, A; Gassert, R
2014-07-01
Miniature, wearable sensor modules are a promising technology to monitor activities of daily living (ADL) over extended periods of time. To assure both user compliance and meaningful results, the selection and placement site of sensors requires careful consideration. We investigated these aspects for the classification of 16 ADL in 6 healthy subjects under laboratory conditions using ReSense, our custom-made inertial measurement unit enhanced with a barometric pressure sensor used to capture activity-related altitude changes. Subjects wore a module on each wrist and ankle, and one on the trunk. Activities comprised whole body movements as well as gross and dextrous upper-limb activities. Wrist-module data outperformed the other locations for the three activity groups. Specifically, overall classification accuracy rates of almost 93% and more than 95% were achieved for the repeated holdout and user-specific validation methods, respectively, for all 16 activities. Including the altitude profile resulted in a considerable improvement of up to 20% in the classification accuracy for stair ascent and descent. The gyroscopes provided no useful information for activity classification under this scheme. The proposed sensor setting could allow for robust long-term activity monitoring with high compliance in different patient populations.
Jane, Nancy Yesudhas; Nehemiah, Khanna Harichandran; Arputharaj, Kannan
2016-01-01
Clinical time-series data acquired from electronic health records (EHR) are liable to temporal complexities such as irregular observations, missing values and time constrained attributes that make the knowledge discovery process challenging. This paper presents a temporal rough set induced neuro-fuzzy (TRiNF) mining framework that handles these complexities and builds an effective clinical decision-making system. TRiNF provides two functionalities namely temporal data acquisition (TDA) and temporal classification. In TDA, a time-series forecasting model is constructed by adopting an improved double exponential smoothing method. The forecasting model is used in missing value imputation and temporal pattern extraction. The relevant attributes are selected using a temporal pattern based rough set approach. In temporal classification, a classification model is built with the selected attributes using a temporal pattern induced neuro-fuzzy classifier. For experimentation, this work uses two clinical time series dataset of hepatitis and thrombosis patients. The experimental result shows that with the proposed TRiNF framework, there is a significant reduction in the error rate, thereby obtaining the classification accuracy on an average of 92.59% for hepatitis and 91.69% for thrombosis dataset. The obtained classification results prove the efficiency of the proposed framework in terms of its improved classification accuracy.
Qureshi, Muhammad Naveed Iqbal; Min, Beomjun; Jo, Hang Joon; Lee, Boreom
2016-01-01
The classification of neuroimaging data for the diagnosis of certain brain diseases is one of the main research goals of the neuroscience and clinical communities. In this study, we performed multiclass classification using a hierarchical extreme learning machine (H-ELM) classifier. We compared the performance of this classifier with that of a support vector machine (SVM) and basic extreme learning machine (ELM) for cortical MRI data from attention deficit/hyperactivity disorder (ADHD) patients. We used 159 structural MRI images of children from the publicly available ADHD-200 MRI dataset. The data consisted of three types, namely, typically developing (TDC), ADHD-inattentive (ADHD-I), and ADHD-combined (ADHD-C). We carried out feature selection by using standard SVM-based recursive feature elimination (RFE-SVM) that enabled us to achieve good classification accuracy (60.78%). In this study, we found the RFE-SVM feature selection approach in combination with H-ELM to effectively enable the acquisition of high multiclass classification accuracy rates for structural neuroimaging data. In addition, we found that the most important features for classification were the surface area of the superior frontal lobe, and the cortical thickness, volume, and mean surface area of the whole cortex. PMID:27500640
Qureshi, Muhammad Naveed Iqbal; Min, Beomjun; Jo, Hang Joon; Lee, Boreom
2016-01-01
The classification of neuroimaging data for the diagnosis of certain brain diseases is one of the main research goals of the neuroscience and clinical communities. In this study, we performed multiclass classification using a hierarchical extreme learning machine (H-ELM) classifier. We compared the performance of this classifier with that of a support vector machine (SVM) and basic extreme learning machine (ELM) for cortical MRI data from attention deficit/hyperactivity disorder (ADHD) patients. We used 159 structural MRI images of children from the publicly available ADHD-200 MRI dataset. The data consisted of three types, namely, typically developing (TDC), ADHD-inattentive (ADHD-I), and ADHD-combined (ADHD-C). We carried out feature selection by using standard SVM-based recursive feature elimination (RFE-SVM) that enabled us to achieve good classification accuracy (60.78%). In this study, we found the RFE-SVM feature selection approach in combination with H-ELM to effectively enable the acquisition of high multiclass classification accuracy rates for structural neuroimaging data. In addition, we found that the most important features for classification were the surface area of the superior frontal lobe, and the cortical thickness, volume, and mean surface area of the whole cortex.
Youn, Su Hyun; Sim, Taeyong; Choi, Ahnryul; Song, Jinsung; Shin, Ki Young; Lee, Il Kwon; Heo, Hyun Mu; Lee, Daeweon; Mun, Joung Hwan
2015-06-01
Ultrasonic surgical units (USUs) have the advantage of minimizing tissue damage during surgeries that require tissue dissection by reducing problems such as coagulation and unwanted carbonization, but the disadvantage of requiring manual adjustment of power output according to the target tissue. In order to overcome this limitation, it is necessary to determine the properties of in vivo tissues automatically. We propose a multi-classifier that can accurately classify tissues based on the unique impedance of each tissue. For this purpose, a multi-classifier was built based on single classifiers with high classification rates, and the classification accuracy of the proposed model was compared with that of single classifiers for various electrode types (Type-I: 6 mm invasive; Type-II: 3 mm invasive; Type-III: surface). The sensitivity and positive predictive value (PPV) of the multi-classifier by cross checks were determined. According to the 10-fold cross validation results, the classification accuracy of the proposed model was significantly higher (p<0.05 or <0.01) than that of existing single classifiers for all electrode types. In particular, the classification accuracy of the proposed model was highest when the 3mm invasive electrode (Type-II) was used (sensitivity=97.33-100.00%; PPV=96.71-100.00%). The results of this study are an important contribution to achieving automatic optimal output power adjustment of USUs according to the properties of individual tissues. Copyright © 2015 Elsevier Ltd. All rights reserved.
Voice based gender classification using machine learning
NASA Astrophysics Data System (ADS)
Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.
2017-11-01
Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.
A new classification scheme of plastic wastes based upon recycling labels.
Özkan, Kemal; Ergin, Semih; Işık, Şahin; Işıklı, Idil
2015-01-01
Since recycling of materials is widely assumed to be environmentally and economically beneficial, reliable sorting and processing of waste packaging materials such as plastics is very important for recycling with high efficiency. An automated system that can quickly categorize these materials is certainly needed for obtaining maximum classification while maintaining high throughput. In this paper, first of all, the photographs of the plastic bottles have been taken and several preprocessing steps were carried out. The first preprocessing step is to extract the plastic area of a bottle from the background. Then, the morphological image operations are implemented. These operations are edge detection, noise removal, hole removing, image enhancement, and image segmentation. These morphological operations can be generally defined in terms of the combinations of erosion and dilation. The effect of bottle color as well as label are eliminated using these operations. Secondly, the pixel-wise intensity values of the plastic bottle images have been used together with the most popular subspace and statistical feature extraction methods to construct the feature vectors in this study. Only three types of plastics are considered due to higher existence ratio of them than the other plastic types in the world. The decision mechanism consists of five different feature extraction methods including as Principal Component Analysis (PCA), Kernel PCA (KPCA), Fisher's Linear Discriminant Analysis (FLDA), Singular Value Decomposition (SVD) and Laplacian Eigenmaps (LEMAP) and uses a simple experimental setup with a camera and homogenous backlighting. Due to the giving global solution for a classification problem, Support Vector Machine (SVM) is selected to achieve the classification task and majority voting technique is used as the decision mechanism. This technique equally weights each classification result and assigns the given plastic object to the class that the most classification results agree on. The proposed classification scheme provides high accuracy rate, and also it is able to run in real-time applications. It can automatically classify the plastic bottle types with approximately 90% recognition accuracy. Besides this, the proposed methodology yields approximately 96% classification rate for the separation of PET or non-PET plastic types. It also gives 92% accuracy for the categorization of non-PET plastic types into HPDE or PP. Copyright © 2014 Elsevier Ltd. All rights reserved.
Pan, Jianjun
2018-01-01
This paper focuses on evaluating the ability and contribution of using backscatter intensity, texture, coherence, and color features extracted from Sentinel-1A data for urban land cover classification and comparing different multi-sensor land cover mapping methods to improve classification accuracy. Both Landsat-8 OLI and Hyperion images were also acquired, in combination with Sentinel-1A data, to explore the potential of different multi-sensor urban land cover mapping methods to improve classification accuracy. The classification was performed using a random forest (RF) method. The results showed that the optimal window size of the combination of all texture features was 9 × 9, and the optimal window size was different for each individual texture feature. For the four different feature types, the texture features contributed the most to the classification, followed by the coherence and backscatter intensity features; and the color features had the least impact on the urban land cover classification. Satisfactory classification results can be obtained using only the combination of texture and coherence features, with an overall accuracy up to 91.55% and a kappa coefficient up to 0.8935, respectively. Among all combinations of Sentinel-1A-derived features, the combination of the four features had the best classification result. Multi-sensor urban land cover mapping obtained higher classification accuracy. The combination of Sentinel-1A and Hyperion data achieved higher classification accuracy compared to the combination of Sentinel-1A and Landsat-8 OLI images, with an overall accuracy of up to 99.12% and a kappa coefficient up to 0.9889. When Sentinel-1A data was added to Hyperion images, the overall accuracy and kappa coefficient were increased by 4.01% and 0.0519, respectively. PMID:29382073
Marciano, Michael A; Adelman, Jonathan D
2017-03-01
The deconvolution of DNA mixtures remains one of the most critical challenges in the field of forensic DNA analysis. In addition, of all the data features required to perform such deconvolution, the number of contributors in the sample is widely considered the most important, and, if incorrectly chosen, the most likely to negatively influence the mixture interpretation of a DNA profile. Unfortunately, most current approaches to mixture deconvolution require the assumption that the number of contributors is known by the analyst, an assumption that can prove to be especially faulty when faced with increasingly complex mixtures of 3 or more contributors. In this study, we propose a probabilistic approach for estimating the number of contributors in a DNA mixture that leverages the strengths of machine learning. To assess this approach, we compare classification performances of six machine learning algorithms and evaluate the model from the top-performing algorithm against the current state of the art in the field of contributor number classification. Overall results show over 98% accuracy in identifying the number of contributors in a DNA mixture of up to 4 contributors. Comparative results showed 3-person mixtures had a classification accuracy improvement of over 6% compared to the current best-in-field methodology, and that 4-person mixtures had a classification accuracy improvement of over 20%. The Probabilistic Assessment for Contributor Estimation (PACE) also accomplishes classification of mixtures of up to 4 contributors in less than 1s using a standard laptop or desktop computer. Considering the high classification accuracy rates, as well as the significant time commitment required by the current state of the art model versus seconds required by a machine learning-derived model, the approach described herein provides a promising means of estimating the number of contributors and, subsequently, will lead to improved DNA mixture interpretation. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
A Novel Energy-Efficient Approach for Human Activity Recognition
Zheng, Lingxiang; Wu, Dihong; Ruan, Xiaoyang; Weng, Shaolin; Tang, Biyu; Lu, Hai; Shi, Haibin
2017-01-01
In this paper, we propose a novel energy-efficient approach for mobile activity recognition system (ARS) to detect human activities. The proposed energy-efficient ARS, using low sampling rates, can achieve high recognition accuracy and low energy consumption. A novel classifier that integrates hierarchical support vector machine and context-based classification (HSVMCC) is presented to achieve a high accuracy of activity recognition when the sampling rate is less than the activity frequency, i.e., the Nyquist sampling theorem is not satisfied. We tested the proposed energy-efficient approach with the data collected from 20 volunteers (14 males and six females) and the average recognition accuracy of around 96.0% was achieved. Results show that using a low sampling rate of 1Hz can save 17.3% and 59.6% of energy compared with the sampling rates of 5 Hz and 50 Hz. The proposed low sampling rate approach can greatly reduce the power consumption while maintaining high activity recognition accuracy. The composition of power consumption in online ARS is also investigated in this paper. PMID:28885560
Chai, Rifai; Naik, Ganesh R; Ling, Sai Ho; Nguyen, Hung T
2017-01-07
One of the key challenges of the biomedical cyber-physical system is to combine cognitive neuroscience with the integration of physical systems to assist people with disabilities. Electroencephalography (EEG) has been explored as a non-invasive method of providing assistive technology by using brain electrical signals. This paper presents a unique prototype of a hybrid brain computer interface (BCI) which senses a combination classification of mental task, steady state visual evoked potential (SSVEP) and eyes closed detection using only two EEG channels. In addition, a microcontroller based head-mounted battery-operated wireless EEG sensor combined with a separate embedded system is used to enhance portability, convenience and cost effectiveness. This experiment has been conducted with five healthy participants and five patients with tetraplegia. Generally, the results show comparable classification accuracies between healthy subjects and tetraplegia patients. For the offline artificial neural network classification for the target group of patients with tetraplegia, the hybrid BCI system combines three mental tasks, three SSVEP frequencies and eyes closed, with average classification accuracy at 74% and average information transfer rate (ITR) of the system of 27 bits/min. For the real-time testing of the intentional signal on patients with tetraplegia, the average success rate of detection is 70% and the speed of detection varies from 2 to 4 s.
The effect of call libraries and acoustic filters on the identification of bat echolocation.
Clement, Matthew J; Murray, Kevin L; Solick, Donald I; Gruver, Jeffrey C
2014-09-01
Quantitative methods for species identification are commonly used in acoustic surveys for animals. While various identification models have been studied extensively, there has been little study of methods for selecting calls prior to modeling or methods for validating results after modeling. We obtained two call libraries with a combined 1556 pulse sequences from 11 North American bat species. We used four acoustic filters to automatically select and quantify bat calls from the combined library. For each filter, we trained a species identification model (a quadratic discriminant function analysis) and compared the classification ability of the models. In a separate analysis, we trained a classification model using just one call library. We then compared a conventional model assessment that used the training library against an alternative approach that used the second library. We found that filters differed in the share of known pulse sequences that were selected (68 to 96%), the share of non-bat noises that were excluded (37 to 100%), their measurement of various pulse parameters, and their overall correct classification rate (41% to 85%). Although the top two filters did not differ significantly in overall correct classification rate (85% and 83%), rates differed significantly for some bat species. In our assessment of call libraries, overall correct classification rates were significantly lower (15% to 23% lower) when tested on the second call library instead of the training library. Well-designed filters obviated the need for subjective and time-consuming manual selection of pulses. Accordingly, researchers should carefully design and test filters and include adequate descriptions in publications. Our results also indicate that it may not be possible to extend inferences about model accuracy beyond the training library. If so, the accuracy of acoustic-only surveys may be lower than commonly reported, which could affect ecological understanding or management decisions based on acoustic surveys.
The effect of call libraries and acoustic filters on the identification of bat echolocation
Clement, Matthew J; Murray, Kevin L; Solick, Donald I; Gruver, Jeffrey C
2014-01-01
Quantitative methods for species identification are commonly used in acoustic surveys for animals. While various identification models have been studied extensively, there has been little study of methods for selecting calls prior to modeling or methods for validating results after modeling. We obtained two call libraries with a combined 1556 pulse sequences from 11 North American bat species. We used four acoustic filters to automatically select and quantify bat calls from the combined library. For each filter, we trained a species identification model (a quadratic discriminant function analysis) and compared the classification ability of the models. In a separate analysis, we trained a classification model using just one call library. We then compared a conventional model assessment that used the training library against an alternative approach that used the second library. We found that filters differed in the share of known pulse sequences that were selected (68 to 96%), the share of non-bat noises that were excluded (37 to 100%), their measurement of various pulse parameters, and their overall correct classification rate (41% to 85%). Although the top two filters did not differ significantly in overall correct classification rate (85% and 83%), rates differed significantly for some bat species. In our assessment of call libraries, overall correct classification rates were significantly lower (15% to 23% lower) when tested on the second call library instead of the training library. Well-designed filters obviated the need for subjective and time-consuming manual selection of pulses. Accordingly, researchers should carefully design and test filters and include adequate descriptions in publications. Our results also indicate that it may not be possible to extend inferences about model accuracy beyond the training library. If so, the accuracy of acoustic-only surveys may be lower than commonly reported, which could affect ecological understanding or management decisions based on acoustic surveys. PMID:25535563
The effect of call libraries and acoustic filters on the identification of bat echolocation
Clement, Matthew; Murray, Kevin L; Solick, Donald I; Gruver, Jeffrey C
2014-01-01
Quantitative methods for species identification are commonly used in acoustic surveys for animals. While various identification models have been studied extensively, there has been little study of methods for selecting calls prior to modeling or methods for validating results after modeling. We obtained two call libraries with a combined 1556 pulse sequences from 11 North American bat species. We used four acoustic filters to automatically select and quantify bat calls from the combined library. For each filter, we trained a species identification model (a quadratic discriminant function analysis) and compared the classification ability of the models. In a separate analysis, we trained a classification model using just one call library. We then compared a conventional model assessment that used the training library against an alternative approach that used the second library. We found that filters differed in the share of known pulse sequences that were selected (68 to 96%), the share of non-bat noises that were excluded (37 to 100%), their measurement of various pulse parameters, and their overall correct classification rate (41% to 85%). Although the top two filters did not differ significantly in overall correct classification rate (85% and 83%), rates differed significantly for some bat species. In our assessment of call libraries, overall correct classification rates were significantly lower (15% to 23% lower) when tested on the second call library instead of the training library. Well-designed filters obviated the need for subjective and time-consuming manual selection of pulses. Accordingly, researchers should carefully design and test filters and include adequate descriptions in publications. Our results also indicate that it may not be possible to extend inferences about model accuracy beyond the training library. If so, the accuracy of acoustic-only surveys may be lower than commonly reported, which could affect ecological understanding or management decisions based on acoustic surveys.
Estimating Classification Consistency and Accuracy for Cognitive Diagnostic Assessment
ERIC Educational Resources Information Center
Cui, Ying; Gierl, Mark J.; Chang, Hua-Hua
2012-01-01
This article introduces procedures for the computation and asymptotic statistical inference for classification consistency and accuracy indices specifically designed for cognitive diagnostic assessments. The new classification indices can be used as important indicators of the reliability and validity of classification results produced by…
The effect of finite field size on classification and atmospheric correction
NASA Technical Reports Server (NTRS)
Kaufman, Y. J.; Fraser, R. S.
1981-01-01
The atmospheric effect on the upward radiance of sunlight scattered from the Earth-atmosphere system is strongly influenced by the contrasts between fields and their sizes. For a given atmospheric turbidity, the atmospheric effect on classification of surface features is much stronger for nonuniform surfaces than for uniform surfaces. Therefore, the classification accuracy of agricultural fields and urban areas is dependent not only on the optical characteristics of the atmosphere, but also on the size of the surface do not account for the nonuniformity of the surface have only a slight effect on the classification accuracy; in other cases the classification accuracy descreases. The radiances above finite fields were computed to simulate radiances measured by a satellite. A simulation case including 11 agricultural fields and four natural fields (water, soil, savanah, and forest) was used to test the effect of the size of the background reflectance and the optical thickness of the atmosphere on classification accuracy. It is concluded that new atmospheric correction methods, which take into account the finite size of the fields, have to be developed to improve significantly the classification accuracy.
Yang, Xiaoyan; Chen, Longgao; Li, Yingkui; Xi, Wenjia; Chen, Longqian
2015-07-01
Land use/land cover (LULC) inventory provides an important dataset in regional planning and environmental assessment. To efficiently obtain the LULC inventory, we compared the LULC classifications based on single satellite imagery with a rule-based classification based on multi-seasonal imagery in Lianyungang City, a coastal city in China, using CBERS-02 (the 2nd China-Brazil Environmental Resource Satellites) images. The overall accuracies of the classification based on single imagery are 78.9, 82.8, and 82.0% in winter, early summer, and autumn, respectively. The rule-based classification improves the accuracy to 87.9% (kappa 0.85), suggesting that combining multi-seasonal images can considerably improve the classification accuracy over any single image-based classification. This method could also be used to analyze seasonal changes of LULC types, especially for those associated with tidal changes in coastal areas. The distribution and inventory of LULC types with an overall accuracy of 87.9% and a spatial resolution of 19.5 m can assist regional planning and environmental assessment efficiently in Lianyungang City. This rule-based classification provides a guidance to improve accuracy for coastal areas with distinct LULC temporal spectral features.
Learning optimal embedded cascades.
Saberian, Mohammad Javad; Vasconcelos, Nuno
2012-10-01
The problem of automatic and optimal design of embedded object detector cascades is considered. Two main challenges are identified: optimization of the cascade configuration and optimization of individual cascade stages, so as to achieve the best tradeoff between classification accuracy and speed, under a detection rate constraint. Two novel boosting algorithms are proposed to address these problems. The first, RCBoost, formulates boosting as a constrained optimization problem which is solved with a barrier penalty method. The constraint is the target detection rate, which is met at all iterations of the boosting process. This enables the design of embedded cascades of known configuration without extensive cross validation or heuristics. The second, ECBoost, searches over cascade configurations to achieve the optimal tradeoff between classification risk and speed. The two algorithms are combined into an overall boosting procedure, RCECBoost, which optimizes both the cascade configuration and its stages under a detection rate constraint, in a fully automated manner. Extensive experiments in face, car, pedestrian, and panda detection show that the resulting detectors achieve an accuracy versus speed tradeoff superior to those of previous methods.
A novel application of deep learning for single-lead ECG classification.
Mathews, Sherin M; Kambhamettu, Chandra; Barner, Kenneth E
2018-06-04
Detecting and classifying cardiac arrhythmias is critical to the diagnosis of patients with cardiac abnormalities. In this paper, a novel approach based on deep learning methodology is proposed for the classification of single-lead electrocardiogram (ECG) signals. We demonstrate the application of the Restricted Boltzmann Machine (RBM) and deep belief networks (DBN) for ECG classification following detection of ventricular and supraventricular heartbeats using single-lead ECG. The effectiveness of this proposed algorithm is illustrated using real ECG signals from the widely-used MIT-BIH database. Simulation results demonstrate that with a suitable choice of parameters, RBM and DBN can achieve high average recognition accuracies of ventricular ectopic beats (93.63%) and of supraventricular ectopic beats (95.57%) at a low sampling rate of 114 Hz. Experimental results indicate that classifiers built into this deep learning-based framework achieved state-of-the art performance models at lower sampling rates and simple features when compared to traditional methods. Further, employing features extracted at a sampling rate of 114 Hz when combined with deep learning provided enough discriminatory power for the classification task. This performance is comparable to that of traditional methods and uses a much lower sampling rate and simpler features. Thus, our proposed deep neural network algorithm demonstrates that deep learning-based methods offer accurate ECG classification and could potentially be extended to other physiological signal classifications, such as those in arterial blood pressure (ABP), nerve conduction (EMG), and heart rate variability (HRV) studies. Copyright © 2018. Published by Elsevier Ltd.
Correlation-based pattern recognition for implantable defibrillators.
Wilkins, J.
1996-01-01
An estimated 300,000 Americans die each year from cardiac arrhythmias. Historically, drug therapy or surgery were the only treatment options available for patients suffering from arrhythmias. Recently, implantable arrhythmia management devices have been developed. These devices allow abnormal cardiac rhythms to be sensed and corrected in vivo. Proper arrhythmia classification is critical to selecting the appropriate therapeutic intervention. The classification problem is made more challenging by the power/computation constraints imposed by the short battery life of implantable devices. Current devices utilize heart rate-based classification algorithms. Although easy to implement, rate-based approaches have unacceptably high error rates in distinguishing supraventricular tachycardia (SVT) from ventricular tachycardia (VT). Conventional morphology assessment techniques used in ECG analysis often require too much computation to be practical for implantable devices. In this paper, a computationally-efficient, arrhythmia classification architecture using correlation-based morphology assessment is presented. The architecture classifies individuals heart beats by assessing similarity between an incoming cardiac signal vector and a series of prestored class templates. A series of these beat classifications are used to make an overall rhythm assessment. The system makes use of several new results in the field of pattern recognition. The resulting system achieved excellent accuracy in discriminating SVT and VT. PMID:8947674
NASA Technical Reports Server (NTRS)
Cibula, William G.; Nyquist, Maurice O.
1987-01-01
An unsupervised computer classification of vegetation/landcover of Olympic National Park and surrounding environs was initially carried out using four bands of Landsat MSS data. The primary objective of the project was to derive a level of landcover classifications useful for park management applications while maintaining an acceptably high level of classification accuracy. Initially, nine generalized vegetation/landcover classes were derived. Overall classification accuracy was 91.7 percent. In an attempt to refine the level of classification, a geographic information system (GIS) approach was employed. Topographic data and watershed boundaries (inferred precipitation/temperature) data were registered with the Landsat MSS data. The resultant boolean operations yielded 21 vegetation/landcover classes while maintaining the same level of classification accuracy. The final classification provided much better identification and location of the major forest types within the park at the same high level of accuracy, and these met the project objective. This classification could now become inputs into a GIS system to help provide answers to park management coupled with other ancillary data programs such as fire management.
Barua, Shaibal; Begum, Shahina; Ahmed, Mobyen Uddin
2015-01-01
Machine learning algorithms play an important role in computer science research. Recent advancement in sensor data collection in clinical sciences lead to a complex, heterogeneous data processing, and analysis for patient diagnosis and prognosis. Diagnosis and treatment of patients based on manual analysis of these sensor data are difficult and time consuming. Therefore, development of Knowledge-based systems to support clinicians in decision-making is important. However, it is necessary to perform experimental work to compare performances of different machine learning methods to help to select appropriate method for a specific characteristic of data sets. This paper compares classification performance of three popular machine learning methods i.e., case-based reasoning, neutral networks and support vector machine to diagnose stress of vehicle drivers using finger temperature and heart rate variability. The experimental results show that case-based reasoning outperforms other two methods in terms of classification accuracy. Case-based reasoning has achieved 80% and 86% accuracy to classify stress using finger temperature and heart rate variability. On contrary, both neural network and support vector machine have achieved less than 80% accuracy by using both physiological signals.
NASA Astrophysics Data System (ADS)
Gao, Yan; Marpu, Prashanth; Morales Manila, Luis M.
2014-11-01
This paper assesses the suitability of 8-band Worldview-2 (WV2) satellite data and object-based random forest algorithm for the classification of avocado growth stages in Mexico. We tested both pixel-based with minimum distance (MD) and maximum likelihood (MLC) and object-based with Random Forest (RF) algorithm for this task. Training samples and verification data were selected by visual interpreting the WV2 images for seven thematic classes: fully grown, middle stage, and early stage of avocado crops, bare land, two types of natural forests, and water body. To examine the contribution of the four new spectral bands of WV2 sensor, all the tested classifications were carried out with and without the four new spectral bands. Classification accuracy assessment results show that object-based classification with RF algorithm obtained higher overall higher accuracy (93.06%) than pixel-based MD (69.37%) and MLC (64.03%) method. For both pixel-based and object-based methods, the classifications with the four new spectral bands (overall accuracy obtained higher accuracy than those without: overall accuracy of object-based RF classification with vs without: 93.06% vs 83.59%, pixel-based MD: 69.37% vs 67.2%, pixel-based MLC: 64.03% vs 36.05%, suggesting that the four new spectral bands in WV2 sensor contributed to the increase of the classification accuracy.
Automated aural classification used for inter-species discrimination of cetaceans.
Binder, Carolyn M; Hines, Paul C
2014-04-01
Passive acoustic methods are in widespread use to detect and classify cetacean species; however, passive acoustic systems often suffer from large false detection rates resulting from numerous transient sources. To reduce the acoustic analyst workload, automatic recognition methods may be implemented in a two-stage process. First, a general automatic detector is implemented that produces many detections to ensure cetacean presence is noted. Then an automatic classifier is used to significantly reduce the number of false detections and classify the cetacean species. This process requires development of a robust classifier capable of performing inter-species classification. Because human analysts can aurally discriminate species, an automated aural classifier that uses perceptual signal features was tested on a cetacean data set. The classifier successfully discriminated between four species of cetaceans-bowhead, humpback, North Atlantic right, and sperm whales-with 85% accuracy. It also performed well (100% accuracy) for discriminating sperm whale clicks from right whale gunshots. An accuracy of 92% and area under the receiver operating characteristic curve of 0.97 were obtained for the relatively challenging bowhead and humpback recognition case. These results demonstrated that the perceptual features employed by the aural classifier provided powerful discrimination cues for inter-species classification of cetaceans.
Leveraging Long-term Seismic Catalogs for Automated Real-time Event Classification
NASA Astrophysics Data System (ADS)
Linville, L.; Draelos, T.; Pankow, K. L.; Young, C. J.; Alvarez, S.
2017-12-01
We investigate the use of labeled event types available through reviewed seismic catalogs to produce automated event labels on new incoming data from the crustal region spanned by the cataloged events. Using events cataloged by the University of Utah Seismograph Stations between October, 2012 and June, 2017, we calculate the spectrogram for a time window that spans the duration of each event as seen on individual stations, resulting in 110k event spectrograms (50% local earthquakes examples, 50% quarry blasts examples). Using 80% of the randomized example events ( 90k), a classifier is trained to distinguish between local earthquakes and quarry blasts. We explore variations of deep learning classifiers, incorporating elements of convolutional and recurrent neural networks. Using a single-layer Long Short Term Memory recurrent neural network, we achieve 92% accuracy on the classification task on the remaining 20K test examples. Leveraging the decisions from a group of stations that detected the same event by using the median of all classifications in the group increases the model accuracy to 96%. Additional data with equivalent processing from 500 more recently cataloged events (July, 2017), achieves the same accuracy as our test data on both single-station examples and multi-station medians, suggesting that the model can maintain accurate and stable classification rates on real-time automated events local to the University of Utah Seismograph Stations, with potentially minimal levels of re-training through time.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Yongjun; Lim, Jonghyuck; Kim, Namkug
2013-05-15
Purpose: To investigate the effect of using different computed tomography (CT) scanners on the accuracy of high-resolution CT (HRCT) images in classifying regional disease patterns in patients with diffuse lung disease, support vector machine (SVM) and Bayesian classifiers were applied to multicenter data. Methods: Two experienced radiologists marked sets of 600 rectangular 20 Multiplication-Sign 20 pixel regions of interest (ROIs) on HRCT images obtained from two scanners (GE and Siemens), including 100 ROIs for each of local patterns of lungs-normal lung and five of regional pulmonary disease patterns (ground-glass opacity, reticular opacity, honeycombing, emphysema, and consolidation). Each ROI was assessedmore » using 22 quantitative features belonging to one of the following descriptors: histogram, gradient, run-length, gray level co-occurrence matrix, low-attenuation area cluster, and top-hat transform. For automatic classification, a Bayesian classifier and a SVM classifier were compared under three different conditions. First, classification accuracies were estimated using data from each scanner. Next, data from the GE and Siemens scanners were used for training and testing, respectively, and vice versa. Finally, all ROI data were integrated regardless of the scanner type and were then trained and tested together. All experiments were performed based on forward feature selection and fivefold cross-validation with 20 repetitions. Results: For each scanner, better classification accuracies were achieved with the SVM classifier than the Bayesian classifier (92% and 82%, respectively, for the GE scanner; and 92% and 86%, respectively, for the Siemens scanner). The classification accuracies were 82%/72% for training with GE data and testing with Siemens data, and 79%/72% for the reverse. The use of training and test data obtained from the HRCT images of different scanners lowered the classification accuracy compared to the use of HRCT images from the same scanner. For integrated ROI data obtained from both scanners, the classification accuracies with the SVM and Bayesian classifiers were 92% and 77%, respectively. The selected features resulting from the classification process differed by scanner, with more features included for the classification of the integrated HRCT data than for the classification of the HRCT data from each scanner. For the integrated data, consisting of HRCT images of both scanners, the classification accuracy based on the SVM was statistically similar to the accuracy of the data obtained from each scanner. However, the classification accuracy of the integrated data using the Bayesian classifier was significantly lower than the classification accuracy of the ROI data of each scanner. Conclusions: The use of an integrated dataset along with a SVM classifier rather than a Bayesian classifier has benefits in terms of the classification accuracy of HRCT images acquired with more than one scanner. This finding is of relevance in studies involving large number of images, as is the case in a multicenter trial with different scanners.« less
Corn and soybean Landsat MSS classification performance as a function of scene characteristics
NASA Technical Reports Server (NTRS)
Batista, G. T.; Hixson, M. M.; Bauer, M. E.
1982-01-01
In order to fully utilize remote sensing to inventory crop production, it is important to identify the factors that affect the accuracy of Landsat classifications. The objective of this study was to investigate the effect of scene characteristics involving crop, soil, and weather variables on the accuracy of Landsat classifications of corn and soybeans. Segments sampling the U.S. Corn Belt were classified using a Gaussian maximum likelihood classifier on multitemporally registered data from two key acquisition periods. Field size had a strong effect on classification accuracy with small fields tending to have low accuracies even when the effect of mixed pixels was eliminated. Other scene characteristics accounting for variability in classification accuracy included proportions of corn and soybeans, crop diversity index, proportion of all field crops, soil drainage, slope, soil order, long-term average soybean yield, maximum yield, relative position of the segment in the Corn Belt, weather, and crop development stage.
A Classification of Remote Sensing Image Based on Improved Compound Kernels of Svm
NASA Astrophysics Data System (ADS)
Zhao, Jianing; Gao, Wanlin; Liu, Zili; Mou, Guifen; Lu, Lin; Yu, Lina
The accuracy of RS classification based on SVM which is developed from statistical learning theory is high under small number of train samples, which results in satisfaction of classification on RS using SVM methods. The traditional RS classification method combines visual interpretation with computer classification. The accuracy of the RS classification, however, is improved a lot based on SVM method, because it saves much labor and time which is used to interpret images and collect training samples. Kernel functions play an important part in the SVM algorithm. It uses improved compound kernel function and therefore has a higher accuracy of classification on RS images. Moreover, compound kernel improves the generalization and learning ability of the kernel.
Nationwide forestry applications program. Analysis of forest classification accuracy
NASA Technical Reports Server (NTRS)
Congalton, R. G.; Mead, R. A.; Oderwald, R. G.; Heinen, J. (Principal Investigator)
1981-01-01
The development of LANDSAT classification accuracy assessment techniques, and of a computerized system for assessing wildlife habitat from land cover maps are considered. A literature review on accuracy assessment techniques and an explanation for the techniques development under both projects are included along with listings of the computer programs. The presentations and discussions at the National Working Conference on LANDSAT Classification Accuracy are summarized. Two symposium papers which were published on the results of this project are appended.
Improved Fuzzy K-Nearest Neighbor Using Modified Particle Swarm Optimization
NASA Astrophysics Data System (ADS)
Jamaluddin; Siringoringo, Rimbun
2017-12-01
Fuzzy k-Nearest Neighbor (FkNN) is one of the most powerful classification methods. The presence of fuzzy concepts in this method successfully improves its performance on almost all classification issues. The main drawbackof FKNN is that it is difficult to determine the parameters. These parameters are the number of neighbors (k) and fuzzy strength (m). Both parameters are very sensitive. This makes it difficult to determine the values of ‘m’ and ‘k’, thus making FKNN difficult to control because no theories or guides can deduce how proper ‘m’ and ‘k’ should be. This study uses Modified Particle Swarm Optimization (MPSO) to determine the best value of ‘k’ and ‘m’. MPSO is focused on the Constriction Factor Method. Constriction Factor Method is an improvement of PSO in order to avoid local circumstances optima. The model proposed in this study was tested on the German Credit Dataset. The test of the data/The data test has been standardized by UCI Machine Learning Repository which is widely applied to classification problems. The application of MPSO to the determination of FKNN parameters is expected to increase the value of classification performance. Based on the experiments that have been done indicating that the model offered in this research results in a better classification performance compared to the Fk-NN model only. The model offered in this study has an accuracy rate of 81%, while. With using Fk-NN model, it has the accuracy of 70%. At the end is done comparison of research model superiority with 2 other classification models;such as Naive Bayes and Decision Tree. This research model has a better performance level, where Naive Bayes has accuracy 75%, and the decision tree model has 70%
Devos, Olivier; Downey, Gerard; Duponchel, Ludovic
2014-04-01
Classification is an important task in chemometrics. For several years now, support vector machines (SVMs) have proven to be powerful for infrared spectral data classification. However such methods require optimisation of parameters in order to control the risk of overfitting and the complexity of the boundary. Furthermore, it is established that the prediction ability of classification models can be improved using pre-processing in order to remove unwanted variance in the spectra. In this paper we propose a new methodology based on genetic algorithm (GA) for the simultaneous optimisation of SVM parameters and pre-processing (GENOPT-SVM). The method has been tested for the discrimination of the geographical origin of Italian olive oil (Ligurian and non-Ligurian) on the basis of near infrared (NIR) or mid infrared (FTIR) spectra. Different classification models (PLS-DA, SVM with mean centre data, GENOPT-SVM) have been tested and statistically compared using McNemar's statistical test. For the two datasets, SVM with optimised pre-processing give models with higher accuracy than the one obtained with PLS-DA on pre-processed data. In the case of the NIR dataset, most of this accuracy improvement (86.3% compared with 82.8% for PLS-DA) occurred using only a single pre-processing step. For the FTIR dataset, three optimised pre-processing steps are required to obtain SVM model with significant accuracy improvement (82.2%) compared to the one obtained with PLS-DA (78.6%). Furthermore, this study demonstrates that even SVM models have to be developed on the basis of well-corrected spectral data in order to obtain higher classification rates. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Hoffbeck, Joseph P.; Landgrebe, David A.
1994-01-01
Many analysis algorithms for high-dimensional remote sensing data require that the remotely sensed radiance spectra be transformed to approximate reflectance to allow comparison with a library of laboratory reflectance spectra. In maximum likelihood classification, however, the remotely sensed spectra are compared to training samples, thus a transformation to reflectance may or may not be helpful. The effect of several radiance-to-reflectance transformations on maximum likelihood classification accuracy is investigated in this paper. We show that the empirical line approach, LOWTRAN7, flat-field correction, single spectrum method, and internal average reflectance are all non-singular affine transformations, and that non-singular affine transformations have no effect on discriminant analysis feature extraction and maximum likelihood classification accuracy. (An affine transformation is a linear transformation with an optional offset.) Since the Atmosphere Removal Program (ATREM) and the log residue method are not affine transformations, experiments with Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data were conducted to determine the effect of these transformations on maximum likelihood classification accuracy. The average classification accuracy of the data transformed by ATREM and the log residue method was slightly less than the accuracy of the original radiance data. Since the radiance-to-reflectance transformations allow direct comparison of remotely sensed spectra with laboratory reflectance spectra, they can be quite useful in labeling the training samples required by maximum likelihood classification, but these transformations have only a slight effect or no effect at all on discriminant analysis and maximum likelihood classification accuracy.
Hotz, Christine S; Templeton, Steven J; Christopher, Mary M
2005-03-01
A rule-based expert system using CLIPS programming language was created to classify body cavity effusions as transudates, modified transudates, exudates, chylous, and hemorrhagic effusions. The diagnostic accuracy of the rule-based system was compared with that produced by 2 machine-learning methods: Rosetta, a rough sets algorithm and RIPPER, a rule-induction method. Results of 508 body cavity fluid analyses (canine, feline, equine) obtained from the University of California-Davis Veterinary Medical Teaching Hospital computerized patient database were used to test CLIPS and to test and train RIPPER and Rosetta. The CLIPS system, using 17 rules, achieved an accuracy of 93.5% compared with pathologist consensus diagnoses. Rosetta accurately classified 91% of effusions by using 5,479 rules. RIPPER achieved the greatest accuracy (95.5%) using only 10 rules. When the original rules of the CLIPS application were replaced with those of RIPPER, the accuracy rates were identical. These results suggest that both rule-based expert systems and machine-learning methods hold promise for the preliminary classification of body fluids in the clinical laboratory.
A fuzzy hill-climbing algorithm for the development of a compact associative classifier
NASA Astrophysics Data System (ADS)
Mitra, Soumyaroop; Lam, Sarah S.
2012-02-01
Classification, a data mining technique, has widespread applications including medical diagnosis, targeted marketing, and others. Knowledge discovery from databases in the form of association rules is one of the important data mining tasks. An integrated approach, classification based on association rules, has drawn the attention of the data mining community over the last decade. While attention has been mainly focused on increasing classifier accuracies, not much efforts have been devoted towards building interpretable and less complex models. This paper discusses the development of a compact associative classification model using a hill-climbing approach and fuzzy sets. The proposed methodology builds the rule-base by selecting rules which contribute towards increasing training accuracy, thus balancing classification accuracy with the number of classification association rules. The results indicated that the proposed associative classification model can achieve competitive accuracies on benchmark datasets with continuous attributes and lend better interpretability, when compared with other rule-based systems.
Goo, Yeung-Ja James; Chi, Der-Jang; Shen, Zong-De
2016-01-01
The purpose of this study is to establish rigorous and reliable going concern doubt (GCD) prediction models. This study first uses the least absolute shrinkage and selection operator (LASSO) to select variables and then applies data mining techniques to establish prediction models, such as neural network (NN), classification and regression tree (CART), and support vector machine (SVM). The samples of this study include 48 GCD listed companies and 124 NGCD (non-GCD) listed companies from 2002 to 2013 in the TEJ database. We conduct fivefold cross validation in order to identify the prediction accuracy. According to the empirical results, the prediction accuracy of the LASSO-NN model is 88.96 % (Type I error rate is 12.22 %; Type II error rate is 7.50 %), the prediction accuracy of the LASSO-CART model is 88.75 % (Type I error rate is 13.61 %; Type II error rate is 14.17 %), and the prediction accuracy of the LASSO-SVM model is 89.79 % (Type I error rate is 10.00 %; Type II error rate is 15.83 %).
Derivation of an artificial gene to improve classification accuracy upon gene selection.
Seo, Minseok; Oh, Sejong
2012-02-01
Classification analysis has been developed continuously since 1936. This research field has advanced as a result of development of classifiers such as KNN, ANN, and SVM, as well as through data preprocessing areas. Feature (gene) selection is required for very high dimensional data such as microarray before classification work. The goal of feature selection is to choose a subset of informative features that reduces processing time and provides higher classification accuracy. In this study, we devised a method of artificial gene making (AGM) for microarray data to improve classification accuracy. Our artificial gene was derived from a whole microarray dataset, and combined with a result of gene selection for classification analysis. We experimentally confirmed a clear improvement of classification accuracy after inserting artificial gene. Our artificial gene worked well for popular feature (gene) selection algorithms and classifiers. The proposed approach can be applied to any type of high dimensional dataset. Copyright © 2011 Elsevier Ltd. All rights reserved.
SVM-RFE based feature selection and Taguchi parameters optimization for multiclass SVM classifier.
Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W M; Li, R K; Jiang, Bo-Ru
2014-01-01
Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.
SVM-RFE Based Feature Selection and Taguchi Parameters Optimization for Multiclass SVM Classifier
Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W. M.; Li, R. K.; Jiang, Bo-Ru
2014-01-01
Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases. PMID:25295306
Classification of urban features using airborne hyperspectral data
NASA Astrophysics Data System (ADS)
Ganesh Babu, Bharath
Accurate mapping and modeling of urban environments are critical for their efficient and successful management. Superior understanding of complex urban environments is made possible by using modern geospatial technologies. This research focuses on thematic classification of urban land use and land cover (LULC) using 248 bands of 2.0 meter resolution hyperspectral data acquired from an airborne imaging spectrometer (AISA+) on 24th July 2006 in and near Terre Haute, Indiana. Three distinct study areas including two commercial classes, two residential classes, and two urban parks/recreational classes were selected for classification and analysis. Four commonly used classification methods -- maximum likelihood (ML), extraction and classification of homogeneous objects (ECHO), spectral angle mapper (SAM), and iterative self organizing data analysis (ISODATA) - were applied to each data set. Accuracy assessment was conducted and overall accuracies were compared between the twenty four resulting thematic maps. With the exception of SAM and ISODATA in a complex commercial area, all methods employed classified the designated urban features with more than 80% accuracy. The thematic classification from ECHO showed the best agreement with ground reference samples. The residential area with relatively homogeneous composition was classified consistently with highest accuracy by all four of the classification methods used. The average accuracy amongst the classifiers was 93.60% for this area. When individually observed, the complex recreational area (Deming Park) was classified with the highest accuracy by ECHO, with an accuracy of 96.80% and 96.10% Kappa. The average accuracy amongst all the classifiers was 92.07%. The commercial area with relatively high complexity was classified with the least accuracy by all classifiers. The lowest accuracy was achieved by SAM at 63.90% with 59.20% Kappa. This was also the lowest accuracy in the entire analysis. This study demonstrates the potential for using the visible and near infrared (VNIR) bands from AISA+ hyperspectral data in urban LULC classification. Based on their performance, the need for further research using ECHO and SAM is underscored. The importance incorporating imaging spectrometer data in high resolution urban feature mapping is emphasized.
Classification of large-scale fundus image data sets: a cloud-computing framework.
Roychowdhury, Sohini
2016-08-01
Large medical image data sets with high dimensionality require substantial amount of computation time for data creation and data processing. This paper presents a novel generalized method that finds optimal image-based feature sets that reduce computational time complexity while maximizing overall classification accuracy for detection of diabetic retinopathy (DR). First, region-based and pixel-based features are extracted from fundus images for classification of DR lesions and vessel-like structures. Next, feature ranking strategies are used to distinguish the optimal classification feature sets. DR lesion and vessel classification accuracies are computed using the boosted decision tree and decision forest classifiers in the Microsoft Azure Machine Learning Studio platform, respectively. For images from the DIARETDB1 data set, 40 of its highest-ranked features are used to classify four DR lesion types with an average classification accuracy of 90.1% in 792 seconds. Also, for classification of red lesion regions and hemorrhages from microaneurysms, accuracies of 85% and 72% are observed, respectively. For images from STARE data set, 40 high-ranked features can classify minor blood vessels with an accuracy of 83.5% in 326 seconds. Such cloud-based fundus image analysis systems can significantly enhance the borderline classification performances in automated screening systems.
Real-time, resource-constrained object classification on a micro-air vehicle
NASA Astrophysics Data System (ADS)
Buck, Louis; Ray, Laura
2013-12-01
A real-time embedded object classification algorithm is developed through the novel combination of binary feature descriptors, a bag-of-visual-words object model and the cortico-striatal loop (CSL) learning algorithm. The BRIEF, ORB and FREAK binary descriptors are tested and compared to SIFT descriptors with regard to their respective classification accuracies, execution times, and memory requirements when used with CSL on a 12.6 g ARM Cortex embedded processor running at 800 MHz. Additionally, the effect of x2 feature mapping and opponent-color representations used with these descriptors is examined. These tests are performed on four data sets of varying sizes and difficulty, and the BRIEF descriptor is found to yield the best combination of speed and classification accuracy. Its use with CSL achieves accuracies between 67% and 95% of those achieved with SIFT descriptors and allows for the embedded classification of a 128x192 pixel image in 0.15 seconds, 60 times faster than classification with SIFT. X2 mapping is found to provide substantial improvements in classification accuracy for all of the descriptors at little cost, while opponent-color descriptors are offer accuracy improvements only on colorful datasets.
A Nonparametric Approach to Estimate Classification Accuracy and Consistency
ERIC Educational Resources Information Center
Lathrop, Quinn N.; Cheng, Ying
2014-01-01
When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…
NASA Astrophysics Data System (ADS)
Löw, Fabian; Schorcht, Gunther; Michel, Ulrich; Dech, Stefan; Conrad, Christopher
2012-10-01
Accurate crop identification and crop area estimation are important for studies on irrigated agricultural systems, yield and water demand modeling, and agrarian policy development. In this study a novel combination of Random Forest (RF) and Support Vector Machine (SVM) classifiers is presented that (i) enhances crop classification accuracy and (ii) provides spatial information on map uncertainty. The methodology was implemented over four distinct irrigated sites in Middle Asia using RapidEye time series data. The RF feature importance statistics was used as feature-selection strategy for the SVM to assess possible negative effects on classification accuracy caused by an oversized feature space. The results of the individual RF and SVM classifications were combined with rules based on posterior classification probability and estimates of classification probability entropy. SVM classification performance was increased by feature selection through RF. Further experimental results indicate that the hybrid classifier improves overall classification accuracy in comparison to the single classifiers as well as useŕs and produceŕs accuracy.
Disregarding population specificity: its influence on the sex assessment methods from the tibia.
Kotěrová, Anežka; Velemínská, Jana; Dupej, Ján; Brzobohatá, Hana; Pilný, Aleš; Brůžek, Jaroslav
2017-01-01
Forensic anthropology has developed classification techniques for sex estimation of unknown skeletal remains, for example population-specific discriminant function analyses. These methods were designed for populations that lived mostly in the late nineteenth and twentieth centuries. Their level of reliability or misclassification is important for practical use in today's forensic practice; it is, however, unknown. We addressed the question of what the likelihood of errors would be if population specificity of discriminant functions of the tibia were disregarded. Moreover, five classification functions in a Czech sample were proposed (accuracies 82.1-87.5 %, sex bias ranged from -1.3 to -5.4 %). We measured ten variables traditionally used for sex assessment of the tibia on a sample of 30 male and 26 female models from recent Czech population. To estimate the classification accuracy and error (misclassification) rates ignoring population specificity, we selected published classification functions of tibia for the Portuguese, south European, and the North American populations. These functions were applied on the dimensions of the Czech population. Comparing the classification success of the reference and the tested Czech sample showed that females from Czech population were significantly overestimated and mostly misclassified as males. Overall accuracy of sex assessment significantly decreased (53.6-69.7 %), sex bias -29.4-100 %, which is most probably caused by secular trend and the generally high variability of body size. Results indicate that the discriminant functions, developed for skeletal series representing geographically and chronologically diverse populations, are not applicable in current forensic investigations. Finally, implications and recommendations for future research are discussed.
Information extraction with object based support vector machines and vegetation indices
NASA Astrophysics Data System (ADS)
Ustuner, Mustafa; Abdikan, Saygin; Balik Sanli, Fusun
2016-07-01
Information extraction through remote sensing data is important for policy and decision makers as extracted information provide base layers for many application of real world. Classification of remotely sensed data is the one of the most common methods of extracting information however it is still a challenging issue because several factors are affecting the accuracy of the classification. Resolution of the imagery, number and homogeneity of land cover classes, purity of training data and characteristic of adopted classifiers are just some of these challenging factors. Object based image classification has some superiority than pixel based classification for high resolution images since it uses geometry and structure information besides spectral information. Vegetation indices are also commonly used for the classification process since it provides additional spectral information for vegetation, forestry and agricultural areas. In this study, the impacts of the Normalized Difference Vegetation Index (NDVI) and Normalized Difference Red Edge Index (NDRE) on the classification accuracy of RapidEye imagery were investigated. Object based Support Vector Machines were implemented for the classification of crop types for the study area located in Aegean region of Turkey. Results demonstrated that the incorporation of NDRE increase the classification accuracy from 79,96% to 86,80% as overall accuracy, however NDVI decrease the classification accuracy from 79,96% to 78,90%. Moreover it is proven than object based classification with RapidEye data give promising results for crop type mapping and analysis.
Kruskal-Wallis-based computationally efficient feature selection for face recognition.
Ali Khan, Sajid; Hussain, Ayyaz; Basit, Abdul; Akram, Sheeraz
2014-01-01
Face recognition in today's technological world, and face recognition applications attain much more importance. Most of the existing work used frontal face images to classify face image. However these techniques fail when applied on real world face images. The proposed technique effectively extracts the prominent facial features. Most of the features are redundant and do not contribute to representing face. In order to eliminate those redundant features, computationally efficient algorithm is used to select the more discriminative face features. Extracted features are then passed to classification step. In the classification step, different classifiers are ensemble to enhance the recognition accuracy rate as single classifier is unable to achieve the high accuracy. Experiments are performed on standard face database images and results are compared with existing techniques.
Bolin, Jocelyn Holden; Finch, W Holmes
2014-01-01
Statistical classification of phenomena into observed groups is very common in the social and behavioral sciences. Statistical classification methods, however, are affected by the characteristics of the data under study. Statistical classification can be further complicated by initial misclassification of the observed groups. The purpose of this study is to investigate the impact of initial training data misclassification on several statistical classification and data mining techniques. Misclassification conditions in the three group case will be simulated and results will be presented in terms of overall as well as subgroup classification accuracy. Results show decreased classification accuracy as sample size, group separation and group size ratio decrease and as misclassification percentage increases with random forests demonstrating the highest accuracy across conditions.
Lewicke, Aaron; Sazonov, Edward; Corwin, Michael J; Neuman, Michael; Schuckers, Stephanie
2008-01-01
Reliability of classification performance is important for many biomedical applications. A classification model which considers reliability in the development of the model such that unreliable segments are rejected would be useful, particularly, in large biomedical data sets. This approach is demonstrated in the development of a technique to reliably determine sleep and wake using only the electrocardiogram (ECG) of infants. Typically, sleep state scoring is a time consuming task in which sleep states are manually derived from many physiological signals. The method was tested with simultaneous 8-h ECG and polysomnogram (PSG) determined sleep scores from 190 infants enrolled in the collaborative home infant monitoring evaluation (CHIME) study. Learning vector quantization (LVQ) neural network, multilayer perceptron (MLP) neural network, and support vector machines (SVMs) are tested as the classifiers. After systematic rejection of difficult to classify segments, the models can achieve 85%-87% correct classification while rejecting only 30% of the data. This corresponds to a Kappa statistic of 0.65-0.68. With rejection, accuracy improves by about 8% over a model without rejection. Additionally, the impact of the PSG scored indeterminate state epochs is analyzed. The advantages of a reliable sleep/wake classifier based only on ECG include high accuracy, simplicity of use, and low intrusiveness. Reliability of the classification can be built directly in the model, such that unreliable segments are rejected.
Chai, Rifai; Naik, Ganesh R; Nguyen, Tuan Nghia; Ling, Sai Ho; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T
2017-05-01
This paper presents a two-class electroencephal-ography-based classification for classifying of driver fatigue (fatigue state versus alert state) from 43 healthy participants. The system uses independent component by entropy rate bound minimization analysis (ERBM-ICA) for the source separation, autoregressive (AR) modeling for the features extraction, and Bayesian neural network for the classification algorithm. The classification results demonstrate a sensitivity of 89.7%, a specificity of 86.8%, and an accuracy of 88.2%. The combination of ERBM-ICA (source separator), AR (feature extractor), and Bayesian neural network (classifier) provides the best outcome with a p-value < 0.05 with the highest value of area under the receiver operating curve (AUC-ROC = 0.93) against other methods such as power spectral density as feature extractor (AUC-ROC = 0.81). The results of this study suggest the method could be utilized effectively for a countermeasure device for driver fatigue identification and other adverse event applications.
Minimum distance classification in remote sensing
NASA Technical Reports Server (NTRS)
Wacker, A. G.; Landgrebe, D. A.
1972-01-01
The utilization of minimum distance classification methods in remote sensing problems, such as crop species identification, is considered. Literature concerning both minimum distance classification problems and distance measures is reviewed. Experimental results are presented for several examples. The objective of these examples is to: (a) compare the sample classification accuracy of a minimum distance classifier, with the vector classification accuracy of a maximum likelihood classifier, and (b) compare the accuracy of a parametric minimum distance classifier with that of a nonparametric one. Results show the minimum distance classifier performance is 5% to 10% better than that of the maximum likelihood classifier. The nonparametric classifier is only slightly better than the parametric version.
Shen, Jing; Hu, FangKe; Zhang, LiHai; Tang, PeiFu; Bi, ZhengGang
2013-04-01
The accuracy of intertrochanteric fracture classification is important; indeed, the patient outcomes are dependent on their classification. The aim of this study was to use the AO classification system to evaluate the variation in classification between X-ray and computed tomography (CT)/3D CT images. Then, differences in the length of surgery were evaluated based on two examinations. Intertrochanteric fractures were reviewed and surgeons were interviewed. The rates of correct discrimination and misclassification (overestimates and underestimates) probabilities were determined. The impact of misclassification on length of surgery was also evaluated. In total, 370 patents and four surgeons were included in the study. All patients had X-ray images and 210 patients had CT/3D CT images. Of them, 214 and 156 patients were treated by intramedullary and extramedullary fixation systems, respectively. The mean length of surgery was 62.1 ± 17.7 min. The overall rate of correct discrimination was 83.8 % and in the classification of A1, A2 and A3 were 80.0, 85.7 and 82.4 %, respectively. The rate of misclassification showed no significant difference between stable and unstable fractures (21.3 vs 13.1 %, P = 0.173). The overall rates of overestimates and underestimates were significantly different (5 vs 11.25 %, P = 0.041). Subtracting the rate of overestimates from underestimates had a positive correlation with prolonged surgery and showed a significant difference with intramedullary fixation (P < 0.001). Classification based on the AO system was good in terms of consistency. CT/3D CT examination was more reliable and more helpful for preoperative assessment, especially for performance of an intramedullary fixation.
Pianta, R C; Longmaid, K; Ferguson, J E
1999-06-01
Investigated an attachment-based theoretical framework and classification system, introduced by Kaplan and Main (1986), for interpreting children's family drawings. This study concentrated on the psychometric properties of the system and the relation between drawings classified using this system and teacher ratings of classroom social-emotional and behavioral functioning, controlling for child age, ethnic status, intelligence, and fine motor skills. This nonclinical sample consisted of 200 kindergarten children of diverse racial and socioeconomic status (SES). Limited support for reliability of this classification system was obtained. Kappas for overall classifications of drawings (e.g., secure) exceeded .80 and mean kappa for discrete drawing features (e.g., figures with smiles) was .82. Coders' endorsement of the presence of certain discrete drawing features predicted their overall classification at 82.5% accuracy. Drawing classification was related to teacher ratings of classroom functioning independent of child age, sex, race, SES, intelligence, and fine motor skills (with p values for the multivariate effects ranging from .043-.001). Results are discussed in terms of the psychometric properties of this system for classifying children's representations of family and the limitations of family drawing techniques for young children.
Selective classification for improved robustness of myoelectric control under nonideal conditions.
Scheme, Erik J; Englehart, Kevin B; Hudgins, Bernard S
2011-06-01
Recent literature in pattern recognition-based myoelectric control has highlighted a disparity between classification accuracy and the usability of upper limb prostheses. This paper suggests that the conventionally defined classification accuracy may be idealistic and may not reflect true clinical performance. Herein, a novel myoelectric control system based on a selective multiclass one-versus-one classification scheme, capable of rejecting unknown data patterns, is introduced. This scheme is shown to outperform nine other popular classifiers when compared using conventional classification accuracy as well as a form of leave-one-out analysis that may be more representative of real prosthetic use. Additionally, the classification scheme allows for real-time, independent adjustment of individual class-pair boundaries making it flexible and intuitive for clinical use.
Multi-source remotely sensed data fusion for improving land cover classification
NASA Astrophysics Data System (ADS)
Chen, Bin; Huang, Bo; Xu, Bing
2017-02-01
Although many advances have been made in past decades, land cover classification of fine-resolution remotely sensed (RS) data integrating multiple temporal, angular, and spectral features remains limited, and the contribution of different RS features to land cover classification accuracy remains uncertain. We proposed to improve land cover classification accuracy by integrating multi-source RS features through data fusion. We further investigated the effect of different RS features on classification performance. The results of fusing Landsat-8 Operational Land Imager (OLI) data with Moderate Resolution Imaging Spectroradiometer (MODIS), China Environment 1A series (HJ-1A), and Advanced Spaceborne Thermal Emission and Reflection (ASTER) digital elevation model (DEM) data, showed that the fused data integrating temporal, spectral, angular, and topographic features achieved better land cover classification accuracy than the original RS data. Compared with the topographic feature, the temporal and angular features extracted from the fused data played more important roles in classification performance, especially those temporal features containing abundant vegetation growth information, which markedly increased the overall classification accuracy. In addition, the multispectral and hyperspectral fusion successfully discriminated detailed forest types. Our study provides a straightforward strategy for hierarchical land cover classification by making full use of available RS data. All of these methods and findings could be useful for land cover classification at both regional and global scales.
Accuracy of automated classification of major depressive disorder as a function of symptom severity.
Ramasubbu, Rajamannar; Brown, Matthew R G; Cortese, Filmeno; Gaxiola, Ismael; Goodyear, Bradley; Greenshaw, Andrew J; Dursun, Serdar M; Greiner, Russell
2016-01-01
Growing evidence documents the potential of machine learning for developing brain based diagnostic methods for major depressive disorder (MDD). As symptom severity may influence brain activity, we investigated whether the severity of MDD affected the accuracies of machine learned MDD-vs-Control diagnostic classifiers. Forty-five medication-free patients with DSM-IV defined MDD and 19 healthy controls participated in the study. Based on depression severity as determined by the Hamilton Rating Scale for Depression (HRSD), MDD patients were sorted into three groups: mild to moderate depression (HRSD 14-19), severe depression (HRSD 20-23), and very severe depression (HRSD ≥ 24). We collected functional magnetic resonance imaging (fMRI) data during both resting-state and an emotional-face matching task. Patients in each of the three severity groups were compared against controls in separate analyses, using either the resting-state or task-based fMRI data. We use each of these six datasets with linear support vector machine (SVM) binary classifiers for identifying individuals as patients or controls. The resting-state fMRI data showed statistically significant classification accuracy only for the very severe depression group (accuracy 66%, p = 0.012 corrected), while mild to moderate (accuracy 58%, p = 1.0 corrected) and severe depression (accuracy 52%, p = 1.0 corrected) were only at chance. With task-based fMRI data, the automated classifier performed at chance in all three severity groups. Binary linear SVM classifiers achieved significant classification of very severe depression with resting-state fMRI, but the contribution of brain measurements may have limited potential in differentiating patients with less severe depression from healthy controls.
Dai, Shengfa; Wei, Qingguo
2017-01-01
Common spatial pattern algorithm is widely used to estimate spatial filters in motor imagery based brain-computer interfaces. However, use of a large number of channels will make common spatial pattern tend to over-fitting and the classification of electroencephalographic signals time-consuming. To overcome these problems, it is necessary to choose an optimal subset of the whole channels to save computational time and improve the classification accuracy. In this paper, a novel method named backtracking search optimization algorithm is proposed to automatically select the optimal channel set for common spatial pattern. Each individual in the population is a N-dimensional vector, with each component representing one channel. A population of binary codes generate randomly in the beginning, and then channels are selected according to the evolution of these codes. The number and positions of 1's in the code denote the number and positions of chosen channels. The objective function of backtracking search optimization algorithm is defined as the combination of classification error rate and relative number of channels. Experimental results suggest that higher classification accuracy can be achieved with much fewer channels compared to standard common spatial pattern with whole channels.
Agile convolutional neural network for pulmonary nodule classification using CT images.
Zhao, Xinzhuo; Liu, Liyao; Qi, Shouliang; Teng, Yueyang; Li, Jianhua; Qian, Wei
2018-04-01
To distinguish benign from malignant pulmonary nodules using CT images is critical for their precise diagnosis and treatment. A new Agile convolutional neural network (CNN) framework is proposed to conquer the challenges of a small-scale medical image database and the small size of the nodules, and it improves the performance of pulmonary nodule classification using CT images. A hybrid CNN of LeNet and AlexNet is constructed through combining the layer settings of LeNet and the parameter settings of AlexNet. A dataset with 743 CT image nodule samples is built up based on the 1018 CT scans of LIDC to train and evaluate the Agile CNN model. Through adjusting the parameters of the kernel size, learning rate, and other factors, the effect of these parameters on the performance of the CNN model is investigated, and an optimized setting of the CNN is obtained finally. After finely optimizing the settings of the CNN, the estimation accuracy and the area under the curve can reach 0.822 and 0.877, respectively. The accuracy of the CNN is significantly dependent on the kernel size, learning rate, training batch size, dropout, and weight initializations. The best performance is achieved when the kernel size is set to [Formula: see text], the learning rate is 0.005, the batch size is 32, and dropout and Gaussian initialization are used. This competitive performance demonstrates that our proposed CNN framework and the optimization strategy of the CNN parameters are suitable for pulmonary nodule classification characterized by small medical datasets and small targets. The classification model might help diagnose and treat pulmonary nodules effectively.
Feature ranking and rank aggregation for automatic sleep stage classification: a comparative study.
Najdi, Shirin; Gharbali, Ali Abdollahi; Fonseca, José Manuel
2017-08-18
Nowadays, sleep quality is one of the most important measures of healthy life, especially considering the huge number of sleep-related disorders. Identifying sleep stages using polysomnographic (PSG) signals is the traditional way of assessing sleep quality. However, the manual process of sleep stage classification is time-consuming, subjective and costly. Therefore, in order to improve the accuracy and efficiency of the sleep stage classification, researchers have been trying to develop automatic classification algorithms. Automatic sleep stage classification mainly consists of three steps: pre-processing, feature extraction and classification. Since classification accuracy is deeply affected by the extracted features, a poor feature vector will adversely affect the classifier and eventually lead to low classification accuracy. Therefore, special attention should be given to the feature extraction and selection process. In this paper the performance of seven feature selection methods, as well as two feature rank aggregation methods, were compared. Pz-Oz EEG, horizontal EOG and submental chin EMG recordings of 22 healthy males and females were used. A comprehensive feature set including 49 features was extracted from these recordings. The extracted features are among the most common and effective features used in sleep stage classification from temporal, spectral, entropy-based and nonlinear categories. The feature selection methods were evaluated and compared using three criteria: classification accuracy, stability, and similarity. Simulation results show that MRMR-MID achieves the highest classification performance while Fisher method provides the most stable ranking. In our simulations, the performance of the aggregation methods was in the average level, although they are known to generate more stable results and better accuracy. The Borda and RRA rank aggregation methods could not outperform significantly the conventional feature ranking methods. Among conventional methods, some of them slightly performed better than others, although the choice of a suitable technique is dependent on the computational complexity and accuracy requirements of the user.
Comparison of wheat classification accuracy using different classifiers of the image-100 system
NASA Technical Reports Server (NTRS)
Dejesusparada, N. (Principal Investigator); Chen, S. C.; Moreira, M. A.; Delima, A. M.
1981-01-01
Classification results using single-cell and multi-cell signature acquisition options, a point-by-point Gaussian maximum-likelihood classifier, and K-means clustering of the Image-100 system are presented. Conclusions reached are that: a better indication of correct classification can be provided by using a test area which contains various cover types of the study area; classification accuracy should be evaluated considering both the percentages of correct classification and error of commission; supervised classification approaches are better than K-means clustering; Gaussian distribution maximum likelihood classifier is better than Single-cell and Multi-cell Signature Acquisition Options of the Image-100 system; and in order to obtain a high classification accuracy in a large and heterogeneous crop area, using Gaussian maximum-likelihood classifier, homogeneous spectral subclasses of the study crop should be created to derive training statistics.
Combining Machine Learning and Natural Language Processing to Assess Literary Text Comprehension
ERIC Educational Resources Information Center
Balyan, Renu; McCarthy, Kathryn S.; McNamara, Danielle S.
2017-01-01
This study examined how machine learning and natural language processing (NLP) techniques can be leveraged to assess the interpretive behavior that is required for successful literary text comprehension. We compared the accuracy of seven different machine learning classification algorithms in predicting human ratings of student essays about…
NASA Astrophysics Data System (ADS)
Bangs, Corey F.; Kruse, Fred A.; Olsen, Chris R.
2013-05-01
Hyperspectral data were assessed to determine the effect of integrating spectral data and extracted texture feature data on classification accuracy. Four separate spectral ranges (hundreds of spectral bands total) were used from the Visible and Near Infrared (VNIR) and Shortwave Infrared (SWIR) portions of the electromagnetic spectrum. Haralick texture features (contrast, entropy, and correlation) were extracted from the average gray-level image for each of the four spectral ranges studied. A maximum likelihood classifier was trained using a set of ground truth regions of interest (ROIs) and applied separately to the spectral data, texture data, and a fused dataset containing both. Classification accuracy was measured by comparison of results to a separate verification set of test ROIs. Analysis indicates that the spectral range (source of the gray-level image) used to extract the texture feature data has a significant effect on the classification accuracy. This result applies to texture-only classifications as well as the classification of integrated spectral data and texture feature data sets. Overall classification improvement for the integrated data sets was near 1%. Individual improvement for integrated spectral and texture classification of the "Urban" class showed approximately 9% accuracy increase over spectral-only classification. Texture-only classification accuracy was highest for the "Dirt Path" class at approximately 92% for the spectral range from 947 to 1343nm. This research demonstrates the effectiveness of texture feature data for more accurate analysis of hyperspectral data and the importance of selecting the correct spectral range to be used for the gray-level image source to extract these features.
Accuracy of Remotely Sensed Classifications For Stratification of Forest and Nonforest Lands
Raymond L. Czaplewski; Paul L. Patterson
2001-01-01
We specify accuracy standards for remotely sensed classifications used by FIA to stratify landscapes into two categories: forest and nonforest. Accuracy must be highest when forest area approaches 100 percent of the landscape. If forest area is rare in a landscape, then accuracy in the nonforest stratum must be very high, even at the expense of accuracy in the forest...
NASA Astrophysics Data System (ADS)
Shahriari Nia, Morteza; Wang, Daisy Zhe; Bohlman, Stephanie Ann; Gader, Paul; Graves, Sarah J.; Petrovic, Milenko
2015-01-01
Hyperspectral images can be used to identify savannah tree species at the landscape scale, which is a key step in measuring biomass and carbon, and tracking changes in species distributions, including invasive species, in these ecosystems. Before automated species mapping can be performed, image processing and atmospheric correction is often performed, which can potentially affect the performance of classification algorithms. We determine how three processing and correction techniques (atmospheric correction, Gaussian filters, and shade/green vegetation filters) affect the prediction accuracy of classification of tree species at pixel level from airborne visible/infrared imaging spectrometer imagery of longleaf pine savanna in Central Florida, United States. Species classification using fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) atmospheric correction outperformed ATCOR in the majority of cases. Green vegetation (normalized difference vegetation index) and shade (near-infrared) filters did not increase classification accuracy when applied to large and continuous patches of specific species. Finally, applying a Gaussian filter reduces interband noise and increases species classification accuracy. Using the optimal preprocessing steps, our classification accuracy of six species classes is about 75%.
NASA Astrophysics Data System (ADS)
Wei, Hongqiang; Zhou, Guiyun; Zhou, Junjie
2018-04-01
The classification of leaf and wood points is an essential preprocessing step for extracting inventory measurements and canopy characterization of trees from the terrestrial laser scanning (TLS) data. The geometry-based approach is one of the widely used classification method. In the geometry-based method, it is common practice to extract salient features at one single scale before the features are used for classification. It remains unclear how different scale(s) used affect the classification accuracy and efficiency. To assess the scale effect on the classification accuracy and efficiency, we extracted the single-scale and multi-scale salient features from the point clouds of two oak trees of different sizes and conducted the classification on leaf and wood. Our experimental results show that the balanced accuracy of the multi-scale method is higher than the average balanced accuracy of the single-scale method by about 10 % for both trees. The average speed-up ratio of single scale classifiers over multi-scale classifier for each tree is higher than 30.
Comparing Features for Classification of MEG Responses to Motor Imagery.
Halme, Hanna-Leena; Parkkonen, Lauri
2016-01-01
Motor imagery (MI) with real-time neurofeedback could be a viable approach, e.g., in rehabilitation of cerebral stroke. Magnetoencephalography (MEG) noninvasively measures electric brain activity at high temporal resolution and is well-suited for recording oscillatory brain signals. MI is known to modulate 10- and 20-Hz oscillations in the somatomotor system. In order to provide accurate feedback to the subject, the most relevant MI-related features should be extracted from MEG data. In this study, we evaluated several MEG signal features for discriminating between left- and right-hand MI and between MI and rest. MEG was measured from nine healthy participants imagining either left- or right-hand finger tapping according to visual cues. Data preprocessing, feature extraction and classification were performed offline. The evaluated MI-related features were power spectral density (PSD), Morlet wavelets, short-time Fourier transform (STFT), common spatial patterns (CSP), filter-bank common spatial patterns (FBCSP), spatio-spectral decomposition (SSD), and combined SSD+CSP, CSP+PSD, CSP+Morlet, and CSP+STFT. We also compared four classifiers applied to single trials using 5-fold cross-validation for evaluating the classification accuracy and its possible dependence on the classification algorithm. In addition, we estimated the inter-session left-vs-right accuracy for each subject. The SSD+CSP combination yielded the best accuracy in both left-vs-right (mean 73.7%) and MI-vs-rest (mean 81.3%) classification. CSP+Morlet yielded the best mean accuracy in inter-session left-vs-right classification (mean 69.1%). There were large inter-subject differences in classification accuracy, and the level of the 20-Hz suppression correlated significantly with the subjective MI-vs-rest accuracy. Selection of the classification algorithm had only a minor effect on the results. We obtained good accuracy in sensor-level decoding of MI from single-trial MEG data. Feature extraction methods utilizing both the spatial and spectral profile of MI-related signals provided the best classification results, suggesting good performance of these methods in an online MEG neurofeedback system.
NASA Astrophysics Data System (ADS)
Tu, Shu-Ju; Wang, Chih-Wei; Pan, Kuang-Tse; Wu, Yi-Cheng; Wu, Chen-Te
2018-03-01
Lung cancer screening aims to detect small pulmonary nodules and decrease the mortality rate of those affected. However, studies from large-scale clinical trials of lung cancer screening have shown that the false-positive rate is high and positive predictive value is low. To address these problems, a technical approach is greatly needed for accurate malignancy differentiation among these early-detected nodules. We studied the clinical feasibility of an additional protocol of localized thin-section CT for further assessment on recalled patients from lung cancer screening tests. Our approach of localized thin-section CT was integrated with radiomics features extraction and machine learning classification which was supervised by pathological diagnosis. Localized thin-section CT images of 122 nodules were retrospectively reviewed and 374 radiomics features were extracted. In this study, 48 nodules were benign and 74 malignant. There were nine patients with multiple nodules and four with synchronous multiple malignant nodules. Different machine learning classifiers with a stratified ten-fold cross-validation were used and repeated 100 times to evaluate classification accuracy. Of the image features extracted from the thin-section CT images, 238 (64%) were useful in differentiating between benign and malignant nodules. These useful features include CT density (p = 0.002 518), sigma (p = 0.002 781), uniformity (p = 0.032 41), and entropy (p = 0.006 685). The highest classification accuracy was 79% by the logistic classifier. The performance metrics of this logistic classification model was 0.80 for the positive predictive value, 0.36 for the false-positive rate, and 0.80 for the area under the receiver operating characteristic curve. Our approach of direct risk classification supervised by the pathological diagnosis with localized thin-section CT and radiomics feature extraction may support clinical physicians in determining truly malignant nodules and therefore reduce problems in lung cancer screening.
Tu, Shu-Ju; Wang, Chih-Wei; Pan, Kuang-Tse; Wu, Yi-Cheng; Wu, Chen-Te
2018-03-14
Lung cancer screening aims to detect small pulmonary nodules and decrease the mortality rate of those affected. However, studies from large-scale clinical trials of lung cancer screening have shown that the false-positive rate is high and positive predictive value is low. To address these problems, a technical approach is greatly needed for accurate malignancy differentiation among these early-detected nodules. We studied the clinical feasibility of an additional protocol of localized thin-section CT for further assessment on recalled patients from lung cancer screening tests. Our approach of localized thin-section CT was integrated with radiomics features extraction and machine learning classification which was supervised by pathological diagnosis. Localized thin-section CT images of 122 nodules were retrospectively reviewed and 374 radiomics features were extracted. In this study, 48 nodules were benign and 74 malignant. There were nine patients with multiple nodules and four with synchronous multiple malignant nodules. Different machine learning classifiers with a stratified ten-fold cross-validation were used and repeated 100 times to evaluate classification accuracy. Of the image features extracted from the thin-section CT images, 238 (64%) were useful in differentiating between benign and malignant nodules. These useful features include CT density (p = 0.002 518), sigma (p = 0.002 781), uniformity (p = 0.032 41), and entropy (p = 0.006 685). The highest classification accuracy was 79% by the logistic classifier. The performance metrics of this logistic classification model was 0.80 for the positive predictive value, 0.36 for the false-positive rate, and 0.80 for the area under the receiver operating characteristic curve. Our approach of direct risk classification supervised by the pathological diagnosis with localized thin-section CT and radiomics feature extraction may support clinical physicians in determining truly malignant nodules and therefore reduce problems in lung cancer screening.
Real-Time Fault Classification for Plasma Processes
Yang, Ryan; Chen, Rongshun
2011-01-01
Plasma process tools, which usually cost several millions of US dollars, are often used in the semiconductor fabrication etching process. If the plasma process is halted due to some process fault, the productivity will be reduced and the cost will increase. In order to maximize the product/wafer yield and tool productivity, a timely and effective fault process detection is required in a plasma reactor. The classification of fault events can help the users to quickly identify fault processes, and thus can save downtime of the plasma tool. In this work, optical emission spectroscopy (OES) is employed as the metrology sensor for in-situ process monitoring. Splitting into twelve different match rates by spectrum bands, the matching rate indicator in our previous work (Yang, R.; Chen, R.S. Sensors 2010, 10, 5703–5723) is used to detect the fault process. Based on the match data, a real-time classification of plasma faults is achieved by a novel method, developed in this study. Experiments were conducted to validate the novel fault classification. From the experimental results, we may conclude that the proposed method is feasible inasmuch that the overall accuracy rate of the classification for fault event shifts is 27 out of 28 or about 96.4% in success. PMID:22164001
Improving crop classification through attention to the timing of airborne radar acquisitions
NASA Technical Reports Server (NTRS)
Brisco, B.; Ulaby, F. T.; Protz, R.
1984-01-01
Radar remote sensors may provide valuable input to crop classification procedures because of (1) their independence of weather conditions and solar illumination, and (2) their ability to respond to differences in crop type. Manual classification of multidate synthetic aperture radar (SAR) imagery resulted in an overall accuracy of 83 percent for corn, forest, grain, and 'other' cover types. Forests and corn fields were identified with accuracies approaching or exceeding 90 percent. Grain fields and 'other' fields were often confused with each other, resulting in classification accuracies of 51 and 66 percent, respectively. The 83 percent correct classification represents a 10 percent improvement when compared to similar SAR data for the same area collected at alternate time periods in 1978. These results demonstrate that improvements in crop classification accuracy can be achieved with SAR data by synchronizing data collection times with crop growth stages in order to maximize differences in the geometric and dielectric properties of the cover types of interest.
IMPACTS OF PATCH SIZE AND LANDSCAPE HETEROGENEITY ON THEMATIC IMAGE CLASSIFICATION ACCURACY
Impacts of Patch Size and Landscape Heterogeneity on Thematic Image Classification Accuracy.
Currently, most thematic accuracy assessments of classified remotely sensed images oily account for errors between the various classes employed, at particular pixels of interest, thu...
Zmiri, Dror; Shahar, Yuval; Taieb-Maimon, Meirav
2012-04-01
To test the feasibility of classifying emergency department patients into severity grades using data mining methods. Emergency department records of 402 patients were classified into five severity grades by two expert physicians. The Naïve Bayes and C4.5 algorithms were applied to produce classifiers from patient data into severity grades. The classifiers' results over several subsets of the data were compared with the physicians' assessments, with a random classifier, and with a classifier that selects the maximal-prevalence class. Positive predictive value, multiple-class extensions of sensitivity and specificity combinations, and entropy change. The mean accuracy of the data mining classifiers was 52.94 ± 5.89%, significantly better (P < 0.05) than the mean accuracy of a random classifier (34.60 ± 2.40%). The entropy of the input data sets was reduced through classification by a mean of 10.1%. Allowing for classification deviations of one severity grade led to mean accuracy of 85.42 ± 1.42%. The classifiers' accuracy in that case was similar to the physicians' consensus rate. Learning from consensus records led to better performance. Reducing the number of severity grades improved results in certain cases. The performance of the Naïve Bayes and C4.5 algorithms was similar; in unbalanced data sets, Naïve Bayes performed better. It is possible to produce a computerized classification model for the severity grade of triage patients, using data mining methods. Learning from patient records regarding which there is a consensus of several physicians is preferable to learning from each physician's patients. Either Naïve Bayes or C4.5 can be used; Naïve Bayes is preferable for unbalanced data sets. An ambiguity in the intermediate severity grades seems to hamper both the physicians' agreement and the classifiers' accuracy. © 2010 Blackwell Publishing Ltd.
Strategic Interviewing to Detect Deception: Cues to Deception across Repeated Interviews
Masip, Jaume; Blandón-Gitlin, Iris; Martínez, Carmen; Herrero, Carmen; Ibabe, Izaskun
2016-01-01
Previous deception research on repeated interviews found that liars are not less consistent than truth tellers, presumably because liars use a “repeat strategy” to be consistent across interviews. The goal of this study was to design an interview procedure to overcome this strategy. Innocent participants (truth tellers) and guilty participants (liars) had to convince an interviewer that they had performed several innocent activities rather than committing a mock crime. The interview focused on the innocent activities (alibi), contained specific central and peripheral questions, and was repeated after 1 week without forewarning. Cognitive load was increased by asking participants to reply quickly. The liars’ answers in replying to both central and peripheral questions were significantly less accurate, less consistent, and more evasive than the truth tellers’ answers. Logistic regression analyses yielded classification rates ranging from around 70% (with consistency as the predictor variable), 85% (with evasive answers as the predictor variable), to over 90% (with an improved measure of consistency that incorporated evasive answers as the predictor variable, as well as with response accuracy as the predictor variable). These classification rates were higher than the interviewers’ accuracy rate (54%). PMID:27847493
An analysis of USSPACECOM's space surveillance network sensor tasking methodology
NASA Astrophysics Data System (ADS)
Berger, Jeff M.; Moles, Joseph B.; Wilsey, David G.
1992-12-01
This study provides the basis for the development of a cost/benefit assessment model to determine the effects of alterations to the Space Surveillance Network (SSN) on orbital element (OE) set accuracy. It provides a review of current methods used by NORAD and the SSN to gather and process observations, an alternative to the current Gabbard classification method, and the development of a model to determine the effects of observation rate and correction interval on OE set accuracy. The proposed classification scheme is based on satellite J2 perturbations. Specifically, classes were established based on mean motion, eccentricity, and inclination since J2 perturbation effects are functions of only these elements. Model development began by creating representative sensor observations using a highly accurate orbital propagation model. These observations were compared to predicted observations generated using the NORAD Simplified General Perturbation (SGP4) model and differentially corrected using a Bayes, sequential estimation, algorithm. A 10-run Monte Carlo analysis was performed using this model on 12 satellites using 16 different observation rate/correction interval combinations. An ANOVA and confidence interval analysis of the results show that this model does demonstrate the differences in steady state position error based on varying observation rate and correction interval.
Hussain, Shaista; Basu, Arindam
2016-01-01
The development of power-efficient neuromorphic devices presents the challenge of designing spike pattern classification algorithms which can be implemented on low-precision hardware and can also achieve state-of-the-art performance. In our pursuit of meeting this challenge, we present a pattern classification model which uses a sparse connection matrix and exploits the mechanism of nonlinear dendritic processing to achieve high classification accuracy. A rate-based structural learning rule for multiclass classification is proposed which modifies a connectivity matrix of binary synaptic connections by choosing the best “k” out of “d” inputs to make connections on every dendritic branch (k < < d). Because learning only modifies connectivity, the model is well suited for implementation in neuromorphic systems using address-event representation (AER). We develop an ensemble method which combines several dendritic classifiers to achieve enhanced generalization over individual classifiers. We have two major findings: (1) Our results demonstrate that an ensemble created with classifiers comprising moderate number of dendrites performs better than both ensembles of perceptrons and of complex dendritic trees. (2) In order to determine the moderate number of dendrites required for a specific classification problem, a two-step solution is proposed. First, an adaptive approach is proposed which scales the relative size of the dendritic trees of neurons for each class. It works by progressively adding dendrites with fixed number of synapses to the network, thereby allocating synaptic resources as per the complexity of the given problem. As a second step, theoretical capacity calculations are used to convert each neuronal dendritic tree to its optimal topology where dendrites of each class are assigned different number of synapses. The performance of the model is evaluated on classification of handwritten digits from the benchmark MNIST dataset and compared with other spike classifiers. We show that our system can achieve classification accuracy within 1 − 2% of other reported spike-based classifiers while using much less synaptic resources (only 7%) compared to that used by other methods. Further, an ensemble classifier created with adaptively learned sizes can attain accuracy of 96.4% which is at par with the best reported performance of spike-based classifiers. Moreover, the proposed method achieves this by using about 20% of the synapses used by other spike algorithms. We also present results of applying our algorithm to classify the MNIST-DVS dataset collected from a real spike-based image sensor and show results comparable to the best reported ones (88.1% accuracy). For VLSI implementations, we show that the reduced synaptic memory can save upto 4X area compared to conventional crossbar topologies. Finally, we also present a biologically realistic spike-based version for calculating the correlations required by the structural learning rule and demonstrate the correspondence between the rate-based and spike-based methods of learning. PMID:27065782
Tahmasian, Masoud; Jamalabadi, Hamidreza; Abedini, Mina; Ghadami, Mohammad R; Sepehry, Amir A; Knight, David C; Khazaie, Habibolah
2017-05-22
Sleep disturbance is common in chronic post-traumatic stress disorder (PTSD). However, prior work has demonstrated that there are inconsistencies between subjective and objective assessments of sleep disturbance in PTSD. Therefore, we investigated whether subjective or objective sleep assessment has greater clinical utility to differentiate PTSD patients from healthy subjects. Further, we evaluated whether the combination of subjective and objective methods improves the accuracy of classification into patient versus healthy groups, which has important diagnostic implications. We recruited 32 chronic war-induced PTSD patients and 32 age- and gender-matched healthy subjects to participate in this study. Subjective (i.e. from three self-reported sleep questionnaires) and objective sleep-related data (i.e. from actigraphy scores) were collected from each participant. Subjective, objective, and combined (subjective and objective) sleep data were then analyzed using support vector machine classification. The classification accuracy, sensitivity, and specificity for subjective variables were 89.2%, 89.3%, and 89%, respectively. The classification accuracy, sensitivity, and specificity for objective variables were 65%, 62.3%, and 67.8%, respectively. The classification accuracy, sensitivity, and specificity for the aggregate variables (combination of subjective and objective variables) were 91.6%, 93.0%, and 90.3%, respectively. Our findings indicate that classification accuracy using subjective measurements is superior to objective measurements and the combination of both assessments appears to improve the classification accuracy for differentiating PTSD patients from healthy individuals. Copyright © 2017 Elsevier B.V. All rights reserved.
Li, Ke; Liu, Yi; Wang, Quanxin; Wu, Yalei; Song, Shimin; Sun, Yi; Liu, Tengchong; Wang, Jun; Li, Yang; Du, Shaoyi
2015-01-01
This paper proposes a novel multi-label classification method for resolving the spacecraft electrical characteristics problems which involve many unlabeled test data processing, high-dimensional features, long computing time and identification of slow rate. Firstly, both the fuzzy c-means (FCM) offline clustering and the principal component feature extraction algorithms are applied for the feature selection process. Secondly, the approximate weighted proximal support vector machine (WPSVM) online classification algorithms is used to reduce the feature dimension and further improve the rate of recognition for electrical characteristics spacecraft. Finally, the data capture contribution method by using thresholds is proposed to guarantee the validity and consistency of the data selection. The experimental results indicate that the method proposed can obtain better data features of the spacecraft electrical characteristics, improve the accuracy of identification and shorten the computing time effectively. PMID:26544549
Application of visible and near-infrared spectroscopy to classification of Miscanthus species
Jin, Xiaoli; Chen, Xiaoling; Xiao, Liang; ...
2017-04-03
Here, the feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validationmore » results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species.« less
Application of visible and near-infrared spectroscopy to classification of Miscanthus species
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Xiaoli; Chen, Xiaoling; Xiao, Liang
Here, the feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validationmore » results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species.« less
Application of visible and near-infrared spectroscopy to classification of Miscanthus species.
Jin, Xiaoli; Chen, Xiaoling; Xiao, Liang; Shi, Chunhai; Chen, Liang; Yu, Bin; Yi, Zili; Yoo, Ji Hye; Heo, Kweon; Yu, Chang Yeon; Yamada, Toshihiko; Sacks, Erik J; Peng, Junhua
2017-01-01
The feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validation results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species.
Application of visible and near-infrared spectroscopy to classification of Miscanthus species
Shi, Chunhai; Chen, Liang; Yu, Bin; Yi, Zili; Yoo, Ji Hye; Heo, Kweon; Yu, Chang Yeon; Yamada, Toshihiko; Sacks, Erik J.; Peng, Junhua
2017-01-01
The feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validation results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species. PMID:28369059
Kwon, Yea-Hoon; Shin, Sae-Byuk; Kim, Shin-Dug
2018-04-30
The purpose of this study is to improve human emotional classification accuracy using a convolution neural networks (CNN) model and to suggest an overall method to classify emotion based on multimodal data. We improved classification performance by combining electroencephalogram (EEG) and galvanic skin response (GSR) signals. GSR signals are preprocessed using by the zero-crossing rate. Sufficient EEG feature extraction can be obtained through CNN. Therefore, we propose a suitable CNN model for feature extraction by tuning hyper parameters in convolution filters. The EEG signal is preprocessed prior to convolution by a wavelet transform while considering time and frequency simultaneously. We use a database for emotion analysis using the physiological signals open dataset to verify the proposed process, achieving 73.4% accuracy, showing significant performance improvement over the current best practice models.
Classifying bent radio galaxies from a mixture of point-like/extended images with Machine Learning.
NASA Astrophysics Data System (ADS)
Bastien, David; Oozeer, Nadeem; Somanah, Radhakrishna
2017-05-01
The hypothesis that bent radio sources are supposed to be found in rich, massive galaxy clusters and the avalibility of huge amount of data from radio surveys have fueled our motivation to use Machine Learning (ML) to identify bent radio sources and as such use them as tracers for galaxy clusters. The shapelet analysis allowed us to decompose radio images into 256 features that could be fed into the ML algorithm. Additionally, ideas from the field of neuro-psychology helped us to consider training the machine to identify bent galaxies at different orientations. From our analysis, we found that the Random Forest algorithm was the most effective with an accuracy rate of 92% for a classification of point and extended sources as well as an accuracy of 80% for bent and unbent classification.
Postcraniometric sex and ancestry estimation in South Africa: a validation study.
Liebenberg, Leandi; Krüger, Gabriele C; L'Abbé, Ericka N; Stull, Kyra E
2018-05-24
With the acceptance of the Daubert criteria as the standards for best practice in forensic anthropological research, more emphasis is being placed on the validation of published methods. Methods, both traditional and novel, need to be validated, adjusted, and refined for optimal performance within forensic anthropological analyses. Recently, a custom postcranial database of modern South Africans was created for use in Fordisc 3.1. Classification accuracies of up to 85% for ancestry estimation and 98% for sex estimation were achieved using a multivariate approach. To measure the external validity and report more realistic performance statistics, an independent sample was tested. The postcrania from 180 black, white, and colored South Africans were measured and classified using the custom postcranial database. A decrease in accuracy was observed for both ancestry estimation (79%) and sex estimation (95%) of the validation sample. When incorporating both sex and ancestry simultaneously, the method achieved 70% accuracy, and 79% accuracy when sex-specific ancestry analyses were run. Classification matrices revealed that postcrania were more likely to misclassify as a result of ancestry rather than sex. While both sex and ancestry influence the size of an individual, sex differences are more marked in the postcranial skeleton and are therefore easier to identify. The external validity of the postcranial database was verified and therefore shown to be a useful tool for forensic casework in South Africa. While the classification rates were slightly lower than the original method, this is expected when a method is generalized.
Global Optimization Ensemble Model for Classification Methods
Anwar, Hina; Qamar, Usman; Muzaffar Qureshi, Abdul Wahab
2014-01-01
Supervised learning is the process of data mining for deducing rules from training datasets. A broad array of supervised learning algorithms exists, every one of them with its own advantages and drawbacks. There are some basic issues that affect the accuracy of classifier while solving a supervised learning problem, like bias-variance tradeoff, dimensionality of input space, and noise in the input data space. All these problems affect the accuracy of classifier and are the reason that there is no global optimal method for classification. There is not any generalized improvement method that can increase the accuracy of any classifier while addressing all the problems stated above. This paper proposes a global optimization ensemble model for classification methods (GMC) that can improve the overall accuracy for supervised learning problems. The experimental results on various public datasets showed that the proposed model improved the accuracy of the classification models from 1% to 30% depending upon the algorithm complexity. PMID:24883382
Porras-Alfaro, Andrea; Liu, Kuan-Liang; Kuske, Cheryl R; Xie, Gary
2014-02-01
We compared the classification accuracy of two sections of the fungal internal transcribed spacer (ITS) region, individually and combined, and the 5' section (about 600 bp) of the large-subunit rRNA (LSU), using a naive Bayesian classifier and BLASTN. A hand-curated ITS-LSU training set of 1,091 sequences and a larger training set of 8,967 ITS region sequences were used. Of the factors evaluated, database composition and quality had the largest effect on classification accuracy, followed by fragment size and use of a bootstrap cutoff to improve classification confidence. The naive Bayesian classifier and BLASTN gave similar results at higher taxonomic levels, but the classifier was faster and more accurate at the genus level when a bootstrap cutoff was used. All of the ITS and LSU sections performed well (>97.7% accuracy) at higher taxonomic ranks from kingdom to family, and differences between them were small at the genus level (within 0.66 to 1.23%). When full-length sequence sections were used, the LSU outperformed the ITS1 and ITS2 fragments at the genus level, but the ITS1 and ITS2 showed higher accuracy when smaller fragment sizes of the same length and a 50% bootstrap cutoff were used. In a comparison using the larger ITS training set, ITS1 and ITS2 had very similar accuracy classification for fragments between 100 and 200 bp. Collectively, the results show that any of the ITS or LSU sections we tested provided comparable classification accuracy to the genus level and underscore the need for larger and more diverse classification training sets.
Liu, Kuan-Liang; Kuske, Cheryl R.
2014-01-01
We compared the classification accuracy of two sections of the fungal internal transcribed spacer (ITS) region, individually and combined, and the 5′ section (about 600 bp) of the large-subunit rRNA (LSU), using a naive Bayesian classifier and BLASTN. A hand-curated ITS-LSU training set of 1,091 sequences and a larger training set of 8,967 ITS region sequences were used. Of the factors evaluated, database composition and quality had the largest effect on classification accuracy, followed by fragment size and use of a bootstrap cutoff to improve classification confidence. The naive Bayesian classifier and BLASTN gave similar results at higher taxonomic levels, but the classifier was faster and more accurate at the genus level when a bootstrap cutoff was used. All of the ITS and LSU sections performed well (>97.7% accuracy) at higher taxonomic ranks from kingdom to family, and differences between them were small at the genus level (within 0.66 to 1.23%). When full-length sequence sections were used, the LSU outperformed the ITS1 and ITS2 fragments at the genus level, but the ITS1 and ITS2 showed higher accuracy when smaller fragment sizes of the same length and a 50% bootstrap cutoff were used. In a comparison using the larger ITS training set, ITS1 and ITS2 had very similar accuracy classification for fragments between 100 and 200 bp. Collectively, the results show that any of the ITS or LSU sections we tested provided comparable classification accuracy to the genus level and underscore the need for larger and more diverse classification training sets. PMID:24242255
Classification of ictal and seizure-free HRV signals with focus on lateralization of epilepsy.
Behbahani, Soroor; Dabanloo, Nader Jafarnia; Nasrabadi, Ali Motie; Dourado, Antonio
2016-01-01
Epileptic onsets often affect the autonomic function of the body during a seizure, whether it is in ictal, interictal or post-ictal periods. The different effects of localization and lateralization of seizures on heart rate variability (HRV) emphasize the importance of autonomic function changes in epileptic patients. On the other hand, the detection of seizures is of primary interests in evaluating the epileptic patients. In the current paper, we analyzed the HRV signal to develop a reliable offline seizure-detection algorithm to focus on the effects of lateralization on HRV. We assessed the HRV during 5-min segments of continuous electrocardiogram (ECG) recording with a total number of 170 seizures occurred in 16 patients, composed of 86 left-sided and 84 right-sided focus seizures. Relatively high and low-frequency components of the HRV were computed using spectral analysis. Poincaré parameters of each heart rate time series considered as non-linear features. We fed these features to the Support Vector Machines (SVMs) to find a robust classification method to classify epileptic and non-epileptic signals. Leave One Out Cross-Validation (LOOCV) approach was used to demonstrate the consistency of the classification results. Our obtained classification accuracy confirms that the proposed scheme has a potential in classifying HRV signals to epileptic and non-epileptic classes. The accuracy rates for right-sided and left-sided focus seizures were obtained as 86.74% and 79.41%, respectively. The main finding of our study is that the patients with right-sided focus epilepsy showed more reduction in parasympathetic activity and more increase in sympathetic activity. It can be a marker of impaired vagal activity associated with increased cardiovascular risk and arrhythmias. Our results suggest that lateralization of the seizure onset zone could exert different influences on heart rate changes. A right-sided seizure would cause an ictal tachycardia whereas a left-sided seizure would result in an ictal bradycardia.
Van Cott, Andrew; Hastings, Charles E; Landsiedel, Robert; Kolle, Susanne; Stinchcombe, Stefan
2018-02-01
In vivo acute systemic testing is a regulatory requirement for agrochemical formulations. GHS specifies an alternative computational approach (GHS additivity formula) for calculating the acute toxicity of mixtures. We collected acute systemic toxicity data from formulations that contained one of several acutely-toxic active ingredients. The resulting acute data set includes 210 formulations tested for oral toxicity, 128 formulations tested for inhalation toxicity and 31 formulations tested for dermal toxicity. The GHS additivity formula was applied to each of these formulations and compared with the experimental in vivo result. In the acute oral assay, the GHS additivity formula misclassified 110 formulations using the GHS classification criteria (48% accuracy) and 119 formulations using the USEPA classification criteria (43% accuracy). With acute inhalation, the GHS additivity formula misclassified 50 formulations using the GHS classification criteria (61% accuracy) and 34 formulations using the USEPA classification criteria (73% accuracy). For acute dermal toxicity, the GHS additivity formula misclassified 16 formulations using the GHS classification criteria (48% accuracy) and 20 formulations using the USEPA classification criteria (36% accuracy). This data indicates the acute systemic toxicity of many formulations is not the sum of the ingredients' toxicity (additivity); but rather, ingredients in a formulation can interact to result in lower or higher toxicity than predicted by the GHS additivity formula. Copyright © 2018 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Geelen, Christopher D.; Wijnhoven, Rob G. J.; Dubbelman, Gijs; de With, Peter H. N.
2015-03-01
This research considers gender classification in surveillance environments, typically involving low-resolution images and a large amount of viewpoint variations and occlusions. Gender classification is inherently difficult due to the large intra-class variation and interclass correlation. We have developed a gender classification system, which is successfully evaluated on two novel datasets, which realistically consider the above conditions, typical for surveillance. The system reaches a mean accuracy of up to 90% and approaches our human baseline of 92.6%, proving a high-quality gender classification system. We also present an in-depth discussion of the fundamental differences between SVM and RF classifiers. We conclude that balancing the degree of randomization in any classifier is required for the highest classification accuracy. For our problem, an RF-SVM hybrid classifier exploiting the combination of HSV and LBP features results in the highest classification accuracy of 89.9 0.2%, while classification computation time is negligible compared to the detection time of pedestrians.
Castro, Eduardo; Martínez-Ramón, Manel; Pearlson, Godfrey; Sui, Jing; Calhoun, Vince D.
2011-01-01
Pattern classification of brain imaging data can enable the automatic detection of differences in cognitive processes of specific groups of interest. Furthermore, it can also give neuroanatomical information related to the regions of the brain that are most relevant to detect these differences by means of feature selection procedures, which are also well-suited to deal with the high dimensionality of brain imaging data. This work proposes the application of recursive feature elimination using a machine learning algorithm based on composite kernels to the classification of healthy controls and patients with schizophrenia. This framework, which evaluates nonlinear relationships between voxels, analyzes whole-brain fMRI data from an auditory task experiment that is segmented into anatomical regions and recursively eliminates the uninformative ones based on their relevance estimates, thus yielding the set of most discriminative brain areas for group classification. The collected data was processed using two analysis methods: the general linear model (GLM) and independent component analysis (ICA). GLM spatial maps as well as ICA temporal lobe and default mode component maps were then input to the classifier. A mean classification accuracy of up to 95% estimated with a leave-two-out cross-validation procedure was achieved by doing multi-source data classification. In addition, it is shown that the classification accuracy rate obtained by using multi-source data surpasses that reached by using single-source data, hence showing that this algorithm takes advantage of the complimentary nature of GLM and ICA. PMID:21723948
NASA Astrophysics Data System (ADS)
Tamimi, E.; Ebadi, H.; Kiani, A.
2017-09-01
Automatic building detection from High Spatial Resolution (HSR) images is one of the most important issues in Remote Sensing (RS). Due to the limited number of spectral bands in HSR images, using other features will lead to improve accuracy. By adding these features, the presence probability of dependent features will be increased, which leads to accuracy reduction. In addition, some parameters should be determined in Support Vector Machine (SVM) classification. Therefore, it is necessary to simultaneously determine classification parameters and select independent features according to image type. Optimization algorithm is an efficient method to solve this problem. On the other hand, pixel-based classification faces several challenges such as producing salt-paper results and high computational time in high dimensional data. Hence, in this paper, a novel method is proposed to optimize object-based SVM classification by applying continuous Ant Colony Optimization (ACO) algorithm. The advantages of the proposed method are relatively high automation level, independency of image scene and type, post processing reduction for building edge reconstruction and accuracy improvement. The proposed method was evaluated by pixel-based SVM and Random Forest (RF) classification in terms of accuracy. In comparison with optimized pixel-based SVM classification, the results showed that the proposed method improved quality factor and overall accuracy by 17% and 10%, respectively. Also, in the proposed method, Kappa coefficient was improved by 6% rather than RF classification. Time processing of the proposed method was relatively low because of unit of image analysis (image object). These showed the superiority of the proposed method in terms of time and accuracy.
Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating
Wang, Bingkun; Huang, Yongfeng; Li, Xing
2016-01-01
E-commerce develops rapidly. Learning and taking good advantage of the myriad reviews from online customers has become crucial to the success in this game, which calls for increasingly more accuracy in sentiment classification of these reviews. Therefore the finer-grained review rating prediction is preferred over the rough binary sentiment classification. There are mainly two types of method in current review rating prediction. One includes methods based on review text content which focus almost exclusively on textual content and seldom relate to those reviewers and items remarked in other relevant reviews. The other one contains methods based on collaborative filtering which extract information from previous records in the reviewer-item rating matrix, however, ignoring review textual content. Here we proposed a framework for review rating prediction which shows the effective combination of the two. Then we further proposed three specific methods under this framework. Experiments on two movie review datasets demonstrate that our review rating prediction framework has better performance than those previous methods. PMID:26880879
Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating.
Wang, Bingkun; Huang, Yongfeng; Li, Xing
2016-01-01
E-commerce develops rapidly. Learning and taking good advantage of the myriad reviews from online customers has become crucial to the success in this game, which calls for increasingly more accuracy in sentiment classification of these reviews. Therefore the finer-grained review rating prediction is preferred over the rough binary sentiment classification. There are mainly two types of method in current review rating prediction. One includes methods based on review text content which focus almost exclusively on textual content and seldom relate to those reviewers and items remarked in other relevant reviews. The other one contains methods based on collaborative filtering which extract information from previous records in the reviewer-item rating matrix, however, ignoring review textual content. Here we proposed a framework for review rating prediction which shows the effective combination of the two. Then we further proposed three specific methods under this framework. Experiments on two movie review datasets demonstrate that our review rating prediction framework has better performance than those previous methods.
Spencer, Bruce D
2012-06-01
Latent class models are increasingly used to assess the accuracy of medical diagnostic tests and other classifications when no gold standard is available and the true state is unknown. When the latent class is treated as the true class, the latent class models provide measures of components of accuracy including specificity and sensitivity and their complements, type I and type II error rates. The error rates according to the latent class model differ from the true error rates, however, and empirical comparisons with a gold standard suggest the true error rates often are larger. We investigate conditions under which the true type I and type II error rates are larger than those provided by the latent class models. Results from Uebersax (1988, Psychological Bulletin 104, 405-416) are extended to accommodate random effects and covariates affecting the responses. The results are important for interpreting the results of latent class analyses. An error decomposition is presented that incorporates an error component from invalidity of the latent class model. © 2011, The International Biometric Society.
NASA Astrophysics Data System (ADS)
Musa Abbagoni, Baba; Yeung, Hoi
2016-08-01
The identification of flow pattern is a key issue in multiphase flow which is encountered in the petrochemical industry. It is difficult to identify the gas-liquid flow regimes objectively with the gas-liquid two-phase flow. This paper presents the feasibility of a clamp-on instrument for an objective flow regime classification of two-phase flow using an ultrasonic Doppler sensor and an artificial neural network, which records and processes the ultrasonic signals reflected from the two-phase flow. Experimental data is obtained on a horizontal test rig with a total pipe length of 21 m and 5.08 cm internal diameter carrying air-water two-phase flow under slug, elongated bubble, stratified-wavy and, stratified flow regimes. Multilayer perceptron neural networks (MLPNNs) are used to develop the classification model. The classifier requires features as an input which is representative of the signals. Ultrasound signal features are extracted by applying both power spectral density (PSD) and discrete wavelet transform (DWT) methods to the flow signals. A classification scheme of ‘1-of-C coding method for classification’ was adopted to classify features extracted into one of four flow regime categories. To improve the performance of the flow regime classifier network, a second level neural network was incorporated by using the output of a first level networks feature as an input feature. The addition of the two network models provided a combined neural network model which has achieved a higher accuracy than single neural network models. Classification accuracies are evaluated in the form of both the PSD and DWT features. The success rates of the two models are: (1) using PSD features, the classifier missed 3 datasets out of 24 test datasets of the classification and scored 87.5% accuracy; (2) with the DWT features, the network misclassified only one data point and it was able to classify the flow patterns up to 95.8% accuracy. This approach has demonstrated the success of a clamp-on ultrasound sensor for flow regime classification that would be possible in industry practice. It is considerably more promising than other techniques as it uses a non-invasive and non-radioactive sensor.
Kim, Junghoe; Calhoun, Vince D.; Shim, Eunsoo; Lee, Jong-Hwan
2015-01-01
Functional connectivity (FC) patterns obtained from resting-state functional magnetic resonance imaging data are commonly employed to study neuropsychiatric conditions by using pattern classifiers such as the support vector machine (SVM). Meanwhile, a deep neural network (DNN) with multiple hidden layers has shown its ability to systematically extract lower-to-higher level information of image and speech data from lower-to-higher hidden layers, markedly enhancing classification accuracy. The objective of this study was to adopt the DNN for whole-brain resting-state FC pattern classification of schizophrenia (SZ) patients vs. healthy controls (HCs) and identification of aberrant FC patterns associated with SZ. We hypothesized that the lower-to-higher level features learned via the DNN would significantly enhance the classification accuracy, and proposed an adaptive learning algorithm to explicitly control the weight sparsity in each hidden layer via L1-norm regularization. Furthermore, the weights were initialized via stacked autoencoder based pre-training to further improve the classification performance. Classification accuracy was systematically evaluated as a function of (1) the number of hidden layers/nodes, (2) the use of L1-norm regularization, (3) the use of the pre-training, (4) the use of framewise displacement (FD) removal, and (5) the use of anatomical/functional parcellation. Using FC patterns from anatomically parcellated regions without FD removal, an error rate of 14.2% was achieved by employing three hidden layers and 50 hidden nodes with both L1-norm regularization and pre-training, which was substantially lower than the error rate from the SVM (22.3%). Moreover, the trained DNN weights (i.e., the learned features) were found to represent the hierarchical organization of aberrant FC patterns in SZ compared with HC. Specifically, pairs of nodes extracted from the lower hidden layer represented sparse FC patterns implicated in SZ, which was quantified by using kurtosis/modularity measures and features from the higher hidden layer showed holistic/global FC patterns differentiating SZ from HC. Our proposed schemes and reported findings attained by using the DNN classifier and whole-brain FC data suggest that such approaches show improved ability to learn hidden patterns in brain imaging data, which may be useful for developing diagnostic tools for SZ and other neuropsychiatric disorders and identifying associated aberrant FC patterns. PMID:25987366
Uddin, M B; Chow, C M; Su, S W
2018-03-26
Sleep apnea (SA), a common sleep disorder, can significantly decrease the quality of life, and is closely associated with major health risks such as cardiovascular disease, sudden death, depression, and hypertension. The normal diagnostic process of SA using polysomnography is costly and time consuming. In addition, the accuracy of different classification methods to detect SA varies with the use of different physiological signals. If an effective, reliable, and accurate classification method is developed, then the diagnosis of SA and its associated treatment will be time-efficient and economical. This study aims to systematically review the literature and present an overview of classification methods to detect SA using respiratory and oximetry signals and address the automated detection approach. Sixty-two included studies revealed the application of single and multiple signals (respiratory and oximetry) for the diagnosis of SA. Both airflow and oxygen saturation signals alone were effective in detecting SA in the case of binary decision-making, whereas multiple signals were good for multi-class detection. In addition, some machine learning methods were superior to the other classification methods for SA detection using respiratory and oximetry signals. To deal with the respiratory and oximetry signals, a good choice of classification method as well as the consideration of associated factors would result in high accuracy in the detection of SA. An accurate classification method should provide a high detection rate with an automated (independent of human action) analysis of respiratory and oximetry signals. Future high-quality automated studies using large samples of data from multiple patient groups or record batches are recommended.
2014-08-22
uncomplicated and inci- dental to the ability being measured. Process- based measures like CS that do not rely on learned content have contributed to military...operations specialist, and dental technician) ratings. These ratings are clearly different from mechanical ratings where AO is, on the face of it, more...performance criteria in studies conducted by the Army (Anderson et al., 2011; Russell, Le, & Putka, 2007), Marine Corps ( Carey , 1994), and Navy (Held, Fedak
Saini, Harsh; Lal, Sunil Pranit; Naidu, Vimal Vikash; Pickering, Vincel Wince; Singh, Gurmeet; Tsunoda, Tatsuhiko; Sharma, Alok
2016-12-05
High dimensional feature space generally degrades classification in several applications. In this paper, we propose a strategy called gene masking, in which non-contributing dimensions are heuristically removed from the data to improve classification accuracy. Gene masking is implemented via a binary encoded genetic algorithm that can be integrated seamlessly with classifiers during the training phase of classification to perform feature selection. It can also be used to discriminate between features that contribute most to the classification, thereby, allowing researchers to isolate features that may have special significance. This technique was applied on publicly available datasets whereby it substantially reduced the number of features used for classification while maintaining high accuracies. The proposed technique can be extremely useful in feature selection as it heuristically removes non-contributing features to improve the performance of classifiers.
Sobol-Shikler, Tal; Robinson, Peter
2010-07-01
We present a classification algorithm for inferring affective states (emotions, mental states, attitudes, and the like) from their nonverbal expressions in speech. It is based on the observations that affective states can occur simultaneously and different sets of vocal features, such as intonation and speech rate, distinguish between nonverbal expressions of different affective states. The input to the inference system was a large set of vocal features and metrics that were extracted from each utterance. The classification algorithm conducted independent pairwise comparisons between nine affective-state groups. The classifier used various subsets of metrics of the vocal features and various classification algorithms for different pairs of affective-state groups. Average classification accuracy of the 36 pairwise machines was 75 percent, using 10-fold cross validation. The comparison results were consolidated into a single ranked list of the nine affective-state groups. This list was the output of the system and represented the inferred combination of co-occurring affective states for the analyzed utterance. The inference accuracy of the combined machine was 83 percent. The system automatically characterized over 500 affective state concepts from the Mind Reading database. The inference of co-occurring affective states was validated by comparing the inferred combinations to the lexical definitions of the labels of the analyzed sentences. The distinguishing capabilities of the system were comparable to human performance.
Seeberg, Trine M.; Tjønnås, Johannes; Haugnes, Pål; Sandbakk, Øyvind
2017-01-01
The automatic classification of sub-techniques in classical cross-country skiing provides unique possibilities for analyzing the biomechanical aspects of outdoor skiing. This is currently possible due to the miniaturization and flexibility of wearable inertial measurement units (IMUs) that allow researchers to bring the laboratory to the field. In this study, we aimed to optimize the accuracy of the automatic classification of classical cross-country skiing sub-techniques by using two IMUs attached to the skier’s arm and chest together with a machine learning algorithm. The novelty of our approach is the reliable detection of individual cycles using a gyroscope on the skier’s arm, while a neural network machine learning algorithm robustly classifies each cycle to a sub-technique using sensor data from an accelerometer on the chest. In this study, 24 datasets from 10 different participants were separated into the categories training-, validation- and test-data. Overall, we achieved a classification accuracy of 93.9% on the test-data. Furthermore, we illustrate how an accurate classification of sub-techniques can be combined with data from standard sports equipment including position, altitude, speed and heart rate measuring systems. Combining this information has the potential to provide novel insight into physiological and biomechanical aspects valuable to coaches, athletes and researchers. PMID:29283421
A Generic Deep-Learning-Based Approach for Automated Surface Inspection.
Ren, Ruoxu; Hung, Terence; Tan, Kay Chen
2018-03-01
Automated surface inspection (ASI) is a challenging task in industry, as collecting training dataset is usually costly and related methods are highly dataset-dependent. In this paper, a generic approach that requires small training data for ASI is proposed. First, this approach builds classifier on the features of image patches, where the features are transferred from a pretrained deep learning network. Next, pixel-wise prediction is obtained by convolving the trained classifier over input image. An experiment on three public and one industrial data set is carried out. The experiment involves two tasks: 1) image classification and 2) defect segmentation. The results of proposed algorithm are compared against several best benchmarks in literature. In the classification tasks, the proposed method improves accuracy by 0.66%-25.50%. In the segmentation tasks, the proposed method reduces error escape rates by 6.00%-19.00% in three defect types and improves accuracies by 2.29%-9.86% in all seven defect types. In addition, the proposed method achieves 0.0% error escape rate in the segmentation task of industrial data.
Reliability of the Walker Cranial Nonmetric Method and Implications for Sex Estimation.
Lewis, Cheyenne J; Garvin, Heather M
2016-05-01
The cranial trait scoring method presented in Buikstra and Ubelaker (Standards for data collection from human skeletal remains. Fayetteville, AR: Arkansas Archeological Survey Research Series No. 44, 1994) and Walker (Am J Phys Anthropol, 136, 2008 and 39) is the most common nonmetric cranial sex estimation method utilized by physical and forensic anthropologists. As such, the reliability and accuracy of the method is vital to ensure its validity in forensic applications. In this study, inter- and intra-observer error rates for the Walker scoring method were calculated using a sample of U.S. White and Black individuals (n = 135). Cohen's weighted kappas, intraclass correlation coefficients, and percentage agreements indicate good agreement between trials and observers for all traits except the mental eminence. Slight disagreement in scoring, however, was found to impact sex classifications, leading to lower accuracy rates than those published by Walker. Furthermore, experience does appear to impact trait scoring and sex classification. The use of revised population-specific equations that avoid the mental eminence is highly recommended to minimize the potential for misclassifications. © 2016 American Academy of Forensic Sciences.
Analyzing thematic maps and mapping for accuracy
Rosenfield, G.H.
1982-01-01
Two problems which exist while attempting to test the accuracy of thematic maps and mapping are: (1) evaluating the accuracy of thematic content, and (2) evaluating the effects of the variables on thematic mapping. Statistical analysis techniques are applicable to both these problems and include techniques for sampling the data and determining their accuracy. In addition, techniques for hypothesis testing, or inferential statistics, are used when comparing the effects of variables. A comprehensive and valid accuracy test of a classification project, such as thematic mapping from remotely sensed data, includes the following components of statistical analysis: (1) sample design, including the sample distribution, sample size, size of the sample unit, and sampling procedure; and (2) accuracy estimation, including estimation of the variance and confidence limits. Careful consideration must be given to the minimum sample size necessary to validate the accuracy of a given. classification category. The results of an accuracy test are presented in a contingency table sometimes called a classification error matrix. Usually the rows represent the interpretation, and the columns represent the verification. The diagonal elements represent the correct classifications. The remaining elements of the rows represent errors by commission, and the remaining elements of the columns represent the errors of omission. For tests of hypothesis that compare variables, the general practice has been to use only the diagonal elements from several related classification error matrices. These data are arranged in the form of another contingency table. The columns of the table represent the different variables being compared, such as different scales of mapping. The rows represent the blocking characteristics, such as the various categories of classification. The values in the cells of the tables might be the counts of correct classification or the binomial proportions of these counts divided by either the row totals or the column totals from the original classification error matrices. In hypothesis testing, when the results of tests of multiple sample cases prove to be significant, some form of statistical test must be used to separate any results that differ significantly from the others. In the past, many analyses of the data in this error matrix were made by comparing the relative magnitudes of the percentage of correct classifications, for either individual categories, the entire map or both. More rigorous analyses have used data transformations and (or) two-way classification analysis of variance. A more sophisticated step of data analysis techniques would be to use the entire classification error matrices using the methods of discrete multivariate analysis or of multiviariate analysis of variance.
A review of supervised object-based land-cover image classification
NASA Astrophysics Data System (ADS)
Ma, Lei; Li, Manchun; Ma, Xiaoxue; Cheng, Liang; Du, Peijun; Liu, Yongxue
2017-08-01
Object-based image classification for land-cover mapping purposes using remote-sensing imagery has attracted significant attention in recent years. Numerous studies conducted over the past decade have investigated a broad array of sensors, feature selection, classifiers, and other factors of interest. However, these research results have not yet been synthesized to provide coherent guidance on the effect of different supervised object-based land-cover classification processes. In this study, we first construct a database with 28 fields using qualitative and quantitative information extracted from 254 experimental cases described in 173 scientific papers. Second, the results of the meta-analysis are reported, including general characteristics of the studies (e.g., the geographic range of relevant institutes, preferred journals) and the relationships between factors of interest (e.g., spatial resolution and study area or optimal segmentation scale, accuracy and number of targeted classes), especially with respect to the classification accuracy of different sensors, segmentation scale, training set size, supervised classifiers, and land-cover types. Third, useful data on supervised object-based image classification are determined from the meta-analysis. For example, we find that supervised object-based classification is currently experiencing rapid advances, while development of the fuzzy technique is limited in the object-based framework. Furthermore, spatial resolution correlates with the optimal segmentation scale and study area, and Random Forest (RF) shows the best performance in object-based classification. The area-based accuracy assessment method can obtain stable classification performance, and indicates a strong correlation between accuracy and training set size, while the accuracy of the point-based method is likely to be unstable due to mixed objects. In addition, the overall accuracy benefits from higher spatial resolution images (e.g., unmanned aerial vehicle) or agricultural sites where it also correlates with the number of targeted classes. More than 95.6% of studies involve an area less than 300 ha, and the spatial resolution of images is predominantly between 0 and 2 m. Furthermore, we identify some methods that may advance supervised object-based image classification. For example, deep learning and type-2 fuzzy techniques may further improve classification accuracy. Lastly, scientists are strongly encouraged to report results of uncertainty studies to further explore the effects of varied factors on supervised object-based image classification.
Sub-pixel image classification for forest types in East Texas
NASA Astrophysics Data System (ADS)
Westbrook, Joey
Sub-pixel classification is the extraction of information about the proportion of individual materials of interest within a pixel. Landcover classification at the sub-pixel scale provides more discrimination than traditional per-pixel multispectral classifiers for pixels where the material of interest is mixed with other materials. It allows for the un-mixing of pixels to show the proportion of each material of interest. The materials of interest for this study are pine, hardwood, mixed forest and non-forest. The goal of this project was to perform a sub-pixel classification, which allows a pixel to have multiple labels, and compare the result to a traditional supervised classification, which allows a pixel to have only one label. The satellite image used was a Landsat 5 Thematic Mapper (TM) scene of the Stephen F. Austin Experimental Forest in Nacogdoches County, Texas and the four cover type classes are pine, hardwood, mixed forest and non-forest. Once classified, a multi-layer raster datasets was created that comprised four raster layers where each layer showed the percentage of that cover type within the pixel area. Percentage cover type maps were then produced and the accuracy of each was assessed using a fuzzy error matrix for the sub-pixel classifications, and the results were compared to the supervised classification in which a traditional error matrix was used. The overall accuracy of the sub-pixel classification using the aerial photo for both training and reference data had the highest (65% overall) out of the three sub-pixel classifications. This was understandable because the analyst can visually observe the cover types actually on the ground for training data and reference data, whereas using the FIA (Forest Inventory and Analysis) plot data, the analyst must assume that an entire pixel contains the exact percentage of a cover type found in a plot. An increase in accuracy was found after reclassifying each sub-pixel classification from nine classes with 10 percent interval each to five classes with 20 percent interval each. When compared to the supervised classification which has a satisfactory overall accuracy of 90%, none of the sub-pixel classification achieved the same level. However, since traditional per-pixel classifiers assign only one label to pixels throughout the landscape while sub-pixel classifications assign multiple labels to each pixel, the traditional 85% accuracy of acceptance for pixel-based classifications should not apply to sub-pixel classifications. More research is needed in order to define the level of accuracy that is deemed acceptable for sub-pixel classifications.
Alcohol use among university students: Considering a positive deviance approach.
Tucker, Maryanne; Harris, Gregory E
2016-09-01
Harmful alcohol consumption among university students continues to be a significant issue. This study examined whether variables identified in the positive deviance literature would predict responsible alcohol consumption among university students. Surveyed students were categorized into three groups: abstainers, responsible drinkers and binge drinkers. Multinomial logistic regression modelling was significant (χ(2) = 274.49, degrees of freedom = 24, p < .001), with several variables predicting group membership. While the model classification accuracy rate (i.e. 71.2%) exceeded the proportional by chance accuracy rate (i.e. 38.4%), providing further support for the model, the model itself best predicted binge drinker membership over the other two groups. © The Author(s) 2015.
Stinchfield, Randy; McCready, John; Turner, Nigel E; Jimenez-Murcia, Susana; Petry, Nancy M; Grant, Jon; Welte, John; Chapman, Heather; Winters, Ken C
2016-09-01
The DSM-5 was published in 2013 and it included two substantive revisions for gambling disorder (GD). These changes are the reduction in the threshold from five to four criteria and elimination of the illegal activities criterion. The purpose of this study was to twofold. First, to assess the reliability, validity and classification accuracy of the DSM-5 diagnostic criteria for GD. Second, to compare the DSM-5-DSM-IV on reliability, validity, and classification accuracy, including an examination of the effect of the elimination of the illegal acts criterion on diagnostic accuracy. To compare DSM-5 and DSM-IV, eight datasets from three different countries (Canada, USA, and Spain; total N = 3247) were used. All datasets were based on similar research methods. Participants were recruited from outpatient gambling treatment services to represent the group with a GD and from the community to represent the group without a GD. All participants were administered a standardized measure of diagnostic criteria. The DSM-5 yielded satisfactory reliability, validity and classification accuracy. In comparing the DSM-5 to the DSM-IV, most comparisons of reliability, validity and classification accuracy showed more similarities than differences. There was evidence of modest improvements in classification accuracy for DSM-5 over DSM-IV, particularly in reduction of false negative errors. This reduction in false negative errors was largely a function of lowering the cut score from five to four and this revision is an improvement over DSM-IV. From a statistical standpoint, eliminating the illegal acts criterion did not make a significant impact on diagnostic accuracy. From a clinical standpoint, illegal acts can still be addressed in the context of the DSM-5 criterion of lying to others.
Bredesen, Ida Marie; Bjøro, Karen; Gunningberg, Lena; Hofoss, Dag
2016-05-01
Pressure ulcers (PUs) are a problem in health care. Staff competency is paramount to PU prevention. Education is essential to increase skills in pressure ulcer classification and risk assessment. Currently, no pressure ulcer learning programs are available in Norwegian. Develop and test an e-learning program for assessment of pressure ulcer risk and pressure ulcer classification. Forty-four nurses working in acute care hospital wards or nursing homes participated and were assigned randomly into two groups: an e-learning program group (intervention) and a traditional classroom lecture group (control). Data was collected immediately before and after training, and again after three months. The study was conducted at one nursing home and two hospitals between May and December 2012. Accuracy of risk assessment (five patient cases) and pressure ulcer classification (40 photos [normal skin, pressure ulcer categories I-IV] split in two sets) were measured by comparing nurse evaluations in each of the two groups to a pre-established standard based on ratings by experts in pressure ulcer classification and risk assessment. Inter-rater reliability was measured by exact percent agreement and multi-rater Fleiss kappa. A Mann-Whitney U test was used for continuous sum score variables. An e-learning program did not improve Braden subscale scoring. For pressure ulcer classification, however, the intervention group scored significantly higher than the control group on several of the categories in post-test immediately after training. However, after three months there were no significant differences in classification skills between the groups. An e-learning program appears to have a greater effect on the accuracy of pressure ulcer classification than classroom teaching in the short term. For proficiency in Braden scoring, no significant effect of educational methods on learning results was detected. Copyright © 2016 Elsevier Ltd. All rights reserved.
Practical Issues in Estimating Classification Accuracy and Consistency with R Package cacIRT
ERIC Educational Resources Information Center
Lathrop, Quinn N.
2015-01-01
There are two main lines of research in estimating classification accuracy (CA) and classification consistency (CC) under Item Response Theory (IRT). The R package cacIRT provides computer implementations of both approaches in an accessible and unified framework. Even with available implementations, there remains decisions a researcher faces when…
Variance estimates and confidence intervals for the Kappa measure of classification accuracy
M. A. Kalkhan; R. M. Reich; R. L. Czaplewski
1997-01-01
The Kappa statistic is frequently used to characterize the results of an accuracy assessment used to evaluate land use and land cover classifications obtained by remotely sensed data. This statistic allows comparisons of alternative sampling designs, classification algorithms, photo-interpreters, and so forth. In order to make these comparisons, it is...
HEp-2 cell image classification method based on very deep convolutional networks with small datasets
NASA Astrophysics Data System (ADS)
Lu, Mengchi; Gao, Long; Guo, Xifeng; Liu, Qiang; Yin, Jianping
2017-07-01
Human Epithelial-2 (HEp-2) cell images staining patterns classification have been widely used to identify autoimmune diseases by the anti-Nuclear antibodies (ANA) test in the Indirect Immunofluorescence (IIF) protocol. Because manual test is time consuming, subjective and labor intensive, image-based Computer Aided Diagnosis (CAD) systems for HEp-2 cell classification are developing. However, methods proposed recently are mostly manual features extraction with low accuracy. Besides, the scale of available benchmark datasets is small, which does not exactly suitable for using deep learning methods. This issue will influence the accuracy of cell classification directly even after data augmentation. To address these issues, this paper presents a high accuracy automatic HEp-2 cell classification method with small datasets, by utilizing very deep convolutional networks (VGGNet). Specifically, the proposed method consists of three main phases, namely image preprocessing, feature extraction and classification. Moreover, an improved VGGNet is presented to address the challenges of small-scale datasets. Experimental results over two benchmark datasets demonstrate that the proposed method achieves superior performance in terms of accuracy compared with existing methods.
Improved fibrosis staging by elastometry and blood test in chronic hepatitis C.
Calès, Paul; Boursier, Jérôme; Ducancelle, Alexandra; Oberti, Frédéric; Hubert, Isabelle; Hunault, Gilles; de Lédinghen, Victor; Zarski, Jean-Pierre; Salmon, Dominique; Lunel, Françoise
2014-07-01
Our main objective was to improve non-invasive fibrosis staging accuracy by resolving the limits of previous methods via new test combinations. Our secondary objectives were to improve staging precision, by developing a detailed fibrosis classification, and reliability (personalized accuracy) determination. All patients (729) included in the derivation population had chronic hepatitis C, liver biopsy, 6 blood tests and Fibroscan. Validation populations included 1584 patients. The most accurate combination was provided by using most markers of FibroMeter and Fibroscan results targeted for significant fibrosis, i.e. 'E-FibroMeter'. Its classification accuracy (91.7%) and precision (assessed by F difference with Metavir: 0.62 ± 0.57) were better than those of FibroMeter (84.1%, P < 0.001; 0.72 ± 0.57, P < 0.001), Fibroscan (88.2%, P = 0.011; 0.68 ± 0.57, P = 0.020), and a previous CSF-SF classification of FibroMeter + Fibroscan (86.7%, P < 0.001; 0.65 ± 0.57, P = 0.044). The accuracy for fibrosis absence (F0) was increased, e.g. from 16.0% with Fibroscan to 75.0% with E-FibroMeter (P < 0.001). Cirrhosis sensitivity was improved, e.g. E-FibroMeter: 92.7% vs. Fibroscan: 83.3%, P = 0.004. The combination improved reliability by deleting unreliable results (accuracy <50%) observed with a single test (1.2% of patients) and increasing optimal reliability (accuracy ≥85%) from 80.4% of patients with Fibroscan (accuracy: 90.9%) to 94.2% of patients with E-FibroMeter (accuracy: 92.9%), P < 0.001. The patient rate with 100% predictive values for cirrhosis by the best combination was twice (36.2%) that of the best single test (FibroMeter: 16.2%, P < 0.001). The new test combination increased: accuracy, globally and especially in patients without fibrosis, staging precision, cirrhosis prediction, and even reliability, thus offering improved fibrosis staging. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
O'Neil, Gina L.; Goodall, Jonathan L.; Watson, Layne T.
2018-04-01
Wetlands are important ecosystems that provide many ecological benefits, and their quality and presence are protected by federal regulations. These regulations require wetland delineations, which can be costly and time-consuming to perform. Computer models can assist in this process, but lack the accuracy necessary for environmental planning-scale wetland identification. In this study, the potential for improvement of wetland identification models through modification of digital elevation model (DEM) derivatives, derived from high-resolution and increasingly available light detection and ranging (LiDAR) data, at a scale necessary for small-scale wetland delineations is evaluated. A novel approach of flow convergence modelling is presented where Topographic Wetness Index (TWI), curvature, and Cartographic Depth-to-Water index (DTW), are modified to better distinguish wetland from upland areas, combined with ancillary soil data, and used in a Random Forest classification. This approach is applied to four study sites in Virginia, implemented as an ArcGIS model. The model resulted in significant improvement in average wetland accuracy compared to the commonly used National Wetland Inventory (84.9% vs. 32.1%), at the expense of a moderately lower average non-wetland accuracy (85.6% vs. 98.0%) and average overall accuracy (85.6% vs. 92.0%). From this, we concluded that modifying TWI, curvature, and DTW provides more robust wetland and non-wetland signatures to the models by improving accuracy rates compared to classifications using the original indices. The resulting ArcGIS model is a general tool able to modify these local LiDAR DEM derivatives based on site characteristics to identify wetlands at a high resolution.
Belgiu, Mariana; Dr Guţ, Lucian
2014-10-01
Although multiresolution segmentation (MRS) is a powerful technique for dealing with very high resolution imagery, some of the image objects that it generates do not match the geometries of the target objects, which reduces the classification accuracy. MRS can, however, be guided to produce results that approach the desired object geometry using either supervised or unsupervised approaches. Although some studies have suggested that a supervised approach is preferable, there has been no comparative evaluation of these two approaches. Therefore, in this study, we have compared supervised and unsupervised approaches to MRS. One supervised and two unsupervised segmentation methods were tested on three areas using QuickBird and WorldView-2 satellite imagery. The results were assessed using both segmentation evaluation methods and an accuracy assessment of the resulting building classifications. Thus, differences in the geometries of the image objects and in the potential to achieve satisfactory thematic accuracies were evaluated. The two approaches yielded remarkably similar classification results, with overall accuracies ranging from 82% to 86%. The performance of one of the unsupervised methods was unexpectedly similar to that of the supervised method; they identified almost identical scale parameters as being optimal for segmenting buildings, resulting in very similar geometries for the resulting image objects. The second unsupervised method produced very different image objects from the supervised method, but their classification accuracies were still very similar. The latter result was unexpected because, contrary to previously published findings, it suggests a high degree of independence between the segmentation results and classification accuracy. The results of this study have two important implications. The first is that object-based image analysis can be automated without sacrificing classification accuracy, and the second is that the previously accepted idea that classification is dependent on segmentation is challenged by our unexpected results, casting doubt on the value of pursuing 'optimal segmentation'. Our results rather suggest that as long as under-segmentation remains at acceptable levels, imperfections in segmentation can be ruled out, so that a high level of classification accuracy can still be achieved.
Computer-aided diagnosis of contrast-enhanced spectral mammography: A feasibility study.
Patel, Bhavika K; Ranjbar, Sara; Wu, Teresa; Pockaj, Barbara A; Li, Jing; Zhang, Nan; Lobbes, Mark; Zhang, Bin; Mitchell, J Ross
2018-01-01
To evaluate whether the use of a computer-aided diagnosis-contrast-enhanced spectral mammography (CAD-CESM) tool can further increase the diagnostic performance of CESM compared with that of experienced radiologists. This IRB-approved retrospective study analyzed 50 lesions described on CESM from August 2014 to December 2015. Histopathologic analyses, used as the criterion standard, revealed 24 benign and 26 malignant lesions. An expert breast radiologist manually outlined lesion boundaries on the different views. A set of morphologic and textural features were then extracted from the low-energy and recombined images. Machine-learning algorithms with feature selection were used along with statistical analysis to reduce, select, and combine features. Selected features were then used to construct a predictive model using a support vector machine (SVM) classification method in a leave-one-out-cross-validation approach. The classification performance was compared against the diagnostic predictions of 2 breast radiologists with access to the same CESM cases. Based on the SVM classification, CAD-CESM correctly identified 45 of 50 lesions in the cohort, resulting in an overall accuracy of 90%. The detection rate for the malignant group was 88% (3 false-negative cases) and 92% for the benign group (2 false-positive cases). Compared with the model, radiologist 1 had an overall accuracy of 78% and a detection rate of 92% (2 false-negative cases) for the malignant group and 62% (10 false-positive cases) for the benign group. Radiologist 2 had an overall accuracy of 86% and a detection rate of 100% for the malignant group and 71% (8 false-positive cases) for the benign group. The results of our feasibility study suggest that a CAD-CESM tool can provide complementary information to radiologists, mainly by reducing the number of false-positive findings. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Quesada-Barriuso, Pablo; Heras, Dora B.; Argüello, Francisco
2016-10-01
The classification of remote sensing hyperspectral images for land cover applications is a very intensive topic. In the case of supervised classification, Support Vector Machines (SVMs) play a dominant role. Recently, the Extreme Learning Machine algorithm (ELM) has been extensively used. The classification scheme previously published by the authors, and called WT-EMP, introduces spatial information in the classification process by means of an Extended Morphological Profile (EMP) that is created from features extracted by wavelets. In addition, the hyperspectral image is denoised in the 2-D spatial domain, also using wavelets and it is joined to the EMP via a stacked vector. In this paper, the scheme is improved achieving two goals. The first one is to reduce the classification time while preserving the accuracy of the classification by using ELM instead of SVM. The second one is to improve the accuracy results by performing not only a 2-D denoising for every spectral band, but also a previous additional 1-D spectral signature denoising applied to each pixel vector of the image. For each denoising the image is transformed by applying a 1-D or 2-D wavelet transform, and then a NeighShrink thresholding is applied. Improvements in terms of classification accuracy are obtained, especially for images with close regions in the classification reference map, because in these cases the accuracy of the classification in the edges between classes is more relevant.
Comparing Features for Classification of MEG Responses to Motor Imagery
Halme, Hanna-Leena; Parkkonen, Lauri
2016-01-01
Background Motor imagery (MI) with real-time neurofeedback could be a viable approach, e.g., in rehabilitation of cerebral stroke. Magnetoencephalography (MEG) noninvasively measures electric brain activity at high temporal resolution and is well-suited for recording oscillatory brain signals. MI is known to modulate 10- and 20-Hz oscillations in the somatomotor system. In order to provide accurate feedback to the subject, the most relevant MI-related features should be extracted from MEG data. In this study, we evaluated several MEG signal features for discriminating between left- and right-hand MI and between MI and rest. Methods MEG was measured from nine healthy participants imagining either left- or right-hand finger tapping according to visual cues. Data preprocessing, feature extraction and classification were performed offline. The evaluated MI-related features were power spectral density (PSD), Morlet wavelets, short-time Fourier transform (STFT), common spatial patterns (CSP), filter-bank common spatial patterns (FBCSP), spatio—spectral decomposition (SSD), and combined SSD+CSP, CSP+PSD, CSP+Morlet, and CSP+STFT. We also compared four classifiers applied to single trials using 5-fold cross-validation for evaluating the classification accuracy and its possible dependence on the classification algorithm. In addition, we estimated the inter-session left-vs-right accuracy for each subject. Results The SSD+CSP combination yielded the best accuracy in both left-vs-right (mean 73.7%) and MI-vs-rest (mean 81.3%) classification. CSP+Morlet yielded the best mean accuracy in inter-session left-vs-right classification (mean 69.1%). There were large inter-subject differences in classification accuracy, and the level of the 20-Hz suppression correlated significantly with the subjective MI-vs-rest accuracy. Selection of the classification algorithm had only a minor effect on the results. Conclusions We obtained good accuracy in sensor-level decoding of MI from single-trial MEG data. Feature extraction methods utilizing both the spatial and spectral profile of MI-related signals provided the best classification results, suggesting good performance of these methods in an online MEG neurofeedback system. PMID:27992574
Multi-label spacecraft electrical signal classification method based on DBN and random forest
Li, Ke; Yu, Nan; Li, Pengfei; Song, Shimin; Wu, Yalei; Li, Yang; Liu, Meng
2017-01-01
In spacecraft electrical signal characteristic data, there exists a large amount of data with high-dimensional features, a high computational complexity degree, and a low rate of identification problems, which causes great difficulty in fault diagnosis of spacecraft electronic load systems. This paper proposes a feature extraction method that is based on deep belief networks (DBN) and a classification method that is based on the random forest (RF) algorithm; The proposed algorithm mainly employs a multi-layer neural network to reduce the dimension of the original data, and then, classification is applied. Firstly, we use the method of wavelet denoising, which was used to pre-process the data. Secondly, the deep belief network is used to reduce the feature dimension and improve the rate of classification for the electrical characteristics data. Finally, we used the random forest algorithm to classify the data and comparing it with other algorithms. The experimental results show that compared with other algorithms, the proposed method shows excellent performance in terms of accuracy, computational efficiency, and stability in addressing spacecraft electrical signal data. PMID:28486479
Multi-label spacecraft electrical signal classification method based on DBN and random forest.
Li, Ke; Yu, Nan; Li, Pengfei; Song, Shimin; Wu, Yalei; Li, Yang; Liu, Meng
2017-01-01
In spacecraft electrical signal characteristic data, there exists a large amount of data with high-dimensional features, a high computational complexity degree, and a low rate of identification problems, which causes great difficulty in fault diagnosis of spacecraft electronic load systems. This paper proposes a feature extraction method that is based on deep belief networks (DBN) and a classification method that is based on the random forest (RF) algorithm; The proposed algorithm mainly employs a multi-layer neural network to reduce the dimension of the original data, and then, classification is applied. Firstly, we use the method of wavelet denoising, which was used to pre-process the data. Secondly, the deep belief network is used to reduce the feature dimension and improve the rate of classification for the electrical characteristics data. Finally, we used the random forest algorithm to classify the data and comparing it with other algorithms. The experimental results show that compared with other algorithms, the proposed method shows excellent performance in terms of accuracy, computational efficiency, and stability in addressing spacecraft electrical signal data.
Simulated rRNA/DNA Ratios Show Potential To Misclassify Active Populations as Dormant
DOE Office of Scientific and Technical Information (OSTI.GOV)
Steven, Blaire; Hesse, Cedar; Soghigian, John
The use of rRNA/DNA ratios derived from surveys of rRNA sequences in RNA and DNA extracts is an appealing but poorly validated approach to infer the activity status of environmental microbes. To improve the interpretation of rRNA/DNA ratios, we performed simulations to investigate the effects of community structure, rRNA amplification, and sampling depth on the accuracy of rRNA/DNA ratios in classifying bacterial populations as “active” or “dormant.” Community structure was an insignificant factor. In contrast, the extent of rRNA amplification that occurs as cells transition from dormant to growing had a significant effect (P < 0.0001) on classification accuracy, withmore » misclassification errors ranging from 16 to 28%, depending on the rRNA amplification model. The error rate increased to 47% when communities included a mixture of rRNA amplification models, but most of the inflated error was false negatives (i.e., active populations misclassified as dormant). Sampling depth also affected error rates (P < 0.001). Inadequate sampling depth produced various artifacts that are characteristic of rRNA/DNA ratios generated from real communities. These data show important constraints on the use of rRNA/DNA ratios to infer activity status. Whereas classification of populations as active based on rRNA/DNA ratios appears generally valid, classification of populations as dormant is potentially far less accurate.« less
Simulated rRNA/DNA Ratios Show Potential To Misclassify Active Populations as Dormant
Steven, Blaire; Hesse, Cedar; Soghigian, John; ...
2017-03-31
The use of rRNA/DNA ratios derived from surveys of rRNA sequences in RNA and DNA extracts is an appealing but poorly validated approach to infer the activity status of environmental microbes. To improve the interpretation of rRNA/DNA ratios, we performed simulations to investigate the effects of community structure, rRNA amplification, and sampling depth on the accuracy of rRNA/DNA ratios in classifying bacterial populations as “active” or “dormant.” Community structure was an insignificant factor. In contrast, the extent of rRNA amplification that occurs as cells transition from dormant to growing had a significant effect (P < 0.0001) on classification accuracy, withmore » misclassification errors ranging from 16 to 28%, depending on the rRNA amplification model. The error rate increased to 47% when communities included a mixture of rRNA amplification models, but most of the inflated error was false negatives (i.e., active populations misclassified as dormant). Sampling depth also affected error rates (P < 0.001). Inadequate sampling depth produced various artifacts that are characteristic of rRNA/DNA ratios generated from real communities. These data show important constraints on the use of rRNA/DNA ratios to infer activity status. Whereas classification of populations as active based on rRNA/DNA ratios appears generally valid, classification of populations as dormant is potentially far less accurate.« less
Aktaruzzaman, M; Migliorini, M; Tenhunen, M; Himanen, S L; Bianchi, A M; Sassi, R
2015-05-01
The work considers automatic sleep stage classification, based on heart rate variability (HRV) analysis, with a focus on the distinction of wakefulness (WAKE) from sleep and rapid eye movement (REM) from non-REM (NREM) sleep. A set of 20 automatically annotated one-night polysomnographic recordings was considered, and artificial neural networks were selected for classification. For each inter-heartbeat (RR) series, beside features previously presented in literature, we introduced a set of four parameters related to signal regularity. RR series of three different lengths were considered (corresponding to 2, 6, and 10 successive epochs, 30 s each, in the same sleep stage). Two sets of only four features captured 99 % of the data variance in each classification problem, and both of them contained one of the new regularity features proposed. The accuracy of classification for REM versus NREM (68.4 %, 2 epochs; 83.8 %, 10 epochs) was higher than when distinguishing WAKE versus SLEEP (67.6 %, 2 epochs; 71.3 %, 10 epochs). Also, the reliability parameter (Cohens's Kappa) was higher (0.68 and 0.45, respectively). Sleep staging classification based on HRV was still less precise than other staging methods, employing a larger variety of signals collected during polysomnographic studies. However, cheap and unobtrusive HRV-only sleep classification proved sufficiently precise for a wide range of applications.
Baltzer, Pascal A T; Dietzel, Matthias; Kaiser, Werner A
2013-08-01
In the face of multiple available diagnostic criteria in MR-mammography (MRM), a practical algorithm for lesion classification is needed. Such an algorithm should be as simple as possible and include only important independent lesion features to differentiate benign from malignant lesions. This investigation aimed to develop a simple classification tree for differential diagnosis in MRM. A total of 1,084 lesions in standardised MRM with subsequent histological verification (648 malignant, 436 benign) were investigated. Seventeen lesion criteria were assessed by 2 readers in consensus. Classification analysis was performed using the chi-squared automatic interaction detection (CHAID) method. Results include the probability for malignancy for every descriptor combination in the classification tree. A classification tree incorporating 5 lesion descriptors with a depth of 3 ramifications (1, root sign; 2, delayed enhancement pattern; 3, border, internal enhancement and oedema) was calculated. Of all 1,084 lesions, 262 (40.4 %) and 106 (24.3 %) could be classified as malignant and benign with an accuracy above 95 %, respectively. Overall diagnostic accuracy was 88.4 %. The classification algorithm reduced the number of categorical descriptors from 17 to 5 (29.4 %), resulting in a high classification accuracy. More than one third of all lesions could be classified with accuracy above 95 %. • A practical algorithm has been developed to classify lesions found in MR-mammography. • A simple decision tree consisting of five criteria reaches high accuracy of 88.4 %. • Unique to this approach, each classification is associated with a diagnostic certainty. • Diagnostic certainty of greater than 95 % is achieved in 34 % of all cases.
Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks.
Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R; Nguyen, Tuan N; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T
2017-01-01
This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively.
Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks
Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R.; Nguyen, Tuan N.; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T.
2017-01-01
This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively. PMID:28326009
Zhang, Xiaoheng; Wang, Lirui; Cao, Yao; Wang, Pin; Zhang, Cheng; Yang, Liuyang; Li, Yongming; Zhang, Yanling; Cheng, Oumei
2018-02-01
Diagnosis of Parkinson's disease (PD) based on speech data has been proved to be an effective way in recent years. However, current researches just care about the feature extraction and classifier design, and do not consider the instance selection. Former research by authors showed that the instance selection can lead to improvement on classification accuracy. However, no attention is paid on the relationship between speech sample and feature until now. Therefore, a new diagnosis algorithm of PD is proposed in this paper by simultaneously selecting speech sample and feature based on relevant feature weighting algorithm and multiple kernel method, so as to find their synergy effects, thereby improving classification accuracy. Experimental results showed that this proposed algorithm obtained apparent improvement on classification accuracy. It can obtain mean classification accuracy of 82.5%, which was 30.5% higher than the relevant algorithm. Besides, the proposed algorithm detected the synergy effects of speech sample and feature, which is valuable for speech marker extraction.
A neural network approach to cloud classification
NASA Technical Reports Server (NTRS)
Lee, Jonathan; Weger, Ronald C.; Sengupta, Sailes K.; Welch, Ronald M.
1990-01-01
It is shown that, using high-spatial-resolution data, very high cloud classification accuracies can be obtained with a neural network approach. A texture-based neural network classifier using only single-channel visible Landsat MSS imagery achieves an overall cloud identification accuracy of 93 percent. Cirrus can be distinguished from boundary layer cloudiness with an accuracy of 96 percent, without the use of an infrared channel. Stratocumulus is retrieved with an accuracy of 92 percent, cumulus at 90 percent. The use of the neural network does not improve cirrus classification accuracy. Rather, its main effect is in the improved separation between stratocumulus and cumulus cloudiness. While most cloud classification algorithms rely on linear parametric schemes, the present study is based on a nonlinear, nonparametric four-layer neural network approach. A three-layer neural network architecture, the nonparametric K-nearest neighbor approach, and the linear stepwise discriminant analysis procedure are compared. A significant finding is that significantly higher accuracies are attained with the nonparametric approaches using only 20 percent of the database as training data, compared to 67 percent of the database in the linear approach.
Affective Computing and the Impact of Gender and Age
Rukavina, Stefanie; Gruss, Sascha; Hoffmann, Holger; Tan, Jun-Wen; Walter, Steffen; Traue, Harald C.
2016-01-01
Affective computing aims at the detection of users’ mental states, in particular, emotions and dispositions during human-computer interactions. Detection can be achieved by measuring multimodal signals, namely, speech, facial expressions and/or psychobiology. Over the past years, one major approach was to identify the best features for each signal using different classification methods. Although this is of high priority, other subject-specific variables should not be neglected. In our study, we analyzed the effect of gender, age, personality and gender roles on the extracted psychobiological features (derived from skin conductance level, facial electromyography and heart rate variability) as well as the influence on the classification results. In an experimental human-computer interaction, five different affective states with picture material from the International Affective Picture System and ULM pictures were induced. A total of 127 subjects participated in the study. Among all potentially influencing variables (gender has been reported to be influential), age was the only variable that correlated significantly with psychobiological responses. In summary, the conducted classification processes resulted in 20% classification accuracy differences according to age and gender, especially when comparing the neutral condition with four other affective states. We suggest taking age and gender specifically into account for future studies in affective computing, as these may lead to an improvement of emotion recognition accuracy. PMID:26939129
DOE Office of Scientific and Technical Information (OSTI.GOV)
Getman, Daniel J
2008-01-01
Many attempts to observe changes in terrestrial systems over time would be significantly enhanced if it were possible to improve the accuracy of classifications of low-resolution historic satellite data. In an effort to examine improving the accuracy of historic satellite image classification by combining satellite and air photo data, two experiments were undertaken in which low-resolution multispectral data and high-resolution panchromatic data were combined and then classified using the ECHO spectral-spatial image classification algorithm and the Maximum Likelihood technique. The multispectral data consisted of 6 multispectral channels (30-meter pixel resolution) from Landsat 7. These data were augmented with panchromatic datamore » (15m pixel resolution) from Landsat 7 in the first experiment, and with a mosaic of digital aerial photography (1m pixel resolution) in the second. The addition of the Landsat 7 panchromatic data provided a significant improvement in the accuracy of classifications made using the ECHO algorithm. Although the inclusion of aerial photography provided an improvement in accuracy, this improvement was only statistically significant at a 40-60% level. These results suggest that once error levels associated with combining aerial photography and multispectral satellite data are reduced, this approach has the potential to significantly enhance the precision and accuracy of classifications made using historic remotely sensed data, as a way to extend the time range of efforts to track temporal changes in terrestrial systems.« less
2011-01-01
Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most sensitivity was around or even lower than a median value of 0.5. Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing. PMID:21849043
ERIC Educational Resources Information Center
Wyse, Adam E.; Babcock, Ben
2016-01-01
A common suggestion made in the psychometric literature for fixed-length classification tests is that one should design tests so that they have maximum information at the cut score. Designing tests in this way is believed to maximize the classification accuracy and consistency of the assessment. This article uses simulated examples to illustrate…
Analysis of data mining classification by comparison of C4.5 and ID algorithms
NASA Astrophysics Data System (ADS)
Sudrajat, R.; Irianingsih, I.; Krisnawan, D.
2017-01-01
The rapid development of information technology, triggered by the intensive use of information technology. For example, data mining widely used in investment. Many techniques that can be used assisting in investment, the method that used for classification is decision tree. Decision tree has a variety of algorithms, such as C4.5 and ID3. Both algorithms can generate different models for similar data sets and different accuracy. C4.5 and ID3 algorithms with discrete data provide accuracy are 87.16% and 99.83% and C4.5 algorithm with numerical data is 89.69%. C4.5 and ID3 algorithms with discrete data provides 520 and 598 customers and C4.5 algorithm with numerical data is 546 customers. From the analysis of the both algorithm it can classified quite well because error rate less than 15%.
The limb movement analysis of rehabilitation exercises using wearable inertial sensors.
Bingquan Huang; Giggins, Oonagh; Kechadi, Tahar; Caulfield, Brian
2016-08-01
Due to no supervision of a therapist in home based exercise programs, inertial sensor based feedback systems which can accurately assess movement repetitions are urgently required. The synchronicity and the degrees of freedom both show that one movement might resemble another movement signal which is mixed in with another not precisely defined movement. Therefore, the data and feature selections are important for movement analysis. This paper explores the data and feature selection for the limb movement analysis of rehabilitation exercises. The results highlight that the classification accuracy is very sensitive to the mount location of the sensors. The results show that the use of 2 or 3 sensor units, the combination of acceleration and gyroscope data, and the feature sets combined by the statistical feature set with another type of feature, can significantly improve the classification accuracy rates. The results illustrate that acceleration data is more effective than gyroscope data for most of the movement analysis.
Yarn-dyed fabric defect classification based on convolutional neural network
NASA Astrophysics Data System (ADS)
Jing, Junfeng; Dong, Amei; Li, Pengfei; Zhang, Kaibing
2017-09-01
Considering that manual inspection of the yarn-dyed fabric can be time consuming and inefficient, we propose a yarn-dyed fabric defect classification method by using a convolutional neural network (CNN) based on a modified AlexNet. CNN shows powerful ability in performing feature extraction and fusion by simulating the learning mechanism of human brain. The local response normalization layers in AlexNet are replaced by the batch normalization layers, which can enhance both the computational efficiency and classification accuracy. In the training process of the network, the characteristics of the defect are extracted step by step and the essential features of the image can be obtained from the fusion of the edge details with several convolution operations. Then the max-pooling layers, the dropout layers, and the fully connected layers are employed in the classification model to reduce the computation cost and extract more precise features of the defective fabric. Finally, the results of the defect classification are predicted by the softmax function. The experimental results show promising performance with an acceptable average classification rate and strong robustness on yarn-dyed fabric defect classification.
Ceylan, Murat; Ceylan, Rahime; Ozbay, Yüksel; Kara, Sadik
2008-09-01
In biomedical signal classification, due to the huge amount of data, to compress the biomedical waveform data is vital. This paper presents two different structures formed using feature extraction algorithms to decrease size of feature set in training and test data. The proposed structures, named as wavelet transform-complex-valued artificial neural network (WT-CVANN) and complex wavelet transform-complex-valued artificial neural network (CWT-CVANN), use real and complex discrete wavelet transform for feature extraction. The aim of using wavelet transform is to compress data and to reduce training time of network without decreasing accuracy rate. In this study, the presented structures were applied to the problem of classification in carotid arterial Doppler ultrasound signals. Carotid arterial Doppler ultrasound signals were acquired from left carotid arteries of 38 patients and 40 healthy volunteers. The patient group included 22 males and 16 females with an established diagnosis of the early phase of atherosclerosis through coronary or aortofemoropopliteal (lower extremity) angiographies (mean age, 59 years; range, 48-72 years). Healthy volunteers were young non-smokers who seem to not bear any risk of atherosclerosis, including 28 males and 12 females (mean age, 23 years; range, 19-27 years). Sensitivity, specificity and average detection rate were calculated for comparison, after training and test phases of all structures finished. These parameters have demonstrated that training times of CVANN and real-valued artificial neural network (RVANN) were reduced using feature extraction algorithms without decreasing accuracy rate in accordance to our aim.
Developing collaborative classifiers using an expert-based model
Mountrakis, G.; Watts, R.; Luo, L.; Wang, Jingyuan
2009-01-01
This paper presents a hierarchical, multi-stage adaptive strategy for image classification. We iteratively apply various classification methods (e.g., decision trees, neural networks), identify regions of parametric and geographic space where accuracy is low, and in these regions, test and apply alternate methods repeating the process until the entire image is classified. Currently, classifiers are evaluated through human input using an expert-based system; therefore, this paper acts as the proof of concept for collaborative classifiers. Because we decompose the problem into smaller, more manageable sub-tasks, our classification exhibits increased flexibility compared to existing methods since classification methods are tailored to the idiosyncrasies of specific regions. A major benefit of our approach is its scalability and collaborative support since selected low-accuracy classifiers can be easily replaced with others without affecting classification accuracy in high accuracy areas. At each stage, we develop spatially explicit accuracy metrics that provide straightforward assessment of results by non-experts and point to areas that need algorithmic improvement or ancillary data. Our approach is demonstrated in the task of detecting impervious surface areas, an important indicator for human-induced alterations to the environment, using a 2001 Landsat scene from Las Vegas, Nevada. ?? 2009 American Society for Photogrammetry and Remote Sensing.
Schmidt, Robert L; Walker, Brandon S; Cohen, Michael B
2015-03-01
Reliable estimates of accuracy are important for any diagnostic test. Diagnostic accuracy studies are subject to unique sources of bias. Verification bias and classification bias are 2 sources of bias that commonly occur in diagnostic accuracy studies. Statistical methods are available to estimate the impact of these sources of bias when they occur alone. The impact of interactions when these types of bias occur together has not been investigated. We developed mathematical relationships to show the combined effect of verification bias and classification bias. A wide range of case scenarios were generated to assess the impact of bias components and interactions on total bias. Interactions between verification bias and classification bias caused overestimation of sensitivity and underestimation of specificity. Interactions had more effect on sensitivity than specificity. Sensitivity was overestimated by at least 7% in approximately 6% of the tested scenarios. Specificity was underestimated by at least 7% in less than 0.1% of the scenarios. Interactions between verification bias and classification bias create distortions in accuracy estimates that are greater than would be predicted from each source of bias acting independently. © 2014 American Cancer Society.
Classification of Clouds in Satellite Imagery Using Adaptive Fuzzy Sparse Representation.
Jin, Wei; Gong, Fei; Zeng, Xingbin; Fu, Randi
2016-12-16
Automatic cloud detection and classification using satellite cloud imagery have various meteorological applications such as weather forecasting and climate monitoring. Cloud pattern analysis is one of the research hotspots recently. Since satellites sense the clouds remotely from space, and different cloud types often overlap and convert into each other, there must be some fuzziness and uncertainty in satellite cloud imagery. Satellite observation is susceptible to noises, while traditional cloud classification methods are sensitive to noises and outliers; it is hard for traditional cloud classification methods to achieve reliable results. To deal with these problems, a satellite cloud classification method using adaptive fuzzy sparse representation-based classification (AFSRC) is proposed. Firstly, by defining adaptive parameters related to attenuation rate and critical membership, an improved fuzzy membership is introduced to accommodate the fuzziness and uncertainty of satellite cloud imagery; secondly, by effective combination of the improved fuzzy membership function and sparse representation-based classification (SRC), atoms in training dictionary are optimized; finally, an adaptive fuzzy sparse representation classifier for cloud classification is proposed. Experiment results on FY-2G satellite cloud image show that, the proposed method not only improves the accuracy of cloud classification, but also has strong stability and adaptability with high computational efficiency.
Zhu, Lianzhang; Chen, Leiming; Zhao, Dehai
2017-01-01
Accurate emotion recognition from speech is important for applications like smart health care, smart entertainment, and other smart services. High accuracy emotion recognition from Chinese speech is challenging due to the complexities of the Chinese language. In this paper, we explore how to improve the accuracy of speech emotion recognition, including speech signal feature extraction and emotion classification methods. Five types of features are extracted from a speech sample: mel frequency cepstrum coefficient (MFCC), pitch, formant, short-term zero-crossing rate and short-term energy. By comparing statistical features with deep features extracted by a Deep Belief Network (DBN), we attempt to find the best features to identify the emotion status for speech. We propose a novel classification method that combines DBN and SVM (support vector machine) instead of using only one of them. In addition, a conjugate gradient method is applied to train DBN in order to speed up the training process. Gender-dependent experiments are conducted using an emotional speech database created by the Chinese Academy of Sciences. The results show that DBN features can reflect emotion status better than artificial features, and our new classification approach achieves an accuracy of 95.8%, which is higher than using either DBN or SVM separately. Results also show that DBN can work very well for small training databases if it is properly designed. PMID:28737705
NASA Technical Reports Server (NTRS)
Nalepka, R. F. (Principal Investigator); Sadowski, F. E.; Sarno, J. E.
1976-01-01
The author has identified the following significant results. A supervised classification within two separate ground areas of the Sam Houston National Forest was carried out for two sq meters spatial resolution MSS data. Data were progressively coarsened to simulate five additional cases of spatial resolution ranging up to 64 sq meters. Similar processing and analysis of all spatial resolutions enabled evaluations of the effect of spatial resolution on classification accuracy for various levels of detail and the effects on area proportion estimation for very general forest features. For very coarse resolutions, a subset of spectral channels which simulated the proposed thematic mapper channels was used to study classification accuracy.
The use of Landsat data to inventory cotton and soybean acreage in North Alabama
NASA Technical Reports Server (NTRS)
Downs, S. W., Jr.; Faust, N. L.
1980-01-01
This study was performed to determine if Landsat data could be used to improve the accuracy of the estimation of cotton acreage. A linear classification algorithm and a maximum likelihood algorithm were used for computer classification of the area, and the classification was compared with ground truth. The classification accuracy for some fields was greater than 90 percent; however, the overall accuracy was 71 percent for cotton and 56 percent for soybeans. The results of this research indicate that computer analysis of Landsat data has potential for improving upon the methods presently being used to determine cotton acreage; however, additional experiments and refinements are needed before the method can be used operationally.
NASA Technical Reports Server (NTRS)
Card, Don H.; Strong, Laurence L.
1989-01-01
An application of a classification accuracy assessment procedure is described for a vegetation and land cover map prepared by digital image processing of LANDSAT multispectral scanner data. A statistical sampling procedure called Stratified Plurality Sampling was used to assess the accuracy of portions of a map of the Arctic National Wildlife Refuge coastal plain. Results are tabulated as percent correct classification overall as well as per category with associated confidence intervals. Although values of percent correct were disappointingly low for most categories, the study was useful in highlighting sources of classification error and demonstrating shortcomings of the plurality sampling method.
Analog design of a new neural network for optical character recognition.
Morns, I P; Dlay, S S
1999-01-01
An electronic circuit is presented for a new type of neural network, which gives a recognition rate of over 100 kHz. The network is used to classify handwritten numerals, presented as Fourier and wavelet descriptors, and has been shown to train far quicker than the popular backpropagation network while maintaining classification accuracy.
ERIC Educational Resources Information Center
McKown, Clark; Gumbiner, Laura M.; Johnson, Jason
2011-01-01
Social rejection is associated with a wide variety of negative outcomes. Early identification of social rejection and intervention to minimize its negative impact is thus important. However, sociometric methods, which are considered high in validity for identifying socially rejected children, are frequently not used because of (a) procedural…
Singha, Mrinal; Wu, Bingfang; Zhang, Miao
2016-01-01
Accurate and timely mapping of paddy rice is vital for food security and environmental sustainability. This study evaluates the utility of temporal features extracted from coarse resolution data for object-based paddy rice classification of fine resolution data. The coarse resolution vegetation index data is first fused with the fine resolution data to generate the time series fine resolution data. Temporal features are extracted from the fused data and added with the multi-spectral data to improve the classification accuracy. Temporal features provided the crop growth information, while multi-spectral data provided the pattern variation of paddy rice. The achieved overall classification accuracy and kappa coefficient were 84.37% and 0.68, respectively. The results indicate that the use of temporal features improved the overall classification accuracy of a single-date multi-spectral image by 18.75% from 65.62% to 84.37%. The minimum sensitivity (MS) of the paddy rice classification has also been improved. The comparison showed that the mapped paddy area was analogous to the agricultural statistics at the district level. This work also highlighted the importance of feature selection to achieve higher classification accuracies. These results demonstrate the potential of the combined use of temporal and spectral features for accurate paddy rice classification. PMID:28025525
Janousova, Eva; Schwarz, Daniel; Kasparek, Tomas
2015-06-30
We investigated a combination of three classification algorithms, namely the modified maximum uncertainty linear discriminant analysis (mMLDA), the centroid method, and the average linkage, with three types of features extracted from three-dimensional T1-weighted magnetic resonance (MR) brain images, specifically MR intensities, grey matter densities, and local deformations for distinguishing 49 first episode schizophrenia male patients from 49 healthy male subjects. The feature sets were reduced using intersubject principal component analysis before classification. By combining the classifiers, we were able to obtain slightly improved results when compared with single classifiers. The best classification performance (81.6% accuracy, 75.5% sensitivity, and 87.8% specificity) was significantly better than classification by chance. We also showed that classifiers based on features calculated using more computation-intensive image preprocessing perform better; mMLDA with classification boundary calculated as weighted mean discriminative scores of the groups had improved sensitivity but similar accuracy compared to the original MLDA; reducing a number of eigenvectors during data reduction did not always lead to higher classification accuracy, since noise as well as the signal important for classification were removed. Our findings provide important information for schizophrenia research and may improve accuracy of computer-aided diagnostics of neuropsychiatric diseases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Singha, Mrinal; Wu, Bingfang; Zhang, Miao
2016-12-22
Accurate and timely mapping of paddy rice is vital for food security and environmental sustainability. This study evaluates the utility of temporal features extracted from coarse resolution data for object-based paddy rice classification of fine resolution data. The coarse resolution vegetation index data is first fused with the fine resolution data to generate the time series fine resolution data. Temporal features are extracted from the fused data and added with the multi-spectral data to improve the classification accuracy. Temporal features provided the crop growth information, while multi-spectral data provided the pattern variation of paddy rice. The achieved overall classification accuracy and kappa coefficient were 84.37% and 0.68, respectively. The results indicate that the use of temporal features improved the overall classification accuracy of a single-date multi-spectral image by 18.75% from 65.62% to 84.37%. The minimum sensitivity (MS) of the paddy rice classification has also been improved. The comparison showed that the mapped paddy area was analogous to the agricultural statistics at the district level. This work also highlighted the importance of feature selection to achieve higher classification accuracies. These results demonstrate the potential of the combined use of temporal and spectral features for accurate paddy rice classification.
Automatic classification of protein structures using physicochemical parameters.
Mohan, Abhilash; Rao, M Divya; Sunderrajan, Shruthi; Pennathur, Gautam
2014-09-01
Protein classification is the first step to functional annotation; SCOP and Pfam databases are currently the most relevant protein classification schemes. However, the disproportion in the number of three dimensional (3D) protein structures generated versus their classification into relevant superfamilies/families emphasizes the need for automated classification schemes. Predicting function of novel proteins based on sequence information alone has proven to be a major challenge. The present study focuses on the use of physicochemical parameters in conjunction with machine learning algorithms (Naive Bayes, Decision Trees, Random Forest and Support Vector Machines) to classify proteins into their respective SCOP superfamily/Pfam family, using sequence derived information. Spectrophores™, a 1D descriptor of the 3D molecular field surrounding a structure was used as a benchmark to compare the performance of the physicochemical parameters. The machine learning algorithms were modified to select features based on information gain for each SCOP superfamily/Pfam family. The effect of combining physicochemical parameters and spectrophores on classification accuracy (CA) was studied. Machine learning algorithms trained with the physicochemical parameters consistently classified SCOP superfamilies and Pfam families with a classification accuracy above 90%, while spectrophores performed with a CA of around 85%. Feature selection improved classification accuracy for both physicochemical parameters and spectrophores based machine learning algorithms. Combining both attributes resulted in a marginal loss of performance. Physicochemical parameters were able to classify proteins from both schemes with classification accuracy ranging from 90-96%. These results suggest the usefulness of this method in classifying proteins from amino acid sequences.
Classification accuracy for stratification with remotely sensed data
Raymond L. Czaplewski; Paul L. Patterson
2003-01-01
Tools are developed that help specify the classification accuracy required from remotely sensed data. These tools are applied during the planning stage of a sample survey that will use poststratification, prestratification with proportional allocation, or double sampling for stratification. Accuracy standards are developed in terms of an âerror matrix,â which is...
Metric learning for automatic sleep stage classification.
Phan, Huy; Do, Quan; Do, The-Luan; Vu, Duc-Lung
2013-01-01
We introduce in this paper a metric learning approach for automatic sleep stage classification based on single-channel EEG data. We show that learning a global metric from training data instead of using the default Euclidean metric, the k-nearest neighbor classification rule outperforms state-of-the-art methods on Sleep-EDF dataset with various classification settings. The overall accuracy for Awake/Sleep and 4-class classification setting are 98.32% and 94.49% respectively. Furthermore, the superior accuracy is achieved by performing classification on a low-dimensional feature space derived from time and frequency domains and without the need for artifact removal as a preprocessing step.
Groom, Madeleine J; Young, Zoe; Hall, Charlotte L; Gillott, Alinda; Hollis, Chris
2016-09-30
There is a clinical need for objective evidence-based measures that are sensitive and specific to ADHD when compared with other neurodevelopmental disorders. This study evaluated the incremental validity of adding an objective measure of activity and computerised cognitive assessment to clinical rating scales to differentiate adult ADHD from Autism spectrum disorders (ASD). Adults with ADHD (n=33) or ASD (n=25) performed the QbTest, comprising a Continuous Performance Test with motion-tracker to record physical activity. QbTest parameters measuring inattention, impulsivity and hyperactivity were combined to provide a summary score ('QbTotal'). Binary stepwise logistic regression measured the probability of assignment to the ADHD or ASD group based on scores on the Conners Adult ADHD Rating Scale-subscale E (CAARS-E) and Autism Quotient (AQ10) in the first step and then QbTotal added in the second step. The model fit was significant at step 1 (CAARS-E, AQ10) with good group classification accuracy. These predictors were retained and QbTotal was added, resulting in a significant improvement in model fit and group classification accuracy. All predictors were significant. ROC curves indicated superior specificity of QbTotal. The findings present preliminary evidence that adding QbTest to clinical rating scales may improve the differentiation of ADHD and ASD in adults. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
ERIC Educational Resources Information Center
Bramley, Tom
2010-01-01
Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…
A hybrid three-class brain-computer interface system utilizing SSSEPs and transient ERPs
NASA Astrophysics Data System (ADS)
Breitwieser, Christian; Pokorny, Christoph; Müller-Putz, Gernot R.
2016-12-01
Objective. This paper investigates the fusion of steady-state somatosensory evoked potentials (SSSEPs) and transient event-related potentials (tERPs), evoked through tactile simulation on the left and right-hand fingertips, in a three-class EEG based hybrid brain-computer interface. It was hypothesized, that fusing the input signals leads to higher classification rates than classifying tERP and SSSEP individually. Approach. Fourteen subjects participated in the studies, consisting of a screening paradigm to determine person dependent resonance-like frequencies and a subsequent online paradigm. The whole setup of the BCI system was based on open interfaces, following suggestions for a common implementation platform. During the online experiment, subjects were instructed to focus their attention on the stimulated fingertips as indicated by a visual cue. The recorded data were classified during runtime using a multi-class shrinkage LDA classifier and the outputs were fused together applying a posterior probability based fusion. Data were further analyzed offline, involving a combined classification of SSSEP and tERP features as a second fusion principle. The final results were tested for statistical significance applying a repeated measures ANOVA. Main results. A significant classification increase was achieved when fusing the results with a combined classification compared to performing an individual classification. Furthermore, the SSSEP classifier was significantly better in detecting a non-control state, whereas the tERP classifier was significantly better in detecting control states. Subjects who had a higher relative band power increase during the screening session also achieved significantly higher classification results than subjects with lower relative band power increase. Significance. It could be shown that utilizing SSSEP and tERP for hBCIs increases the classification accuracy and also that tERP and SSSEP are not classifying control- and non-control states with the same level of accuracy.
Bahadure, Nilesh Bhaskarrao; Ray, Arun Kumar; Thethi, Har Pal
2018-01-17
The detection of a brain tumor and its classification from modern imaging modalities is a primary concern, but a time-consuming and tedious work was performed by radiologists or clinical supervisors. The accuracy of detection and classification of tumor stages performed by radiologists is depended on their experience only, so the computer-aided technology is very important to aid with the diagnosis accuracy. In this study, to improve the performance of tumor detection, we investigated comparative approach of different segmentation techniques and selected the best one by comparing their segmentation score. Further, to improve the classification accuracy, the genetic algorithm is employed for the automatic classification of tumor stage. The decision of classification stage is supported by extracting relevant features and area calculation. The experimental results of proposed technique are evaluated and validated for performance and quality analysis on magnetic resonance brain images, based on segmentation score, accuracy, sensitivity, specificity, and dice similarity index coefficient. The experimental results achieved 92.03% accuracy, 91.42% specificity, 92.36% sensitivity, and an average segmentation score between 0.82 and 0.93 demonstrating the effectiveness of the proposed technique for identifying normal and abnormal tissues from brain MR images. The experimental results also obtained an average of 93.79% dice similarity index coefficient, which indicates better overlap between the automated extracted tumor regions with manually extracted tumor region by radiologists.
Thematic accuracy of the National Land Cover Database (NLCD) 2001 land cover for Alaska
Selkowitz, D.J.; Stehman, S.V.
2011-01-01
The National Land Cover Database (NLCD) 2001 Alaska land cover classification is the first 30-m resolution land cover product available covering the entire state of Alaska. The accuracy assessment of the NLCD 2001 Alaska land cover classification employed a geographically stratified three-stage sampling design to select the reference sample of pixels. Reference land cover class labels were determined via fixed wing aircraft, as the high resolution imagery used for determining the reference land cover classification in the conterminous U.S. was not available for most of Alaska. Overall thematic accuracy for the Alaska NLCD was 76.2% (s.e. 2.8%) at Level II (12 classes evaluated) and 83.9% (s.e. 2.1%) at Level I (6 classes evaluated) when agreement was defined as a match between the map class and either the primary or alternate reference class label. When agreement was defined as a match between the map class and primary reference label only, overall accuracy was 59.4% at Level II and 69.3% at Level I. The majority of classification errors occurred at Level I of the classification hierarchy (i.e., misclassifications were generally to a different Level I class, not to a Level II class within the same Level I class). Classification accuracy was higher for more abundant land cover classes and for pixels located in the interior of homogeneous land cover patches. ?? 2011.
The study of vehicle classification equipment with solutions to improve accuracy in Oklahoma.
DOT National Transportation Integrated Search
2014-12-01
The accuracy of vehicle counting and classification data is vital for appropriate future highway and road : design, including determining pavement characteristics, eliminating traffic jams, and improving safety. : Organizations relying on vehicle cla...
Peatland classification of West Siberia based on Landsat imagery
NASA Astrophysics Data System (ADS)
Terentieva, I.; Glagolev, M.; Lapshina, E.; Maksyutov, S. S.
2014-12-01
Increasing interest in peatlands for prediction of environmental changes requires an understanding of its geographical distribution. West Siberia Plain is the biggest peatland area in Eurasia and is situated in the high latitudes experiencing enhanced rate of climate change. West Siberian taiga mires are important globally, accounting for about 12.5% of the global wetland area. A number of peatland maps of the West Siberia was developed in 1970s, but their accuracy is limited. Here we report the effort in mapping West Siberian peatlands using 30 m resolution Landsat imagery. As a first step, peatland classification scheme oriented on environmental parameter upscaling was developed. The overall workflow involves data pre-processing, training data collection, image classification on a scene-by-scene basis, regrouping of the derived classes into final peatland types and accuracy assessment. To avoid misclassification peatlands were distinguished from other landscapes using threshold method: for each scene, Green-Red Vegetation Indices was used for peatland masking and 5th channel was used for masking water bodies. Peatland image masks were made in Quantum GIS, filtered in MATLAB and then classified in Multispec (Purdue Research Foundation) using maximum likelihood algorithm of supervised classification method. Training sample selection was mostly based on spectral signatures due to limited ancillary and high-resolution image data. As an additional source of information, we applied our field knowledge resulting from more than 10 years of fieldwork in West Siberia summarized in an extensive dataset of botanical relevés, field photos, pH and electrical conductivity data from 40 test sites. After the classification procedure, discriminated spectral classes were generalized into 12 peatland types. Overall accuracy assessment was based on 439 randomly assigned test sites showing final map accuracy was 80%. Total peatland area was estimated at 73.0 Mha. Various ridge-hollow and ridge-hollow-pool bog complexes prevail here occupying 34.5 Mha. They are followed by lakes (11.1 Mha), fens (10.7 Mha), pine-dwarf-shrub sphagnum bogs (9.3 Mha) and palsa complexes (7.4 Mha).
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds.
Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M; Bloom, Peter H; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds
Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data. PMID:28403159
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds
Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael J.; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.
Byun, Wonwoo; Lee, Jung-Min; Kim, Youngwon; Brusseau, Timothy A
2018-03-26
This study examined the accuracy of the Fitbit activity tracker (FF) for quantifying sedentary behavior (SB) and varying intensities of physical activity (PA) in 3-5-year-old children. Twenty-eight healthy preschool-aged children (Girls: 46%, Mean age: 4.8 ± 1.0 years) wore the FF and were directly observed while performing a set of various unstructured and structured free-living activities from sedentary to vigorous intensity. The classification accuracy of the FF for measuring SB, light PA (LPA), moderate-to-vigorous PA (MVPA), and total PA (TPA) was examined calculating Pearson correlation coefficients (r), mean absolute percent error (MAPE), Cohen's kappa ( k ), sensitivity (Se), specificity (Sp), and area under the receiver operating curve (ROC-AUC). The classification accuracies of the FF (ROC-AUC) were 0.92, 0.63, 0.77 and 0.92 for SB, LPA, MVPA and TPA, respectively. Similarly, values of kappa, Se, Sp and percentage of correct classification were consistently high for SB and TPA, but low for LPA and MVPA. The FF demonstrated excellent classification accuracy for assessing SB and TPA, but lower accuracy for classifying LPA and MVPA. Our findings suggest that the FF should be considered as a valid instrument for assessing time spent sedentary and overall physical activity in preschool-aged children.
Wang, Juan; Nishikawa, Robert M; Yang, Yongyi
2016-01-01
In computer-aided detection of microcalcifications (MCs), the detection accuracy is often compromised by frequent occurrence of false positives (FPs), which can be attributed to a number of factors, including imaging noise, inhomogeneity in tissue background, linear structures, and artifacts in mammograms. In this study, the authors investigated a unified classification approach for combating the adverse effects of these heterogeneous factors for accurate MC detection. To accommodate FPs caused by different factors in a mammogram image, the authors developed a classification model to which the input features were adapted according to the image context at a detection location. For this purpose, the input features were defined in two groups, of which one group was derived from the image intensity pattern in a local neighborhood of a detection location, and the other group was used to characterize how a MC is different from its structural background. Owing to the distinctive effect of linear structures in the detector response, the authors introduced a dummy variable into the unified classifier model, which allowed the input features to be adapted according to the image context at a detection location (i.e., presence or absence of linear structures). To suppress the effect of inhomogeneity in tissue background, the input features were extracted from different domains aimed for enhancing MCs in a mammogram image. To demonstrate the flexibility of the proposed approach, the authors implemented the unified classifier model by two widely used machine learning algorithms, namely, a support vector machine (SVM) classifier and an Adaboost classifier. In the experiment, the proposed approach was tested for two representative MC detectors in the literature [difference-of-Gaussians (DoG) detector and SVM detector]. The detection performance was assessed using free-response receiver operating characteristic (FROC) analysis on a set of 141 screen-film mammogram (SFM) images (66 cases) and a set of 188 full-field digital mammogram (FFDM) images (95 cases). The FROC analysis results show that the proposed unified classification approach can significantly improve the detection accuracy of two MC detectors on both SFM and FFDM images. Despite the difference in performance between the two detectors, the unified classifiers can reduce their FP rate to a similar level in the output of the two detectors. In particular, with true-positive rate at 85%, the FP rate on SFM images for the DoG detector was reduced from 1.16 to 0.33 clusters/image (unified SVM) and 0.36 clusters/image (unified Adaboost), respectively; similarly, for the SVM detector, the FP rate was reduced from 0.45 clusters/image to 0.30 clusters/image (unified SVM) and 0.25 clusters/image (unified Adaboost), respectively. Similar FP reduction results were also achieved on FFDM images for the two MC detectors. The proposed unified classification approach can be effective for discriminating MCs from FPs caused by different factors (such as MC-like noise patterns and linear structures) in MC detection. The framework is general and can be applicable for further improving the detection accuracy of existing MC detectors.
Cho, Ming-Yuan; Hoang, Thi Thom
2017-01-01
Fast and accurate fault classification is essential to power system operations. In this paper, in order to classify electrical faults in radial distribution systems, a particle swarm optimization (PSO) based support vector machine (SVM) classifier has been proposed. The proposed PSO based SVM classifier is able to select appropriate input features and optimize SVM parameters to increase classification accuracy. Further, a time-domain reflectometry (TDR) method with a pseudorandom binary sequence (PRBS) stimulus has been used to generate a dataset for purposes of classification. The proposed technique has been tested on a typical radial distribution network to identify ten different types of faults considering 12 given input features generated by using Simulink software and MATLAB Toolbox. The success rate of the SVM classifier is over 97%, which demonstrates the effectiveness and high efficiency of the developed method.
Chocolate Classification by an Electronic Nose with Pressure Controlled Generated Stimulation
Valdez, Luis F.; Gutiérrez, Juan Manuel
2016-01-01
In this work, we will analyze the response of a Metal Oxide Gas Sensor (MOGS) array to a flow controlled stimulus generated in a pressure controlled canister produced by a homemade olfactometer to build an E-nose. The built E-nose is capable of chocolate identification between the 26 analyzed chocolate bar samples and four features recognition (chocolate type, extra ingredient, sweetener and expiration date status). The data analysis tools used were Principal Components Analysis (PCA) and Artificial Neural Networks (ANNs). The chocolate identification E-nose average classification rate was of 81.3% with 0.99 accuracy (Acc), 0.86 precision (Prc), 0.84 sensitivity (Sen) and 0.99 specificity (Spe) for test. The chocolate feature recognition E-nose gives a classification rate of 85.36% with 0.96 Acc, 0.86 Prc, 0.85 Sen and 0.96 Spe. In addition, a preliminary sample aging analysis was made. The results prove the pressure controlled generated stimulus is reliable for this type of studies. PMID:27775628
Chocolate Classification by an Electronic Nose with Pressure Controlled Generated Stimulation.
Valdez, Luis F; Gutiérrez, Juan Manuel
2016-10-20
In this work, we will analyze the response of a Metal Oxide Gas Sensor (MOGS) array to a flow controlled stimulus generated in a pressure controlled canister produced by a homemade olfactometer to build an E-nose. The built E-nose is capable of chocolate identification between the 26 analyzed chocolate bar samples and four features recognition (chocolate type, extra ingredient, sweetener and expiration date status). The data analysis tools used were Principal Components Analysis (PCA) and Artificial Neural Networks (ANNs). The chocolate identification E-nose average classification rate was of 81.3% with 0.99 accuracy (Acc), 0.86 precision (Prc), 0.84 sensitivity (Sen) and 0.99 specificity (Spe) for test. The chocolate feature recognition E-nose gives a classification rate of 85.36% with 0.96 Acc, 0.86 Prc, 0.85 Sen and 0.96 Spe. In addition, a preliminary sample aging analysis was made. The results prove the pressure controlled generated stimulus is reliable for this type of studies.
Ground Truth Sampling and LANDSAT Accuracy Assessment
NASA Technical Reports Server (NTRS)
Robinson, J. W.; Gunther, F. J.; Campbell, W. J.
1982-01-01
It is noted that the key factor in any accuracy assessment of remote sensing data is the method used for determining the ground truth, independent of the remote sensing data itself. The sampling and accuracy procedures developed for nuclear power plant siting study are described. The purpose of the sampling procedure was to provide data for developing supervised classifications for two study sites and for assessing the accuracy of that and the other procedures used. The purpose of the accuracy assessment was to allow the comparison of the cost and accuracy of various classification procedures as applied to various data types.
Multi-Temporal Classification and Change Detection Using Uav Images
NASA Astrophysics Data System (ADS)
Makuti, S.; Nex, F.; Yang, M. Y.
2018-05-01
In this paper different methodologies for the classification and change detection of UAV image blocks are explored. UAV is not only the cheapest platform for image acquisition but it is also the easiest platform to operate in repeated data collections over a changing area like a building construction site. Two change detection techniques have been evaluated in this study: the pre-classification and the post-classification algorithms. These methods are based on three main steps: feature extraction, classification and change detection. A set of state of the art features have been used in the tests: colour features (HSV), textural features (GLCM) and 3D geometric features. For classification purposes Conditional Random Field (CRF) has been used: the unary potential was determined using the Random Forest algorithm while the pairwise potential was defined by the fully connected CRF. In the performed tests, different feature configurations and settings have been considered to assess the performance of these methods in such challenging task. Experimental results showed that the post-classification approach outperforms the pre-classification change detection method. This was analysed using the overall accuracy, where by post classification have an accuracy of up to 62.6 % and the pre classification change detection have an accuracy of 46.5 %. These results represent a first useful indication for future works and developments.
Automated structural classification of lipids by machine learning.
Taylor, Ryan; Miller, Ryan H; Miller, Ryan D; Porter, Michael; Dalgleish, James; Prince, John T
2015-03-01
Modern lipidomics is largely dependent upon structural ontologies because of the great diversity exhibited in the lipidome, but no automated lipid classification exists to facilitate this partitioning. The size of the putative lipidome far exceeds the number currently classified, despite a decade of work. Automated classification would benefit ongoing classification efforts by decreasing the time needed and increasing the accuracy of classification while providing classifications for mass spectral identification algorithms. We introduce a tool that automates classification into the LIPID MAPS ontology of known lipids with >95% accuracy and novel lipids with 63% accuracy. The classification is based upon simple chemical characteristics and modern machine learning algorithms. The decision trees produced are intelligible and can be used to clarify implicit assumptions about the current LIPID MAPS classification scheme. These characteristics and decision trees are made available to facilitate alternative implementations. We also discovered many hundreds of lipids that are currently misclassified in the LIPID MAPS database, strongly underscoring the need for automated classification. Source code and chemical characteristic lists as SMARTS search strings are available under an open-source license at https://www.github.com/princelab/lipid_classifier. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Talai, Sahand; Boelmans, Kai; Sedlacik, Jan; Forkert, Nils D.
2017-03-01
Parkinsonian syndromes encompass a spectrum of neurodegenerative diseases, which can be classified into various subtypes. The differentiation of these subtypes is typically conducted based on clinical criteria. Due to the overlap of intra-syndrome symptoms, the accurate differential diagnosis based on clinical guidelines remains a challenge with failure rates up to 25%. The aim of this study is to present an image-based classification method of patients with Parkinson's disease (PD) and patients with progressive supranuclear palsy (PSP), an atypical variant of PD. Therefore, apparent diffusion coefficient (ADC) parameter maps were calculated based on diffusion-tensor magnetic resonance imaging (MRI) datasets. Mean ADC values were determined in 82 brain regions using an atlas-based approach. The extracted mean ADC values for each patient were then used as features for classification using a linear kernel support vector machine classifier. To increase the classification accuracy, a feature selection was performed, which resulted in the top 17 attributes to be used as the final input features. A leave-one-out cross validation based on 56 PD and 21 PSP subjects revealed that the proposed method is capable of differentiating PD and PSP patients with an accuracy of 94.8%. In conclusion, the classification of PD and PSP patients based on ADC features obtained from diffusion MRI datasets is a promising new approach for the differentiation of Parkinsonian syndromes in the broader context of decision support systems.
NASA Technical Reports Server (NTRS)
Quattrochi, D. A.
1984-01-01
An initial analysis of LANDSAT 4 Thematic Mapper (TM) data for the discrimination of agricultural, forested wetland, and urban land covers is conducted using a scene of data collected over Arkansas and Tennessee. A classification of agricultural lands derived from multitemporal LANDSAT Multispectral Scanner (MSS) data is compared with a classification of TM data for the same area. Results from this comparative analysis show that the multitemporal MSS classification produced an overall accuracy of 80.91% while the TM classification yields an overall classification accuracy of 97.06% correct.
NASA Astrophysics Data System (ADS)
Hariharan, Harishwaran; Aklaghi, Nima; Baker, Clayton A.; Rangwala, Huzefa; Kosecka, Jana; Sikdar, Siddhartha
2016-04-01
In spite of major advances in biomechanical design of upper extremity prosthetics, these devices continue to lack intuitive control. Conventional myoelectric control strategies typically utilize electromyography (EMG) signal amplitude sensed from forearm muscles. EMG has limited specificity in resolving deep muscle activity and poor signal-to-noise ratio. We have been investigating alternative control strategies that rely on real-time ultrasound imaging that can overcome many of the limitations of EMG. In this work, we present an ultrasound image sequence classification method that utilizes spatiotemporal features to describe muscle activity and classify motor intent. Ultrasound images of the forearm muscles were obtained from able-bodied subjects and a trans-radial amputee while they attempted different hand movements. A grid-based approach is used to test the feasibility of using spatio-temporal features by classifying hand motions performed by the subjects. Using the leave-one-out cross validation on image sequences acquired from able-bodied subjects, we observe that the grid-based approach is able to discern four hand motions with 95.31% accuracy. In case of the trans-radial amputee, we are able to discern three hand motions with 80% accuracy. In a second set of experiments, we study classification accuracy by extracting spatio-temporal sub-sequences the depict activity due to the motion of local anatomical interfaces. Short time and space limited cuboidal sequences are initially extracted and assigned an optical flow behavior label, based on a response function. The image space is clustered based on the location of cuboids and features calculated from the cuboids in each cluster. Using sequences of known motions, we extract feature vectors that describe said motion. A K-nearest neighbor classifier is designed for classification experiments. Using the leave-one-out cross validation on image sequences for an amputee subject, we demonstrate that the classifier is able to discern three important hand motions with an accuracy of 93.33% accuracy, 91-100% precision and 80-100% recall rate. We anticipate that ultrasound imaging based methods will address some limitations of conventional myoelectric sensing, while adding advantages inherent to ultrasound imaging.
Xiao, Bo; Imel, Zac E; Georgiou, Panayiotis G; Atkins, David C; Narayanan, Shrikanth S
2015-01-01
The technology for evaluating patient-provider interactions in psychotherapy-observational coding-has not changed in 70 years. It is labor-intensive, error prone, and expensive, limiting its use in evaluating psychotherapy in the real world. Engineering solutions from speech and language processing provide new methods for the automatic evaluation of provider ratings from session recordings. The primary data are 200 Motivational Interviewing (MI) sessions from a study on MI training methods with observer ratings of counselor empathy. Automatic Speech Recognition (ASR) was used to transcribe sessions, and the resulting words were used in a text-based predictive model of empathy. Two supporting datasets trained the speech processing tasks including ASR (1200 transcripts from heterogeneous psychotherapy sessions and 153 transcripts and session recordings from 5 MI clinical trials). The accuracy of computationally-derived empathy ratings were evaluated against human ratings for each provider. Computationally-derived empathy scores and classifications (high vs. low) were highly accurate against human-based codes and classifications, with a correlation of 0.65 and F-score (a weighted average of sensitivity and specificity) of 0.86, respectively. Empathy prediction using human transcription as input (as opposed to ASR) resulted in a slight increase in prediction accuracies, suggesting that the fully automatic system with ASR is relatively robust. Using speech and language processing methods, it is possible to generate accurate predictions of provider performance in psychotherapy from audio recordings alone. This technology can support large-scale evaluation of psychotherapy for dissemination and process studies.
NASA Technical Reports Server (NTRS)
Rignot, Eric; Williams, Cynthia; Way, Jobea; Viereck, Leslie
1993-01-01
A maximum a posteriori Bayesian classifier for multifrequency polarimetric SAR data is used to perform a supervised classification of forest types in the floodplains of Alaska. The image classes include white spruce, balsam poplar, black spruce, alder, non-forests, and open water. The authors investigate the effect on classification accuracy of changing environmental conditions, and of frequency and polarization of the signal. The highest classification accuracy (86 percent correctly classified forest pixels, and 91 percent overall) is obtained combining L- and C-band frequencies fully polarimetric on a date where the forest is just recovering from flooding. The forest map compares favorably with a vegetation map assembled from digitized aerial photos which took five years for completion, and address the state of the forest in 1978, ignoring subsequent fires, changes in the course of the river, clear-cutting of trees, and tree growth. HV-polarization is the most useful polarization at L- and C-band for classification. C-band VV (ERS-1 mode) and L-band HH (J-ERS-1 mode) alone or combined yield unsatisfactory classification accuracies. Additional data acquired in the winter season during thawed and frozen days yield classification accuracies respectively 20 percent and 30 percent lower due to a greater confusion between conifers and deciduous trees. Data acquired at the peak of flooding in May 1991 also yield classification accuracies 10 percent lower because of dominant trunk-ground interactions which mask out finer differences in radar backscatter between tree species. Combination of several of these dates does not improve classification accuracy. For comparison, panchromatic optical data acquired by SPOT in the summer season of 1991 are used to classify the same area. The classification accuracy (78 percent for the forest types and 90 percent if open water is included) is lower than that obtained with AIRSAR although conifers and deciduous trees are better separated due to the presence of leaves on the deciduous trees. Optical data do not separate black spruce and white spruce as well as SAR data, cannot separate alder from balsam poplar, and are of course limited by the frequent cloud cover in the polar regions. Yet, combining SPOT and AIRSAR offers better chances to identify vegetation types independent of ground truth information using a combination of NDVI indexes from SPOT, biomass numbers from AIRSAR, and a segmentation map from either one.
NASA Astrophysics Data System (ADS)
Karakacan Kuzucu, A.; Bektas Balcik, F.
2017-11-01
Accurate and reliable land use/land cover (LULC) information obtained by remote sensing technology is necessary in many applications such as environmental monitoring, agricultural management, urban planning, hydrological applications, soil management, vegetation condition study and suitability analysis. But this information still remains a challenge especially in heterogeneous landscapes covering urban and rural areas due to spectrally similar LULC features. In parallel with technological developments, supplementary data such as satellite-derived spectral indices have begun to be used as additional bands in classification to produce data with high accuracy. The aim of this research is to test the potential of spectral vegetation indices combination with supervised classification methods and to extract reliable LULC information from SPOT 7 multispectral imagery. The Normalized Difference Vegetation Index (NDVI), the Ratio Vegetation Index (RATIO), the Soil Adjusted Vegetation Index (SAVI) were the three vegetation indices used in this study. The classical maximum likelihood classifier (MLC) and support vector machine (SVM) algorithm were applied to classify SPOT 7 image. Catalca is selected region located in the north west of the Istanbul in Turkey, which has complex landscape covering artificial surface, forest and natural area, agricultural field, quarry/mining area, pasture/scrubland and water body. Accuracy assessment of all classified images was performed through overall accuracy and kappa coefficient. The results indicated that the incorporation of these three different vegetation indices decrease the classification accuracy for the MLC and SVM classification. In addition, the maximum likelihood classification slightly outperformed the support vector machine classification approach in both overall accuracy and kappa statistics.
Bozkurt, Selen; Bostanci, Asli; Turhan, Murat
2017-08-11
The goal of this study is to evaluate the results of machine learning methods for the classification of OSA severity of patients with suspected sleep disorder breathing as normal, mild, moderate and severe based on non-polysomnographic variables: 1) clinical data, 2) symptoms and 3) physical examination. In order to produce classification models for OSA severity, five different machine learning methods (Bayesian network, Decision Tree, Random Forest, Neural Networks and Logistic Regression) were trained while relevant variables and their relationships were derived empirically from observed data. Each model was trained and evaluated using 10-fold cross-validation and to evaluate classification performances of all methods, true positive rate (TPR), false positive rate (FPR), Positive Predictive Value (PPV), F measure and Area Under Receiver Operating Characteristics curve (ROC-AUC) were used. Results of 10-fold cross validated tests with different variable settings promisingly indicated that the OSA severity of suspected OSA patients can be classified, using non-polysomnographic features, with 0.71 true positive rate as the highest and, 0.15 false positive rate as the lowest, respectively. Moreover, the test results of different variables settings revealed that the accuracy of the classification models was significantly improved when physical examination variables were added to the model. Study results showed that machine learning methods can be used to estimate the probabilities of no, mild, moderate, and severe obstructive sleep apnea and such approaches may improve accurate initial OSA screening and help referring only the suspected moderate or severe OSA patients to sleep laboratories for the expensive tests.
NASA Astrophysics Data System (ADS)
Anitha, J.; Vijila, C. Kezi Selva; Hemanth, D. Jude
2010-02-01
Diabetic retinopathy (DR) is a chronic eye disease for which early detection is highly essential to avoid any fatal results. Image processing of retinal images emerge as a feasible tool for this early diagnosis. Digital image processing techniques involve image classification which is a significant technique to detect the abnormality in the eye. Various automated classification systems have been developed in the recent years but most of them lack high classification accuracy. Artificial neural networks are the widely preferred artificial intelligence technique since it yields superior results in terms of classification accuracy. In this work, Radial Basis function (RBF) neural network based bi-level classification system is proposed to differentiate abnormal DR Images and normal retinal images. The results are analyzed in terms of classification accuracy, sensitivity and specificity. A comparative analysis is performed with the results of the probabilistic classifier namely Bayesian classifier to show the superior nature of neural classifier. Experimental results show promising results for the neural classifier in terms of the performance measures.
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features.
Li, Linyi; Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images.
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features
Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images. PMID:28761440
Umut, İlhan; Çentik, Güven
2016-01-01
The number of channels used for polysomnographic recording frequently causes difficulties for patients because of the many cables connected. Also, it increases the risk of having troubles during recording process and increases the storage volume. In this study, it is intended to detect periodic leg movement (PLM) in sleep with the use of the channels except leg electromyography (EMG) by analysing polysomnography (PSG) data with digital signal processing (DSP) and machine learning methods. PSG records of 153 patients of different ages and genders with PLM disorder diagnosis were examined retrospectively. A novel software was developed for the analysis of PSG records. The software utilizes the machine learning algorithms, statistical methods, and DSP methods. In order to classify PLM, popular machine learning methods (multilayer perceptron, K-nearest neighbour, and random forests) and logistic regression were used. Comparison of classified results showed that while K-nearest neighbour classification algorithm had higher average classification rate (91.87%) and lower average classification error value (RMSE = 0.2850), multilayer perceptron algorithm had the lowest average classification rate (83.29%) and the highest average classification error value (RMSE = 0.3705). Results showed that PLM can be classified with high accuracy (91.87%) without leg EMG record being present. PMID:27213008
Umut, İlhan; Çentik, Güven
2016-01-01
The number of channels used for polysomnographic recording frequently causes difficulties for patients because of the many cables connected. Also, it increases the risk of having troubles during recording process and increases the storage volume. In this study, it is intended to detect periodic leg movement (PLM) in sleep with the use of the channels except leg electromyography (EMG) by analysing polysomnography (PSG) data with digital signal processing (DSP) and machine learning methods. PSG records of 153 patients of different ages and genders with PLM disorder diagnosis were examined retrospectively. A novel software was developed for the analysis of PSG records. The software utilizes the machine learning algorithms, statistical methods, and DSP methods. In order to classify PLM, popular machine learning methods (multilayer perceptron, K-nearest neighbour, and random forests) and logistic regression were used. Comparison of classified results showed that while K-nearest neighbour classification algorithm had higher average classification rate (91.87%) and lower average classification error value (RMSE = 0.2850), multilayer perceptron algorithm had the lowest average classification rate (83.29%) and the highest average classification error value (RMSE = 0.3705). Results showed that PLM can be classified with high accuracy (91.87%) without leg EMG record being present.
Protein classification based on text document classification techniques.
Cheng, Betty Yee Man; Carbonell, Jaime G; Klein-Seetharaman, Judith
2005-03-01
The need for accurate, automated protein classification methods continues to increase as advances in biotechnology uncover new proteins. G-protein coupled receptors (GPCRs) are a particularly difficult superfamily of proteins to classify due to extreme diversity among its members. Previous comparisons of BLAST, k-nearest neighbor (k-NN), hidden markov model (HMM) and support vector machine (SVM) using alignment-based features have suggested that classifiers at the complexity of SVM are needed to attain high accuracy. Here, analogous to document classification, we applied Decision Tree and Naive Bayes classifiers with chi-square feature selection on counts of n-grams (i.e. short peptide sequences of length n) to this classification task. Using the GPCR dataset and evaluation protocol from the previous study, the Naive Bayes classifier attained an accuracy of 93.0 and 92.4% in level I and level II subfamily classification respectively, while SVM has a reported accuracy of 88.4 and 86.3%. This is a 39.7 and 44.5% reduction in residual error for level I and level II subfamily classification, respectively. The Decision Tree, while inferior to SVM, outperforms HMM in both level I and level II subfamily classification. For those GPCR families whose profiles are stored in the Protein FAMilies database of alignments and HMMs (PFAM), our method performs comparably to a search against those profiles. Finally, our method can be generalized to other protein families by applying it to the superfamily of nuclear receptors with 94.5, 97.8 and 93.6% accuracy in family, level I and level II subfamily classification respectively. Copyright 2005 Wiley-Liss, Inc.
Zhe Fan; Zhong Wang; Guanglin Li; Ruomei Wang
2016-08-01
Motion classification system based on surface Electromyography (sEMG) pattern recognition has achieved good results in experimental condition. But it is still a challenge for clinical implement and practical application. Many factors contribute to the difficulty of clinical use of the EMG based dexterous control. The most obvious and important is the noise in the EMG signal caused by electrode shift, muscle fatigue, motion artifact, inherent instability of signal and biological signals such as Electrocardiogram. In this paper, a novel method based on Canonical Correlation Analysis (CCA) was developed to eliminate the reduction of classification accuracy caused by electrode shift. The average classification accuracy of our method were above 95% for the healthy subjects. In the process, we validated the influence of electrode shift on motion classification accuracy and discovered the strong correlation with correlation coefficient of >0.9 between shift position data and normal position data.
Forest tree species discrimination in western Himalaya using EO-1 Hyperion
NASA Astrophysics Data System (ADS)
George, Rajee; Padalia, Hitendra; Kushwaha, S. P. S.
2014-05-01
The information acquired in the narrow bands of hyperspectral remote sensing data has potential to capture plant species spectral variability, thereby improving forest tree species mapping. This study assessed the utility of spaceborne EO-1 Hyperion data in discrimination and classification of broadleaved evergreen and conifer forest tree species in western Himalaya. The pre-processing of 242 bands of Hyperion data resulted into 160 noise-free and vertical stripe corrected reflectance bands. Of these, 29 bands were selected through step-wise exclusion of bands (Wilk's Lambda). Spectral Angle Mapper (SAM) and Support Vector Machine (SVM) algorithms were applied to the selected bands to assess their effectiveness in classification. SVM was also applied to broadband data (Landsat TM) to compare the variation in classification accuracy. All commonly occurring six gregarious tree species, viz., white oak, brown oak, chir pine, blue pine, cedar and fir in western Himalaya could be effectively discriminated. SVM produced a better species classification (overall accuracy 82.27%, kappa statistic 0.79) than SAM (overall accuracy 74.68%, kappa statistic 0.70). It was noticed that classification accuracy achieved with Hyperion bands was significantly higher than Landsat TM bands (overall accuracy 69.62%, kappa statistic 0.65). Study demonstrated the potential utility of narrow spectral bands of Hyperion data in discriminating tree species in a hilly terrain.
NASA Astrophysics Data System (ADS)
Hall-Brown, Mary
The heterogeneity of Arctic vegetation can make land cover classification vey difficult when using medium to small resolution imagery (Schneider et al., 2009; Muller et al., 1999). Using high radiometric and spatial resolution imagery, such as the SPOT 5 and IKONOS satellites, have helped arctic land cover classification accuracies rise into the 80 and 90 percentiles (Allard, 2003; Stine et al., 2010; Muller et al., 1999). However, those increases usually come at a high price. High resolution imagery is very expensive and can often add tens of thousands of dollars onto the cost of the research. The EO-1 satellite launched in 2002 carries two sensors that have high specral and/or high spatial resolutions and can be an acceptable compromise between the resolution versus cost issues. The Hyperion is a hyperspectral sensor with the capability of collecting 242 spectral bands of information. The Advanced Land Imager (ALI) is an advanced multispectral sensor whose spatial resolution can be sharpened to 10 meters. This dissertation compares the accuracies of arctic land cover classifications produced by the Hyperion and ALI sensors to the classification accuracies produced by the Systeme Pour l' Observation de le Terre (SPOT), the Landsat Thematic Mapper (TM) and the Landsat Enhanced Thematic Mapper Plus (ETM+) sensors. Hyperion and ALI images from August 2004 were collected over the Upper Kuparuk River Basin, Alaska. Image processing included the stepwise discriminant analysis of pixels that were positively classified from coinciding ground control points, geometric and radiometric correction, and principle component analysis. Finally, stratified random sampling was used to perform accuracy assessments on satellite derived land cover classifications. Accuracy was estimated from an error matrix (confusion matrix) that provided the overall, producer's and user's accuracies. This research found that while the Hyperion sensor produced classfication accuracies that were equivalent to the TM and ETM+ sensor (approximately 78%), the Hyperion could not obtain the accuracy of the SPOT 5 HRV sensor. However, the land cover classifications derived from the ALI sensor exceeded most classification accuracies derived from the TM and ETM+ senors and were even comparable to most SPOT 5 HRV classifications (87%). With the deactivation of the Landsat series satellites, the monitoring of remote locations such as in the Arctic on an uninterupted basis thoughout the world is in jeopardy. The utilization of the Hyperion and ALI sensors are a way to keep that endeavor operational. By keeping the ALI sensor active at all times, uninterupted observation of the entire Earth can be accomplished. Keeping the Hyperion sensor as a "tasked" sensor can provide scientists with additional imagery and options for their studies without overburdening storage issues.
Evaluation of space SAR as a land-cover classification
NASA Technical Reports Server (NTRS)
Brisco, B.; Ulaby, F. T.; Williams, T. H. L.
1985-01-01
The multidimensional approach to the mapping of land cover, crops, and forests is reported. Dimensionality is achieved by using data from sensors such as LANDSAT to augment Seasat and Shuttle Image Radar (SIR) data, using different image features such as tone and texture, and acquiring multidate data. Seasat, Shuttle Imaging Radar (SIR-A), and LANDSAT data are used both individually and in combination to map land cover in Oklahoma. The results indicates that radar is the best single sensor (72% accuracy) and produces the best sensor combination (97.5% accuracy) for discriminating among five land cover categories. Multidate Seasat data and a single data of LANDSAT coverage are then used in a crop classification study of western Kansas. The highest accuracy for a single channel is achieved using a Seasat scene, which produces a classification accuracy of 67%. Classification accuracy increases to approximately 75% when either a multidate Seasat combination or LANDSAT data in a multisensor combination is used. The tonal and textural elements of SIR-A data are then used both alone and in combination to classify forests into five categories.
Comparative Analysis of Haar and Daubechies Wavelet for Hyper Spectral Image Classification
NASA Astrophysics Data System (ADS)
Sharif, I.; Khare, S.
2014-11-01
With the number of channels in the hundreds instead of in the tens Hyper spectral imagery possesses much richer spectral information than multispectral imagery. The increased dimensionality of such Hyper spectral data provides a challenge to the current technique for analyzing data. Conventional classification methods may not be useful without dimension reduction pre-processing. So dimension reduction has become a significant part of Hyper spectral image processing. This paper presents a comparative analysis of the efficacy of Haar and Daubechies wavelets for dimensionality reduction in achieving image classification. Spectral data reduction using Wavelet Decomposition could be useful because it preserves the distinction among spectral signatures. Daubechies wavelets optimally capture the polynomial trends while Haar wavelet is discontinuous and resembles a step function. The performance of these wavelets are compared in terms of classification accuracy and time complexity. This paper shows that wavelet reduction has more separate classes and yields better or comparable classification accuracy. In the context of the dimensionality reduction algorithm, it is found that the performance of classification of Daubechies wavelets is better as compared to Haar wavelet while Daubechies takes more time compare to Haar wavelet. The experimental results demonstrate the classification system consistently provides over 84% classification accuracy.
ANALYSIS OF A CLASSIFICATION ERROR MATRIX USING CATEGORICAL DATA TECHNIQUES.
Rosenfield, George H.; Fitzpatrick-Lins, Katherine
1984-01-01
Summary form only given. A classification error matrix typically contains tabulation results of an accuracy evaluation of a thematic classification, such as that of a land use and land cover map. The diagonal elements of the matrix represent the counts corrected, and the usual designation of classification accuracy has been the total percent correct. The nondiagonal elements of the matrix have usually been neglected. The classification error matrix is known in statistical terms as a contingency table of categorical data. As an example, an application of these methodologies to a problem of remotely sensed data concerning two photointerpreters and four categories of classification indicated that there is no significant difference in the interpretation between the two photointerpreters, and that there are significant differences among the interpreted category classifications. However, two categories, oak and cottonwood, are not separable in classification in this experiment at the 0. 51 percent probability. A coefficient of agreement is determined for the interpreted map as a whole, and individually for each of the interpreted categories. A conditional coefficient of agreement for the individual categories is compared to other methods for expressing category accuracy which have already been presented in the remote sensing literature.
Research on Remote Sensing Image Classification Based on Feature Level Fusion
NASA Astrophysics Data System (ADS)
Yuan, L.; Zhu, G.
2018-04-01
Remote sensing image classification, as an important direction of remote sensing image processing and application, has been widely studied. However, in the process of existing classification algorithms, there still exists the phenomenon of misclassification and missing points, which leads to the final classification accuracy is not high. In this paper, we selected Sentinel-1A and Landsat8 OLI images as data sources, and propose a classification method based on feature level fusion. Compare three kind of feature level fusion algorithms (i.e., Gram-Schmidt spectral sharpening, Principal Component Analysis transform and Brovey transform), and then select the best fused image for the classification experimental. In the classification process, we choose four kinds of image classification algorithms (i.e. Minimum distance, Mahalanobis distance, Support Vector Machine and ISODATA) to do contrast experiment. We use overall classification precision and Kappa coefficient as the classification accuracy evaluation criteria, and the four classification results of fused image are analysed. The experimental results show that the fusion effect of Gram-Schmidt spectral sharpening is better than other methods. In four kinds of classification algorithms, the fused image has the best applicability to Support Vector Machine classification, the overall classification precision is 94.01 % and the Kappa coefficients is 0.91. The fused image with Sentinel-1A and Landsat8 OLI is not only have more spatial information and spectral texture characteristics, but also enhances the distinguishing features of the images. The proposed method is beneficial to improve the accuracy and stability of remote sensing image classification.
Significance of perceptually relevant image decolorization for scene classification
NASA Astrophysics Data System (ADS)
Viswanathan, Sowmya; Divakaran, Govind; Soman, Kutti Padanyl
2017-11-01
Color images contain luminance and chrominance components representing the intensity and color information, respectively. The objective of this paper is to show the significance of incorporating chrominance information to the task of scene classification. An improved color-to-grayscale image conversion algorithm that effectively incorporates chrominance information is proposed using the color-to-gray structure similarity index and singular value decomposition to improve the perceptual quality of the converted grayscale images. The experimental results based on an image quality assessment for image decolorization and its success rate (using the Cadik and COLOR250 datasets) show that the proposed image decolorization technique performs better than eight existing benchmark algorithms for image decolorization. In the second part of the paper, the effectiveness of incorporating the chrominance component for scene classification tasks is demonstrated using a deep belief network-based image classification system developed using dense scale-invariant feature transforms. The amount of chrominance information incorporated into the proposed image decolorization technique is confirmed with the improvement to the overall scene classification accuracy. Moreover, the overall scene classification performance improved by combining the models obtained using the proposed method and conventional decolorization methods.
Crossmodal Congruency Benefits of Tactile and Visual Signalling
2013-11-12
modal information format seemed to produce faster and more accurate performance. The question of learning complex tactile communication signals...SECURITY CLASSIFICATION OF: We conducted an experiment in which tactile messages were created based on five common military arm and hand signals. We...compared response times and accuracy rates of novice individuals responding to visual and tactile representations of these messages, which were
Azami, Hamed; Escudero, Javier
2015-08-01
Breast cancer is one of the most common types of cancer in women all over the world. Early diagnosis of this kind of cancer can significantly increase the chances of long-term survival. Since diagnosis of breast cancer is a complex problem, neural network (NN) approaches have been used as a promising solution. Considering the low speed of the back-propagation (BP) algorithm to train a feed-forward NN, we consider a number of improved NN trainings for the Wisconsin breast cancer dataset: BP with momentum, BP with adaptive learning rate, BP with adaptive learning rate and momentum, Polak-Ribikre conjugate gradient algorithm (CGA), Fletcher-Reeves CGA, Powell-Beale CGA, scaled CGA, resilient BP (RBP), one-step secant and quasi-Newton methods. An NN ensemble, which is a learning paradigm to combine a number of NN outputs, is used to improve the accuracy of the classification task. Results demonstrate that NN ensemble-based classification methods have better performance than NN-based algorithms. The highest overall average accuracy is 97.68% obtained by NN ensemble trained by RBP for 50%-50% training-test evaluation method.
Can single classifiers be as useful as model ensembles to produce benthic seabed substratum maps?
NASA Astrophysics Data System (ADS)
Turner, Joseph A.; Babcock, Russell C.; Hovey, Renae; Kendrick, Gary A.
2018-05-01
Numerous machine-learning classifiers are available for benthic habitat map production, which can lead to different results. This study highlights the performance of the Random Forest (RF) classifier, which was significantly better than Classification Trees (CT), Naïve Bayes (NB), and a multi-model ensemble in terms of overall accuracy, Balanced Error Rate (BER), Kappa, and area under the curve (AUC) values. RF accuracy was often higher than 90% for each substratum class, even at the most detailed level of the substratum classification and AUC values also indicated excellent performance (0.8-1). Total agreement between classifiers was high at the broadest level of classification (75-80%) when differentiating between hard and soft substratum. However, this sharply declined as the number of substratum categories increased (19-45%) including a mix of rock, gravel, pebbles, and sand. The model ensemble, produced from the results of all three classifiers by majority voting, did not show any increase in predictive performance when compared to the single RF classifier. This study shows how a single classifier may be sufficient to produce benthic seabed maps and model ensembles of multiple classifiers.
Skin Lesion Analysis towards Melanoma Detection Using Deep Learning Network.
Li, Yuexiang; Shen, Linlin
2018-02-11
Skin lesions are a severe disease globally. Early detection of melanoma in dermoscopy images significantly increases the survival rate. However, the accurate recognition of melanoma is extremely challenging due to the following reasons: low contrast between lesions and skin, visual similarity between melanoma and non-melanoma lesions, etc. Hence, reliable automatic detection of skin tumors is very useful to increase the accuracy and efficiency of pathologists. In this paper, we proposed two deep learning methods to address three main tasks emerging in the area of skin lesion image processing, i.e., lesion segmentation (task 1), lesion dermoscopic feature extraction (task 2) and lesion classification (task 3). A deep learning framework consisting of two fully convolutional residual networks (FCRN) is proposed to simultaneously produce the segmentation result and the coarse classification result. A lesion index calculation unit (LICU) is developed to refine the coarse classification results by calculating the distance heat-map. A straight-forward CNN is proposed for the dermoscopic feature extraction task. The proposed deep learning frameworks were evaluated on the ISIC 2017 dataset. Experimental results show the promising accuracies of our frameworks, i.e., 0.753 for task 1, 0.848 for task 2 and 0.912 for task 3 were achieved.
A Step Towards EEG-based Brain Computer Interface for Autism Intervention*
Fan, Jing; Wade, Joshua W.; Bian, Dayi; Key, Alexandra P.; Warren, Zachary E.; Mion, Lorraine C.; Sarkar, Nilanjan
2017-01-01
Autism Spectrum Disorder (ASD) is a prevalent and costly neurodevelopmental disorder. Individuals with ASD often have deficits in social communication skills as well as adaptive behavior skills related to daily activities. We have recently designed a novel virtual reality (VR) based driving simulator for driving skill training for individuals with ASD. In this paper, we explored the feasibility of detecting engagement level, emotional states, and mental workload during VR-based driving using EEG as a first step towards a potential EEG-based Brain Computer Interface (BCI) for assisting autism intervention. We used spectral features of EEG signals from a 14-channel EEG neuroheadset, together with therapist ratings of behavioral engagement, enjoyment, frustration, boredom, and difficulty to train a group of classification models. Seven classification methods were applied and compared including Bayes network, naïve Bayes, Support Vector Machine (SVM), multilayer perceptron, K-nearest neighbors (KNN), random forest, and J48. The classification results were promising, with over 80% accuracy in classifying engagement and mental workload, and over 75% accuracy in classifying emotional states. Such results may lead to an adaptive closed-loop VR-based skill training system for use in autism intervention. PMID:26737113
Kim, Junghoe; Calhoun, Vince D; Shim, Eunsoo; Lee, Jong-Hwan
2016-01-01
Functional connectivity (FC) patterns obtained from resting-state functional magnetic resonance imaging data are commonly employed to study neuropsychiatric conditions by using pattern classifiers such as the support vector machine (SVM). Meanwhile, a deep neural network (DNN) with multiple hidden layers has shown its ability to systematically extract lower-to-higher level information of image and speech data from lower-to-higher hidden layers, markedly enhancing classification accuracy. The objective of this study was to adopt the DNN for whole-brain resting-state FC pattern classification of schizophrenia (SZ) patients vs. healthy controls (HCs) and identification of aberrant FC patterns associated with SZ. We hypothesized that the lower-to-higher level features learned via the DNN would significantly enhance the classification accuracy, and proposed an adaptive learning algorithm to explicitly control the weight sparsity in each hidden layer via L1-norm regularization. Furthermore, the weights were initialized via stacked autoencoder based pre-training to further improve the classification performance. Classification accuracy was systematically evaluated as a function of (1) the number of hidden layers/nodes, (2) the use of L1-norm regularization, (3) the use of the pre-training, (4) the use of framewise displacement (FD) removal, and (5) the use of anatomical/functional parcellation. Using FC patterns from anatomically parcellated regions without FD removal, an error rate of 14.2% was achieved by employing three hidden layers and 50 hidden nodes with both L1-norm regularization and pre-training, which was substantially lower than the error rate from the SVM (22.3%). Moreover, the trained DNN weights (i.e., the learned features) were found to represent the hierarchical organization of aberrant FC patterns in SZ compared with HC. Specifically, pairs of nodes extracted from the lower hidden layer represented sparse FC patterns implicated in SZ, which was quantified by using kurtosis/modularity measures and features from the higher hidden layer showed holistic/global FC patterns differentiating SZ from HC. Our proposed schemes and reported findings attained by using the DNN classifier and whole-brain FC data suggest that such approaches show improved ability to learn hidden patterns in brain imaging data, which may be useful for developing diagnostic tools for SZ and other neuropsychiatric disorders and identifying associated aberrant FC patterns. Copyright © 2015 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Seo, Young Wook; Yoon, Seung Chul; Park, Bosoon; Hinton, Arthur; Windham, William R.; Lawrence, Kurt C.
2013-05-01
Salmonella is a major cause of foodborne disease outbreaks resulting from the consumption of contaminated food products in the United States. This paper reports the development of a hyperspectral imaging technique for detecting and differentiating two of the most common Salmonella serotypes, Salmonella Enteritidis (SE) and Salmonella Typhimurium (ST), from background microflora that are often found in poultry carcass rinse. Presumptive positive screening of colonies with a traditional direct plating method is a labor intensive and time consuming task. Thus, this paper is concerned with the detection of differences in spectral characteristics among the pure SE, ST, and background microflora grown on brilliant green sulfa (BGS) and xylose lysine tergitol 4 (XLT4) agar media with a spread plating technique. Visible near-infrared hyperspectral imaging, providing the spectral and spatial information unique to each microorganism, was utilized to differentiate SE and ST from the background microflora. A total of 10 classification models, including five machine learning algorithms, each without and with principal component analysis (PCA), were validated and compared to find the best model in classification accuracy. The five machine learning (classification) algorithms used in this study were Mahalanobis distance (MD), k-nearest neighbor (kNN), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM). The average classification accuracy of all 10 models on a calibration (or training) set of the pure cultures on BGS agar plates was 98% (Kappa coefficient = 0.95) in determining the presence of SE and/or ST although it was difficult to differentiate between SE and ST. The average classification accuracy of all 10 models on a training set for ST detection on XLT4 agar was over 99% (Kappa coefficient = 0.99) although SE colonies on XLT4 agar were difficult to differentiate from background microflora. The average classification accuracy of all 10 models on a validation set of chicken carcass rinses spiked with SE or ST and incubated on BGS agar plates was 94.45% and 83.73%, without and with PCA for classification, respectively. The best performing classification model on the validation set was QDA without PCA by achieving the classification accuracy of 98.65% (Kappa coefficient=0.98). The overall best performing classification model regardless of using PCA was MD with the classification accuracy of 94.84% (Kappa coefficient=0.88) on the validation set.
The impact of missing trauma data on predicting massive transfusion
Trickey, Amber W.; Fox, Erin E.; del Junco, Deborah J.; Ning, Jing; Holcomb, John B.; Brasel, Karen J.; Cohen, Mitchell J.; Schreiber, Martin A.; Bulger, Eileen M.; Phelan, Herb A.; Alarcon, Louis H.; Myers, John G.; Muskat, Peter; Cotton, Bryan A.; Wade, Charles E.; Rahbar, Mohammad H.
2013-01-01
INTRODUCTION Missing data are inherent in clinical research and may be especially problematic for trauma studies. This study describes a sensitivity analysis to evaluate the impact of missing data on clinical risk prediction algorithms. Three blood transfusion prediction models were evaluated utilizing an observational trauma dataset with valid missing data. METHODS The PRospective Observational Multi-center Major Trauma Transfusion (PROMMTT) study included patients requiring ≥ 1 unit of red blood cells (RBC) at 10 participating U.S. Level I trauma centers from July 2009 – October 2010. Physiologic, laboratory, and treatment data were collected prospectively up to 24h after hospital admission. Subjects who received ≥ 10 RBC units within 24h of admission were classified as massive transfusion (MT) patients. Correct classification percentages for three MT prediction models were evaluated using complete case analysis and multiple imputation. A sensitivity analysis for missing data was conducted to determine the upper and lower bounds for correct classification percentages. RESULTS PROMMTT enrolled 1,245 subjects. MT was received by 297 patients (24%). Missing percentage ranged from 2.2% (heart rate) to 45% (respiratory rate). Proportions of complete cases utilized in the MT prediction models ranged from 41% to 88%. All models demonstrated similar correct classification percentages using complete case analysis and multiple imputation. In the sensitivity analysis, correct classification upper-lower bound ranges per model were 4%, 10%, and 12%. Predictive accuracy for all models using PROMMTT data was lower than reported in the original datasets. CONCLUSIONS Evaluating the accuracy clinical prediction models with missing data can be misleading, especially with many predictor variables and moderate levels of missingness per variable. The proposed sensitivity analysis describes the influence of missing data on risk prediction algorithms. Reporting upper/lower bounds for percent correct classification may be more informative than multiple imputation, which provided similar results to complete case analysis in this study. PMID:23778514
Using spectrotemporal indices to improve the fruit-tree crop classification accuracy
NASA Astrophysics Data System (ADS)
Peña, M. A.; Liao, R.; Brenning, A.
2017-06-01
This study assesses the potential of spectrotemporal indices derived from satellite image time series (SITS) to improve the classification accuracy of fruit-tree crops. Six major fruit-tree crop types in the Aconcagua Valley, Chile, were classified by applying various linear discriminant analysis (LDA) techniques on a Landsat-8 time series of nine images corresponding to the 2014-15 growing season. As features we not only used the complete spectral resolution of the SITS, but also all possible normalized difference indices (NDIs) that can be constructed from any two bands of the time series, a novel approach to derive features from SITS. Due to the high dimensionality of this "enhanced" feature set we used the lasso and ridge penalized variants of LDA (PLDA). Although classification accuracies yielded by the standard LDA applied on the full-band SITS were good (misclassification error rate, MER = 0.13), they were further improved by 23% (MER = 0.10) with ridge PLDA using the enhanced feature set. The most important bands to discriminate the crops of interest were mainly concentrated on the first two image dates of the time series, corresponding to the crops' greenup stage. Despite the high predictor weights provided by the red and near infrared bands, typically used to construct greenness spectral indices, other spectral regions were also found important for the discrimination, such as the shortwave infrared band at 2.11-2.19 μm, sensitive to foliar water changes. These findings support the usefulness of spectrotemporal indices in the context of SITS-based crop type classifications, which until now have been mainly constructed by the arithmetic combination of two bands of the same image date in order to derive greenness temporal profiles like those from the normalized difference vegetation index.
AVHRR composite period selection for land cover classification
Maxwell, S.K.; Hoffer, R.M.; Chapman, P.L.
2002-01-01
Multitemporal satellite image datasets provide valuable information on the phenological characteristics of vegetation, thereby significantly increasing the accuracy of cover type classifications compared to single date classifications. However, the processing of these datasets can become very complex when dealing with multitemporal data combined with multispectral data. Advanced Very High Resolution Radiometer (AVHRR) biweekly composite data are commonly used to classify land cover over large regions. Selecting a subset of these biweekly composite periods may be required to reduce the complexity and cost of land cover mapping. The objective of our research was to evaluate the effect of reducing the number of composite periods and altering the spacing of those composite periods on classification accuracy. Because inter-annual variability can have a major impact on classification results, 5 years of AVHRR data were evaluated. AVHRR biweekly composite images for spectral channels 1-4 (visible, near-infrared and two thermal bands) covering the entire growing season were used to classify 14 cover types over the entire state of Colorado for each of five different years. A supervised classification method was applied to maintain consistent procedures for each case tested. Results indicate that the number of composite periods can be halved-reduced from 14 composite dates to seven composite dates-without significantly reducing overall classification accuracy (80.4% Kappa accuracy for the 14-composite data-set as compared to 80.0% for a seven-composite dataset). At least seven composite periods were required to ensure the classification accuracy was not affected by inter-annual variability due to climate fluctuations. Concentrating more composites near the beginning and end of the growing season, as compared to using evenly spaced time periods, consistently produced slightly higher classification values over the 5 years tested (average Kappa) of 80.3% for the heavy early/late case as compared to 79.0% for the alternate dataset case).
Liu, Siqi; Oh, Heesoo; Chambers, David William; Xu, Tianmin; Baumrind, Sheldon
2018-04-06
Determine optimal weightings of Peer Assessment Rating (PAR) index and Discrepancy Index (DI) for malocclusion severity assessment in Chinese orthodontic patients. Sixty-nine Chinese orthodontists assessed a full set of pre-treatment records from a stratified random sample of 120 subjects gathered from six university orthodontic centres. Using professional judgment as the outcome variable, multiple regression analyses were performed to derive customized weighting systems for the PAR index and DI, for all subjects and each Angle classification subgroup. Professional judgment was consistent, with an Intraclass Correlation Coefficient (ICC) of 0.995. The PAR index or DI can be reliably measured, with ICC = 0.959 and 0.990, respectively. The predictive accuracy of PAR index was greatly improved by the Chinese weighting process (from r = 0.431 to r = 0.788) with almost equal distribution in each Angle classification subgroup. The Chinese-weighted DI showed a higher predictive accuracy, at P = 0.01, compared with the PAR index (r = 0.851 versus r = 0.788). A better performance was found in the Class II group (r = 0.890) when compared to Class I (r = 0.736) and III (r = 0.785) groups. The Chinese-weighted PAR index and DI were capable of predicting 62 per cent and 73 per cent of total variance in the professional judgment of malocclusion severity in Chinese patients. Differential prediction across Angle classifications merits attention since different weighting formulas were found.
Hao, Pengyu; Wang, Li; Niu, Zheng
2015-01-01
A range of single classifiers have been proposed to classify crop types using time series vegetation indices, and hybrid classifiers are used to improve discriminatory power. Traditional fusion rules use the product of multi-single classifiers, but that strategy cannot integrate the classification output of machine learning classifiers. In this research, the performance of two hybrid strategies, multiple voting (M-voting) and probabilistic fusion (P-fusion), for crop classification using NDVI time series were tested with different training sample sizes at both pixel and object levels, and two representative counties in north Xinjiang were selected as study area. The single classifiers employed in this research included Random Forest (RF), Support Vector Machine (SVM), and See 5 (C 5.0). The results indicated that classification performance improved (increased the mean overall accuracy by 5%~10%, and reduced standard deviation of overall accuracy by around 1%) substantially with the training sample number, and when the training sample size was small (50 or 100 training samples), hybrid classifiers substantially outperformed single classifiers with higher mean overall accuracy (1%~2%). However, when abundant training samples (4,000) were employed, single classifiers could achieve good classification accuracy, and all classifiers obtained similar performances. Additionally, although object-based classification did not improve accuracy, it resulted in greater visual appeal, especially in study areas with a heterogeneous cropping pattern. PMID:26360597
Koch, Stefan P.; Hägele, Claudia; Haynes, John-Dylan; Heinz, Andreas; Schlagenhauf, Florian; Sterzer, Philipp
2015-01-01
Functional neuroimaging has provided evidence for altered function of mesolimbic circuits implicated in reward processing, first and foremost the ventral striatum, in patients with schizophrenia. While such findings based on significant group differences in brain activations can provide important insights into the pathomechanisms of mental disorders, the use of neuroimaging results from standard univariate statistical analysis for individual diagnosis has proven difficult. In this proof of concept study, we tested whether the predictive accuracy for the diagnostic classification of schizophrenia patients vs. healthy controls could be improved using multivariate pattern analysis (MVPA) of regional functional magnetic resonance imaging (fMRI) activation patterns for the anticipation of monetary reward. With a searchlight MVPA approach using support vector machine classification, we found that the diagnostic category could be predicted from local activation patterns in frontal, temporal, occipital and midbrain regions, with a maximal cluster peak classification accuracy of 93% for the right pallidum. Region-of-interest based MVPA for the ventral striatum achieved a maximal cluster peak accuracy of 88%, whereas the classification accuracy on the basis of standard univariate analysis reached only 75%. Moreover, using support vector regression we could additionally predict the severity of negative symptoms from ventral striatal activation patterns. These results show that MVPA can be used to substantially increase the accuracy of diagnostic classification on the basis of task-related fMRI signal patterns in a regionally specific way. PMID:25799236
2013-01-01
Background and purpose Guidelines for fracture treatment and evaluation require a valid classification. Classifications especially designed for children are available, but they might lead to reduced accuracy, considering the relative infrequency of childhood fractures in a general orthopedic department. We tested the reliability and accuracy of the Müller classification when used for long bone fractures in children. Methods We included all long bone fractures in children aged < 16 years who were treated in 2008 at the surgical ward of Stavanger University Hospital. 20 surgeons recorded 232 fractures. Datasets were generated for intra- and inter-rater analysis, as well as a reference dataset for accuracy calculations. We present proportion of agreement (PA) and kappa (K) statistics. Results For intra-rater analysis, overall agreement (κ) was 0.75 (95% CI: 0.68–0.81) and PA was 79%. For inter-rater assessment, K was 0.71 (95% CI: 0.61–0.80) and PA was 77%. Accuracy was estimated: κ = 0.72 (95% CI: 0.64–0.79) and PA = 76%. Interpretation The Müller classification (slightly adjusted for pediatric fractures) showed substantial to excellent accuracy among general orthopedic surgeons when applied to long bone fractures in children. However, separate knowledge about the child-specific fracture pattern, the maturity of the bone, and the degree of displacement must be considered when the treatment and the prognosis of the fractures are evaluated. PMID:23245225
Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants.
Yousef, Malik; Saçar Demirci, Müşerref Duygu; Khalifa, Waleed; Allmer, Jens
2016-01-01
MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of ~95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection.
Word pair classification during imagined speech using direct brain recordings
NASA Astrophysics Data System (ADS)
Martin, Stephanie; Brunner, Peter; Iturrate, Iñaki; Millán, José Del R.; Schalk, Gerwin; Knight, Robert T.; Pasley, Brian N.
2016-05-01
People that cannot communicate due to neurological disorders would benefit from an internal speech decoder. Here, we showed the ability to classify individual words during imagined speech from electrocorticographic signals. In a word imagery task, we used high gamma (70-150 Hz) time features with a support vector machine model to classify individual words from a pair of words. To account for temporal irregularities during speech production, we introduced a non-linear time alignment into the SVM kernel. Classification accuracy reached 88% in a two-class classification framework (50% chance level), and average classification accuracy across fifteen word-pairs was significant across five subjects (mean = 58% p < 0.05). We also compared classification accuracy between imagined speech, overt speech and listening. As predicted, higher classification accuracy was obtained in the listening and overt speech conditions (mean = 89% and 86%, respectively; p < 0.0001), where speech stimuli were directly presented. The results provide evidence for a neural representation for imagined words in the temporal lobe, frontal lobe and sensorimotor cortex, consistent with previous findings in speech perception and production. These data represent a proof of concept study for basic decoding of speech imagery, and delineate a number of key challenges to usage of speech imagery neural representations for clinical applications.
Word pair classification during imagined speech using direct brain recordings
Martin, Stephanie; Brunner, Peter; Iturrate, Iñaki; Millán, José del R.; Schalk, Gerwin; Knight, Robert T.; Pasley, Brian N.
2016-01-01
People that cannot communicate due to neurological disorders would benefit from an internal speech decoder. Here, we showed the ability to classify individual words during imagined speech from electrocorticographic signals. In a word imagery task, we used high gamma (70–150 Hz) time features with a support vector machine model to classify individual words from a pair of words. To account for temporal irregularities during speech production, we introduced a non-linear time alignment into the SVM kernel. Classification accuracy reached 88% in a two-class classification framework (50% chance level), and average classification accuracy across fifteen word-pairs was significant across five subjects (mean = 58%; p < 0.05). We also compared classification accuracy between imagined speech, overt speech and listening. As predicted, higher classification accuracy was obtained in the listening and overt speech conditions (mean = 89% and 86%, respectively; p < 0.0001), where speech stimuli were directly presented. The results provide evidence for a neural representation for imagined words in the temporal lobe, frontal lobe and sensorimotor cortex, consistent with previous findings in speech perception and production. These data represent a proof of concept study for basic decoding of speech imagery, and delineate a number of key challenges to usage of speech imagery neural representations for clinical applications. PMID:27165452
Comparing ecoregional classifications for natural areas management in the Klamath Region, USA
Sarr, Daniel A.; Duff, Andrew; Dinger, Eric C.; Shafer, Sarah L.; Wing, Michael; Seavy, Nathaniel E.; Alexander, John D.
2015-01-01
We compared three existing ecoregional classification schemes (Bailey, Omernik, and World Wildlife Fund) with two derived schemes (Omernik Revised and Climate Zones) to explore their effectiveness in explaining species distributions and to better understand natural resource geography in the Klamath Region, USA. We analyzed presence/absence data derived from digital distribution maps for trees, amphibians, large mammals, small mammals, migrant birds, and resident birds using three statistical analyses of classification accuracy (Analysis of Similarity, Canonical Analysis of Principal Coordinates, and Classification Strength). The classifications were roughly comparable in classification accuracy, with Omernik Revised showing the best overall performance. Trees showed the strongest fidelity to the classifications, and large mammals showed the weakest fidelity. We discuss the implications for regional biogeography and describe how intermediate resolution ecoregional classifications may be appropriate for use as natural areas management domains.
a Gsa-Svm Hybrid System for Classification of Binary Problems
NASA Astrophysics Data System (ADS)
Sarafrazi, Soroor; Nezamabadi-pour, Hossein; Barahman, Mojgan
2011-06-01
This paperhybridizesgravitational search algorithm (GSA) with support vector machine (SVM) and made a novel GSA-SVM hybrid system to improve the classification accuracy in binary problems. GSA is an optimization heuristic toolused to optimize the value of SVM kernel parameter (in this paper, radial basis function (RBF) is chosen as the kernel function). The experimental results show that this newapproach can achieve high classification accuracy and is comparable to or better than the particle swarm optimization (PSO)-SVM and genetic algorithm (GA)-SVM, which are two hybrid systems for classification.
Typicality effects in artificial categories: is there a hemisphere difference?
Richards, L G; Chiarello, C
1990-07-01
In category classification tasks, typicality effects are usually found: accuracy and reaction time depend upon distance from a prototype. In this study, subjects learned either verbal or nonverbal dot pattern categories, followed by a lateralized classification task. Comparable typicality effects were found in both reaction time and accuracy across visual fields for both verbal and nonverbal categories. Both hemispheres appeared to use a similarity-to-prototype matching strategy in classification. This indicates that merely having a verbal label does not differentiate classification in the two hemispheres.
Multi-site evaluation of IKONOS data for classification of tropical coral reef environments
Andrefouet, S.; Kramer, Philip; Torres-Pulliza, D.; Joyce, K.E.; Hochberg, E.J.; Garza-Perez, R.; Mumby, P.J.; Riegl, Bernhard; Yamano, H.; White, W.H.; Zubia, M.; Brock, J.C.; Phinn, S.R.; Naseer, A.; Hatcher, B.G.; Muller-Karger, F. E.
2003-01-01
Ten IKONOS images of different coral reef sites distributed around the world were processed to assess the potential of 4-m resolution multispectral data for coral reef habitat mapping. Complexity of reef environments, established by field observation, ranged from 3 to 15 classes of benthic habitats containing various combinations of sediments, carbonate pavement, seagrass, algae, and corals in different geomorphologic zones (forereef, lagoon, patch reef, reef flats). Processing included corrections for sea surface roughness and bathymetry, unsupervised or supervised classification, and accuracy assessment based on ground-truth data. IKONOS classification results were compared with classified Landsat 7 imagery for simple to moderate complexity of reef habitats (5-11 classes). For both sensors, overall accuracies of the classifications show a general linear trend of decreasing accuracy with increasing habitat complexity. The IKONOS sensor performed better, with a 15-20% improvement in accuracy compared to Landsat. For IKONOS, overall accuracy was 77% for 4-5 classes, 71% for 7-8 classes, 65% in 9-11 classes, and 53% for more than 13 classes. The Landsat classification accuracy was systematically lower, with an average of 56% for 5-10 classes. Within this general trend, inter-site comparisons and specificities demonstrate the benefits of different approaches. Pre-segmentation of the different geomorphologic zones and depth correction provided different advantages in different environments. Our results help guide scientists and managers in applying IKONOS-class data for coral reef mapping applications. ?? 2003 Elsevier Inc. All rights reserved.
Transportation Modes Classification Using Sensors on Smartphones.
Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu
2016-08-19
This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user's transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes.
Transportation Modes Classification Using Sensors on Smartphones
Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu
2016-01-01
This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user’s transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes. PMID:27548182
NASA Astrophysics Data System (ADS)
Zamora Ramos, Ernesto
Artificial Intelligence is a big part of automation and with today's technological advances, artificial intelligence has taken great strides towards positioning itself as the technology of the future to control, enhance and perfect automation. Computer vision includes pattern recognition and classification and machine learning. Computer vision is at the core of decision making and it is a vast and fruitful branch of artificial intelligence. In this work, we expose novel algorithms and techniques built upon existing technologies to improve pattern recognition and neural network training, initially motivated by a multidisciplinary effort to build a robot that helps maintain and optimize solar panel energy production. Our contributions detail an improved non-linear pre-processing technique to enhance poorly illuminated images based on modifications to the standard histogram equalization for an image. While the original motivation was to improve nocturnal navigation, the results have applications in surveillance, search and rescue, medical imaging enhancing, and many others. We created a vision system for precise camera distance positioning motivated to correctly locate the robot for capture of solar panel images for classification. The classification algorithm marks solar panels as clean or dirty for later processing. Our algorithm extends past image classification and, based on historical and experimental data, it identifies the optimal moment in which to perform maintenance on marked solar panels as to minimize the energy and profit loss. In order to improve upon the classification algorithm, we delved into feedforward neural networks because of their recent advancements, proven universal approximation and classification capabilities, and excellent recognition rates. We explore state-of-the-art neural network training techniques offering pointers and insights, culminating on the implementation of a complete library with support for modern deep learning architectures, multilayer percepterons and convolutional neural networks. Our research with neural networks has encountered a great deal of difficulties regarding hyperparameter estimation for good training convergence rate and accuracy. Most hyperparameters, including architecture, learning rate, regularization, trainable parameters (or weights) initialization, and so on, are chosen via a trial and error process with some educated guesses. However, we developed the first quantitative method to compare weight initialization strategies, a critical hyperparameter choice during training, to estimate among a group of candidate strategies which would make the network converge to the highest classification accuracy faster with high probability. Our method provides a quick, objective measure to compare initialization strategies to select the best possible among them beforehand without having to complete multiple training sessions for each candidate strategy to compare final results.
NASA Astrophysics Data System (ADS)
Dondurur, Mehmet
The primary objective of this study was to determine the degree to which modern SAR systems can be used to obtain information about the Earth's vegetative resources. Information obtainable from microwave synthetic aperture radar (SAR) data was compared with that obtainable from LANDSAT-TM and SPOT data. Three hypotheses were tested: (a) Classification of land cover/use from SAR data can be accomplished on a pixel-by-pixel basis with the same overall accuracy as from LANDSAT-TM and SPOT data. (b) Classification accuracy for individual land cover/use classes will differ between sensors. (c) Combining information derived from optical and SAR data into an integrated monitoring system will improve overall and individual land cover/use class accuracies. The study was conducted with three data sets for the Sleeping Bear Dunes test site in the northwestern part of Michigan's lower peninsula, including an October 1982 LANDSAT-TM scene, a June 1989 SPOT scene and C-, L- and P-Band radar data from the Jet Propulsion Laboratory AIRSAR. Reference data were derived from the Michigan Resource Information System (MIRIS) and available color infrared aerial photos. Classification and rectification of data sets were done using ERDAS Image Processing Programs. Classification algorithms included Maximum Likelihood, Mahalanobis Distance, Minimum Spectral Distance, ISODATA, Parallelepiped, and Sequential Cluster Analysis. Classified images were rectified as necessary so that all were at the same scale and oriented north-up. Results were analyzed with contingency tables and percent correctly classified (PCC) and Cohen's Kappa (CK) as accuracy indices using CSLANT and ImagePro programs developed for this study. Accuracy analyses were based upon a 1.4 by 6.5 km area with its long axis east-west. Reference data for this subscene total 55,770 15 by 15 m pixels with sixteen cover types, including seven level III forest classes, three level III urban classes, two level II range classes, two water classes, one wetland class and one agriculture class. An initial analysis was made without correcting the 1978 MIRIS reference data to the different dates of the TM, SPOT and SAR data sets. In this analysis, highest overall classification accuracy (PCC) was 87% with the TM data set, with both SPOT and C-Band SAR at 85%, a difference statistically significant at the 0.05 level. When the reference data were corrected for land cover change between 1978 and 1991, classification accuracy with the C-Band SAR data increased to 87%. Classification accuracy differed from sensor to sensor for individual land cover classes, Combining sensors into hypothetical multi-sensor systems resulted in higher accuracies than for any single sensor. Combining LANDSAT -TM and C-Band SAR yielded an overall classification accuracy (PCC) of 92%. The results of this study indicate that C-Band SAR data provide an acceptable substitute for LANDSAT-TM or SPOT data when land cover information is desired of areas where cloud cover obscures the terrain. Even better results can be obtained by integrating TM and C-Band SAR data into a multi-sensor system.
Classification of Clouds in Satellite Imagery Using Adaptive Fuzzy Sparse Representation
Jin, Wei; Gong, Fei; Zeng, Xingbin; Fu, Randi
2016-01-01
Automatic cloud detection and classification using satellite cloud imagery have various meteorological applications such as weather forecasting and climate monitoring. Cloud pattern analysis is one of the research hotspots recently. Since satellites sense the clouds remotely from space, and different cloud types often overlap and convert into each other, there must be some fuzziness and uncertainty in satellite cloud imagery. Satellite observation is susceptible to noises, while traditional cloud classification methods are sensitive to noises and outliers; it is hard for traditional cloud classification methods to achieve reliable results. To deal with these problems, a satellite cloud classification method using adaptive fuzzy sparse representation-based classification (AFSRC) is proposed. Firstly, by defining adaptive parameters related to attenuation rate and critical membership, an improved fuzzy membership is introduced to accommodate the fuzziness and uncertainty of satellite cloud imagery; secondly, by effective combination of the improved fuzzy membership function and sparse representation-based classification (SRC), atoms in training dictionary are optimized; finally, an adaptive fuzzy sparse representation classifier for cloud classification is proposed. Experiment results on FY-2G satellite cloud image show that, the proposed method not only improves the accuracy of cloud classification, but also has strong stability and adaptability with high computational efficiency. PMID:27999261
Hettige, Nuwan C; Nguyen, Thai Binh; Yuan, Chen; Rajakulendran, Thanara; Baddour, Jermeen; Bhagwat, Nikhil; Bani-Fatemi, Ali; Voineskos, Aristotle N; Mallar Chakravarty, M; De Luca, Vincenzo
2017-07-01
Suicide is a major concern for those afflicted by schizophrenia. Identifying patients at the highest risk for future suicide attempts remains a complex problem for psychiatric interventions. Machine learning models allow for the integration of many risk factors in order to build an algorithm that predicts which patients are likely to attempt suicide. Currently it is unclear how to integrate previously identified risk factors into a clinically relevant predictive tool to estimate the probability of a patient with schizophrenia for attempting suicide. We conducted a cross-sectional assessment on a sample of 345 participants diagnosed with schizophrenia spectrum disorders. Suicide attempters and non-attempters were clearly identified using the Columbia Suicide Severity Rating Scale (C-SSRS) and the Beck Suicide Ideation Scale (BSS). We developed four classification algorithms using a regularized regression, random forest, elastic net and support vector machine models with sociocultural and clinical variables as features to train the models. All classification models performed similarly in identifying suicide attempters and non-attempters. Our regularized logistic regression model demonstrated an accuracy of 67% and an area under the curve (AUC) of 0.71, while the random forest model demonstrated 66% accuracy and an AUC of 0.67. Support vector classifier (SVC) model demonstrated an accuracy of 67% and an AUC of 0.70, and the elastic net model demonstrated and accuracy of 65% and an AUC of 0.71. Machine learning algorithms offer a relatively successful method for incorporating many clinical features to predict individuals at risk for future suicide attempts. Increased performance of these models using clinically relevant variables offers the potential to facilitate early treatment and intervention to prevent future suicide attempts. Copyright © 2017 Elsevier Inc. All rights reserved.
Pediatric Surgeon-Directed Wound Classification Improves Accuracy
Zens, Tiffany J.; Rusy, Deborah A.; Gosain, Ankush
2015-01-01
Background Surgical wound classification (SWC) communicates the degree of contamination in the surgical field and is used to stratify risk of surgical site infection and compare outcomes amongst centers. We hypothesized that changing from nurse-directed to surgeon-directed SWC during a structured operative debrief we will improve accuracy of documentation. Methods An IRB-approved retrospective chart review was performed. Two time periods were defined: initially, SWC was determined and recorded by the circulating nurse (Pre-Debrief 6/2012-5/2013) and allowing six months for adoption and education, we implemented a structured operative debriefing including surgeon-directed SWC (Post-Debrief 1/2014-8/2014). Accuracy of SWC was determined for four commonly performed Pediatric General Surgery operations: inguinal hernia repair (clean), gastrostomy +/− Nissen fundoplication (clean-contaminated), appendectomy without perforation (contaminated), and appendectomy with perforation (dirty). Results 183 cases Pre-Debrief and 142 cases Post-Debrief met inclusion criteria. No differences between time periods were noted in regards to patient demographics, ASA class, or case mix. Accuracy of wound classification improved Post-Debrief (42% vs. 58.5%, p=0.003). Pre-Debrief, 26.8% of cases were overestimated or underestimated by more than one wound class, vs. 3.5% of cases Post-Debrief (p<0.001). Interestingly, the majority of Post-Debrief contaminated cases were incorrectly classified as clean-contaminated. Conclusions Implementation of a structured operative debrief including surgeon-directed SWC improves the percentage of correctly classified wounds and decreases the degree of inaccuracy in incorrectly classified cases. However, following implementation of the debriefing, we still observed a 41.5% rate of incorrect documentation, most notably in contaminated cases, indicating further education and process improvement is needed. PMID:27020829
Kraken: ultrafast metagenomic sequence classification using exact alignments
2014-01-01
Kraken is an ultrafast and highly accurate program for assigning taxonomic labels to metagenomic DNA sequences. Previous programs designed for this task have been relatively slow and computationally expensive, forcing researchers to use faster abundance estimation programs, which only classify small subsets of metagenomic data. Using exact alignment of k-mers, Kraken achieves classification accuracy comparable to the fastest BLAST program. In its fastest mode, Kraken classifies 100 base pair reads at a rate of over 4.1 million reads per minute, 909 times faster than Megablast and 11 times faster than the abundance estimation program MetaPhlAn. Kraken is available at http://ccb.jhu.edu/software/kraken/. PMID:24580807
Brain-Computer Interface Based on Generation of Visual Images
Bobrov, Pavel; Frolov, Alexander; Cantor, Charles; Fedulova, Irina; Bakhnyan, Mikhail; Zhavoronkov, Alexander
2011-01-01
This paper examines the task of recognizing EEG patterns that correspond to performing three mental tasks: relaxation and imagining of two types of pictures: faces and houses. The experiments were performed using two EEG headsets: BrainProducts ActiCap and Emotiv EPOC. The Emotiv headset becomes widely used in consumer BCI application allowing for conducting large-scale EEG experiments in the future. Since classification accuracy significantly exceeded the level of random classification during the first three days of the experiment with EPOC headset, a control experiment was performed on the fourth day using ActiCap. The control experiment has shown that utilization of high-quality research equipment can enhance classification accuracy (up to 68% in some subjects) and that the accuracy is independent of the presence of EEG artifacts related to blinking and eye movement. This study also shows that computationally-inexpensive Bayesian classifier based on covariance matrix analysis yields similar classification accuracy in this problem as a more sophisticated Multi-class Common Spatial Patterns (MCSP) classifier. PMID:21695206
NASA Technical Reports Server (NTRS)
Mulligan, P. J.; Gervin, J. C.; Lu, Y. C.
1985-01-01
An area bordering the Eastern Shore of the Chesapeake Bay was selected for study and classified using unsupervised techniques applied to LANDSAT-2 MSS data and several band combinations of LANDSAT-4 TM data. The accuracies of these Level I land cover classifications were verified using the Taylor's Island USGS 7.5 minute topographic map which was photointerpreted, digitized and rasterized. The the Taylor's Island map, comparing the MSS and TM three band (2 3 4) classifications, the increased resolution of TM produced a small improvement in overall accuracy of 1% correct due primarily to a small improvement, and 1% and 3%, in areas such as water and woodland. This was expected as the MSS data typically produce high accuracies for categories which cover large contiguous areas. However, in the categories covering smaller areas within the map there was generally an improvement of at least 10%. Classification of the important residential category improved 12%, and wetlands were mapped with 11% greater accuracy.
NASA Astrophysics Data System (ADS)
Roychowdhury, K.
2016-06-01
Landcover is the easiest detectable indicator of human interventions on land. Urban and peri-urban areas present a complex combination of landcover, which makes classification challenging. This paper assesses the different methods of classifying landcover using dual polarimetric Sentinel-1 data collected during monsoon (July) and winter (December) months of 2015. Four broad landcover classes such as built up areas, water bodies and wetlands, vegetation and open spaces of Kolkata and its surrounding regions were identified. Polarimetric analyses were conducted on Single Look Complex (SLC) data of the region while ground range detected (GRD) data were used for spectral and spatial classification. Unsupervised classification by means of K-Means clustering used backscatter values and was able to identify homogenous landcovers over the study area. The results produced an overall accuracy of less than 50% for both the seasons. Higher classification accuracy (around 70%) was achieved by adding texture variables as inputs along with the backscatter values. However, the accuracy of classification increased significantly with polarimetric analyses. The overall accuracy was around 80% in Wishart H-A-Alpha unsupervised classification. The method was useful in identifying urban areas due to their double-bounce scattering and vegetated areas, which have more random scattering. Normalized Difference Built-up index (NDBI) and Normalized Difference Vegetation Index (NDVI) obtained from Landsat 8 data over the study area were used to verify vegetation and urban classes. The study compares the accuracies of different methods of classifying landcover using medium resolution SAR data in a complex urban area and suggests that polarimetric analyses present the most accurate results for urban and suburban areas.
Gradus, Jaimie L; King, Matthew W; Galatzer-Levy, Isaac; Street, Amy E
2017-08-01
Suicide rates among recent veterans have led to interest in risk identification. Evidence of gender-and trauma-specific predictors of suicidal ideation necessitates the use of advanced computational methods capable of elucidating these important and complex associations. In this study, we used machine learning to examine gender-specific associations between predeployment and military factors, traumatic deployment experiences, and psychopathology and suicidal ideation (SI) in a national sample of veterans deployed during the Iraq and Afghanistan conflicts (n = 2,244). Classification, regression tree analyses, and random forests were used to identify associations with SI and determine their classification accuracy. Findings converged on several associations for men that included depression, posttraumatic stress disorder (PTSD), and somatic complaints. Sexual harassment during deployment emerged as a key factor that interacted with PTSD and depression and demonstrated a stronger association with SI among women. Classification accuracy for SI presence or absence was good based on the receiver operating characteristic area under the curve, men = .91, women = .92. The risk for SI was classifiable with good accuracy, with associations that varied by gender. The use of machine learning analyses allowed for the discovery of rich, nuanced results that should be replicated in other samples and may eventually be a basis for the development of gender-specific actuarial tools to assess SI risk among veterans. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.
A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification
NASA Astrophysics Data System (ADS)
Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun
2016-12-01
Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.
A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification.
Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun
2016-12-01
Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value.
A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification
Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun
2016-01-01
Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value. PMID:27905520
Pang, Shuchao; Yu, Zhezhou; Orgun, Mehmet A
2017-03-01
Highly accurate classification of biomedical images is an essential task in the clinical diagnosis of numerous medical diseases identified from those images. Traditional image classification methods combined with hand-crafted image feature descriptors and various classifiers are not able to effectively improve the accuracy rate and meet the high requirements of classification of biomedical images. The same also holds true for artificial neural network models directly trained with limited biomedical images used as training data or directly used as a black box to extract the deep features based on another distant dataset. In this study, we propose a highly reliable and accurate end-to-end classifier for all kinds of biomedical images via deep learning and transfer learning. We first apply domain transferred deep convolutional neural network for building a deep model; and then develop an overall deep learning architecture based on the raw pixels of original biomedical images using supervised training. In our model, we do not need the manual design of the feature space, seek an effective feature vector classifier or segment specific detection object and image patches, which are the main technological difficulties in the adoption of traditional image classification methods. Moreover, we do not need to be concerned with whether there are large training sets of annotated biomedical images, affordable parallel computing resources featuring GPUs or long times to wait for training a perfect deep model, which are the main problems to train deep neural networks for biomedical image classification as observed in recent works. With the utilization of a simple data augmentation method and fast convergence speed, our algorithm can achieve the best accuracy rate and outstanding classification ability for biomedical images. We have evaluated our classifier on several well-known public biomedical datasets and compared it with several state-of-the-art approaches. We propose a robust automated end-to-end classifier for biomedical images based on a domain transferred deep convolutional neural network model that shows a highly reliable and accurate performance which has been confirmed on several public biomedical image datasets. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.
Özdemir, Merve Erkınay; Telatar, Ziya; Eroğul, Osman; Tunca, Yusuf
2018-05-01
Dysmorphic syndromes have different facial malformations. These malformations are significant to an early diagnosis of dysmorphic syndromes and contain distinctive information for face recognition. In this study we define the certain features of each syndrome by considering facial malformations and classify Fragile X, Hurler, Prader Willi, Down, Wolf Hirschhorn syndromes and healthy groups automatically. The reference points are marked on the face images and ratios between the points' distances are taken into consideration as features. We suggest a neural network based hierarchical decision tree structure in order to classify the syndrome types. We also implement k-nearest neighbor (k-NN) and artificial neural network (ANN) classifiers to compare classification accuracy with our hierarchical decision tree. The classification accuracy is 50, 73 and 86.7% with k-NN, ANN and hierarchical decision tree methods, respectively. Then, the same images are shown to a clinical expert who achieve a recognition rate of 46.7%. We develop an efficient system to recognize different syndrome types automatically in a simple, non-invasive imaging data, which is independent from the patient's age, sex and race at high accuracy. The promising results indicate that our method can be used for pre-diagnosis of the dysmorphic syndromes by clinical experts.
Common component classification: what can we learn from machine learning?
Anderson, Ariana; Labus, Jennifer S; Vianna, Eduardo P; Mayer, Emeran A; Cohen, Mark S
2011-05-15
Machine learning methods have been applied to classifying fMRI scans by studying locations in the brain that exhibit temporal intensity variation between groups, frequently reporting classification accuracy of 90% or better. Although empirical results are quite favorable, one might doubt the ability of classification methods to withstand changes in task ordering and the reproducibility of activation patterns over runs, and question how much of the classification machines' power is due to artifactual noise versus genuine neurological signal. To examine the true strength and power of machine learning classifiers we create and then deconstruct a classifier to examine its sensitivity to physiological noise, task reordering, and across-scan classification ability. The models are trained and tested both within and across runs to assess stability and reproducibility across conditions. We demonstrate the use of independent components analysis for both feature extraction and artifact removal and show that removal of such artifacts can reduce predictive accuracy even when data has been cleaned in the preprocessing stages. We demonstrate how mistakes in the feature selection process can cause the cross-validation error seen in publication to be a biased estimate of the testing error seen in practice and measure this bias by purposefully making flawed models. We discuss other ways to introduce bias and the statistical assumptions lying behind the data and model themselves. Finally we discuss the complications in drawing inference from the smaller sample sizes typically seen in fMRI studies, the effects of small or unbalanced samples on the Type 1 and Type 2 error rates, and how publication bias can give a false confidence of the power of such methods. Collectively this work identifies challenges specific to fMRI classification and methods affecting the stability of models. Copyright © 2010 Elsevier Inc. All rights reserved.
Pettersson-Yeo, William; Benetti, Stefania; Marquand, Andre F.; Joules, Richard; Catani, Marco; Williams, Steve C. R.; Allen, Paul; McGuire, Philip; Mechelli, Andrea
2014-01-01
In the pursuit of clinical utility, neuroimaging researchers of psychiatric and neurological illness are increasingly using analyses, such as support vector machine, that allow inference at the single-subject level. Recent studies employing single-modality data, however, suggest that classification accuracies must be improved for such utility to be realized. One possible solution is to integrate different data types to provide a single combined output classification; either by generating a single decision function based on an integrated kernel matrix, or, by creating an ensemble of multiple single modality classifiers and integrating their predictions. Here, we describe four integrative approaches: (1) an un-weighted sum of kernels, (2) multi-kernel learning, (3) prediction averaging, and (4) majority voting, and compare their ability to enhance classification accuracy relative to the best single-modality classification accuracy. We achieve this by integrating structural, functional, and diffusion tensor magnetic resonance imaging data, in order to compare ultra-high risk (n = 19), first episode psychosis (n = 19) and healthy control subjects (n = 23). Our results show that (i) whilst integration can enhance classification accuracy by up to 13%, the frequency of such instances may be limited, (ii) where classification can be enhanced, simple methods may yield greater increases relative to more computationally complex alternatives, and, (iii) the potential for classification enhancement is highly influenced by the specific diagnostic comparison under consideration. In conclusion, our findings suggest that for moderately sized clinical neuroimaging datasets, combining different imaging modalities in a data-driven manner is no “magic bullet” for increasing classification accuracy. However, it remains possible that this conclusion is dependent on the use of neuroimaging modalities that had little, or no, complementary information to offer one another, and that the integration of more diverse types of data would have produced greater classification enhancement. We suggest that future studies ideally examine a greater variety of data types (e.g., genetic, cognitive, and neuroimaging) in order to identify the data types and combinations optimally suited to the classification of early stage psychosis. PMID:25076868
Pettersson-Yeo, William; Benetti, Stefania; Marquand, Andre F; Joules, Richard; Catani, Marco; Williams, Steve C R; Allen, Paul; McGuire, Philip; Mechelli, Andrea
2014-01-01
In the pursuit of clinical utility, neuroimaging researchers of psychiatric and neurological illness are increasingly using analyses, such as support vector machine, that allow inference at the single-subject level. Recent studies employing single-modality data, however, suggest that classification accuracies must be improved for such utility to be realized. One possible solution is to integrate different data types to provide a single combined output classification; either by generating a single decision function based on an integrated kernel matrix, or, by creating an ensemble of multiple single modality classifiers and integrating their predictions. Here, we describe four integrative approaches: (1) an un-weighted sum of kernels, (2) multi-kernel learning, (3) prediction averaging, and (4) majority voting, and compare their ability to enhance classification accuracy relative to the best single-modality classification accuracy. We achieve this by integrating structural, functional, and diffusion tensor magnetic resonance imaging data, in order to compare ultra-high risk (n = 19), first episode psychosis (n = 19) and healthy control subjects (n = 23). Our results show that (i) whilst integration can enhance classification accuracy by up to 13%, the frequency of such instances may be limited, (ii) where classification can be enhanced, simple methods may yield greater increases relative to more computationally complex alternatives, and, (iii) the potential for classification enhancement is highly influenced by the specific diagnostic comparison under consideration. In conclusion, our findings suggest that for moderately sized clinical neuroimaging datasets, combining different imaging modalities in a data-driven manner is no "magic bullet" for increasing classification accuracy. However, it remains possible that this conclusion is dependent on the use of neuroimaging modalities that had little, or no, complementary information to offer one another, and that the integration of more diverse types of data would have produced greater classification enhancement. We suggest that future studies ideally examine a greater variety of data types (e.g., genetic, cognitive, and neuroimaging) in order to identify the data types and combinations optimally suited to the classification of early stage psychosis.
NASA Technical Reports Server (NTRS)
Fagan, Matthew E.; Defries, Ruth S.; Sesnie, Steven E.; Arroyo-Mora, J. Pablo; Soto, Carlomagno; Singh, Aditya; Townsend, Philip A.; Chazdon, Robin L.
2015-01-01
An efficient means to map tree plantations is needed to detect tropical land use change and evaluate reforestation projects. To analyze recent tree plantation expansion in northeastern Costa Rica, we examined the potential of combining moderate-resolution hyperspectral imagery (2005 HyMap mosaic) with multitemporal, multispectral data (Landsat) to accurately classify (1) general forest types and (2) tree plantations by species composition. Following a linear discriminant analysis to reduce data dimensionality, we compared four Random Forest classification models: hyperspectral data (HD) alone; HD plus interannual spectral metrics; HD plus a multitemporal forest regrowth classification; and all three models combined. The fourth, combined model achieved overall accuracy of 88.5%. Adding multitemporal data significantly improved classification accuracy (p less than 0.0001) of all forest types, although the effect on tree plantation accuracy was modest. The hyperspectral data alone classified six species of tree plantations with 75% to 93% producer's accuracy; adding multitemporal spectral data increased accuracy only for two species with dense canopies. Non-native tree species had higher classification accuracy overall and made up the majority of tree plantations in this landscape. Our results indicate that combining occasionally acquired hyperspectral data with widely available multitemporal satellite imagery enhances mapping and monitoring of reforestation in tropical landscapes.
NASA Technical Reports Server (NTRS)
Spann, G. W.; Faust, N. L.
1974-01-01
It is known from several previous investigations that many categories of land-use can be mapped via computer processing of Earth Resources Technology Satellite data. The results are presented of one such experiment using the USGS/NASA land-use classification system. Douglas County, Georgia, was chosen as the test site for this project. It was chosen primarily because of its recent rapid growth and future growth potential. Results of the investigation indicate an overall land-use mapping accuracy of 67% with higher accuracies in rural areas and lower accuracies in urban areas. It is estimated, however, that 95% of the State of Georgia could be mapped by these techniques with an accuracy of 80% to 90%.
NASA Astrophysics Data System (ADS)
Park, M.; Stenstrom, M. K.
2004-12-01
Recognizing urban information from the satellite imagery is problematic due to the diverse features and dynamic changes of urban landuse. The use of Landsat imagery for urban land use classification involves inherent uncertainty due to its spatial resolution and the low separability among land uses. To resolve the uncertainty problem, we investigated the performance of Bayesian networks to classify urban land use since Bayesian networks provide a quantitative way of handling uncertainty and have been successfully used in many areas. In this study, we developed the optimized networks for urban land use classification from Landsat ETM+ images of Marina del Rey area based on USGS land cover/use classification level III. The networks started from a tree structure based on mutual information between variables and added the links to improve accuracy. This methodology offers several advantages: (1) The network structure shows the dependency relationships between variables. The class node value can be predicted even with particular band information missing due to sensor system error. The missing information can be inferred from other dependent bands. (2) The network structure provides information of variables that are important for the classification, which is not available from conventional classification methods such as neural networks and maximum likelihood classification. In our case, for example, bands 1, 5 and 6 are the most important inputs in determining the land use of each pixel. (3) The networks can be reduced with those input variables important for classification. This minimizes the problem without considering all possible variables. We also examined the effect of incorporating ancillary data: geospatial information such as X and Y coordinate values of each pixel and DEM data, and vegetation indices such as NDVI and Tasseled Cap transformation. The results showed that the locational information improved overall accuracy (81%) and kappa coefficient (76%), and lowered the omission and commission errors compared with using only spectral data (accuracy 71%, kappa coefficient 62%). Incorporating DEM data did not significantly improve overall accuracy (74%) and kappa coefficient (66%) but lowered the omission and commission errors. Incorporating NDVI did not much improve the overall accuracy (72%) and k coefficient (65%). Including Tasseled Cap transformation reduced the accuracy (accuracy 70%, kappa 61%). Therefore, additional information from the DEM and vegetation indices was not useful as locational ancillary data.
NASA Astrophysics Data System (ADS)
Müller-Putz, Gernot R.; Scherer, Reinhold; Brauneis, Christian; Pfurtscheller, Gert
2005-12-01
Brain-computer interfaces (BCIs) can be realized on the basis of steady-state evoked potentials (SSEPs). These types of brain signals resulting from repetitive stimulation have the same fundamental frequency as the stimulation but also include higher harmonics. This study investigated how the classification accuracy of a 4-class BCI system can be improved by incorporating visually evoked harmonic oscillations. The current study revealed that the use of three SSVEP harmonics yielded a significantly higher classification accuracy than was the case for one or two harmonics. During feedback experiments, the five subjects investigated reached a classification accuracy between 42.5% and 94.4%.
Müller-Putz, Gernot R; Scherer, Reinhold; Brauneis, Christian; Pfurtscheller, Gert
2005-12-01
Brain-computer interfaces (BCIs) can be realized on the basis of steady-state evoked potentials (SSEPs). These types of brain signals resulting from repetitive stimulation have the same fundamental frequency as the stimulation but also include higher harmonics. This study investigated how the classification accuracy of a 4-class BCI system can be improved by incorporating visually evoked harmonic oscillations. The current study revealed that the use of three SSVEP harmonics yielded a significantly higher classification accuracy than was the case for one or two harmonics. During feedback experiments, the five subjects investigated reached a classification accuracy between 42.5% and 94.4%.
Determination of sex from the patella in a contemporary Spanish population.
Peckmann, Tanya R; Meek, Susan; Dilkie, Natasha; Rozendaal, Andrew
2016-11-01
The skull and pelvis have been used for the determination of sex for unknown human remains. However, in forensic cases where skeletal remains often exhibit postmortem damage and taphonomic changes the patella may be used for the determination of sex as it is a preservationally favoured bone. The goal of the present research was to derive discriminant function equations from the patella for estimation of sex from a contemporary Spanish population. Six parameters were measured on 106 individuals (55 males and 51 females), ranging in age from 22 to 85 years old, from the Granada Osteological Collection. The statistical analyses showed that all variables were sexually dimorphic. Discriminant function score equations were generated for use in sex determination. The overall accuracy of sex classification ranged from 75.2% to 84.8% for the direct method and 75.5%-83.8% for the stepwise method. When the South African White discriminant functions were applied to the Spanish sample they showed high accuracy rates for sexing female patellae (90%-95.9%) and low accuracy rates for sexing male patellae (52.7%-58.2%). When the South African Black discriminant functions were applied to the Spanish sample they showed high accuracy rates for sexing male patellae (90.9%) and low accuracy rates for sexing female patellae (70%-75.5%). The patella was shown to be useful for sex determination in the contemporary Spanish population. Copyright © 2016 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Hwa, Hsiao-Lin; Lin, Chih-Peng; Huang, Tsun-Ying; Kuo, Po-Hsiu; Hsieh, Wei-Hsin; Lin, Chun-Yen; Yin, Hsiang-I; Tseng, Li-Hui; Lee, James Chun-I
2017-06-01
Ancestry informative single-nucleotide polymorphism (AISNP) panels for differentiating between East and Southeast Asian populations are scarce. This study aimed to identify AISNPs for ancestry assignment of five East and Southeast Asian populations, and Caucasians. We analyzed 145 autosomal SNPs of the 627 DNA samples from individuals of six populations (234 Taiwanese Han, 91 Filipinos, 79 Indonesians, 60 Thais, 71 Vietnamese, and 92 Caucasians) using arrays. The multiple logistic regression model and a multi-tier approach were used for ancestry classification. We observed that 130 AISNPs were effective for classifying the ethnic origins with fair accuracy. Among the 130 AISNPs, 122 were useful for stratification between these five Asian populations and 64 were effective for differentiating between Caucasians and these Asian populations. For differentiation between Caucasians and Asians, an accuracy rate of 100% was achieved in these 627 subjects with 50 optimal AISNPs among the 64 effective SNPs. For classification of the five Asian populations, the accuracy rates of ancestry inference using 20 to 57 SNPs for each of the two Asian populations ranged from 74.1% to 100%. Another 14 degraded DNA samples with incomplete profiling were analyzed, and the ancestry of 12 (85.7%) of those subjects was accurately assigned. We developed a 130-AISNP panel for ethnic origin differentiation between the five East and Southeast Asian populations and Caucasians. This AISNP set may be helpful for individual ancestral assignment of these populations in forensic casework.
Sequenced subjective accents for brain-computer interfaces
NASA Astrophysics Data System (ADS)
Vlek, R. J.; Schaefer, R. S.; Gielen, C. C. A. M.; Farquhar, J. D. R.; Desain, P.
2011-06-01
Subjective accenting is a cognitive process in which identical auditory pulses at an isochronous rate turn into the percept of an accenting pattern. This process can be voluntarily controlled, making it a candidate for communication from human user to machine in a brain-computer interface (BCI) system. In this study we investigated whether subjective accenting is a feasible paradigm for BCI and how its time-structured nature can be exploited for optimal decoding from non-invasive EEG data. Ten subjects perceived and imagined different metric patterns (two-, three- and four-beat) superimposed on a steady metronome. With an offline classification paradigm, we classified imagined accented from non-accented beats on a single trial (0.5 s) level with an average accuracy of 60.4% over all subjects. We show that decoding of imagined accents is also possible with a classifier trained on perception data. Cyclic patterns of accents and non-accents were successfully decoded with a sequence classification algorithm. Classification performances were compared by means of bit rate. Performance in the best scenario translates into an average bit rate of 4.4 bits min-1 over subjects, which makes subjective accenting a promising paradigm for an online auditory BCI.
High Accuracy Human Activity Recognition Based on Sparse Locality Preserving Projections.
Zhu, Xiangbin; Qiu, Huiling
2016-01-01
Human activity recognition(HAR) from the temporal streams of sensory data has been applied to many fields, such as healthcare services, intelligent environments and cyber security. However, the classification accuracy of most existed methods is not enough in some applications, especially for healthcare services. In order to improving accuracy, it is necessary to develop a novel method which will take full account of the intrinsic sequential characteristics for time-series sensory data. Moreover, each human activity may has correlated feature relationship at different levels. Therefore, in this paper, we propose a three-stage continuous hidden Markov model (TSCHMM) approach to recognize human activities. The proposed method contains coarse, fine and accurate classification. The feature reduction is an important step in classification processing. In this paper, sparse locality preserving projections (SpLPP) is exploited to determine the optimal feature subsets for accurate classification of the stationary-activity data. It can extract more discriminative activities features from the sensor data compared with locality preserving projections. Furthermore, all of the gyro-based features are used for accurate classification of the moving data. Compared with other methods, our method uses significantly less number of features, and the over-all accuracy has been obviously improved.
High Accuracy Human Activity Recognition Based on Sparse Locality Preserving Projections
2016-01-01
Human activity recognition(HAR) from the temporal streams of sensory data has been applied to many fields, such as healthcare services, intelligent environments and cyber security. However, the classification accuracy of most existed methods is not enough in some applications, especially for healthcare services. In order to improving accuracy, it is necessary to develop a novel method which will take full account of the intrinsic sequential characteristics for time-series sensory data. Moreover, each human activity may has correlated feature relationship at different levels. Therefore, in this paper, we propose a three-stage continuous hidden Markov model (TSCHMM) approach to recognize human activities. The proposed method contains coarse, fine and accurate classification. The feature reduction is an important step in classification processing. In this paper, sparse locality preserving projections (SpLPP) is exploited to determine the optimal feature subsets for accurate classification of the stationary-activity data. It can extract more discriminative activities features from the sensor data compared with locality preserving projections. Furthermore, all of the gyro-based features are used for accurate classification of the moving data. Compared with other methods, our method uses significantly less number of features, and the over-all accuracy has been obviously improved. PMID:27893761
NASA Technical Reports Server (NTRS)
Sadowski, F. E.; Sarno, J. E.
1976-01-01
First, an analysis of forest feature signatures was used to help explain the large variation in classification accuracy that can occur among individual forest features for any one case of spatial resolution and the inconsistent changes in classification accuracy that were demonstrated among features as spatial resolution was degraded. Second, the classification rejection threshold was varied in an effort to reduce the large proportion of unclassified resolution elements that previously appeared in the processing of coarse resolution data when a constant rejection threshold was used for all cases of spatial resolution. For the signature analysis, two-channel ellipse plots showing the feature signature distributions for several cases of spatial resolution indicated that the capability of signatures to correctly identify their respective features is dependent on the amount of statistical overlap among signatures. Reductions in signature variance that occur in data of degraded spatial resolution may not necessarily decrease the amount of statistical overlap among signatures having large variance and small mean separations. Features classified by such signatures may thus continue to have similar amounts of misclassified elements in coarser resolution data, and thus, not necessarily improve in classification accuracy.
Wang, Huiya; Feng, Jun; Wang, Hongyu
2017-07-20
Detection of clustered microcalcification (MC) from mammograms plays essential roles in computer-aided diagnosis for early stage breast cancer. To tackle problems associated with the diversity of data structures of MC lesions and the variability of normal breast tissues, multi-pattern sample space learning is required. In this paper, a novel grouped fuzzy Support Vector Machine (SVM) algorithm with sample space partition based on Expectation-Maximization (EM) (called G-FSVM) is proposed for clustered MC detection. The diversified pattern of training data is partitioned into several groups based on EM algorithm. Then a series of fuzzy SVM are integrated for classification with each group of samples from the MC lesions and normal breast tissues. From DDSM database, a total of 1,064 suspicious regions are selected from 239 mammography, and the measurement of Accuracy, True Positive Rate (TPR), False Positive Rate (FPR) and EVL = TPR* 1-FPR are 0.82, 0.78, 0.14 and 0.72, respectively. The proposed method incorporates the merits of fuzzy SVM and multi-pattern sample space learning, decomposing the MC detection problem into serial simple two-class classification. Experimental results from synthetic data and DDSM database demonstrate that our integrated classification framework reduces the false positive rate significantly while maintaining the true positive rate.
NASA Astrophysics Data System (ADS)
Fujita, Yusuke; Mitani, Yoshihiro; Hamamoto, Yoshihiko; Segawa, Makoto; Terai, Shuji; Sakaida, Isao
2017-03-01
Ultrasound imaging is a popular and non-invasive tool used in the diagnoses of liver disease. Cirrhosis is a chronic liver disease and it can advance to liver cancer. Early detection and appropriate treatment are crucial to prevent liver cancer. However, ultrasound image analysis is very challenging, because of the low signal-to-noise ratio of ultrasound images. To achieve the higher classification performance, selection of training regions of interest (ROIs) is very important that effect to classification accuracy. The purpose of our study is cirrhosis detection with high accuracy using liver ultrasound images. In our previous works, training ROI selection by MILBoost and multiple-ROI classification based on the product rule had been proposed, to achieve high classification performance. In this article, we propose self-training method to select training ROIs effectively. Evaluation experiments were performed to evaluate effect of self-training, using manually selected ROIs and also automatically selected ROIs. Experimental results show that self-training for manually selected ROIs achieved higher classification performance than other approaches, including our conventional methods. The manually ROI definition and sample selection are important to improve classification accuracy in cirrhosis detection using ultrasound images.
The impact of OCR accuracy on automated cancer classification of pathology reports.
Zuccon, Guido; Nguyen, Anthony N; Bergheim, Anton; Wickman, Sandra; Grayson, Narelle
2012-01-01
To evaluate the effects of Optical Character Recognition (OCR) on the automatic cancer classification of pathology reports. Scanned images of pathology reports were converted to electronic free-text using a commercial OCR system. A state-of-the-art cancer classification system, the Medical Text Extraction (MEDTEX) system, was used to automatically classify the OCR reports. Classifications produced by MEDTEX on the OCR versions of the reports were compared with the classification from a human amended version of the OCR reports. The employed OCR system was found to recognise scanned pathology reports with up to 99.12% character accuracy and up to 98.95% word accuracy. Errors in the OCR processing were found to minimally impact on the automatic classification of scanned pathology reports into notifiable groups. However, the impact of OCR errors is not negligible when considering the extraction of cancer notification items, such as primary site, histological type, etc. The automatic cancer classification system used in this work, MEDTEX, has proven to be robust to errors produced by the acquisition of freetext pathology reports from scanned images through OCR software. However, issues emerge when considering the extraction of cancer notification items.
Heart Rate Variability Dynamics for the Prognosis of Cardiovascular Risk
Ramirez-Villegas, Juan F.; Lam-Espinosa, Eric; Ramirez-Moreno, David F.; Calvo-Echeverry, Paulo C.; Agredo-Rodriguez, Wilfredo
2011-01-01
Statistical, spectral, multi-resolution and non-linear methods were applied to heart rate variability (HRV) series linked with classification schemes for the prognosis of cardiovascular risk. A total of 90 HRV records were analyzed: 45 from healthy subjects and 45 from cardiovascular risk patients. A total of 52 features from all the analysis methods were evaluated using standard two-sample Kolmogorov-Smirnov test (KS-test). The results of the statistical procedure provided input to multi-layer perceptron (MLP) neural networks, radial basis function (RBF) neural networks and support vector machines (SVM) for data classification. These schemes showed high performances with both training and test sets and many combinations of features (with a maximum accuracy of 96.67%). Additionally, there was a strong consideration for breathing frequency as a relevant feature in the HRV analysis. PMID:21386966
Differential diagnosis of neurodegenerative diseases using structural MRI data
Koikkalainen, Juha; Rhodius-Meester, Hanneke; Tolonen, Antti; Barkhof, Frederik; Tijms, Betty; Lemstra, Afina W.; Tong, Tong; Guerrero, Ricardo; Schuh, Andreas; Ledig, Christian; Rueckert, Daniel; Soininen, Hilkka; Remes, Anne M.; Waldemar, Gunhild; Hasselbalch, Steen; Mecocci, Patrizia; van der Flier, Wiesje; Lötjönen, Jyrki
2016-01-01
Different neurodegenerative diseases can cause memory disorders and other cognitive impairments. The early detection and the stratification of patients according to the underlying disease are essential for an efficient approach to this healthcare challenge. This emphasizes the importance of differential diagnostics. Most studies compare patients and controls, or Alzheimer's disease with one other type of dementia. Such a bilateral comparison does not resemble clinical practice, where a clinician is faced with a number of different possible types of dementia. Here we studied which features in structural magnetic resonance imaging (MRI) scans could best distinguish four types of dementia, Alzheimer's disease, frontotemporal dementia, vascular dementia, and dementia with Lewy bodies, and control subjects. We extracted an extensive set of features quantifying volumetric and morphometric characteristics from T1 images, and vascular characteristics from FLAIR images. Classification was performed using a multi-class classifier based on Disease State Index methodology. The classifier provided continuous probability indices for each disease to support clinical decision making. A dataset of 504 individuals was used for evaluation. The cross-validated classification accuracy was 70.6% and balanced accuracy was 69.1% for the five disease groups using only automatically determined MRI features. Vascular dementia patients could be detected with high sensitivity (96%) using features from FLAIR images. Controls (sensitivity 82%) and Alzheimer's disease patients (sensitivity 74%) could be accurately classified using T1-based features, whereas the most difficult group was the dementia with Lewy bodies (sensitivity 32%). These results were notable better than the classification accuracies obtained with visual MRI ratings (accuracy 44.6%, balanced accuracy 51.6%). Different quantification methods provided complementary information, and consequently, the best results were obtained by utilizing several quantification methods. The results prove that automatic quantification methods and computerized decision support methods are feasible for clinical practice and provide comprehensive information that may help clinicians in the diagnosis making. PMID:27104138
NASA Astrophysics Data System (ADS)
Dash, Jatindra K.; Kale, Mandar; Mukhopadhyay, Sudipta; Khandelwal, Niranjan; Prabhakar, Nidhi; Garg, Mandeep; Kalra, Naveen
2017-03-01
In this paper, we investigate the effect of the error criteria used during a training phase of the artificial neural network (ANN) on the accuracy of the classifier for classification of lung tissues affected with Interstitial Lung Diseases (ILD). Mean square error (MSE) and the cross-entropy (CE) criteria are chosen being most popular choice in state-of-the-art implementations. The classification experiment performed on the six interstitial lung disease (ILD) patterns viz. Consolidation, Emphysema, Ground Glass Opacity, Micronodules, Fibrosis and Healthy from MedGIFT database. The texture features from an arbitrary region of interest (AROI) are extracted using Gabor filter. Two different neural networks are trained with the scaled conjugate gradient back propagation algorithm with MSE and CE error criteria function respectively for weight updation. Performance is evaluated in terms of average accuracy of these classifiers using 4 fold cross-validation. Each network is trained for five times for each fold with randomly initialized weight vectors and accuracies are computed. Significant improvement in classification accuracy is observed when ANN is trained by using CE (67.27%) as error function compared to MSE (63.60%). Moreover, standard deviation of the classification accuracy for the network trained with CE (6.69) error criteria is found less as compared to network trained with MSE (10.32) criteria.
Cao, Jianfang; Cui, Hongyan; Shi, Hao; Jiao, Lijuan
2016-01-01
A back-propagation (BP) neural network can solve complicated random nonlinear mapping problems; therefore, it can be applied to a wide range of problems. However, as the sample size increases, the time required to train BP neural networks becomes lengthy. Moreover, the classification accuracy decreases as well. To improve the classification accuracy and runtime efficiency of the BP neural network algorithm, we proposed a parallel design and realization method for a particle swarm optimization (PSO)-optimized BP neural network based on MapReduce on the Hadoop platform using both the PSO algorithm and a parallel design. The PSO algorithm was used to optimize the BP neural network's initial weights and thresholds and improve the accuracy of the classification algorithm. The MapReduce parallel programming model was utilized to achieve parallel processing of the BP algorithm, thereby solving the problems of hardware and communication overhead when the BP neural network addresses big data. Datasets on 5 different scales were constructed using the scene image library from the SUN Database. The classification accuracy of the parallel PSO-BP neural network algorithm is approximately 92%, and the system efficiency is approximately 0.85, which presents obvious advantages when processing big data. The algorithm proposed in this study demonstrated both higher classification accuracy and improved time efficiency, which represents a significant improvement obtained from applying parallel processing to an intelligent algorithm on big data.
NASA Astrophysics Data System (ADS)
Sukawattanavijit, Chanika; Srestasathiern, Panu
2017-10-01
Land Use and Land Cover (LULC) information are significant to observe and evaluate environmental change. LULC classification applying remotely sensed data is a technique popularly employed on a global and local dimension particularly, in urban areas which have diverse land cover types. These are essential components of the urban terrain and ecosystem. In the present, object-based image analysis (OBIA) is becoming widely popular for land cover classification using the high-resolution image. COSMO-SkyMed SAR data was fused with THAICHOTE (namely, THEOS: Thailand Earth Observation Satellite) optical data for land cover classification using object-based. This paper indicates a comparison between object-based and pixel-based approaches in image fusion. The per-pixel method, support vector machines (SVM) was implemented to the fused image based on Principal Component Analysis (PCA). For the objectbased classification was applied to the fused images to separate land cover classes by using nearest neighbor (NN) classifier. Finally, the accuracy assessment was employed by comparing with the classification of land cover mapping generated from fused image dataset and THAICHOTE image. The object-based data fused COSMO-SkyMed with THAICHOTE images demonstrated the best classification accuracies, well over 85%. As the results, an object-based data fusion provides higher land cover classification accuracy than per-pixel data fusion.
Fusion of ECG and ABP signals based on wavelet transform for cardiac arrhythmias classification.
Arvanaghi, Roghayyeh; Daneshvar, Sabalan; Seyedarabi, Hadi; Goshvarpour, Atefeh
2017-11-01
Each of Electrocardiogram (ECG) and Atrial Blood Pressure (ABP) signals contain information of cardiac status. This information can be used for diagnosis and monitoring of diseases. The majority of previously proposed methods rely only on ECG signal to classify heart rhythms. In this paper, ECG and ABP were used to classify five different types of heart rhythms. To this end, two mentioned signals (ECG and ABP) have been fused. These physiological signals have been used from MINIC physioNet database. ECG and ABP signals have been fused together on the basis of the proposed Discrete Wavelet Transformation fusion technique. Then, some frequency features were extracted from the fused signal. To classify the different types of cardiac arrhythmias, these features were given to a multi-layer perceptron neural network. In this study, the best results for the proposed fusion algorithm were obtained. In this case, the accuracy rates of 96.6%, 96.9%, 95.6% and 93.9% were achieved for two, three, four and five classes, respectively. However, the maximum classification rate of 89% was obtained for two classes on the basis of ECG features. It has been found that the higher accuracy rates were acquired by using the proposed fusion technique. The results confirmed the importance of fusing features from different physiological signals to gain more accurate assessments. Copyright © 2017 Elsevier B.V. All rights reserved.
Use of the color trails test as an embedded measure of performance validity.
Henry, George K; Algina, James
2013-01-01
One hundred personal injury litigants and disability claimants referred for a forensic neuropsychological evaluation were administered both portions of the Color Trails Test (CTT) as part of a more comprehensive battery of standardized tests. Subjects who failed two or more free-standing tests of cognitive performance validity formed the Failed Performance Validity (FPV) group, while subjects who passed all free-standing performance validity measures were assigned to the Passed Performance Validity (PPV) group. A cutscore of ≥45 seconds to complete Color Trails 1 (CT1) was associated with a classification accuracy of 78%, good sensitivity (66%) and high specificity (90%), while a cutscore of ≥84 seconds to complete Color Trails 2 (CT2) was associated with a classification accuracy of 82%, good sensitivity (74%) and high specificity (90%). A CT1 cutscore of ≥58 seconds, and a CT2 cutscore ≥100 seconds was associated with 100% positive predictive power at base rates from 20 to 50%.
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-01-01
Background Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Methodology/Principal Findings Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. Conclusions/Significance The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice. PMID:21359184
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-02-16
Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice.
NASA Astrophysics Data System (ADS)
Melville, Bethany; Lucieer, Arko; Aryal, Jagannath
2018-04-01
This paper presents a random forest classification approach for identifying and mapping three types of lowland native grassland communities found in the Tasmanian Midlands region. Due to the high conservation priority assigned to these communities, there has been an increasing need to identify appropriate datasets that can be used to derive accurate and frequently updateable maps of community extent. Therefore, this paper proposes a method employing repeat classification and statistical significance testing as a means of identifying the most appropriate dataset for mapping these communities. Two datasets were acquired and analysed; a Landsat ETM+ scene, and a WorldView-2 scene, both from 2010. Training and validation data were randomly subset using a k-fold (k = 50) approach from a pre-existing field dataset. Poa labillardierei, Themeda triandra and lowland native grassland complex communities were identified in addition to dry woodland and agriculture. For each subset of randomly allocated points, a random forest model was trained based on each dataset, and then used to classify the corresponding imagery. Validation was performed using the reciprocal points from the independent subset that had not been used to train the model. Final training and classification accuracies were reported as per class means for each satellite dataset. Analysis of Variance (ANOVA) was undertaken to determine whether classification accuracy differed between the two datasets, as well as between classifications. Results showed mean class accuracies between 54% and 87%. Class accuracy only differed significantly between datasets for the dry woodland and Themeda grassland classes, with the WorldView-2 dataset showing higher mean classification accuracies. The results of this study indicate that remote sensing is a viable method for the identification of lowland native grassland communities in the Tasmanian Midlands, and that repeat classification and statistical significant testing can be used to identify optimal datasets for vegetation community mapping.
Combining Passive Microwave Rain Rate Retrieval with Visible and Infrared Cloud Classification.
NASA Astrophysics Data System (ADS)
Miller, Shawn William
The relation between cloud type and rain rate has been investigated here from different approaches. Previous studies and intercomparisons have indicated that no single passive microwave rain rate algorithm is an optimal choice for all types of precipitating systems. Motivated by the upcoming Tropical Rainfall Measuring Mission (TRMM), an algorithm which combines visible and infrared cloud classification with passive microwave rain rate estimation was developed and analyzed in a preliminary manner using data from the Tropical Ocean Global Atmosphere-Coupled Ocean Atmosphere Response Experiment (TOGA-COARE). Overall correlation with radar rain rate measurements across five case studies showed substantial improvement in the combined algorithm approach when compared to the use of any single microwave algorithm. An automated neural network cloud classifier for use over both land and ocean was independently developed and tested on Advanced Very High Resolution Radiometer (AVHRR) data. The global classifier achieved strict accuracy for 82% of the test samples, while a more localized version achieved strict accuracy for 89% of its own test set. These numbers provide hope for the eventual development of a global automated cloud classifier for use throughout the tropics and the temperate zones. The localized classifier was used in conjunction with gridded 15-minute averaged radar rain rates at 8km resolution produced from the current operational network of National Weather Service (NWS) radars, to investigate the relation between cloud type and rain rate over three regions of the continental United States and adjacent waters. The results indicate a substantially lower amount of available moisture in the Front Range of the Rocky Mountains than in the Midwest or in the eastern Gulf of Mexico.
[Research on Rapid Discrimination of Edible Oil by ATR Infrared Spectroscopy].
Ma, Xiao; Yuan, Hong-fu; Song, Chun-feng; Hu, Ai-qin; Li, Xiao-yu; Zhao, Zhong; Li, Xiu-qin; Guo Zhen; Zhu, Zhi-qiang
2015-07-01
A rapid discrimination method of edible oils, KL-BP model, was proposed by attenuated total reflectance infrared spectroscopy. The model extracts the characteristic of classification from source data by KL and reduces data dimension at the same time. Then the neural network model is constructed by the new data which as the input of the model. 84 edible oil samples which include sesame oil, corn oil, canola oil, blend oil, sunflower oil, peanut oil, olive oil, soybean oil and tea seed oil, were collected and their infrared spectra determined using an ATR FT-IR spectrometer. In order to compare the method performance, principal component analysis (PCA) direct-classification model, KL direct-classification model, PLS-DA model, PCA-BP model and KL-BP model are constructed in this paper. The results show that the recognition rates of PCA, PCA-BP, KL, PLS-DA and KL-BP are 59.1%, 68.2%, 77.3%, 77.3% and 90.9% for discriminating the 9 kinds of edible oils, respectively. KL extracts the eigenvector which make the distance between different class and distance of every class ratio is the largest. So the method can get much more classify information than PCA. BP neural network can effectively enhance the classification ability and accuracy. Taking full of the advantages of KL in extracting more category information in dimension reducing and the features of BP neural network in self-learning, adaptive, nonlinear, the KL-BP method has the best classification ability and recognition accuracy and great importance for rapidly recognizing edible oil in practice.
Hazrati, Mehrnaz Kh; Erfanian, Abbas
2008-01-01
This paper presents a new EEG-based Brain-Computer Interface (BCI) for on-line controlling the sequence of hand grasping and holding in a virtual reality environment. The goal of this research is to develop an interaction technique that will allow the BCI to be effective in real-world scenarios for hand grasp control. Moreover, for consistency of man-machine interface, it is desirable the intended movement to be what the subject imagines. For this purpose, we developed an on-line BCI which was based on the classification of EEG associated with imagination of the movement of hand grasping and resting state. A classifier based on probabilistic neural network (PNN) was introduced for classifying the EEG. The PNN is a feedforward neural network that realizes the Bayes decision discriminant function by estimating probability density function using mixtures of Gaussian kernels. Two types of classification schemes were considered here for on-line hand control: adaptive and static. In contrast to static classification, the adaptive classifier was continuously updated on-line during recording. The experimental evaluation on six subjects on different days demonstrated that by using the static scheme, a classification accuracy as high as the rate obtained by the adaptive scheme can be achieved. At the best case, an average classification accuracy of 93.0% and 85.8% was obtained using adaptive and static scheme, respectively. The results obtained from more than 1500 trials on six subjects showed that interactive virtual reality environment can be used as an effective tool for subject training in BCI.
Estimation of different data compositions for early-season crop type classification.
Hao, Pengyu; Wu, Mingquan; Niu, Zheng; Wang, Li; Zhan, Yulin
2018-01-01
Timely and accurate crop type distribution maps are an important inputs for crop yield estimation and production forecasting as multi-temporal images can observe phenological differences among crops. Therefore, time series remote sensing data are essential for crop type mapping, and image composition has commonly been used to improve the quality of the image time series. However, the optimal composition period is unclear as long composition periods (such as compositions lasting half a year) are less informative and short composition periods lead to information redundancy and missing pixels. In this study, we initially acquired daily 30 m Normalized Difference Vegetation Index (NDVI) time series by fusing MODIS, Landsat, Gaofen and Huanjing (HJ) NDVI, and then composited the NDVI time series using four strategies (daily, 8-day, 16-day, and 32-day). We used Random Forest to identify crop types and evaluated the classification performances of the NDVI time series generated from four composition strategies in two studies regions from Xinjiang, China. Results indicated that crop classification performance improved as crop separabilities and classification accuracies increased, and classification uncertainties dropped in the green-up stage of the crops. When using daily NDVI time series, overall accuracies saturated at 113-day and 116-day in Bole and Luntai, and the saturated overall accuracies (OAs) were 86.13% and 91.89%, respectively. Cotton could be identified 40∼60 days and 35∼45 days earlier than the harvest in Bole and Luntai when using daily, 8-day and 16-day composition NDVI time series since both producer's accuracies (PAs) and user's accuracies (UAs) were higher than 85%. Among the four compositions, the daily NDVI time series generated the highest classification accuracies. Although the 8-day, 16-day and 32-day compositions had similar saturated overall accuracies (around 85% in Bole and 83% in Luntai), the 8-day and 16-day compositions achieved these accuracies around 155-day in Bole and 133-day in Luntai, which were earlier than the 32-day composition (170-day in both Bole and Luntai). Therefore, when the daily NDVI time series cannot be acquired, the 16-day composition is recommended in this study.
Estimation of different data compositions for early-season crop type classification
Wu, Mingquan; Wang, Li; Zhan, Yulin
2018-01-01
Timely and accurate crop type distribution maps are an important inputs for crop yield estimation and production forecasting as multi-temporal images can observe phenological differences among crops. Therefore, time series remote sensing data are essential for crop type mapping, and image composition has commonly been used to improve the quality of the image time series. However, the optimal composition period is unclear as long composition periods (such as compositions lasting half a year) are less informative and short composition periods lead to information redundancy and missing pixels. In this study, we initially acquired daily 30 m Normalized Difference Vegetation Index (NDVI) time series by fusing MODIS, Landsat, Gaofen and Huanjing (HJ) NDVI, and then composited the NDVI time series using four strategies (daily, 8-day, 16-day, and 32-day). We used Random Forest to identify crop types and evaluated the classification performances of the NDVI time series generated from four composition strategies in two studies regions from Xinjiang, China. Results indicated that crop classification performance improved as crop separabilities and classification accuracies increased, and classification uncertainties dropped in the green-up stage of the crops. When using daily NDVI time series, overall accuracies saturated at 113-day and 116-day in Bole and Luntai, and the saturated overall accuracies (OAs) were 86.13% and 91.89%, respectively. Cotton could be identified 40∼60 days and 35∼45 days earlier than the harvest in Bole and Luntai when using daily, 8-day and 16-day composition NDVI time series since both producer’s accuracies (PAs) and user’s accuracies (UAs) were higher than 85%. Among the four compositions, the daily NDVI time series generated the highest classification accuracies. Although the 8-day, 16-day and 32-day compositions had similar saturated overall accuracies (around 85% in Bole and 83% in Luntai), the 8-day and 16-day compositions achieved these accuracies around 155-day in Bole and 133-day in Luntai, which were earlier than the 32-day composition (170-day in both Bole and Luntai). Therefore, when the daily NDVI time series cannot be acquired, the 16-day composition is recommended in this study. PMID:29868265
van der Heijden, R T; Heijnen, J J; Hellinga, C; Romein, B; Luyben, K C
1994-01-05
Measurements provide the basis for process monitoring and control as well as for model development and validation. Systematic approaches to increase the accuracy and credibility of the empirical data set are therefore of great value. In (bio)chemical conversions, linear conservation relations such as the balance equations for charge, enthalpy, and/or chemical elements, can be employed to relate conversion rates. In a pactical situation, some of these rates will be measured (in effect, be calculated directly from primary measurements of, e.g., concentrations and flow rates), as others can or cannot be calculated from the measured ones. When certain measured rates can also be calculated from other measured rates, the set of equations, the accuracy and credibility of the measured rates can indeed be improved by, respectively, balancing and gross error diagnosis. The balanced conversion rates are more accurate, and form a consistent set of data, which is more suitable for further application (e.g., to calculate nonmeasured rates) than the raw measurements. Such an approach has drawn attention in previous studies. The current study deals mainly with the problem of mathematically classifying the conversion rates into balanceable and calculable rates, given the subset of measured rates. The significance of this problem is illustrated with some examples. It is shown that a simple matrix equation can be derived that contains the vector of measured conversion rates and the redundancy matrix R. Matrix R plays a predominant role in the classification problem. In supplementary articles, significance of the redundancy matrix R for an improved gross error diagnosis approach will be shown. In addition, efficient equations have been derived to calculate the balanceable and/or calculable rates. The method is completely based on matrix algebra (principally different from the graph-theoretical approach), and it is easily implemented into a computer program. (c) 1994 John Wiley & Sons, Inc.
Semi-supervised classification tool for DubaiSat-2 multispectral imagery
NASA Astrophysics Data System (ADS)
Al-Mansoori, Saeed
2015-10-01
This paper addresses a semi-supervised classification tool based on a pixel-based approach of the multi-spectral satellite imagery. There are not many studies demonstrating such algorithm for the multispectral images, especially when the image consists of 4 bands (Red, Green, Blue and Near Infrared) as in DubaiSat-2 satellite images. The proposed approach utilizes both unsupervised and supervised classification schemes sequentially to identify four classes in the image, namely, water bodies, vegetation, land (developed and undeveloped areas) and paved areas (i.e. roads). The unsupervised classification concept is applied to identify two classes; water bodies and vegetation, based on a well-known index that uses the distinct wavelengths of visible and near-infrared sunlight that is absorbed and reflected by the plants to identify the classes; this index parameter is called "Normalized Difference Vegetation Index (NDVI)". Afterward, the supervised classification is performed by selecting training homogenous samples for roads and land areas. Here, a precise selection of training samples plays a vital role in the classification accuracy. Post classification is finally performed to enhance the classification accuracy, where the classified image is sieved, clumped and filtered before producing final output. Overall, the supervised classification approach produced higher accuracy than the unsupervised method. This paper shows some current preliminary research results which point out the effectiveness of the proposed technique in a virtual perspective.
Classification Consistency and Accuracy for Complex Assessments Using Item Response Theory
ERIC Educational Resources Information Center
Lee, Won-Chan
2010-01-01
In this article, procedures are described for estimating single-administration classification consistency and accuracy indices for complex assessments using item response theory (IRT). This IRT approach was applied to real test data comprising dichotomous and polytomous items. Several different IRT model combinations were considered. Comparisons…
Conceptual Scoring and Classification Accuracy of Vocabulary Testing in Bilingual Children
ERIC Educational Resources Information Center
Anaya, Jissel B.; Peña, Elizabeth D.; Bedore, Lisa M.
2018-01-01
Purpose: This study examined the effects of single-language and conceptual scoring on the vocabulary performance of bilingual children with and without specific language impairment. We assessed classification accuracy across 3 scoring methods. Method: Participants included Spanish-English bilingual children (N = 247) aged 5;1 (years;months) to…
A MUSIC-based method for SSVEP signal processing.
Chen, Kun; Liu, Quan; Ai, Qingsong; Zhou, Zude; Xie, Sheng Quan; Meng, Wei
2016-03-01
The research on brain computer interfaces (BCIs) has become a hotspot in recent years because it offers benefit to disabled people to communicate with the outside world. Steady state visual evoked potential (SSVEP)-based BCIs are more widely used because of higher signal to noise ratio and greater information transfer rate compared with other BCI techniques. In this paper, a multiple signal classification based method was proposed for multi-dimensional SSVEP feature extraction. 2-second data epochs from four electrodes achieved excellent accuracy rates including idle state detection. In some asynchronous mode experiments, the recognition accuracy reached up to 100%. The experimental results showed that the proposed method attained good frequency resolution. In most situations, the recognition accuracy was higher than canonical correlation analysis, which is a typical method for multi-channel SSVEP signal processing. Also, a virtual keyboard was successfully controlled by different subjects in an unshielded environment, which proved the feasibility of the proposed method for multi-dimensional SSVEP signal processing in practical applications.
Classification with spatio-temporal interpixel class dependency contexts
NASA Technical Reports Server (NTRS)
Jeon, Byeungwoo; Landgrebe, David A.
1992-01-01
A contextual classifier which can utilize both spatial and temporal interpixel dependency contexts is investigated. After spatial and temporal neighbors are defined, a general form of maximum a posterior spatiotemporal contextual classifier is derived. This contextual classifier is simplified under several assumptions. Joint prior probabilities of the classes of each pixel and its spatial neighbors are modeled by the Gibbs random field. The classification is performed in a recursive manner to allow a computationally efficient contextual classification. Experimental results with bitemporal TM data show significant improvement of classification accuracy over noncontextual pixelwise classifiers. This spatiotemporal contextual classifier should find use in many applications of remote sensing, especially when the classification accuracy is important.
Wright, C.; Gallant, Alisa L.
2007-01-01
The U.S. Fish and Wildlife Service uses the term palustrine wetland to describe vegetated wetlands traditionally identified as marsh, bog, fen, swamp, or wet meadow. Landsat TM imagery was combined with image texture and ancillary environmental data to model probabilities of palustrine wetland occurrence in Yellowstone National Park using classification trees. Model training and test locations were identified from National Wetlands Inventory maps, and classification trees were built for seven years spanning a range of annual precipitation. At a coarse level, palustrine wetland was separated from upland. At a finer level, five palustrine wetland types were discriminated: aquatic bed (PAB), emergent (PEM), forested (PFO), scrub–shrub (PSS), and unconsolidated shore (PUS). TM-derived variables alone were relatively accurate at separating wetland from upland, but model error rates dropped incrementally as image texture, DEM-derived terrain variables, and other ancillary GIS layers were added. For classification trees making use of all available predictors, average overall test error rates were 7.8% for palustrine wetland/upland models and 17.0% for palustrine wetland type models, with consistent accuracies across years. However, models were prone to wetland over-prediction. While the predominant PEM class was classified with omission and commission error rates less than 14%, we had difficulty identifying the PAB and PSS classes. Ancillary vegetation information greatly improved PSS classification and moderately improved PFO discrimination. Association with geothermal areas distinguished PUS wetlands. Wetland over-prediction was exacerbated by class imbalance in likely combination with spatial and spectral limitations of the TM sensor. Wetland probability surfaces may be more informative than hard classification, and appear to respond to climate-driven wetland variability. The developed method is portable, relatively easy to implement, and should be applicable in other settings and over larger extents.
Yamamoto, Hiroyuki; Yamamoto, Kyoko; Yoshida, Katsumi; Shindoh, Chiyohiko; Takeda, Kyoko; Monden, Masami; Izumo, Hiroko; Niinuma, Hiroyuki; Nishi, Yutaro; Niwa, Koichiro; Komatsu, Yasuhiro
2015-11-01
Chronic kidney disease (CKD) is a global public health issue, and strategies for its early detection and intervention are imperative. The latest Japanese CKD guideline recommends that patients without diabetes should be classified using the urine protein-to-creatinine ratio (PCR) instead of the urine albumin-to-creatinine ratio (ACR); however, no validation studies are available. This study aimed to validate the PCR-based CKD risk classification compared with the ACR-based classification and to explore more accurate classification methods. We analyzed two previously reported datasets that included diabetic and/or cardiovascular patients who were classified into early CKD stages. In total, 860 patients (131 diabetic patients and 729 cardiovascular patients, including 193 diabetic patients) were enrolled. We assessed the CKD risk classification of each patient according to the estimated glomerular filtration rate and the ACR-based or PCR-based classification. The use of the cut-off value recommended in the current guideline (PCR 0.15 g/g creatinine) resulted in risk misclassification rates of 26.0% and 16.6% for the two datasets. The misclassification was primarily caused by underestimation. Moderate to substantial agreement between each classification was achieved: Cohen's kappa, 0.56 (95% confidence interval, 0.45-0.69) and 0.72 (0.67-0.76) in each dataset, respectively. To improve the accuracy, we tested various candidate PCR cut-off values, showing that a PCR cut-off value of 0.08-0.10 g/g creatinine resulted in improvement in the misclassification rates and kappa values. Modification of the PCR cut-off value would improve its efficacy to identify high-risk populations who will benefit from early intervention.
Gastric precancerous diseases classification using CNN with a concise model.
Zhang, Xu; Hu, Weiling; Chen, Fei; Liu, Jiquan; Yang, Yuanhang; Wang, Liangjing; Duan, Huilong; Si, Jianmin
2017-01-01
Gastric precancerous diseases (GPD) may deteriorate into early gastric cancer if misdiagnosed, so it is important to help doctors recognize GPD accurately and quickly. In this paper, we realize the classification of 3-class GPD, namely, polyp, erosion, and ulcer using convolutional neural networks (CNN) with a concise model called the Gastric Precancerous Disease Network (GPDNet). GPDNet introduces fire modules from SqueezeNet to reduce the model size and parameters about 10 times while improving speed for quick classification. To maintain classification accuracy with fewer parameters, we propose an innovative method called iterative reinforced learning (IRL). After training GPDNet from scratch, we apply IRL to fine-tune the parameters whose values are close to 0, and then we take the modified model as a pretrained model for the next training. The result shows that IRL can improve the accuracy about 9% after 6 iterations. The final classification accuracy of our GPDNet was 88.90%, which is promising for clinical GPD recognition.
Convolutional neural network with transfer learning for rice type classification
NASA Astrophysics Data System (ADS)
Patel, Vaibhav Amit; Joshi, Manjunath V.
2018-04-01
Presently, rice type is identified manually by humans, which is time consuming and error prone. Therefore, there is a need to do this by machine which makes it faster with greater accuracy. This paper proposes a deep learning based method for classification of rice types. We propose two methods to classify the rice types. In the first method, we train a deep convolutional neural network (CNN) using the given segmented rice images. In the second method, we train a combination of a pretrained VGG16 network and the proposed method, while using transfer learning in which the weights of a pretrained network are used to achieve better accuracy. Our approach can also be used for classification of rice grain as broken or fine. We train a 5-class model for classifying rice types using 4000 training images and another 2- class model for the classification of broken and normal rice using 1600 training images. We observe that despite having distinct rice images, our architecture, pretrained on ImageNet data boosts classification accuracy significantly.
Modified DCTNet for audio signals classification
NASA Astrophysics Data System (ADS)
Xian, Yin; Pu, Yunchen; Gan, Zhe; Lu, Liang; Thompson, Andrew
2016-10-01
In this paper, we investigate DCTNet for audio signal classification. Its output feature is related to Cohen's class of time-frequency distributions. We introduce the use of adaptive DCTNet (A-DCTNet) for audio signals feature extraction. The A-DCTNet applies the idea of constant-Q transform, with its center frequencies of filterbanks geometrically spaced. The A-DCTNet is adaptive to different acoustic scales, and it can better capture low frequency acoustic information that is sensitive to human audio perception than features such as Mel-frequency spectral coefficients (MFSC). We use features extracted by the A-DCTNet as input for classifiers. Experimental results show that the A-DCTNet and Recurrent Neural Networks (RNN) achieve state-of-the-art performance in bird song classification rate, and improve artist identification accuracy in music data. They demonstrate A-DCTNet's applicability to signal processing problems.
Xiao, Bo; Imel, Zac E.; Georgiou, Panayiotis G.; Atkins, David C.; Narayanan, Shrikanth S.
2015-01-01
The technology for evaluating patient-provider interactions in psychotherapy–observational coding–has not changed in 70 years. It is labor-intensive, error prone, and expensive, limiting its use in evaluating psychotherapy in the real world. Engineering solutions from speech and language processing provide new methods for the automatic evaluation of provider ratings from session recordings. The primary data are 200 Motivational Interviewing (MI) sessions from a study on MI training methods with observer ratings of counselor empathy. Automatic Speech Recognition (ASR) was used to transcribe sessions, and the resulting words were used in a text-based predictive model of empathy. Two supporting datasets trained the speech processing tasks including ASR (1200 transcripts from heterogeneous psychotherapy sessions and 153 transcripts and session recordings from 5 MI clinical trials). The accuracy of computationally-derived empathy ratings were evaluated against human ratings for each provider. Computationally-derived empathy scores and classifications (high vs. low) were highly accurate against human-based codes and classifications, with a correlation of 0.65 and F-score (a weighted average of sensitivity and specificity) of 0.86, respectively. Empathy prediction using human transcription as input (as opposed to ASR) resulted in a slight increase in prediction accuracies, suggesting that the fully automatic system with ASR is relatively robust. Using speech and language processing methods, it is possible to generate accurate predictions of provider performance in psychotherapy from audio recordings alone. This technology can support large-scale evaluation of psychotherapy for dissemination and process studies. PMID:26630392
Monteiro-Soares, M; Martins-Mendes, D; Vaz-Carneiro, A; Sampaio, S; Dinis-Ribeiro, M
2014-10-01
We systematically review the available systems used to classify diabetic foot ulcers in order to synthesize their methodological qualitative issues and accuracy to predict lower extremity amputation, as this may represent a critical point in these patients' care. Two investigators searched, in EBSCO, ISI, PubMed and SCOPUS databases, and independently selected studies published until May 2013 and reporting prognostic accuracy and/or reliability of specific systems for patients with diabetic foot ulcer in order to predict lower extremity amputation. We included 25 studies reporting a prevalence of lower extremity amputation between 6% and 78%. Eight different diabetic foot ulcer descriptions and seven prognostic stratification classification systems were addressed with a variable (1-9) number of factors included, specially peripheral arterial disease (n = 12) or infection at the ulcer site (n = 10) or ulcer depth (n = 10). The Meggitt-Wagner, S(AD)SAD and Texas University Classification systems were the most extensively validated, whereas ten classifications were derived or validated only once. Reliability was reported in a single study, and accuracy measures were reported in five studies with another eight allowing their calculation. Pooled accuracy ranged from 0.65 (for gangrene) to 0.74 (for infection). There are numerous classification systems for diabetic foot ulcer outcome prediction, but only few studies evaluated their reliability or external validity. Studies rarely validated several systems simultaneously and only a few reported accuracy measures. Further studies assessing reliability and accuracy of the available systems and their composing variables are needed. Copyright © 2014 John Wiley & Sons, Ltd.
Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders.
Subasi, Abdulhamit
2013-06-01
Support vector machine (SVM) is an extensively used machine learning method with many biomedical signal classification applications. In this study, a novel PSO-SVM model has been proposed that hybridized the particle swarm optimization (PSO) and SVM to improve the EMG signal classification accuracy. This optimization mechanism involves kernel parameter setting in the SVM training procedure, which significantly influences the classification accuracy. The experiments were conducted on the basis of EMG signal to classify into normal, neurogenic or myopathic. In the proposed method the EMG signals were decomposed into the frequency sub-bands using discrete wavelet transform (DWT) and a set of statistical features were extracted from these sub-bands to represent the distribution of wavelet coefficients. The obtained results obviously validate the superiority of the SVM method compared to conventional machine learning methods, and suggest that further significant enhancements in terms of classification accuracy can be achieved by the proposed PSO-SVM classification system. The PSO-SVM yielded an overall accuracy of 97.41% on 1200 EMG signals selected from 27 subject records against 96.75%, 95.17% and 94.08% for the SVM, the k-NN and the RBF classifiers, respectively. PSO-SVM is developed as an efficient tool so that various SVMs can be used conveniently as the core of PSO-SVM for diagnosis of neuromuscular disorders. Copyright © 2013 Elsevier Ltd. All rights reserved.
Assessing herbivore foraging behavior with GPS collars in a semiarid grassland.
Augustine, David J; Derner, Justin D
2013-03-15
Advances in global positioning system (GPS) technology have dramatically enhanced the ability to track and study distributions of free-ranging livestock. Understanding factors controlling the distribution of free-ranging livestock requires the ability to assess when and where they are foraging. For four years (2008-2011), we periodically collected GPS and activity sensor data together with direct observations of collared cattle grazing semiarid rangeland in eastern Colorado. From these data, we developed classification tree models that allowed us to discriminate between grazing and non-grazing activities. We evaluated: (1) which activity sensor measurements from the GPS collars were most valuable in predicting cattle foraging behavior, (2) the accuracy of binary (grazing, non-grazing) activity models vs. models with multiple activity categories (grazing, resting, traveling, mixed), and (3) the accuracy of models that are robust across years vs. models specific to a given year. A binary classification tree correctly removed 86.5% of the non-grazing locations, while correctly retaining 87.8% of the locations where the animal was grazing, for an overall misclassification rate of 12.9%. A classification tree that separated activity into four different categories yielded a greater misclassification rate of 16.0%. Distance travelled in a 5 minute interval and the proportion of the interval with the sensor indicating a head down position were the two most important variables predicting grazing activity. Fitting annual models of cattle foraging activity did not improve model accuracy compared to a single model based on all four years combined. This suggests that increased sample size was more valuable than accounting for interannual variation in foraging behavior associated with variation in forage production. Our models differ from previous assessments in semiarid rangeland of Israel and mesic pastures in the United States in terms of the value of different activity sensor measurements for identifying grazing activity, suggesting that the use of GPS collars to classify cattle grazing behavior will require calibrations specific to the environment and vegetation being studied.
Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio
NASA Astrophysics Data System (ADS)
Nababan, A. A.; Sitompul, O. S.; Tulus
2018-04-01
K- Nearest Neighbor (KNN) is a good classifier, but from several studies, the result performance accuracy of KNN still lower than other methods. One of the causes of the low accuracy produced, because each attribute has the same effect on the classification process, while some less relevant characteristics lead to miss-classification of the class assignment for new data. In this research, we proposed Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio as a parameter to see the correlation between each attribute in the data and the Gain Ratio also will be used as the basis for weighting each attribute of the dataset. The accuracy of results is compared to the accuracy acquired from the original KNN method using 10-fold Cross-Validation with several datasets from the UCI Machine Learning repository and KEEL-Dataset Repository, such as abalone, glass identification, haberman, hayes-roth and water quality status. Based on the result of the test, the proposed method was able to increase the classification accuracy of KNN, where the highest difference of accuracy obtained hayes-roth dataset is worth 12.73%, and the lowest difference of accuracy obtained in the abalone dataset of 0.07%. The average result of the accuracy of all dataset increases the accuracy by 5.33%.
NASA Technical Reports Server (NTRS)
Myint, Soe W.; Mesev, Victor; Quattrochi, Dale; Wentz, Elizabeth A.
2013-01-01
Remote sensing methods used to generate base maps to analyze the urban environment rely predominantly on digital sensor data from space-borne platforms. This is due in part from new sources of high spatial resolution data covering the globe, a variety of multispectral and multitemporal sources, sophisticated statistical and geospatial methods, and compatibility with GIS data sources and methods. The goal of this chapter is to review the four groups of classification methods for digital sensor data from space-borne platforms; per-pixel, sub-pixel, object-based (spatial-based), and geospatial methods. Per-pixel methods are widely used methods that classify pixels into distinct categories based solely on the spectral and ancillary information within that pixel. They are used for simple calculations of environmental indices (e.g., NDVI) to sophisticated expert systems to assign urban land covers. Researchers recognize however, that even with the smallest pixel size the spectral information within a pixel is really a combination of multiple urban surfaces. Sub-pixel classification methods therefore aim to statistically quantify the mixture of surfaces to improve overall classification accuracy. While within pixel variations exist, there is also significant evidence that groups of nearby pixels have similar spectral information and therefore belong to the same classification category. Object-oriented methods have emerged that group pixels prior to classification based on spectral similarity and spatial proximity. Classification accuracy using object-based methods show significant success and promise for numerous urban 3 applications. Like the object-oriented methods that recognize the importance of spatial proximity, geospatial methods for urban mapping also utilize neighboring pixels in the classification process. The primary difference though is that geostatistical methods (e.g., spatial autocorrelation methods) are utilized during both the pre- and post-classification steps. Within this chapter, each of the four approaches is described in terms of scale and accuracy classifying urban land use and urban land cover; and for its range of urban applications. We demonstrate the overview of four main classification groups in Figure 1 while Table 1 details the approaches with respect to classification requirements and procedures (e.g., reflectance conversion, steps before training sample selection, training samples, spatial approaches commonly used, classifiers, primary inputs for classification, output structures, number of output layers, and accuracy assessment). The chapter concludes with a brief summary of the methods reviewed and the challenges that remain in developing new classification methods for improving the efficiency and accuracy of mapping urban areas.
Gross, Douglas P; Zhang, Jing; Steenstra, Ivan; Barnsley, Susan; Haws, Calvin; Amell, Tyler; McIntosh, Greg; Cooper, Juliette; Zaiane, Osmar
2013-12-01
To develop a classification algorithm and accompanying computer-based clinical decision support tool to help categorize injured workers toward optimal rehabilitation interventions based on unique worker characteristics. Population-based historical cohort design. Data were extracted from a Canadian provincial workers' compensation database on all claimants undergoing work assessment between December 2009 and January 2011. Data were available on: (1) numerous personal, clinical, occupational, and social variables; (2) type of rehabilitation undertaken; and (3) outcomes following rehabilitation (receiving time loss benefits or undergoing repeat programs). Machine learning, concerned with the design of algorithms to discriminate between classes based on empirical data, was the foundation of our approach to build a classification system with multiple independent and dependent variables. The population included 8,611 unique claimants. Subjects were predominantly employed (85 %) males (64 %) with diagnoses of sprain/strain (44 %). Baseline clinician classification accuracy was high (ROC = 0.86) for selecting programs that lead to successful return-to-work. Classification performance for machine learning techniques outperformed the clinician baseline classification (ROC = 0.94). The final classifiers were multifactorial and included the variables: injury duration, occupation, job attachment status, work status, modified work availability, pain intensity rating, self-rated occupational disability, and 9 items from the SF-36 Health Survey. The use of machine learning classification techniques appears to have resulted in classification performance better than clinician decision-making. The final algorithm has been integrated into a computer-based clinical decision support tool that requires additional validation in a clinical sample.
Wang, Xueyi; Davidson, Nicholas J.
2011-01-01
Ensemble methods have been widely used to improve prediction accuracy over individual classifiers. In this paper, we achieve a few results about the prediction accuracies of ensemble methods for binary classification that are missed or misinterpreted in previous literature. First we show the upper and lower bounds of the prediction accuracies (i.e. the best and worst possible prediction accuracies) of ensemble methods. Next we show that an ensemble method can achieve > 0.5 prediction accuracy, while individual classifiers have < 0.5 prediction accuracies. Furthermore, for individual classifiers with different prediction accuracies, the average of the individual accuracies determines the upper and lower bounds. We perform two experiments to verify the results and show that it is hard to achieve the upper and lower bounds accuracies by random individual classifiers and better algorithms need to be developed. PMID:21853162
Moore, D F; Harwood, V J; Ferguson, D M; Lukasik, J; Hannah, P; Getrich, M; Brownell, M
2005-01-01
The accuracy of ribotyping and antibiotic resistance analysis (ARA) for prediction of sources of faecal bacterial pollution in an urban southern California watershed was determined using blinded proficiency samples. Antibiotic resistance patterns and HindIII ribotypes of Escherichia coli (n = 997), and antibiotic resistance patterns of Enterococcus spp. (n = 3657) were used to construct libraries from sewage samples and from faeces of seagulls, dogs, cats, horses and humans within the watershed. The three libraries were analysed to determine the accuracy of host source prediction. The internal accuracy of the libraries (average rate of correct classification, ARCC) with six source categories was 44% for E. coli ARA, 69% for E. coli ribotyping and 48% for Enterococcus ARA. Each library's predictive ability towards isolates that were not part of the library was determined using a blinded proficiency panel of 97 E. coli and 99 Enterococcus isolates. Twenty-eight per cent (by ARA) and 27% (by ribotyping) of the E. coli proficiency isolates were assigned to the correct source category. Sixteen per cent were assigned to the same source category by both methods, and 6% were assigned to the correct category. Addition of 2480 E. coli isolates to the ARA library did not improve the ARCC or proficiency accuracy. In contrast, 45% of Enterococcus proficiency isolates were correctly identified by ARA. None of the methods performed well enough on the proficiency panel to be judged ready for application to environmental samples. Most microbial source tracking (MST) studies published have demonstrated library accuracy solely by the internal ARCC measurement. Low rates of correct classification for E. coli proficiency isolates compared with the ARCCs of the libraries indicate that testing of bacteria from samples that are not represented in the library, such as blinded proficiency samples, is necessary to accurately measure predictive ability. The library-based MST methods used in this study may not be suited for determination of the source(s) of faecal pollution in large, urban watersheds.
Liu, Ju-Chi; Chou, Hung-Chyun; Chen, Chien-Hsiu; Lin, Yi-Tseng
2016-01-01
A high efficient time-shift correlation algorithm was proposed to deal with the peak time uncertainty of P300 evoked potential for a P300-based brain-computer interface (BCI). The time-shift correlation series data were collected as the input nodes of an artificial neural network (ANN), and the classification of four LED visual stimuli was selected as the output node. Two operating modes, including fast-recognition mode (FM) and accuracy-recognition mode (AM), were realized. The proposed BCI system was implemented on an embedded system for commanding an adult-size humanoid robot to evaluate the performance from investigating the ground truth trajectories of the humanoid robot. When the humanoid robot walked in a spacious area, the FM was used to control the robot with a higher information transfer rate (ITR). When the robot walked in a crowded area, the AM was used for high accuracy of recognition to reduce the risk of collision. The experimental results showed that, in 100 trials, the accuracy rate of FM was 87.8% and the average ITR was 52.73 bits/min. In addition, the accuracy rate was improved to 92% for the AM, and the average ITR decreased to 31.27 bits/min. due to strict recognition constraints. PMID:27579033
Liu, Ju-Chi; Chou, Hung-Chyun; Chen, Chien-Hsiu; Lin, Yi-Tseng; Kuo, Chung-Hsien
2016-01-01
A high efficient time-shift correlation algorithm was proposed to deal with the peak time uncertainty of P300 evoked potential for a P300-based brain-computer interface (BCI). The time-shift correlation series data were collected as the input nodes of an artificial neural network (ANN), and the classification of four LED visual stimuli was selected as the output node. Two operating modes, including fast-recognition mode (FM) and accuracy-recognition mode (AM), were realized. The proposed BCI system was implemented on an embedded system for commanding an adult-size humanoid robot to evaluate the performance from investigating the ground truth trajectories of the humanoid robot. When the humanoid robot walked in a spacious area, the FM was used to control the robot with a higher information transfer rate (ITR). When the robot walked in a crowded area, the AM was used for high accuracy of recognition to reduce the risk of collision. The experimental results showed that, in 100 trials, the accuracy rate of FM was 87.8% and the average ITR was 52.73 bits/min. In addition, the accuracy rate was improved to 92% for the AM, and the average ITR decreased to 31.27 bits/min. due to strict recognition constraints.
Landenburger, L.; Lawrence, R.L.; Podruzny, S.; Schwartz, C.C.
2008-01-01
Moderate resolution satellite imagery traditionally has been thought to be inadequate for mapping vegetation at the species level. This has made comprehensive mapping of regional distributions of sensitive species, such as whitebark pine, either impractical or extremely time consuming. We sought to determine whether using a combination of moderate resolution satellite imagery (Landsat Enhanced Thematic Mapper Plus), extensive stand data collected by land management agencies for other purposes, and modern statistical classification techniques (boosted classification trees) could result in successful mapping of whitebark pine. Overall classification accuracies exceeded 90%, with similar individual class accuracies. Accuracies on a localized basis varied based on elevation. Accuracies also varied among administrative units, although we were not able to determine whether these differences related to inherent spatial variations or differences in the quality of available reference data.
MicroRNA Expression Profile Selection for Cancer Staging Classification Using Backpropagation
NASA Astrophysics Data System (ADS)
Anjarwati; Wibowo, Adi; Adhy, Satriyo; Kusumaningrum, Retno
2018-05-01
Ovarian cancer, breast cancer, and lung cancer are deadly diseases and require serious treatment. The cancers are among the fifth most common causes of cancer-induced deaths especially for woman. The high mortality rate of cancer is caused by the lack of effective strategies for early detection of the cancer, whereas if its detected in the early stages, the life survival of cancer patients will be 90%, otherwise the survival rate only 30% when the cancers detected on metastasis stages or cancer cells have spread from a primary site of cancer. MicroRNAs can be used as potential biomarkers for cancer due to their profile expression on the cancers. In this paper, we proposed the feature selection of microRNA expression profiles for classification of the cancers stages using Backpropagation Neural Network. The Cancer stages are classified into before metastasis and after metastasis. Several combinations of the microRNA expression profiles from medical references are compared to find the best features for the classification. The accuracy and the mean square errors are used as basis testing the comparison.
NASA Astrophysics Data System (ADS)
Sebatubun, M. M.; Haryawan, C.; Windarta, B.
2018-03-01
Lung cancer causes a high mortality rate in the world than any other cancers. That can be minimised if the symptoms and cancer cells have been detected early. One of the techniques used to detect lung cancer is by computed tomography (CT) scan. CT scan images have been used in this study to identify one of the lesion characteristics named ground glass opacity (GGO). It has been used to determine the level of malignancy of the lesion. There were three phases in identifying GGO: image cropping, feature extraction using grey level co-occurrence matrices (GLCM) and classification using Naïve Bayes Classifier. In order to improve the classification results, the most significant feature was sought by feature selection using gain ratio evaluation. Based on the results obtained, the most significant features could be identified by using feature selection method used in this research. The accuracy rate increased from 83.33% to 91.67%, the sensitivity from 82.35% to 94.11% and the specificity from 84.21% to 89.47%.
NASA Technical Reports Server (NTRS)
Wrigley, R. C.; Acevedo, W.; Alexander, D.; Buis, J.; Card, D.
1984-01-01
An experiment of a factorial design was conducted to test the effects on classification accuracy of land cover types due to the improved spatial, spectral and radiometric characteristics of the Thematic Mapper (TM) in comparison to the Multispectral Scanner (MSS). High altitude aircraft scanner data from the Airborne Thematic Mapper instrument was acquired over central California in August, 1983 and used to simulate Thematic Mapper data as well as all combinations of the three characteristics for eight data sets in all. Results for the training sites (field center pixels) showed better classification accuracies for MSS spatial resolution, TM spectral bands and TM radiometry in order of importance.
NASA Astrophysics Data System (ADS)
Efremova, T. T.; Avrova, A. F.; Efremov, S. P.
2016-09-01
The approaches of multivariate statistics have been used for the numerical classification of morphogenetic types of moss litters in swampy spruce forests according to their physicochemical properties (the ash content, decomposition degree, bulk density, pH, mass, and thickness). Three clusters of moss litters— peat, peaty, and high-ash peaty—have been specified. The functions of classification for identification of new objects have been calculated and evaluated. The degree of decomposition and the ash content are the main classification parameters of litters, though all other characteristics are also statistically significant. The final prediction accuracy of the assignment of a litter to a particular cluster is 86%. Two leading factors participating in the clustering of litters have been determined. The first factor—the degree of transformation of plant remains (quality)—specifies 49% of the total variance, and the second factor—the accumulation rate (quantity)— specifies 26% of the total variance. The morphogenetic structure and physicochemical properties of the clusters of moss litters are characterized.
Canizo, Brenda V; Escudero, Leticia B; Pérez, María B; Pellerano, Roberto G; Wuilloud, Rodolfo G
2018-03-01
The feasibility of the application of chemometric techniques associated with multi-element analysis for the classification of grape seeds according to their provenance vineyard soil was investigated. Grape seed samples from different localities of Mendoza province (Argentina) were evaluated. Inductively coupled plasma mass spectrometry (ICP-MS) was used for the determination of twenty-nine elements (Ag, As, Ce, Co, Cs, Cu, Eu, Fe, Ga, Gd, La, Lu, Mn, Mo, Nb, Nd, Ni, Pr, Rb, Sm, Te, Ti, Tl, Tm, U, V, Y, Zn and Zr). Once the analytical data were collected, supervised pattern recognition techniques such as linear discriminant analysis (LDA), partial least square discriminant analysis (PLS-DA), k-nearest neighbors (k-NN), support vector machine (SVM) and Random Forest (RF) were applied to construct classification/discrimination rules. The results indicated that nonlinear methods, RF and SVM, perform best with up to 98% and 93% accuracy rate, respectively, and therefore are excellent tools for classification of grapes. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Yan, Dan; Bai, Lianfa; Zhang, Yi; Han, Jing
2018-02-01
For the problems of missing details and performance of the colorization based on sparse representation, we propose a conceptual model framework for colorizing gray-scale images, and then a multi-sparse dictionary colorization algorithm based on the feature classification and detail enhancement (CEMDC) is proposed based on this framework. The algorithm can achieve a natural colorized effect for a gray-scale image, and it is consistent with the human vision. First, the algorithm establishes a multi-sparse dictionary classification colorization model. Then, to improve the accuracy rate of the classification, the corresponding local constraint algorithm is proposed. Finally, we propose a detail enhancement based on Laplacian Pyramid, which is effective in solving the problem of missing details and improving the speed of image colorization. In addition, the algorithm not only realizes the colorization of the visual gray-scale image, but also can be applied to the other areas, such as color transfer between color images, colorizing gray fusion images, and infrared images.
Multiple confidence estimates as indices of eyewitness memory.
Sauer, James D; Brewer, Neil; Weber, Nathan
2008-08-01
Eyewitness identification decisions are vulnerable to various influences on witnesses' decision criteria that contribute to false identifications of innocent suspects and failures to choose perpetrators. An alternative procedure using confidence estimates to assess the degree of match between novel and previously viewed faces was investigated. Classification algorithms were applied to participants' confidence data to determine when a confidence value or pattern of confidence values indicated a positive response. Experiment 1 compared confidence group classification accuracy with a binary decision control group's accuracy on a standard old-new face recognition task and found superior accuracy for the confidence group for target-absent trials but not for target-present trials. Experiment 2 used a face mini-lineup task and found reduced target-present accuracy offset by large gains in target-absent accuracy. Using a standard lineup paradigm, Experiments 3 and 4 also found improved classification accuracy for target-absent lineups and, with a more sophisticated algorithm, for target-present lineups. This demonstrates the accessibility of evidence for recognition memory decisions and points to a more sensitive index of memory quality than is afforded by binary decisions.
Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor.
Xu, Chang; Wang, Yingguan; Bao, Xinghe; Li, Fengrong
2018-05-24
This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs). Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN) classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.
NASA Technical Reports Server (NTRS)
Stoner, E. R.; May, G. A.; Kalcic, M. T. (Principal Investigator)
1981-01-01
Sample segments of ground-verified land cover data collected in conjunction with the USDA/ESS June Enumerative Survey were merged with LANDSAT data and served as a focus for unsupervised spectral class development and accuracy assessment. Multitemporal data sets were created from single-date LANDSAT MSS acquisitions from a nominal scene covering an eleven-county area in north central Missouri. Classification accuracies for the four land cover types predominant in the test site showed significant improvement in going from unitemporal to multitemporal data sets. Transformed LANDSAT data sets did not significantly improve classification accuracies. Regression estimators yielded mixed results for different land covers. Misregistration of two LANDSAT data sets by as much and one half pixels did not significantly alter overall classification accuracies. Existing algorithms for scene-to scene overlay proved adequate for multitemporal data analysis as long as statistical class development and accuracy assessment were restricted to field interior pixels.
TIM Barrel Protein Structure Classification Using Alignment Approach and Best Hit Strategy
NASA Astrophysics Data System (ADS)
Chu, Jia-Han; Lin, Chun Yuan; Chang, Cheng-Wen; Lee, Chihan; Yang, Yuh-Shyong; Tang, Chuan Yi
2007-11-01
The classification of protein structures is essential for their function determination in bioinformatics. It has been estimated that around 10% of all known enzymes have TIM barrel domains from the Structural Classification of Proteins (SCOP) database. With its high sequence variation and diverse functionalities, TIM barrel protein becomes to be an attractive target for protein engineering and for the evolution study. Hence, in this paper, an alignment approach with the best hit strategy is proposed to classify the TIM barrel protein structure in terms of superfamily and family levels in the SCOP. This work is also used to do the classification for class level in the Enzyme nomenclature (ENZYME) database. Two testing data sets, TIM40D and TIM95D, both are used to evaluate this approach. The resulting classification has an overall prediction accuracy rate of 90.3% for the superfamily level in the SCOP, 89.5% for the family level in the SCOP and 70.1% for the class level in the ENZYME. These results demonstrate that the alignment approach with the best hit strategy is a simple and viable method for the TIM barrel protein structure classification, even only has the amino acid sequences information.
Classification by Using Multispectral Point Cloud Data
NASA Astrophysics Data System (ADS)
Liao, C. T.; Huang, H. H.
2012-07-01
Remote sensing images are generally recorded in two-dimensional format containing multispectral information. Also, the semantic information is clearly visualized, which ground features can be better recognized and classified via supervised or unsupervised classification methods easily. Nevertheless, the shortcomings of multispectral images are highly depending on light conditions, and classification results lack of three-dimensional semantic information. On the other hand, LiDAR has become a main technology for acquiring high accuracy point cloud data. The advantages of LiDAR are high data acquisition rate, independent of light conditions and can directly produce three-dimensional coordinates. However, comparing with multispectral images, the disadvantage is multispectral information shortage, which remains a challenge in ground feature classification through massive point cloud data. Consequently, by combining the advantages of both LiDAR and multispectral images, point cloud data with three-dimensional coordinates and multispectral information can produce a integrate solution for point cloud classification. Therefore, this research acquires visible light and near infrared images, via close range photogrammetry, by matching images automatically through free online service for multispectral point cloud generation. Then, one can use three-dimensional affine coordinate transformation to compare the data increment. At last, the given threshold of height and color information is set as threshold in classification.
Do pre-trained deep learning models improve computer-aided classification of digital mammograms?
NASA Astrophysics Data System (ADS)
Aboutalib, Sarah S.; Mohamed, Aly A.; Zuley, Margarita L.; Berg, Wendie A.; Luo, Yahong; Wu, Shandong
2018-02-01
Digital mammography screening is an important exam for the early detection of breast cancer and reduction in mortality. False positives leading to high recall rates, however, results in unnecessary negative consequences to patients and health care systems. In order to better aid radiologists, computer-aided tools can be utilized to improve distinction between image classifications and thus potentially reduce false recalls. The emergence of deep learning has shown promising results in the area of biomedical imaging data analysis. This study aimed to investigate deep learning and transfer learning methods that can improve digital mammography classification performance. In particular, we evaluated the effect of pre-training deep learning models with other imaging datasets in order to boost classification performance on a digital mammography dataset. Two types of datasets were used for pre-training: (1) a digitized film mammography dataset, and (2) a very large non-medical imaging dataset. By using either of these datasets to pre-train the network initially, and then fine-tuning with the digital mammography dataset, we found an increase in overall classification performance in comparison to a model without pre-training, with the very large non-medical dataset performing the best in improving the classification accuracy.
Examining the Classification Accuracy of a Vocabulary Screening Measure with Preschool Children
ERIC Educational Resources Information Center
Marcotte, Amanda M.; Clemens, Nathan H.; Parker, Christopher; Whitcomb, Sara A.
2016-01-01
This study investigated the classification accuracy of the "Dynamic Indicators of Vocabulary Skills" (DIVS) as a preschool vocabulary screening measure. With a sample of 240 preschoolers, fall and winter DIVS scores were used to predict year-end vocabulary risk using the 25th percentile on the "Peabody Picture Vocabulary Test--Third…
ERIC Educational Resources Information Center
Zhang, Bo
2010-01-01
This article investigates how measurement models and statistical procedures can be applied to estimate the accuracy of proficiency classification in language testing. The paper starts with a concise introduction of four measurement models: the classical test theory (CTT) model, the dichotomous item response theory (IRT) model, the testlet response…
ERIC Educational Resources Information Center
Furey, William M.; Marcotte, Amanda M.; Hintze, John M.; Shackett, Caroline M.
2016-01-01
The study presents a critical analysis of written expression curriculum-based measurement (WE-CBM) metrics derived from 3- and 10-min test lengths. Criterion validity and classification accuracy were examined for Total Words Written (TWW), Correct Writing Sequences (CWS), Percent Correct Writing Sequences (%CWS), and Correct Minus Incorrect…
ERIC Educational Resources Information Center
Pena, Elizabeth D.; Gillam, Ronald B.; Malek, Melynn; Ruiz-Felter, Roxanna; Resendiz, Maria; Fiestas, Christine; Sabel, Tracy
2006-01-01
Two experiments examined reliability and classification accuracy of a narration-based dynamic assessment task. Purpose: The first experiment evaluated whether parallel results were obtained from stories created in response to 2 different wordless picture books. If so, the tasks and measures would be appropriate for assessing pretest and posttest…
The Potential Impact of Not Being Able to Create Parallel Tests on Expected Classification Accuracy
ERIC Educational Resources Information Center
Wyse, Adam E.
2011-01-01
In many practical testing situations, alternate test forms from the same testing program are not strictly parallel to each other and instead the test forms exhibit small psychometric differences. This article investigates the potential practical impact that these small psychometric differences can have on expected classification accuracy. Ten…
Emotion recognition from multichannel EEG signals using K-nearest neighbor classification.
Li, Mi; Xu, Hongpei; Liu, Xingwang; Lu, Shengfu
2018-04-27
Many studies have been done on the emotion recognition based on multi-channel electroencephalogram (EEG) signals. This paper explores the influence of the emotion recognition accuracy of EEG signals in different frequency bands and different number of channels. We classified the emotional states in the valence and arousal dimensions using different combinations of EEG channels. Firstly, DEAP default preprocessed data were normalized. Next, EEG signals were divided into four frequency bands using discrete wavelet transform, and entropy and energy were calculated as features of K-nearest neighbor Classifier. The classification accuracies of the 10, 14, 18 and 32 EEG channels based on the Gamma frequency band were 89.54%, 92.28%, 93.72% and 95.70% in the valence dimension and 89.81%, 92.24%, 93.69% and 95.69% in the arousal dimension. As the number of channels increases, the classification accuracy of emotional states also increases, the classification accuracy of the gamma frequency band is greater than that of the beta frequency band followed by the alpha and theta frequency bands. This paper provided better frequency bands and channels reference for emotion recognition based on EEG.
Zourmand, Alireza; Ting, Hua-Nong; Mirhassani, Seyed Mostafa
2013-03-01
Speech is one of the prevalent communication mediums for humans. Identifying the gender of a child speaker based on his/her speech is crucial in telecommunication and speech therapy. This article investigates the use of fundamental and formant frequencies from sustained vowel phonation to distinguish the gender of Malay children aged between 7 and 12 years. The Euclidean minimum distance and multilayer perceptron were used to classify the gender of 360 Malay children based on different combinations of fundamental and formant frequencies (F0, F1, F2, and F3). The Euclidean minimum distance with normalized frequency data achieved a classification accuracy of 79.44%, which was higher than that of the nonnormalized frequency data. Age-dependent modeling was used to improve the accuracy of gender classification. The Euclidean distance method obtained 84.17% based on the optimal classification accuracy for all age groups. The accuracy was further increased to 99.81% using multilayer perceptron based on mel-frequency cepstral coefficients. Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Du, Peijun; Tan, Kun; Xing, Xiaoshi
2010-12-01
Combining Support Vector Machine (SVM) with wavelet analysis, we constructed wavelet SVM (WSVM) classifier based on wavelet kernel functions in Reproducing Kernel Hilbert Space (RKHS). In conventional kernel theory, SVM is faced with the bottleneck of kernel parameter selection which further results in time-consuming and low classification accuracy. The wavelet kernel in RKHS is a kind of multidimensional wavelet function that can approximate arbitrary nonlinear functions. Implications on semiparametric estimation are proposed in this paper. Airborne Operational Modular Imaging Spectrometer II (OMIS II) hyperspectral remote sensing image with 64 bands and Reflective Optics System Imaging Spectrometer (ROSIS) data with 115 bands were used to experiment the performance and accuracy of the proposed WSVM classifier. The experimental results indicate that the WSVM classifier can obtain the highest accuracy when using the Coiflet Kernel function in wavelet transform. In contrast with some traditional classifiers, including Spectral Angle Mapping (SAM) and Minimum Distance Classification (MDC), and SVM classifier using Radial Basis Function kernel, the proposed wavelet SVM classifier using the wavelet kernel function in Reproducing Kernel Hilbert Space is capable of improving classification accuracy obviously.
NASA Astrophysics Data System (ADS)
Hwang, Han-Jeong; Lim, Jeong-Hwan; Kim, Do-Won; Im, Chang-Hwan
2014-07-01
A number of recent studies have demonstrated that near-infrared spectroscopy (NIRS) is a promising neuroimaging modality for brain-computer interfaces (BCIs). So far, most NIRS-based BCI studies have focused on enhancing the accuracy of the classification of different mental tasks. In the present study, we evaluated the performances of a variety of mental task combinations in order to determine the mental task pairs that are best suited for customized NIRS-based BCIs. To this end, we recorded event-related hemodynamic responses while seven participants performed eight different mental tasks. Classification accuracies were then estimated for all possible pairs of the eight mental tasks (C=28). Based on this analysis, mental task combinations with relatively high classification accuracies frequently included the following three mental tasks: "mental multiplication," "mental rotation," and "right-hand motor imagery." Specifically, mental task combinations consisting of two of these three mental tasks showed the highest mean classification accuracies. It is expected that our results will be a useful reference to reduce the time needed for preliminary tests when discovering individual-specific mental task combinations.
Palaniappan, Rajkumar; Sundaraj, Kenneth; Sundaraj, Sebastian; Huliraj, N; Revadi, S S
2017-06-08
Auscultation is a medical procedure used for the initial diagnosis and assessment of lung and heart diseases. From this perspective, we propose assessing the performance of the extreme learning machine (ELM) classifiers for the diagnosis of pulmonary pathology using breath sounds. Energy and entropy features were extracted from the breath sound using the wavelet packet transform. The statistical significance of the extracted features was evaluated by one-way analysis of variance (ANOVA). The extracted features were inputted into the ELM classifier. The maximum classification accuracies obtained for the conventional validation (CV) of the energy and entropy features were 97.36% and 98.37%, respectively, whereas the accuracies obtained for the cross validation (CRV) of the energy and entropy features were 96.80% and 97.91%, respectively. In addition, maximum classification accuracies of 98.25% and 99.25% were obtained for the CV and CRV of the ensemble features, respectively. The results indicate that the classification accuracy obtained with the ensemble features was higher than those obtained with the energy and entropy features.
Lohsiriwat, Varut; Prapasrivorakul, Siriluck; Lohsiriwat, Darin
2009-01-01
The purposes of this study were to determine clinical presentations and surgical outcomes of perforated peptic ulcer (PPU), and to evaluate the accuracy of the Boey scoring system in predicting mortality and morbidity. We carried out a retrospective study of patients undergoing emergency surgery for PPU between 2001 and 2006 in a university hospital. Clinical presentations and surgical outcomes were analyzed. Adjusted odds ratio (OR) of each Boey score on morbidity and mortality rate was compared with zero risk score. Receiver-operating characteristic curve analysis was used to compare the predictive ability between Boey score, American Society of Anesthesiologists (ASA) classification, and Mannheim Peritonitis Index (MPI). The study included 152 patients with average age of 52 years (range: 15-88 years), and 78% were male. The most common site of PPU was the prepyloric region (74%). Primary closure and omental graft was the most common procedure performed. Overall mortality rate was 9% and the complication rate was 30%. The mortality rate increased progressively with increasing numbers of the Boey score: 1%, 8% (OR=2.4), 33% (OR=3.5), and 38% (OR=7.7) for 0, 1, 2, and 3 scores, respectively (p<0.001). The morbidity rates for 0, 1, 2, and 3 Boey scores were 11%, 47% (OR=2.9), 75% (OR=4.3), and 77% (OR=4.9), respectively (p<0.001). Boey score and ASA classification appeared to be better than MPI for predicting the poor surgical outcomes. Perforated peptic ulcer is associated with high rates of mortality and morbidity. The Boey risk score serves as a simple and precise predictor for postoperative mortality and morbidity.
Nasir, Muhammad; Attique Khan, Muhammad; Sharif, Muhammad; Lali, Ikram Ullah; Saba, Tanzila; Iqbal, Tassawar
2018-02-21
Melanoma is the deadliest type of skin cancer with highest mortality rate. However, the annihilation in early stage implies a high survival rate therefore, it demands early diagnosis. The accustomed diagnosis methods are costly and cumbersome due to the involvement of experienced experts as well as the requirements for highly equipped environment. The recent advancements in computerized solutions for these diagnoses are highly promising with improved accuracy and efficiency. In this article, we proposed a method for the classification of melanoma and benign skin lesions. Our approach integrates preprocessing, lesion segmentation, features extraction, features selection, and classification. Preprocessing is executed in the context of hair removal by DullRazor, whereas lesion texture and color information are utilized to enhance the lesion contrast. In lesion segmentation, a hybrid technique has been implemented and results are fused using additive law of probability. Serial based method is applied subsequently that extracts and fuses the traits such as color, texture, and HOG (shape). The fused features are selected afterwards by implementing a novel Boltzman Entropy method. Finally, the selected features are classified by Support Vector Machine. The proposed method is evaluated on publically available data set PH2. Our approach has provided promising results of sensitivity 97.7%, specificity 96.7%, accuracy 97.5%, and F-score 97.5%, which are significantly better than the results of existing methods available on the same data set. The proposed method detects and classifies melanoma significantly good as compared to existing methods. © 2018 Wiley Periodicals, Inc.
Impacts of land use/cover classification accuracy on regional climate simulations
NASA Astrophysics Data System (ADS)
Ge, Jianjun; Qi, Jiaguo; Lofgren, Brent M.; Moore, Nathan; Torbick, Nathan; Olson, Jennifer M.
2007-03-01
Land use/cover change has been recognized as a key component in global change. Various land cover data sets, including historically reconstructed, recently observed, and future projected, have been used in numerous climate modeling studies at regional to global scales. However, little attention has been paid to the effect of land cover classification accuracy on climate simulations, though accuracy assessment has become a routine procedure in land cover production community. In this study, we analyzed the behavior of simulated precipitation in the Regional Atmospheric Modeling System (RAMS) over a range of simulated classification accuracies over a 3 month period. This study found that land cover accuracy under 80% had a strong effect on precipitation especially when the land surface had a greater control of the atmosphere. This effect became stronger as the accuracy decreased. As shown in three follow-on experiments, the effect was further influenced by model parameterizations such as convection schemes and interior nudging, which can mitigate the strength of surface boundary forcings. In reality, land cover accuracy rarely obtains the commonly recommended 85% target. Its effect on climate simulations should therefore be considered, especially when historically reconstructed and future projected land covers are employed.
Karzmark, Peter; Deutsch, Gayle K
2018-01-01
This investigation was designed to determine the predictive accuracy of a comprehensive neuropsychological and brief neuropsychological test battery with regard to the capacity to perform instrumental activities of daily living (IADLs). Accuracy statistics that included measures of sensitivity, specificity, positive and negative predicted power and positive likelihood ratio were calculated for both types of batteries. The sample was drawn from a general neurological group of adults (n = 117) that included a number of older participants (age >55; n = 38). Standardized neuropsychological assessments were administered to all participants and were comprised of the Halstead Reitan Battery and portions of the Wechsler Adult Intelligence Scale-III. A comprehensive test battery yielded a moderate increase over base-rate in predictive accuracy that generalized to older individuals. There was only limited support for using a brief battery, for although sensitivity was high, specificity was low. We found that a comprehensive neuropsychological test battery provided good classification accuracy for predicting IADL capacity.
NASA Astrophysics Data System (ADS)
Liu, Wanjun; Liang, Xuejian; Qu, Haicheng
2017-11-01
Hyperspectral image (HSI) classification is one of the most popular topics in remote sensing community. Traditional and deep learning-based classification methods were proposed constantly in recent years. In order to improve the classification accuracy and robustness, a dimensionality-varied convolutional neural network (DVCNN) was proposed in this paper. DVCNN was a novel deep architecture based on convolutional neural network (CNN). The input of DVCNN was a set of 3D patches selected from HSI which contained spectral-spatial joint information. In the following feature extraction process, each patch was transformed into some different 1D vectors by 3D convolution kernels, which were able to extract features from spectral-spatial data. The rest of DVCNN was about the same as general CNN and processed 2D matrix which was constituted by by all 1D data. So that the DVCNN could not only extract more accurate and rich features than CNN, but also fused spectral-spatial information to improve classification accuracy. Moreover, the robustness of network on water-absorption bands was enhanced in the process of spectral-spatial fusion by 3D convolution, and the calculation was simplified by dimensionality varied convolution. Experiments were performed on both Indian Pines and Pavia University scene datasets, and the results showed that the classification accuracy of DVCNN improved by 32.87% on Indian Pines and 19.63% on Pavia University scene than spectral-only CNN. The maximum accuracy improvement of DVCNN achievement was 13.72% compared with other state-of-the-art HSI classification methods, and the robustness of DVCNN on water-absorption bands noise was demonstrated.
Siuly; Yin, Xiaoxia; Hadjiloucas, Sillas; Zhang, Yanchun
2016-04-01
This work provides a performance comparison of four different machine learning classifiers: multinomial logistic regression with ridge estimators (MLR) classifier, k-nearest neighbours (KNN), support vector machine (SVM) and naïve Bayes (NB) as applied to terahertz (THz) transient time domain sequences associated with pixelated images of different powder samples. The six substances considered, although have similar optical properties, their complex insertion loss at the THz part of the spectrum is significantly different because of differences in both their frequency dependent THz extinction coefficient as well as differences in their refractive index and scattering properties. As scattering can be unquantifiable in many spectroscopic experiments, classification solely on differences in complex insertion loss can be inconclusive. The problem is addressed using two-dimensional (2-D) cross-correlations between background and sample interferograms, these ensure good noise suppression of the datasets and provide a range of statistical features that are subsequently used as inputs to the above classifiers. A cross-validation procedure is adopted to assess the performance of the classifiers. Firstly the measurements related to samples that had thicknesses of 2mm were classified, then samples at thicknesses of 4mm, and after that 3mm were classified and the success rate and consistency of each classifier was recorded. In addition, mixtures having thicknesses of 2 and 4mm as well as mixtures of 2, 3 and 4mm were presented simultaneously to all classifiers. This approach provided further cross-validation of the classification consistency of each algorithm. The results confirm the superiority in classification accuracy and robustness of the MLR (least accuracy 88.24%) and KNN (least accuracy 90.19%) algorithms which consistently outperformed the SVM (least accuracy 74.51%) and NB (least accuracy 56.86%) classifiers for the same number of feature vectors across all studies. The work establishes a general methodology for assessing the performance of other hyperspectral dataset classifiers on the basis of 2-D cross-correlations in far-infrared spectroscopy or other parts of the electromagnetic spectrum. It also advances the wider proliferation of automated THz imaging systems across new application areas e.g., biomedical imaging, industrial processing and quality control where interpretation of hyperspectral images is still under development. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Skin Lesion Analysis towards Melanoma Detection Using Deep Learning Network
2018-01-01
Skin lesions are a severe disease globally. Early detection of melanoma in dermoscopy images significantly increases the survival rate. However, the accurate recognition of melanoma is extremely challenging due to the following reasons: low contrast between lesions and skin, visual similarity between melanoma and non-melanoma lesions, etc. Hence, reliable automatic detection of skin tumors is very useful to increase the accuracy and efficiency of pathologists. In this paper, we proposed two deep learning methods to address three main tasks emerging in the area of skin lesion image processing, i.e., lesion segmentation (task 1), lesion dermoscopic feature extraction (task 2) and lesion classification (task 3). A deep learning framework consisting of two fully convolutional residual networks (FCRN) is proposed to simultaneously produce the segmentation result and the coarse classification result. A lesion index calculation unit (LICU) is developed to refine the coarse classification results by calculating the distance heat-map. A straight-forward CNN is proposed for the dermoscopic feature extraction task. The proposed deep learning frameworks were evaluated on the ISIC 2017 dataset. Experimental results show the promising accuracies of our frameworks, i.e., 0.753 for task 1, 0.848 for task 2 and 0.912 for task 3 were achieved. PMID:29439500
Integrative Chemical-Biological Read-Across Approach for Chemical Hazard Classification
Low, Yen; Sedykh, Alexander; Fourches, Denis; Golbraikh, Alexander; Whelan, Maurice; Rusyn, Ivan; Tropsha, Alexander
2013-01-01
Traditional read-across approaches typically rely on the chemical similarity principle to predict chemical toxicity; however, the accuracy of such predictions is often inadequate due to the underlying complex mechanisms of toxicity. Here we report on the development of a hazard classification and visualization method that draws upon both chemical structural similarity and comparisons of biological responses to chemicals measured in multiple short-term assays (”biological” similarity). The Chemical-Biological Read-Across (CBRA) approach infers each compound's toxicity from those of both chemical and biological analogs whose similarities are determined by the Tanimoto coefficient. Classification accuracy of CBRA was compared to that of classical RA and other methods using chemical descriptors alone, or in combination with biological data. Different types of adverse effects (hepatotoxicity, hepatocarcinogenicity, mutagenicity, and acute lethality) were classified using several biological data types (gene expression profiling and cytotoxicity screening). CBRA-based hazard classification exhibited consistently high external classification accuracy and applicability to diverse chemicals. Transparency of the CBRA approach is aided by the use of radial plots that show the relative contribution of analogous chemical and biological neighbors. Identification of both chemical and biological features that give rise to the high accuracy of CBRA-based toxicity prediction facilitates mechanistic interpretation of the models. PMID:23848138
Classification of ECG beats using deep belief network and active learning.
G, Sayantan; T, Kien P; V, Kadambari K
2018-04-12
A new semi-supervised approach based on deep learning and active learning for classification of electrocardiogram signals (ECG) is proposed. The objective of the proposed work is to model a scientific method for classification of cardiac irregularities using electrocardiogram beats. The model follows the Association for the Advancement of medical instrumentation (AAMI) standards and consists of three phases. In phase I, feature representation of ECG is learnt using Gaussian-Bernoulli deep belief network followed by a linear support vector machine (SVM) training in the consecutive phase. It yields three deep models which are based on AAMI-defined classes, namely N, V, S, and F. In the last phase, a query generator is introduced to interact with the expert to label few beats to improve accuracy and sensitivity. The proposed approach depicts significant improvement in accuracy with minimal queries posed to the expert and fast online training as tested on the MIT-BIH Arrhythmia Database and the MIT-BIH Supra-ventricular Arrhythmia Database (SVDB). With 100 queries labeled by the expert in phase III, the method achieves an accuracy of 99.5% in "S" versus all classifications (SVEB) and 99.4% accuracy in "V " versus all classifications (VEB) on MIT-BIH Arrhythmia Database. In a similar manner, it is attributed that an accuracy of 97.5% for SVEB and 98.6% for VEB on SVDB database is achieved respectively. Graphical Abstract Reply- Deep belief network augmented by active learning for efficient prediction of arrhythmia.
NASA Astrophysics Data System (ADS)
Xie, W.-J.; Zhang, L.; Chen, H.-P.; Zhou, J.; Mao, W.-J.
2018-04-01
The purpose of carrying out national geographic conditions monitoring is to obtain information of surface changes caused by human social and economic activities, so that the geographic information can be used to offer better services for the government, enterprise and public. Land cover data contains detailed geographic conditions information, thus has been listed as one of the important achievements in the national geographic conditions monitoring project. At present, the main issue of the production of the land cover data is about how to improve the classification accuracy. For the land cover data quality inspection and acceptance, classification accuracy is also an important check point. So far, the classification accuracy inspection is mainly based on human-computer interaction or manual inspection in the project, which are time consuming and laborious. By harnessing the automatic high-resolution remote sensing image change detection technology based on the ERDAS IMAGINE platform, this paper carried out the classification accuracy inspection test of land cover data in the project, and presented a corresponding technical route, which includes data pre-processing, change detection, result output and information extraction. The result of the quality inspection test shows the effectiveness of the technical route, which can meet the inspection needs for the two typical errors, that is, missing and incorrect update error, and effectively reduces the work intensity of human-computer interaction inspection for quality inspectors, and also provides a technical reference for the data production and quality control of the land cover data.
Cao, Jianfang; Cui, Hongyan; Shi, Hao; Jiao, Lijuan
2016-01-01
A back-propagation (BP) neural network can solve complicated random nonlinear mapping problems; therefore, it can be applied to a wide range of problems. However, as the sample size increases, the time required to train BP neural networks becomes lengthy. Moreover, the classification accuracy decreases as well. To improve the classification accuracy and runtime efficiency of the BP neural network algorithm, we proposed a parallel design and realization method for a particle swarm optimization (PSO)-optimized BP neural network based on MapReduce on the Hadoop platform using both the PSO algorithm and a parallel design. The PSO algorithm was used to optimize the BP neural network’s initial weights and thresholds and improve the accuracy of the classification algorithm. The MapReduce parallel programming model was utilized to achieve parallel processing of the BP algorithm, thereby solving the problems of hardware and communication overhead when the BP neural network addresses big data. Datasets on 5 different scales were constructed using the scene image library from the SUN Database. The classification accuracy of the parallel PSO-BP neural network algorithm is approximately 92%, and the system efficiency is approximately 0.85, which presents obvious advantages when processing big data. The algorithm proposed in this study demonstrated both higher classification accuracy and improved time efficiency, which represents a significant improvement obtained from applying parallel processing to an intelligent algorithm on big data. PMID:27304987
NASA Astrophysics Data System (ADS)
Saad, S. M.; Shakaff, A. Y. M.; Saad, A. R. M.; Yusof, A. M.; Andrew, A. M.; Zakaria, A.; Adom, A. H.
2017-03-01
There are various sources influencing indoor air quality (IAQ) which could emit dangerous gases such as carbon monoxide (CO), carbon dioxide (CO2), ozone (O3) and particulate matter. These gases are usually safe for us to breathe in if they are emitted in safe quantity but if the amount of these gases exceeded the safe level, they might be hazardous to human being especially children and people with asthmatic problem. Therefore, a smart indoor air quality monitoring system (IAQMS) is needed that able to tell the occupants about which sources that trigger the indoor air pollution. In this project, an IAQMS that able to classify sources influencing IAQ has been developed. This IAQMS applies a classification method based on Probabilistic Neural Network (PNN). It is used to classify the sources of indoor air pollution based on five conditions: ambient air, human activity, presence of chemical products, presence of food and beverage, and presence of fragrance. In order to get good and best classification accuracy, an analysis of several feature selection based on data pre-processing method is done to discriminate among the sources. The output from each data pre-processing method has been used as the input for the neural network. The result shows that PNN analysis with the data pre-processing method give good classification accuracy of 99.89% and able to classify the sources influencing IAQ high classification rate.
Saha, Monjoy; Chakraborty, Chandan
2018-05-01
We present an efficient deep learning framework for identifying, segmenting, and classifying cell membranes and nuclei from human epidermal growth factor receptor-2 (HER2)-stained breast cancer images with minimal user intervention. This is a long-standing issue for pathologists because the manual quantification of HER2 is error-prone, costly, and time-consuming. Hence, we propose a deep learning-based HER2 deep neural network (Her2Net) to solve this issue. The convolutional and deconvolutional parts of the proposed Her2Net framework consisted mainly of multiple convolution layers, max-pooling layers, spatial pyramid pooling layers, deconvolution layers, up-sampling layers, and trapezoidal long short-term memory (TLSTM). A fully connected layer and a softmax layer were also used for classification and error estimation. Finally, HER2 scores were calculated based on the classification results. The main contribution of our proposed Her2Net framework includes the implementation of TLSTM and a deep learning framework for cell membrane and nucleus detection, segmentation, and classification and HER2 scoring. Our proposed Her2Net achieved 96.64% precision, 96.79% recall, 96.71% F-score, 93.08% negative predictive value, 98.33% accuracy, and a 6.84% false-positive rate. Our results demonstrate the high accuracy and wide applicability of the proposed Her2Net in the context of HER2 scoring for breast cancer evaluation.
Hierarchical vs non-hierarchical audio indexation and classification for video genres
NASA Astrophysics Data System (ADS)
Dammak, Nouha; BenAyed, Yassine
2018-04-01
In this paper, Support Vector Machines (SVMs) are used for segmenting and indexing video genres based on only audio features extracted at block level, which has a prominent asset by capturing local temporal information. The main contribution of our study is to show the wide effect on the classification accuracies while using an hierarchical categorization structure based on Mel Frequency Cepstral Coefficients (MFCC) audio descriptor. In fact, the classification consists in three common video genres: sports videos, music clips and news scenes. The sub-classification may divide each genre into several multi-speaker and multi-dialect sub-genres. The validation of this approach was carried out on over 360 minutes of video span yielding a classification accuracy of over 99%.
Cognitive-motivational deficits in ADHD: development of a classification system.
Gupta, Rashmi; Kar, Bhoomika R; Srinivasan, Narayanan
2011-01-01
The classification systems developed so far to detect attention deficit/hyperactivity disorder (ADHD) do not have high sensitivity and specificity. We have developed a classification system based on several neuropsychological tests that measure cognitive-motivational functions that are specifically impaired in ADHD children. A total of 240 (120 ADHD children and 120 healthy controls) children in the age range of 6-9 years and 32 Oppositional Defiant Disorder (ODD) children (aged 9 years) participated in the study. Stop-Signal, Task-Switching, Attentional Network, and Choice Delay tests were administered to all the participants. Receiver operating characteristic (ROC) analysis indicated that percentage choice of long-delay reward best classified the ADHD children from healthy controls. Single parameters were not helpful in making a differential classification of ADHD with ODD. Multinominal logistic regression (MLR) was performed with multiple parameters (data fusion) that produced improved overall classification accuracy. A combination of stop-signal reaction time, posterror-slowing, mean delay, switch cost, and percentage choice of long-delay reward produced an overall classification accuracy of 97.8%; with internal validation, the overall accuracy was 92.2%. Combining parameters from different tests of control functions not only enabled us to accurately classify ADHD children from healthy controls but also in making a differential classification with ODD. These results have implications for the theories of ADHD.
Mediterranean Land Use and Land Cover Classification Assessment Using High Spatial Resolution Data
NASA Astrophysics Data System (ADS)
Elhag, Mohamed; Boteva, Silvena
2016-10-01
Landscape fragmentation is noticeably practiced in Mediterranean regions and imposes substantial complications in several satellite image classification methods. To some extent, high spatial resolution data were able to overcome such complications. For better classification performances in Land Use Land Cover (LULC) mapping, the current research adopts different classification methods comparison for LULC mapping using Sentinel-2 satellite as a source of high spatial resolution. Both of pixel-based and an object-based classification algorithms were assessed; the pixel-based approach employs Maximum Likelihood (ML), Artificial Neural Network (ANN) algorithms, Support Vector Machine (SVM), and, the object-based classification uses the Nearest Neighbour (NN) classifier. Stratified Masking Process (SMP) that integrates a ranking process within the classes based on spectral fluctuation of the sum of the training and testing sites was implemented. An analysis of the overall and individual accuracy of the classification results of all four methods reveals that the SVM classifier was the most efficient overall by distinguishing most of the classes with the highest accuracy. NN succeeded to deal with artificial surface classes in general while agriculture area classes, and forest and semi-natural area classes were segregated successfully with SVM. Furthermore, a comparative analysis indicates that the conventional classification method yielded better accuracy results than the SMP method overall with both classifiers used, ML and SVM.
Evaluation of airborne image data for mapping riparian vegetation within the Grand Canyon
Davis, Philip A.; Staid, Matthew I.; Plescia, Jeffrey B.; Johnson, Jeffrey R.
2002-01-01
This study examined various types of remote-sensing data that have been acquired during a 12-month period over a portion of the Colorado River corridor to determine the type of data and conditions for data acquisition that provide the optimum classification results for mapping riparian vegetation. Issues related to vegetation mapping included time of year, number and positions of wavelength bands, and spatial resolution for data acquisition to produce accurate vegetation maps versus cost of data. Image data considered in the study consisted of scanned color-infrared (CIR) film, digital CIR, and digital multispectral data, whose resolutions from 11 cm (photographic film) to 100 cm (multispectral), that were acquired during the Spring, Summer, and Fall seasons in 2000 for five long-term monitoring sites containing riparian vegetation. Results show that digitally acquired data produce higher and more consistent classification accuracies for mapping vegetation units than do film products. The highest accuracies were obtained from nine-band multispectral data; however, a four-band subset of these data, that did not include short-wave infrared bands, produced comparable mapping results. The four-band subset consisted of the wavelength bands 0.52-0.59 µm, 0.59-0.62 µm, 0.67-0.72 µm, and 0.73-0.85 µm. Use of only three of these bands that simulate digital CIR sensors produced accuracies for several vegetation units that were 10% lower than those obtained using the full multispectral data set. Classification tests using band ratios produced lower accuracies than those using band reflectance for scanned film data; a result attributed to the relatively poor radiometric fidelity maintained by the film scanning process, whereas calibrated multispectral data produced similar classification accuracies using band reflectance and band ratios. This suggests that the intrinsic band reflectance of the vegetation is more important than inter-band reflectance differences in attaining high mapping accuracies. These results also indicate that radiometrically calibrated sensors that record a wide range of radiance produce superior results and that such sensors should be used for monitoring purposes. When texture (spatial variance) at near-infrared wavelength is combined with spectral data in classification, accuracy increased most markedly (20-30%) for the highest resolution (11-cm) CIR film data, but decreased in its effect on accuracy in lower-resolution multi-spectral image data; a result observed in previous studies (Franklin and McDermid 1993, Franklin et al. 2000, 2001). While many classification unit accuracies obtained from the 11-cm film CIR band with texture data were in fact higher than those produced using the 100-cm, nine-band multispectral data with texture, the 11-cm film CIR data produced much lower accuracies than the 100-cm multispectral data for the more sparsely populated vegetation units due to saturation of picture elements during the film scanning process in vegetation units with a high proportion of alluvium. Overall classification accuracies obtained from spectral band and texture data range from 36% to 78% for all databases considered, from 57% to 71% for the 11-cm film CIR data, and from 54% to 78% for the 100-cm multispectral data. Classification results obtained from 20-cm film CIR band and texture data, which were produced by applying a Gaussian filter to the 11-cm film CIR data, showed increases in accuracy due to texture that were similar to those observed using the original 11-cm film CIR data. This suggests that data can be collected at the lower resolution and still retain the added power of vegetation texture. Classification accuracies for the riparian vegetation units examined in this study do not appear to be influenced by season of data acquisition, although data acquired under direct sunlight produced higher overall accuracies than data acquired under overcast conditions. The latter observation, in addition to the importance of band reflectance for classification, implies that data should be acquired near summer solstice when sun elevation and reflectance is highest and when shadows cast by steep canyon walls are minimized.
Abdolali, Fatemeh; Zoroofi, Reza Aghaeizadeh; Otake, Yoshito; Sato, Yoshinobu
2017-02-01
Accurate detection of maxillofacial cysts is an essential step for diagnosis, monitoring and planning therapeutic intervention. Cysts can be of various sizes and shapes and existing detection methods lead to poor results. Customizing automatic detection systems to gain sufficient accuracy in clinical practice is highly challenging. For this purpose, integrating the engineering knowledge in efficient feature extraction is essential. This paper presents a novel framework for maxillofacial cysts detection. A hybrid methodology based on surface and texture information is introduced. The proposed approach consists of three main steps as follows: At first, each cystic lesion is segmented with high accuracy. Then, in the second and third steps, feature extraction and classification are performed. Contourlet and SPHARM coefficients are utilized as texture and shape features which are fed into the classifier. Two different classifiers are used in this study, i.e. support vector machine and sparse discriminant analysis. Generally SPHARM coefficients are estimated by the iterative residual fitting (IRF) algorithm which is based on stepwise regression method. In order to improve the accuracy of IRF estimation, a method based on extra orthogonalization is employed to reduce linear dependency. We have utilized a ground-truth dataset consisting of cone beam CT images of 96 patients, belonging to three maxillofacial cyst categories: radicular cyst, dentigerous cyst and keratocystic odontogenic tumor. Using orthogonalized SPHARM, residual sum of squares is decreased which leads to a more accurate estimation. Analysis of the results based on statistical measures such as specificity, sensitivity, positive predictive value and negative predictive value is reported. The classification rate of 96.48% is achieved using sparse discriminant analysis and orthogonalized SPHARM features. Classification accuracy at least improved by 8.94% with respect to conventional features. This study demonstrated that our proposed methodology can improve the computer assisted diagnosis (CAD) performance by incorporating more discriminative features. Using orthogonalized SPHARM is promising in computerized cyst detection and may have a significant impact in future CAD systems. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Use of collateral information to improve LANDSAT classification accuracies
NASA Technical Reports Server (NTRS)
Strahler, A. H. (Principal Investigator)
1981-01-01
Methods to improve LANDSAT classification accuracies were investigated including: (1) the use of prior probabilities in maximum likelihood classification as a methodology to integrate discrete collateral data with continuously measured image density variables; (2) the use of the logit classifier as an alternative to multivariate normal classification that permits mixing both continuous and categorical variables in a single model and fits empirical distributions of observations more closely than the multivariate normal density function; and (3) the use of collateral data in a geographic information system as exercised to model a desired output information layer as a function of input layers of raster format collateral and image data base layers.
NASA Astrophysics Data System (ADS)
Szuflitowska, B.; Orlowski, P.
2017-08-01
Automated detection system consists of two key steps: extraction of features from EEG signals and classification for detection of pathology activity. The EEG sequences were analyzed using Short-Time Fourier Transform and the classification was performed using Linear Discriminant Analysis. The accuracy of the technique was tested on three sets of EEG signals: epilepsy, healthy and Alzheimer's Disease. The classification error below 10% has been considered a success. The higher accuracy are obtained for new data of unknown classes than testing data. The methodology can be helpful in differentiation epilepsy seizure and disturbances in the EEG signal in Alzheimer's Disease.
NASA Astrophysics Data System (ADS)
Hu, Ruiguang; Xiao, Liping; Zheng, Wenjuan
2015-12-01
In this paper, multi-kernel learning(MKL) is used for drug-related webpages classification. First, body text and image-label text are extracted through HTML parsing, and valid images are chosen by the FOCARSS algorithm. Second, text based BOW model is used to generate text representation, and image-based BOW model is used to generate images representation. Last, text and images representation are fused with a few methods. Experimental results demonstrate that the classification accuracy of MKL is higher than those of all other fusion methods in decision level and feature level, and much higher than the accuracy of single-modal classification.
Landcover classification in MRF context using Dempster-Shafer fusion for multisensor imagery.
Sarkar, Anjan; Banerjee, Anjan; Banerjee, Nilanjan; Brahma, Siddhartha; Kartikeyan, B; Chakraborty, Manab; Majumder, K L
2005-05-01
This work deals with multisensor data fusion to obtain landcover classification. The role of feature-level fusion using the Dempster-Shafer rule and that of data-level fusion in the MRF context is studied in this paper to obtain an optimally segmented image. Subsequently, segments are validated and classification accuracy for the test data is evaluated. Two examples of data fusion of optical images and a synthetic aperture radar image are presented, each set having been acquired on different dates. Classification accuracies of the technique proposed are compared with those of some recent techniques in literature for the same image data.
Random forests for classification in ecology
Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J.
2007-01-01
Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in ecology. Advantages of RF compared to other statistical classifiers include (1) very high classification accuracy; (2) a novel method of determining variable importance; (3) ability to model complex interactions among predictor variables; (4) flexibility to perform several types of statistical data analysis, including regression, classification, survival analysis, and unsupervised learning; and (5) an algorithm for imputing missing values. We compared the accuracies of RF and four other commonly used statistical classifiers using data on invasive plant species presence in Lava Beds National Monument, California, USA, rare lichen species presence in the Pacific Northwest, USA, and nest sites for cavity nesting birds in the Uinta Mountains, Utah, USA. We observed high classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods. We also observed that the variables that RF identified as most important for classifying invasive plant species coincided with expectations based on the literature. ?? 2007 by the Ecological Society of America.
Application of Sensor Fusion to Improve Uav Image Classification
NASA Astrophysics Data System (ADS)
Jabari, S.; Fathollahi, F.; Zhang, Y.
2017-08-01
Image classification is one of the most important tasks of remote sensing projects including the ones that are based on using UAV images. Improving the quality of UAV images directly affects the classification results and can save a huge amount of time and effort in this area. In this study, we show that sensor fusion can improve image quality which results in increasing the accuracy of image classification. Here, we tested two sensor fusion configurations by using a Panchromatic (Pan) camera along with either a colour camera or a four-band multi-spectral (MS) camera. We use the Pan camera to benefit from its higher sensitivity and the colour or MS camera to benefit from its spectral properties. The resulting images are then compared to the ones acquired by a high resolution single Bayer-pattern colour camera (here referred to as HRC). We assessed the quality of the output images by performing image classification tests. The outputs prove that the proposed sensor fusion configurations can achieve higher accuracies compared to the images of the single Bayer-pattern colour camera. Therefore, incorporating a Pan camera on-board in the UAV missions and performing image fusion can help achieving higher quality images and accordingly higher accuracy classification results.
Convolutional Neural Network for Histopathological Analysis of Osteosarcoma.
Mishra, Rashika; Daescu, Ovidiu; Leavey, Patrick; Rakheja, Dinesh; Sengupta, Anita
2018-03-01
Pathologists often deal with high complexity and sometimes disagreement over osteosarcoma tumor classification due to cellular heterogeneity in the dataset. Segmentation and classification of histology tissue in H&E stained tumor image datasets is a challenging task because of intra-class variations, inter-class similarity, crowded context, and noisy data. In recent years, deep learning approaches have led to encouraging results in breast cancer and prostate cancer analysis. In this article, we propose convolutional neural network (CNN) as a tool to improve efficiency and accuracy of osteosarcoma tumor classification into tumor classes (viable tumor, necrosis) versus nontumor. The proposed CNN architecture contains eight learned layers: three sets of stacked two convolutional layers interspersed with max pooling layers for feature extraction and two fully connected layers with data augmentation strategies to boost performance. The use of a neural network results in higher accuracy of average 92% for the classification. We compare the proposed architecture with three existing and proven CNN architectures for image classification: AlexNet, LeNet, and VGGNet. We also provide a pipeline to calculate percentage necrosis in a given whole slide image. We conclude that the use of neural networks can assure both high accuracy and efficiency in osteosarcoma classification.
Lu, Dengsheng; Batistella, Mateus; de Miranda, Evaristo E; Moran, Emilio
2008-01-01
Complex forest structure and abundant tree species in the moist tropical regions often cause difficulties in classifying vegetation classes with remotely sensed data. This paper explores improvement in vegetation classification accuracies through a comparative study of different image combinations based on the integration of Landsat Thematic Mapper (TM) and SPOT High Resolution Geometric (HRG) instrument data, as well as the combination of spectral signatures and textures. A maximum likelihood classifier was used to classify the different image combinations into thematic maps. This research indicated that data fusion based on HRG multispectral and panchromatic data slightly improved vegetation classification accuracies: a 3.1 to 4.6 percent increase in the kappa coefficient compared with the classification results based on original HRG or TM multispectral images. A combination of HRG spectral signatures and two textural images improved the kappa coefficient by 6.3 percent compared with pure HRG multispectral images. The textural images based on entropy or second-moment texture measures with a window size of 9 pixels × 9 pixels played an important role in improving vegetation classification accuracy. Overall, optical remote-sensing data are still insufficient for accurate vegetation classifications in the Amazon basin.
Lu, Dengsheng; Batistella, Mateus; de Miranda, Evaristo E.; Moran, Emilio
2009-01-01
Complex forest structure and abundant tree species in the moist tropical regions often cause difficulties in classifying vegetation classes with remotely sensed data. This paper explores improvement in vegetation classification accuracies through a comparative study of different image combinations based on the integration of Landsat Thematic Mapper (TM) and SPOT High Resolution Geometric (HRG) instrument data, as well as the combination of spectral signatures and textures. A maximum likelihood classifier was used to classify the different image combinations into thematic maps. This research indicated that data fusion based on HRG multispectral and panchromatic data slightly improved vegetation classification accuracies: a 3.1 to 4.6 percent increase in the kappa coefficient compared with the classification results based on original HRG or TM multispectral images. A combination of HRG spectral signatures and two textural images improved the kappa coefficient by 6.3 percent compared with pure HRG multispectral images. The textural images based on entropy or second-moment texture measures with a window size of 9 pixels × 9 pixels played an important role in improving vegetation classification accuracy. Overall, optical remote-sensing data are still insufficient for accurate vegetation classifications in the Amazon basin. PMID:19789716
Automated detection of breast cancer in resected specimens with fluorescence lifetime imaging
NASA Astrophysics Data System (ADS)
Phipps, Jennifer E.; Gorpas, Dimitris; Unger, Jakob; Darrow, Morgan; Bold, Richard J.; Marcu, Laura
2018-01-01
Re-excision rates for breast cancer lumpectomy procedures are currently nearly 25% due to surgeons relying on inaccurate or incomplete methods of evaluating specimen margins. The objective of this study was to determine if cancer could be automatically detected in breast specimens from mastectomy and lumpectomy procedures by a classification algorithm that incorporated parameters derived from fluorescence lifetime imaging (FLIm). This study generated a database of co-registered histologic sections and FLIm data from breast cancer specimens (N = 20) and a support vector machine (SVM) classification algorithm able to automatically detect cancerous, fibrous, and adipose breast tissue. Classification accuracies were greater than 97% for automated detection of cancerous, fibrous, and adipose tissue from breast cancer specimens. The classification worked equally well for specimens scanned by hand or with a mechanical stage, demonstrating that the system could be used during surgery or on excised specimens. The ability of this technique to simply discriminate between cancerous and normal breast tissue, in particular to distinguish fibrous breast tissue from tumor, which is notoriously challenging for optical techniques, leads to the conclusion that FLIm has great potential to assess breast cancer margins. Identification of positive margins before waiting for complete histologic analysis could significantly reduce breast cancer re-excision rates.
Zhong, Hua; Redo-Sanchez, Albert; Zhang, X-C
2006-10-02
We present terahertz (THz) reflective spectroscopic focal-plane imaging of four explosive and bio-chemical materials (2, 4-DNT, Theophylline, RDX and Glutamic Acid) at a standoff imaging distance of 0.4 m. The 2 dimension (2-D) nature of this technique enables a fast acquisition time and is very close to a camera-like operation, compared to the most commonly used point emission-detection and raster scanning configuration. The samples are identified by their absorption peaks extracted from the negative derivative of the reflection coefficient respect to the frequency (-dr/dv) of each pixel. Classification of the samples is achieved by using minimum distance classifier and neural network methods with a rate of accuracy above 80% and a false alarm rate below 8%. This result supports the future application of THz time-domain spectroscopy (TDS) in standoff distance sensing, imaging, and identification.
Ralston, Barbara E.; Davis, Philip A.; Weber, Robert M.; Rundall, Jill M.
2008-01-01
A vegetation database of the riparian vegetation located within the Colorado River ecosystem (CRE), a subsection of the Colorado River between Glen Canyon Dam and the western boundary of Grand Canyon National Park, was constructed using four-band image mosaics acquired in May 2002. A digital line scanner was flown over the Colorado River corridor in Arizona by ISTAR Americas, using a Leica ADS-40 digital camera to acquire a digital surface model and four-band image mosaics (blue, green, red, and near-infrared) for vegetation mapping. The primary objective of this mapping project was to develop a digital inventory map of vegetation to enable patch- and landscape-scale change detection, and to establish randomized sampling points for ground surveys of terrestrial fauna (principally, but not exclusively, birds). The vegetation base map was constructed through a combination of ground surveys to identify vegetation classes, image processing, and automated supervised classification procedures. Analysis of the imagery and subsequent supervised classification involved multiple steps to evaluate band quality, band ratios, and vegetation texture and density. Identification of vegetation classes involved collection of cover data throughout the river corridor and subsequent analysis using two-way indicator species analysis (TWINSPAN). Vegetation was classified into six vegetation classes, following the National Vegetation Classification Standard, based on cover dominance. This analysis indicated that total area covered by all vegetation within the CRE was 3,346 ha. Considering the six vegetation classes, the sparse shrub (SS) class accounted for the greatest amount of vegetation (627 ha) followed by Pluchea (PLSE) and Tamarix (TARA) at 494 and 366 ha, respectively. The wetland (WTLD) and Prosopis-Acacia (PRGL) classes both had similar areal cover values (227 and 213 ha, respectively). Baccharis-Salix (BAXX) was the least represented at 94 ha. Accuracy assessment of the supervised classification determined that accuracies varied among vegetation classes from 90% to 49%. Causes for low accuracies were similar spectral signatures among vegetation classes. Fuzzy accuracy assessment improved classification accuracies such that Federal mapping standards of 80% accuracies for all classes were met. The scale used to quantify vegetation adequately meets the needs of the stakeholder group. Increasing the scale to meet the U.S. Geological Survey (USGS)-National Park Service (NPS)National Mapping Program's minimum mapping unit of 0.5 ha is unwarranted because this scale would reduce the resolution of some classes (e.g., seep willow/coyote willow would likely be combined with tamarisk). While this would undoubtedly improve classification accuracies, it would not provide the community-level information about vegetation change that would benefit stakeholders. The identification of vegetation classes should follow NPS mapping approaches to complement the national effort and should incorporate the alternative analysis for community identification that is being incorporated into newer NPS mapping efforts. National Vegetation Classification is followed in this report for association- to formation-level categories. Accuracies could be improved by including more environmental variables such as stage elevation in the classification process and incorporating object-based classification methods. Another approach that may address the heterogeneous species issue and classification is to use spectral mixing analysis to estimate the fractional cover of species within each pixel and better quantify the cover of individual species that compose a cover class. Varying flights to capture vegetation at different times of the year might also help separate some vegetation classes, though the cost may be prohibitive. Lastly, photointerpretation instead of automated mapping could be tried. Photointerpretation would likely not improve accuracies in this case, howev
NASA Astrophysics Data System (ADS)
Law, Yan Nei; Lieng, Monica Keiko; Li, Jingmei; Khoo, David Aik-Aun
2014-03-01
Breast cancer is the most common cancer and second leading cause of cancer death among women in the US. The relative survival rate is lower among women with a more advanced stage at diagnosis. Early detection through screening is vital. Mammography is the most widely used and only proven screening method for reliably and effectively detecting abnormal breast tissues. In particular, mammographic density is one of the strongest breast cancer risk factors, after age and gender, and can be used to assess the future risk of disease before individuals become symptomatic. A reliable method for automatic density assessment would be beneficial and could assist radiologists in the evaluation of mammograms. To address this problem, we propose a density classification method which uses statistical features from different parts of the breast. Our method is composed of three parts: breast region identification, feature extraction and building ensemble classifiers for density assessment. It explores the potential of the features extracted from second and higher order statistical information for mammographic density classification. We further investigate the registration of bilateral pairs and time-series of mammograms. The experimental results on 322 mammograms demonstrate that (1) a classifier using features from dense regions has higher discriminative power than a classifier using only features from the whole breast region; (2) these high-order features can be effectively combined to boost the classification accuracy; (3) a classifier using these statistical features from dense regions achieves 75% accuracy, which is a significant improvement from 70% accuracy obtained by the existing approaches.
Colorectal cancer detection by hyperspectral imaging using fluorescence excitation scanning
NASA Astrophysics Data System (ADS)
Leavesley, Silas J.; Deal, Joshua; Hill, Shante; Martin, Will A.; Lall, Malvika; Lopez, Carmen; Rider, Paul F.; Rich, Thomas C.; Boudreaux, Carole W.
2018-02-01
Hyperspectral imaging technologies have shown great promise for biomedical applications. These techniques have been especially useful for detection of molecular events and characterization of cell, tissue, and biomaterial composition. Unfortunately, hyperspectral imaging technologies have been slow to translate to clinical devices - likely due to increased cost and complexity of the technology as well as long acquisition times often required to sample a spectral image. We have demonstrated that hyperspectral imaging approaches which scan the fluorescence excitation spectrum can provide increased signal strength and faster imaging, compared to traditional emission-scanning approaches. We have also demonstrated that excitation-scanning approaches may be able to detect spectral differences between colonic adenomas and adenocarcinomas and normal mucosa in flash-frozen tissues. Here, we report feasibility results from using excitation-scanning hyperspectral imaging to screen pairs of fresh tumoral and nontumoral colorectal tissues. Tissues were imaged using a novel hyperspectral imaging fluorescence excitation scanning microscope, sampling a wavelength range of 360-550 nm, at 5 nm increments. Image data were corrected to achieve a NIST-traceable flat spectral response. Image data were then analyzed using a range of supervised and unsupervised classification approaches within ENVI software (Harris Geospatial Solutions). Supervised classification resulted in >99% accuracy for single-patient image data, but only 64% accuracy for multi-patient classification (n=9 to date), with the drop in accuracy due to increased false-positive detection rates. Hence, initial data indicate that this approach may be a viable detection approach, but that larger patient sample sizes need to be evaluated and the effects of inter-patient variability studied.
Kaewkamnerd, Saowaluck; Uthaipibull, Chairat; Intarapanich, Apichart; Pannarut, Montri; Chaotheing, Sastra; Tongsima, Sissades
2012-01-01
Current malaria diagnosis relies primarily on microscopic examination of Giemsa-stained thick and thin blood films. This method requires vigorously trained technicians to efficiently detect and classify the malaria parasite species such as Plasmodium falciparum (Pf) and Plasmodium vivax (Pv) for an appropriate drug administration. However, accurate classification of parasite species is difficult to achieve because of inherent technical limitations and human inconsistency. To improve performance of malaria parasite classification, many researchers have proposed automated malaria detection devices using digital image analysis. These image processing tools, however, focus on detection of parasites on thin blood films, which may not detect the existence of parasites due to the parasite scarcity on the thin blood film. The problem is aggravated with low parasitemia condition. Automated detection and classification of parasites on thick blood films, which contain more numbers of parasite per detection area, would address the previous limitation. The prototype of an automatic malaria parasite identification system is equipped with mountable motorized units for controlling the movements of objective lens and microscope stage. This unit was tested for its precision to move objective lens (vertical movement, z-axis) and microscope stage (in x- and y-horizontal movements). The average precision of x-, y- and z-axes movements were 71.481 ± 7.266 μm, 40.009 ± 0.000 μm, and 7.540 ± 0.889 nm, respectively. Classification of parasites on 60 Giemsa-stained thick blood films (40 blood films containing infected red blood cells and 20 control blood films of normal red blood cells) was tested using the image analysis module. By comparing our results with the ones verified by trained malaria microscopists, the prototype detected parasite-positive and parasite-negative blood films at the rate of 95% and 68.5% accuracy, respectively. For classification performance, the thick blood films with Pv parasite was correctly classified with the success rate of 75% while the accuracy of Pf classification was 90%. This work presents an automatic device for both detection and classification of malaria parasite species on thick blood film. The system is based on digital image analysis and featured with motorized stage units, designed to easily be mounted on most conventional light microscopes used in the endemic areas. The constructed motorized module could control the movements of objective lens and microscope stage at high precision for effective acquisition of quality images for analysis. The analysis program could accurately classify parasite species, into Pf or Pv, based on distribution of chromatin size.
Spatial modeling and classification of corneal shape.
Marsolo, Keith; Twa, Michael; Bullimore, Mark A; Parthasarathy, Srinivasan
2007-03-01
One of the most promising applications of data mining is in biomedical data used in patient diagnosis. Any method of data analysis intended to support the clinical decision-making process should meet several criteria: it should capture clinically relevant features, be computationally feasible, and provide easily interpretable results. In an initial study, we examined the feasibility of using Zernike polynomials to represent biomedical instrument data in conjunction with a decision tree classifier to distinguish between the diseased and non-diseased eyes. Here, we provide a comprehensive follow-up to that work, examining a second representation, pseudo-Zernike polynomials, to determine whether they provide any increase in classification accuracy. We compare the fidelity of both methods using residual root-mean-square (rms) error and evaluate accuracy using several classifiers: neural networks, C4.5 decision trees, Voting Feature Intervals, and Naïve Bayes. We also examine the effect of several meta-learning strategies: boosting, bagging, and Random Forests (RFs). We present results comparing accuracy as it relates to dataset and transformation resolution over a larger, more challenging, multi-class dataset. They show that classification accuracy is similar for both data transformations, but differs by classifier. We find that the Zernike polynomials provide better feature representation than the pseudo-Zernikes and that the decision trees yield the best balance of classification accuracy and interpretability.
AVHRR channel selection for land cover classification
Maxwell, S.K.; Hoffer, R.M.; Chapman, P.L.
2002-01-01
Mapping land cover of large regions often requires processing of satellite images collected from several time periods at many spectral wavelength channels. However, manipulating and processing large amounts of image data increases the complexity and time, and hence the cost, that it takes to produce a land cover map. Very few studies have evaluated the importance of individual Advanced Very High Resolution Radiometer (AVHRR) channels for discriminating cover types, especially the thermal channels (channels 3, 4 and 5). Studies rarely perform a multi-year analysis to determine the impact of inter-annual variability on the classification results. We evaluated 5 years of AVHRR data using combinations of the original AVHRR spectral channels (1-5) to determine which channels are most important for cover type discrimination, yet stabilize inter-annual variability. Particular attention was placed on the channels in the thermal portion of the spectrum. Fourteen cover types over the entire state of Colorado were evaluated using a supervised classification approach on all two-, three-, four- and five-channel combinations for seven AVHRR biweekly composite datasets covering the entire growing season for each of 5 years. Results show that all three of the major portions of the electromagnetic spectrum represented by the AVHRR sensor are required to discriminate cover types effectively and stabilize inter-annual variability. Of the two-channel combinations, channels 1 (red visible) and 2 (near-infrared) had, by far, the highest average overall accuracy (72.2%), yet the inter-annual classification accuracies were highly variable. Including a thermal channel (channel 4) significantly increased the average overall classification accuracy by 5.5% and stabilized interannual variability. Each of the thermal channels gave similar classification accuracies; however, because of the problems in consistently interpreting channel 3 data, either channel 4 or 5 was found to be a more appropriate choice. Substituting the thermal channel with a single elevation layer resulted in equivalent classification accuracies and inter-annual variability.
Optimizing Support Vector Machine Parameters with Genetic Algorithm for Credit Risk Assessment
NASA Astrophysics Data System (ADS)
Manurung, Jonson; Mawengkang, Herman; Zamzami, Elviawaty
2017-12-01
Support vector machine (SVM) is a popular classification method known to have strong generalization capabilities. SVM can solve the problem of classification and linear regression or nonlinear kernel which can be a learning algorithm for the ability of classification and regression. However, SVM also has a weakness that is difficult to determine the optimal parameter value. SVM calculates the best linear separator on the input feature space according to the training data. To classify data which are non-linearly separable, SVM uses kernel tricks to transform the data into a linearly separable data on a higher dimension feature space. The kernel trick using various kinds of kernel functions, such as : linear kernel, polynomial, radial base function (RBF) and sigmoid. Each function has parameters which affect the accuracy of SVM classification. To solve the problem genetic algorithms are proposed to be applied as the optimal parameter value search algorithm thus increasing the best classification accuracy on SVM. Data taken from UCI repository of machine learning database: Australian Credit Approval. The results show that the combination of SVM and genetic algorithms is effective in improving classification accuracy. Genetic algorithms has been shown to be effective in systematically finding optimal kernel parameters for SVM, instead of randomly selected kernel parameters. The best accuracy for data has been upgraded from kernel Linear: 85.12%, polynomial: 81.76%, RBF: 77.22% Sigmoid: 78.70%. However, for bigger data sizes, this method is not practical because it takes a lot of time.
Accurate crop classification using hierarchical genetic fuzzy rule-based systems
NASA Astrophysics Data System (ADS)
Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.
2014-10-01
This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.
NASA Technical Reports Server (NTRS)
Spruce, J. P.; Smoot, James; Ellis, Jean; Hilbert, Kent; Swann, Roberta
2012-01-01
This paper discusses the development and implementation of a geospatial data processing method and multi-decadal Landsat time series for computing general coastal U.S. land-use and land-cover (LULC) classifications and change products consisting of seven classes (water, barren, upland herbaceous, non-woody wetland, woody upland, woody wetland, and urban). Use of this approach extends the observational period of the NOAA-generated Coastal Change and Analysis Program (C-CAP) products by almost two decades, assuming the availability of one cloud free Landsat scene from any season for each targeted year. The Mobile Bay region in Alabama was used as a study area to develop, demonstrate, and validate the method that was applied to derive LULC products for nine dates at approximate five year intervals across a 34-year time span, using single dates of data for each classification in which forests were either leaf-on, leaf-off, or mixed senescent conditions. Classifications were computed and refined using decision rules in conjunction with unsupervised classification of Landsat data and C-CAP value-added products. Each classification's overall accuracy was assessed by comparing stratified random locations to available reference data, including higher spatial resolution satellite and aerial imagery, field survey data, and raw Landsat RGBs. Overall classification accuracies ranged from 83 to 91% with overall Kappa statistics ranging from 0.78 to 0.89. The accuracies are comparable to those from similar, generalized LULC products derived from C-CAP data. The Landsat MSS-based LULC product accuracies are similar to those from Landsat TM or ETM+ data. Accurate classifications were computed for all nine dates, yielding effective results regardless of season. This classification method yielded products that were used to compute LULC change products via additive GIS overlay techniques.
Cervical cancer survival prediction using hybrid of SMOTE, CART and smooth support vector machine
NASA Astrophysics Data System (ADS)
Purnami, S. W.; Khasanah, P. M.; Sumartini, S. H.; Chosuvivatwong, V.; Sriplung, H.
2016-04-01
According to the WHO, every two minutes there is one patient who died from cervical cancer. The high mortality rate is due to the lack of awareness of women for early detection. There are several factors that supposedly influence the survival of cervical cancer patients, including age, anemia status, stage, type of treatment, complications and secondary disease. This study wants to classify/predict cervical cancer survival based on those factors. Various classifications methods: classification and regression tree (CART), smooth support vector machine (SSVM), three order spline SSVM (TSSVM) were used. Since the data of cervical cancer are imbalanced, synthetic minority oversampling technique (SMOTE) is used for handling imbalanced dataset. Performances of these methods are evaluated using accuracy, sensitivity and specificity. Results of this study show that balancing data using SMOTE as preprocessing can improve performance of classification. The SMOTE-SSVM method provided better result than SMOTE-TSSVM and SMOTE-CART.
ERIC Educational Resources Information Center
Guiberson, Mark; Rodriguez, Barbara L.; Dale, Philip S.
2011-01-01
Purpose: The purpose of the current study was to examine the concurrent validity and classification accuracy of 3 parent report measures of language development in Spanish-speaking toddlers. Method: Forty-five Spanish-speaking parents and their 2-year-old children participated. Twenty-three children had expressive language delays (ELDs) as…
ERIC Educational Resources Information Center
Guiberson, Mark; Rodriguez, Barbara L.
2010-01-01
Purpose: To describe the concurrent validity and classification accuracy of 2 Spanish parent surveys of language development, the Spanish Ages and Stages Questionnaire (ASQ; Squires, Potter, & Bricker, 1999) and the Pilot Inventario-III (Pilot INV-III; Guiberson, 2008a). Method: Forty-eight Spanish-speaking parents of preschool-age children…
ERIC Educational Resources Information Center
Zytowski, Donald G.
1972-01-01
Owing to the uncertainty concerning the concurrent validity of the SVIB and the KOIS, a test of accuracy of classification of men in the occupations common to both inventories was undertaken. The results suggest that neither show any less validity than had been shown in separate studies previously. (Author)
ERIC Educational Resources Information Center
Cohen, Ira L.; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N. S.; Romanczyk, Raymond G.; Karmel, Bernard Z.; Gardner, Judith M.
2016-01-01
In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80%,…
Developing Local Oral Reading Fluency Cut Scores for Predicting High-Stakes Test Performance
ERIC Educational Resources Information Center
Grapin, Sally L.; Kranzler, John H.; Waldron, Nancy; Joyce-Beaulieu, Diana; Algina, James
2017-01-01
This study evaluated the classification accuracy of a second grade oral reading fluency curriculum-based measure (R-CBM) in predicting third grade state test performance. It also compared the long-term classification accuracy of local and publisher-recommended R-CBM cut scores. Participants were 266 students who were divided into a calibration…
Factors Affecting the Item Parameter Estimation and Classification Accuracy of the DINA Model
ERIC Educational Resources Information Center
de la Torre, Jimmy; Hong, Yuan; Deng, Weiling
2010-01-01
To better understand the statistical properties of the deterministic inputs, noisy "and" gate cognitive diagnosis (DINA) model, the impact of several factors on the quality of the item parameter estimates and classification accuracy was investigated. Results of the simulation study indicate that the fully Bayes approach is most accurate when the…
NASA Astrophysics Data System (ADS)
Hänsch, Ronny; Hellwich, Olaf
2018-04-01
Random Forests have continuously proven to be one of the most accurate, robust, as well as efficient methods for the supervised classification of images in general and polarimetric synthetic aperture radar data in particular. While the majority of previous work focus on improving classification accuracy, we aim for accelerating the training of the classifier as well as its usage during prediction while maintaining its accuracy. Unlike other approaches we mainly consider algorithmic changes to stay as much as possible independent of platform and programming language. The final model achieves an approximately 60 times faster training and a 500 times faster prediction, while the accuracy is only marginally decreased by roughly 1 %.
Elhenawy, Mohammed; Jahangiri, Arash; Rakha, Hesham A; El-Shawarby, Ihab
2015-10-01
The ability to model driver stop/run behavior at signalized intersections considering the roadway surface condition is critical in the design of advanced driver assistance systems. Such systems can reduce intersection crashes and fatalities by predicting driver stop/run behavior. The research presented in this paper uses data collected from two controlled field experiments on the Smart Road at the Virginia Tech Transportation Institute (VTTI) to model driver stop/run behavior at the onset of a yellow indication for different roadway surface conditions. The paper offers two contributions. First, it introduces a new predictor related to driver aggressiveness and demonstrates that this measure enhances the modeling of driver stop/run behavior. Second, it applies well-known artificial intelligence techniques including: adaptive boosting (AdaBoost), random forest, and support vector machine (SVM) algorithms as well as traditional logistic regression techniques on the data in order to develop a model that can be used by traffic signal controllers to predict driver stop/run decisions in a connected vehicle environment. The research demonstrates that by adding the proposed driver aggressiveness predictor to the model, there is a statistically significant increase in the model accuracy. Moreover the false alarm rate is significantly reduced but this reduction is not statistically significant. The study demonstrates that, for the subject data, the SVM machine learning algorithm performs the best in terms of optimum classification accuracy and false positive rates. However, the SVM model produces the best performance in terms of the classification accuracy only. Copyright © 2015 Elsevier Ltd. All rights reserved.
How reliable and accurate is the AO/OTA comprehensive classification for adult long-bone fractures?
Meling, Terje; Harboe, Knut; Enoksen, Cathrine H; Aarflot, Morten; Arthursson, Astvaldur J; Søreide, Kjetil
2012-07-01
Reliable classification of fractures is important for treatment allocation and study comparisons. The overall accuracy of scoring applied to a general population of fractures is little known. This study aimed to investigate the accuracy and reliability of the comprehensive Arbeitsgemeinschaft für Osteosynthesefragen/Orthopedic Trauma Association classification for adult long-bone fractures and identify factors associated with poor coding agreement. Adults (>16 years) with long-bone fractures coded in a Fracture and Dislocation Registry at the Stavanger University Hospital during the fiscal year 2008 were included. An unblinded reference code dataset was generated for the overall accuracy assessment by two experienced orthopedic trauma surgeons. Blinded analysis of intrarater reliability was performed by rescoring and of interrater reliability by recoding of a randomly selected fracture sample. Proportion of agreement (PA) and kappa (κ) statistics are presented. Uni- and multivariate logistic regression analyses of factors predicting accuracy were performed. During the study period, 949 fractures were included and coded by 26 surgeons. For the intrarater analysis, overall agreements were κ = 0.67 (95% confidence interval [CI]: 0.64-0.70) and PA 69%. For interrater assessment, κ = 0.67 (95% CI: 0.62-0.72) and PA 69%. The accuracy of surgeons' blinded recoding was κ = 0.68 (95% CI: 0.65- 0.71) and PA 68%. Fracture type, frequency of the fracture, and segment fractured significantly influenced accuracy whereas the coder's experience did not. Both the reliability and accuracy of the comprehensive Arbeitsgemeinschaft für Osteosynthesefragen/Orthopedic Trauma Association classification for long-bone fractures ranged from substantial to excellent. Variations in coding accuracy seem to be related more to the fracture itself than the surgeon. Diagnostic study, level I.
Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.
Chen, Shizhi; Yang, Xiaodong; Tian, Yingli
2015-09-01
A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement.
Li, Yachun; Charalampaki, Patra; Liu, Yong; Yang, Guang-Zhong; Giannarou, Stamatia
2018-06-13
Probe-based confocal laser endomicroscopy (pCLE) enables in vivo, in situ tissue characterisation without changes in the surgical setting and simplifies the oncological surgical workflow. The potential of this technique in identifying residual cancer tissue and improving resection rates of brain tumours has been recently verified in pilot studies. The interpretation of endomicroscopic information is challenging, particularly for surgeons who do not themselves routinely review histopathology. Also, the diagnosis can be examiner-dependent, leading to considerable inter-observer variability. Therefore, automatic tissue characterisation with pCLE would support the surgeon in establishing diagnosis as well as guide robot-assisted intervention procedures. The aim of this work is to propose a deep learning-based framework for brain tissue characterisation for context aware diagnosis support in neurosurgical oncology. An efficient representation of the context information of pCLE data is presented by exploring state-of-the-art CNN models with different tuning configurations. A novel video classification framework based on the combination of convolutional layers with long-range temporal recursion has been proposed to estimate the probability of each tumour class. The video classification accuracy is compared for different network architectures and data representation and video segmentation methods. We demonstrate the application of the proposed deep learning framework to classify Glioblastoma and Meningioma brain tumours based on endomicroscopic data. Results show significant improvement of our proposed image classification framework over state-of-the-art feature-based methods. The use of video data further improves the classification performance, achieving accuracy equal to 99.49%. This work demonstrates that deep learning can provide an efficient representation of pCLE data and accurately classify Glioblastoma and Meningioma tumours. The performance evaluation analysis shows the potential clinical value of the technique.
NASA Astrophysics Data System (ADS)
Zhu, Jun; Chen, Lijun; Ma, Lantao; Li, Dejian; Jiang, Wei; Pan, Lihong; Shen, Huiting; Jia, Hongmin; Hsiang, Chingyun; Cheng, Guojie; Ling, Li; Chen, Shijie; Wang, Jun; Liao, Wenkui; Zhang, Gary
2014-04-01
Defect review is a time consuming job. Human error makes result inconsistent. The defects located on don't care area would not hurt the yield and no need to review them such as defects on dark area. However, critical area defects can impact yield dramatically and need more attention to review them such as defects on clear area. With decrease in integrated circuit dimensions, mask defects are always thousands detected during inspection even more. Traditional manual or simple classification approaches are unable to meet efficient and accuracy requirement. This paper focuses on automatic defect management and classification solution using image output of Lasertec inspection equipment and Anchor pattern centric image process technology. The number of mask defect found during an inspection is always in the range of thousands or even more. This system can handle large number defects with quick and accurate defect classification result. Our experiment includes Die to Die and Single Die modes. The classification accuracy can reach 87.4% and 93.3%. No critical or printable defects are missing in our test cases. The missing classification defects are 0.25% and 0.24% in Die to Die mode and Single Die mode. This kind of missing rate is encouraging and acceptable to apply on production line. The result can be output and reloaded back to inspection machine to have further review. This step helps users to validate some unsure defects with clear and magnification images when captured images can't provide enough information to make judgment. This system effectively reduces expensive inline defect review time. As a fully inline automated defect management solution, the system could be compatible with current inspection approach and integrated with optical simulation even scoring function and guide wafer level defect inspection.
Satellite inventory of Minnesota forest resources
NASA Technical Reports Server (NTRS)
Bauer, Marvin E.; Burk, Thomas E.; Ek, Alan R.; Coppin, Pol R.; Lime, Stephen D.; Walsh, Terese A.; Walters, David K.; Befort, William; Heinzen, David F.
1993-01-01
The methods and results of using Landsat Thematic Mapper (TM) data to classify and estimate the acreage of forest covertypes in northeastern Minnesota are described. Portions of six TM scenes covering five counties with a total area of 14,679 square miles were classified into six forest and five nonforest classes. The approach involved the integration of cluster sampling, image processing, and estimation. Using cluster sampling, 343 plots, each 88 acres in size, were photo interpreted and field mapped as a source of reference data for classifier training and calibration of the TM data classifications. Classification accuracies of up to 75 percent were achieved; most misclassification was between similar or related classes. An inverse method of calibration, based on the error rates obtained from the classifications of the cluster plots, was used to adjust the classification class proportions for classification errors. The resulting area estimates for total forest land in the five-county area were within 3 percent of the estimate made independently by the USDA Forest Service. Area estimates for conifer and hardwood forest types were within 0.8 and 6.0 percent respectively, of the Forest Service estimates. A trial of a second method of estimating the same classes as the Forest Service resulted in standard errors of 0.002 to 0.015. A study of the use of multidate TM data for change detection showed that forest canopy depletion, canopy increment, and no change could be identified with greater than 90 percent accuracy. The project results have been the basis for the Minnesota Department of Natural Resources and the Forest Service to define and begin to implement an annual system of forest inventory which utilizes Landsat TM data to detect changes in forest cover.
NASA Astrophysics Data System (ADS)
Tao, C.-S.; Chen, S.-W.; Li, Y.-Z.; Xiao, S.-P.
2017-09-01
Land cover classification is an important application for polarimetric synthetic aperture radar (PolSAR) data utilization. Rollinvariant polarimetric features such as H / Ani / α / Span are commonly adopted in PolSAR land cover classification. However, target orientation diversity effect makes PolSAR images understanding and interpretation difficult. Only using the roll-invariant polarimetric features may introduce ambiguity in the interpretation of targets' scattering mechanisms and limit the followed classification accuracy. To address this problem, this work firstly focuses on hidden polarimetric feature mining in the rotation domain along the radar line of sight using the recently reported uniform polarimetric matrix rotation theory and the visualization and characterization tool of polarimetric coherence pattern. The former rotates the acquired polarimetric matrix along the radar line of sight and fully describes the rotation characteristics of each entry of the matrix. Sets of new polarimetric features are derived to describe the hidden scattering information of the target in the rotation domain. The latter extends the traditional polarimetric coherence at a given rotation angle to the rotation domain for complete interpretation. A visualization and characterization tool is established to derive new polarimetric features for hidden information exploration. Then, a classification scheme is developed combing both the selected new hidden polarimetric features in rotation domain and the commonly used roll-invariant polarimetric features with a support vector machine (SVM) classifier. Comparison experiments based on AIRSAR and multi-temporal UAVSAR data demonstrate that compared with the conventional classification scheme which only uses the roll-invariant polarimetric features, the proposed classification scheme achieves both higher classification accuracy and better robustness. For AIRSAR data, the overall classification accuracy with the proposed classification scheme is 94.91 %, while that with the conventional classification scheme is 93.70 %. Moreover, for multi-temporal UAVSAR data, the averaged overall classification accuracy with the proposed classification scheme is up to 97.08 %, which is much higher than the 87.79 % from the conventional classification scheme. Furthermore, for multitemporal PolSAR data, the proposed classification scheme can achieve better robustness. The comparison studies also clearly demonstrate that mining and utilization of hidden polarimetric features and information in the rotation domain can gain the added benefits for PolSAR land cover classification and provide a new vision for PolSAR image interpretation and application.
Kulkarni, Shruti R; Rajendran, Bipin
2018-07-01
We demonstrate supervised learning in Spiking Neural Networks (SNNs) for the problem of handwritten digit recognition using the spike triggered Normalized Approximate Descent (NormAD) algorithm. Our network that employs neurons operating at sparse biological spike rates below 300Hz achieves a classification accuracy of 98.17% on the MNIST test database with four times fewer parameters compared to the state-of-the-art. We present several insights from extensive numerical experiments regarding optimization of learning parameters and network configuration to improve its accuracy. We also describe a number of strategies to optimize the SNN for implementation in memory and energy constrained hardware, including approximations in computing the neuronal dynamics and reduced precision in storing the synaptic weights. Experiments reveal that even with 3-bit synaptic weights, the classification accuracy of the designed SNN does not degrade beyond 1% as compared to the floating-point baseline. Further, the proposed SNN, which is trained based on the precise spike timing information outperforms an equivalent non-spiking artificial neural network (ANN) trained using back propagation, especially at low bit precision. Thus, our study shows the potential for realizing efficient neuromorphic systems that use spike based information encoding and learning for real-world applications. Copyright © 2018 Elsevier Ltd. All rights reserved.
Hu, Shan; Xu, Chao; Guan, Weiqiao; Tang, Yong; Liu, Yana
2014-01-01
Osteosarcoma is the most common malignant bone tumor among children and adolescents. In this study, image texture analysis was made to extract texture features from bone CR images to evaluate the recognition rate of osteosarcoma. To obtain the optimal set of features, Sym4 and Db4 wavelet transforms and gray-level co-occurrence matrices were applied to the image, with statistical methods being used to maximize the feature selection. To evaluate the performance of these methods, a support vector machine algorithm was used. The experimental results demonstrated that the Sym4 wavelet had a higher classification accuracy (93.44%) than the Db4 wavelet with respect to osteosarcoma occurrence in the epiphysis, whereas the Db4 wavelet had a higher classification accuracy (96.25%) for osteosarcoma occurrence in the diaphysis. Results including accuracy, sensitivity, specificity and ROC curves obtained using the wavelets were all higher than those obtained using the features derived from the GLCM method. It is concluded that, a set of texture features can be extracted from the wavelets and used in computer-aided osteosarcoma diagnosis systems. In addition, this study also confirms that multi-resolution analysis is a useful tool for texture feature extraction during bone CR image processing.
Analysis of spatial distribution of land cover maps accuracy
NASA Astrophysics Data System (ADS)
Khatami, R.; Mountrakis, G.; Stehman, S. V.
2017-12-01
Land cover maps have become one of the most important products of remote sensing science. However, classification errors will exist in any classified map and affect the reliability of subsequent map usage. Moreover, classification accuracy often varies over different regions of a classified map. These variations of accuracy will affect the reliability of subsequent analyses of different regions based on the classified maps. The traditional approach of map accuracy assessment based on an error matrix does not capture the spatial variation in classification accuracy. Here, per-pixel accuracy prediction methods are proposed based on interpolating accuracy values from a test sample to produce wall-to-wall accuracy maps. Different accuracy prediction methods were developed based on four factors: predictive domain (spatial versus spectral), interpolation function (constant, linear, Gaussian, and logistic), incorporation of class information (interpolating each class separately versus grouping them together), and sample size. Incorporation of spectral domain as explanatory feature spaces of classification accuracy interpolation was done for the first time in this research. Performance of the prediction methods was evaluated using 26 test blocks, with 10 km × 10 km dimensions, dispersed throughout the United States. The performance of the predictions was evaluated using the area under the curve (AUC) of the receiver operating characteristic. Relative to existing accuracy prediction methods, our proposed methods resulted in improvements of AUC of 0.15 or greater. Evaluation of the four factors comprising the accuracy prediction methods demonstrated that: i) interpolations should be done separately for each class instead of grouping all classes together; ii) if an all-classes approach is used, the spectral domain will result in substantially greater AUC than the spatial domain; iii) for the smaller sample size and per-class predictions, the spectral and spatial domain yielded similar AUC; iv) for the larger sample size (i.e., very dense spatial sample) and per-class predictions, the spatial domain yielded larger AUC; v) increasing the sample size improved accuracy predictions with a greater benefit accruing to the spatial domain; and vi) the function used for interpolation had the smallest effect on AUC.
Zhang, He-Hua; Yang, Liuyang; Liu, Yuchuan; Wang, Pin; Yin, Jun; Li, Yongming; Qiu, Mingguo; Zhu, Xueru; Yan, Fang
2016-11-16
The use of speech based data in the classification of Parkinson disease (PD) has been shown to provide an effect, non-invasive mode of classification in recent years. Thus, there has been an increased interest in speech pattern analysis methods applicable to Parkinsonism for building predictive tele-diagnosis and tele-monitoring models. One of the obstacles in optimizing classifications is to reduce noise within the collected speech samples, thus ensuring better classification accuracy and stability. While the currently used methods are effect, the ability to invoke instance selection has been seldomly examined. In this study, a PD classification algorithm was proposed and examined that combines a multi-edit-nearest-neighbor (MENN) algorithm and an ensemble learning algorithm. First, the MENN algorithm is applied for selecting optimal training speech samples iteratively, thereby obtaining samples with high separability. Next, an ensemble learning algorithm, random forest (RF) or decorrelated neural network ensembles (DNNE), is used to generate trained samples from the collected training samples. Lastly, the trained ensemble learning algorithms are applied to the test samples for PD classification. This proposed method was examined using a more recently deposited public datasets and compared against other currently used algorithms for validation. Experimental results showed that the proposed algorithm obtained the highest degree of improved classification accuracy (29.44%) compared with the other algorithm that was examined. Furthermore, the MENN algorithm alone was found to improve classification accuracy by as much as 45.72%. Moreover, the proposed algorithm was found to exhibit a higher stability, particularly when combining the MENN and RF algorithms. This study showed that the proposed method could improve PD classification when using speech data and can be applied to future studies seeking to improve PD classification methods.
2014-01-01
Background Behavioral interventions such as psychotherapy are leading, evidence-based practices for a variety of problems (e.g., substance abuse), but the evaluation of provider fidelity to behavioral interventions is limited by the need for human judgment. The current study evaluated the accuracy of statistical text classification in replicating human-based judgments of provider fidelity in one specific psychotherapy—motivational interviewing (MI). Method Participants (n = 148) came from five previously conducted randomized trials and were either primary care patients at a safety-net hospital or university students. To be eligible for the original studies, participants met criteria for either problematic drug or alcohol use. All participants received a type of brief motivational interview, an evidence-based intervention for alcohol and substance use disorders. The Motivational Interviewing Skills Code is a standard measure of MI provider fidelity based on human ratings that was used to evaluate all therapy sessions. A text classification approach called a labeled topic model was used to learn associations between human-based fidelity ratings and MI session transcripts. It was then used to generate codes for new sessions. The primary comparison was the accuracy of model-based codes with human-based codes. Results Receiver operating characteristic (ROC) analyses of model-based codes showed reasonably strong sensitivity and specificity with those from human raters (range of area under ROC curve (AUC) scores: 0.62 – 0.81; average AUC: 0.72). Agreement with human raters was evaluated based on talk turns as well as code tallies for an entire session. Generated codes had higher reliability with human codes for session tallies and also varied strongly by individual code. Conclusion To scale up the evaluation of behavioral interventions, technological solutions will be required. The current study demonstrated preliminary, encouraging findings regarding the utility of statistical text classification in bridging this methodological gap. PMID:24758152
Franson, J.C.; Hohman, W.L.; Moore, J.L.; Smith, M.R.
1996-01-01
We used 363 blood samples collected from wild canvasback dueks (Aythya valisineria) at Catahoula Lake, Louisiana, U.S.A. to evaluate the effect of sample storage time on the efficacy of erythrocytic protoporphyrin as an indicator of lead exposure. The protoporphyrin concentration of each sample was determined by hematofluorometry within 5 min of blood collection and after refrigeration at 4 °C for 24 and 48 h. All samples were analyzed for lead by atomic absorption spectrophotometry. Based on a blood lead concentration of ≥0.2 ppm wet weight as positive evidence for lead exposure, the protoporphyrin technique resulted in overall error rates of 29%, 20%, and 19% and false negative error rates of 47%, 29% and 25% when hematofluorometric determinations were made on blood at 5 min, 24 h, and 48 h, respectively. False positive error rates were less than 10% for all three measurement times. The accuracy of the 24-h erythrocytic protoporphyrin classification of blood samples as positive or negative for lead exposure was significantly greater than the 5-min classification, but no improvement in accuracy was gained when samples were tested at 48 h. The false negative errors were probably due, at least in part, to the lag time between lead exposure and the increase of blood protoporphyrin concentrations. False negatives resulted in an underestimation of the true number of canvasbacks exposed to lead, indicating that hematofluorometry provides a conservative estimate of lead exposure.
Simulation of seagrass bed mapping by satellite images based on the radiative transfer model
NASA Astrophysics Data System (ADS)
Sagawa, Tatsuyuki; Komatsu, Teruhisa
2015-06-01
Seagrass and seaweed beds play important roles in coastal marine ecosystems. They are food sources and habitats for many marine organisms, and influence the physical, chemical, and biological environment. They are sensitive to human impacts such as reclamation and pollution. Therefore, their management and preservation are necessary for a healthy coastal environment. Satellite remote sensing is a useful tool for mapping and monitoring seagrass beds. The efficiency of seagrass mapping, seagrass bed classification in particular, has been evaluated by mapping accuracy using an error matrix. However, mapping accuracies are influenced by coastal environments such as seawater transparency, bathymetry, and substrate type. Coastal management requires sufficient accuracy and an understanding of mapping limitations for monitoring coastal habitats including seagrass beds. Previous studies are mainly based on case studies in specific regions and seasons. Extensive data are required to generalise assessments of classification accuracy from case studies, which has proven difficult. This study aims to build a simulator based on a radiative transfer model to produce modelled satellite images and assess the visual detectability of seagrass beds under different transparencies and seagrass coverages, as well as to examine mapping limitations and classification accuracy. Our simulations led to the development of a model of water transparency and the mapping of depth limits and indicated the possibility for seagrass density mapping under certain ideal conditions. The results show that modelling satellite images is useful in evaluating the accuracy of classification and that establishing seagrass bed monitoring by remote sensing is a reliable tool.
Corcoran, Jennifer M.; Knight, Joseph F.; Gallant, Alisa L.
2013-01-01
Wetland mapping at the landscape scale using remotely sensed data requires both affordable data and an efficient accurate classification method. Random forest classification offers several advantages over traditional land cover classification techniques, including a bootstrapping technique to generate robust estimations of outliers in the training data, as well as the capability of measuring classification confidence. Though the random forest classifier can generate complex decision trees with a multitude of input data and still not run a high risk of over fitting, there is a great need to reduce computational and operational costs by including only key input data sets without sacrificing a significant level of accuracy. Our main questions for this study site in Northern Minnesota were: (1) how does classification accuracy and confidence of mapping wetlands compare using different remote sensing platforms and sets of input data; (2) what are the key input variables for accurate differentiation of upland, water, and wetlands, including wetland type; and (3) which datasets and seasonal imagery yield the best accuracy for wetland classification. Our results show the key input variables include terrain (elevation and curvature) and soils descriptors (hydric), along with an assortment of remotely sensed data collected in the spring (satellite visible, near infrared, and thermal bands; satellite normalized vegetation index and Tasseled Cap greenness and wetness; and horizontal-horizontal (HH) and horizontal-vertical (HV) polarization using L-band satellite radar). We undertook this exploratory analysis to inform decisions by natural resource managers charged with monitoring wetland ecosystems and to aid in designing a system for consistent operational mapping of wetlands across landscapes similar to those found in Northern Minnesota.
A Visual mining based framework for classification accuracy estimation
NASA Astrophysics Data System (ADS)
Arun, Pattathal Vijayakumar
2013-12-01
Classification techniques have been widely used in different remote sensing applications and correct classification of mixed pixels is a tedious task. Traditional approaches adopt various statistical parameters, however does not facilitate effective visualisation. Data mining tools are proving very helpful in the classification process. We propose a visual mining based frame work for accuracy assessment of classification techniques using open source tools such as WEKA and PREFUSE. These tools in integration can provide an efficient approach for getting information about improvements in the classification accuracy and helps in refining training data set. We have illustrated framework for investigating the effects of various resampling methods on classification accuracy and found that bilinear (BL) is best suited for preserving radiometric characteristics. We have also investigated the optimal number of folds required for effective analysis of LISS-IV images. Techniki klasyfikacji są szeroko wykorzystywane w różnych aplikacjach teledetekcyjnych, w których poprawna klasyfikacja pikseli stanowi poważne wyzwanie. Podejście tradycyjne wykorzystujące różnego rodzaju parametry statystyczne nie zapewnia efektywnej wizualizacji. Wielce obiecujące wydaje się zastosowanie do klasyfikacji narzędzi do eksploracji danych. W artykule zaproponowano podejście bazujące na wizualnej analizie eksploracyjnej, wykorzystujące takie narzędzia typu open source jak WEKA i PREFUSE. Wymienione narzędzia ułatwiają korektę pół treningowych i efektywnie wspomagają poprawę dokładności klasyfikacji. Działanie metody sprawdzono wykorzystując wpływ różnych metod resampling na zachowanie dokładności radiometrycznej i uzyskując najlepsze wyniki dla metody bilinearnej (BL).
NASA Technical Reports Server (NTRS)
Quattrochi, D. A.; Anderson, J. E.; Brannon, D. P.; Hill, C. L.
1982-01-01
An initial analysis of LANDSAT 4 thematic mapper (TM) data for the delineation and classification of agricultural, forested wetland, and urban land covers was conducted. A study area in Poinsett County, Arkansas was used to evaluate a classification of agricultural lands derived from multitemporal LANDSAT multispectral scanner (MSS) data in comparison with a classification of TM data for the same area. Data over Reelfoot Lake in northwestern Tennessee were utilized to evaluate the TM for delineating forested wetland species. A classification of the study area was assessed for accuracy in discriminating five forested wetland categories. Finally, the TM data were used to identify urban features within a small city. A computer generated classification of Union City, Tennessee was analyzed for accuracy in delineating urban land covers. An evaluation of digitally enhanced TM data using principal components analysis to facilitate photointerpretation of urban features was also performed.
NASA Technical Reports Server (NTRS)
Lillesand, T. M.; Werth, L. F. (Principal Investigator)
1980-01-01
A 25% improvement in average classification accuracy was realized by processing double-date vs. single-date data. Under the spectrally and spatially complex site conditions characterizing the geographical area used, further improvement in wetland classification accuracy is apparently precluded by the spectral and spatial resolution restrictions of the LANDSAT MSS. Full scene analysis of scanning densitometer data extracted from scale infrared photography failed to permit discrimination of many wetland and nonwetland cover types. When classification of photographic data was limited to wetland areas only, much more detailed and accurate classification could be made. The integration of conventional image interpretation (to simply delineate wetland boundaries) and machine assisted classification (to discriminate among cover types present within the wetland areas) appears to warrant further research to study the feasibility and cost of extending this methodology over a large area using LANDSAT and/or small scale photography.
The dynamics of human-induced land cover change in miombo ecosystems of southern Africa
NASA Astrophysics Data System (ADS)
Jaiteh, Malanding Sambou
Understanding human-induced land cover change in the miombo require the consistent, geographically-referenced, data on temporal land cover characteristics as well as biophysical and socioeconomic drivers of land use, the major cause of land cover change. The overall goal of this research to examine the applications of high-resolution satellite remote sensing data in studying the dynamics of human-induced land cover change in the miombo. Specific objectives are to: (1) evaluate the applications of computer-assisted classification of Landsat Thematic Mapper (TM) data for land cover mapping in the miombo and (2) analyze spatial and temporal patterns of landscape change locations in the miombo. Stepwise Thematic Classification, STC (a hybrid supervised-unsupervised classification) procedure for classifying Landsat TM data was developed and tested using Landsat TM data. Classification accuracy results were compared to those from supervised and unsupervised classification. The STC provided the highest classification accuracy i.e., 83.9% correspondence between classified and referenced data compared to 44.2% and 34.5% for unsupervised and supervised classification respectively. Improvements in the classification process can be attributed to thematic stratification of the image data into spectrally homogenous (thematic) groups and step-by-step classification of the groups using supervised or unsupervised classification techniques. Supervised classification failed to classify 18% of the scene evidence that training data used did not adequately represent all of the variability in the data. Application of the procedure in drier miombo produced overall classification accuracy of 63%. This is much lower than that of wetter miombo. The results clearly demonstrate that digital classification of Landsat TM can be successfully implemented in the miombo without intensive fieldwork. Spatial characteristics of land cover change in agricultural and forested landscapes in central Malawi were analyzed for the period 1984 to 1995 spatial pattern analysis methods. Shifting cultivation areas, Agriculture in forested landscape, experienced highest rate of woodland cover fragmentation with mean patch size of closed woodland cover decreasing from 20ha to 7.5ha. Permanent bare (cropland and settlement) in intensive agricultural matrix landscapes increased 52% largely through the conversion of fallow areas. Protected National Park area remained fairly unchanged although closed woodland area increased by 4%, mainly from regeneration of open woodland. This study provided evidence that changes in spatial characteristics in the miombo differ with landscape. Land use change (i.e. conversion to cropland) is the primary driving force behind changes in landscape spatial patterns. Also, results revealed that exclusion of intense human use (i.e. cultivation and woodcutting) through regulations and/or fencing increased both closed woodland area (through regeneration of open woodland) and overall connectivity in the landscape. Spatial characteristics of land cover change were analyzed at locations in Malawi (wetter miombo) and Zimbabwe (drier miombo). Results indicate land cover dynamics differ both between and within case study sites. In communal areas in the Kasungu scene, land cover change is dominated by woodland fragmentation to open vegetation. Change in private commercial lands was dominantly expansion of bare (settlement and cropland) areas primarily at the expense of open vegetation (fallow land).
Coban, Huseyin Oguz; Koc, Ayhan; Eker, Mehmet
2010-01-01
Previous studies have been able to successfully detect changes in gently-sloping forested areas with low-diversity and homogeneous vegetation cover using medium-resolution satellite data such as landsat. The aim of the present study is to examine the capacity of multi-temporal landsat data to identify changes in forested areas with mixed vegetation and generally located on steep slopes or non-uniform topography landsat thematic mapper (TM) and landsat enhanced thematic mapperplus (ETM+) data for the years 1987-2000 was used to detect changes within a 19,500 ha forested area in the Western Black sea region of Turkey. The data comply with the forest cover type maps previously created for forest management plans of the research area. The methods used to detect changes were: post-classification comparison, image differencing, image rationing and NDVI (Normalized Difference Vegetation Index) differencing methods. Following the supervised classification process, error matrices were used to evaluate the accuracy of classified images obtained. The overall accuracy has been calculated as 87.59% for 1987 image and as 91.81% for 2000 image. General kappa statistics have been calculated as 0.8543 and 0.9038 for 1987 and 2000, respectively. The changes identified via the post-classification comparison method were compared with other change detetion methods. Maximum coherence was found to be 74.95% at 4/3 band rate. The NDVI difference and 3rd band difference methods achieved the same coherence with slight variations. The results suggest that landsat satellite data accurately conveys the temporal changes which occur on steeply-sloping forested areas with a mixed structure, providing a limited amount of detail but with a high level of accuracy. Moreover it has been decided that the post-classification comparison method can meet the needs of forestry activities better than other methods as it provides information about the direction of these changes.
NASA Astrophysics Data System (ADS)
Cardille, J. A.; Lee, J.
2017-12-01
With the opening of the Landsat archive, there is a dramatically increased potential for creating high-quality time series of land use/land-cover (LULC) classifications derived from remote sensing. Although LULC time series are appealing, their creation is typically challenging in two fundamental ways. First, there is a need to create maximally correct LULC maps for consideration at each time step; and second, there is a need to have the elements of the time series be consistent with each other, without pixels that flip improbably between covers due only to unavoidable, stray classification errors. We have developed the Bayesian Updating of Land Cover - Unsupervised (BULC-U) algorithm to address these challenges simultaneously, and introduce and apply it here for two related but distinct purposes. First, with minimal human intervention, we produced an internally consistent, high-accuracy LULC time series in rapidly changing Mato Grosso, Brazil for a time interval (1986-2000) in which cropland area more than doubled. The spatial and temporal resolution of the 59 LULC snapshots allows users to witness the establishment of towns and farms at the expense of forest. The new time series could be used by policy-makers and analysts to unravel important considerations for conservation and management, including the timing and location of past development, the rate and nature of changes in forest connectivity, the connection with road infrastructure, and more. The second application of BULC-U is to sharpen the well-known GlobCover 2009 classification from 300m to 30m, while improving accuracy measures for every class. The greatly improved resolution and accuracy permits a better representation of the true LULC proportions, the use of this map in models, and quantification of the potential impacts of changes. Given that there may easily be thousands and potentially millions of images available to harvest for an LULC time series, it is imperative to build useful algorithms requiring minimal human intervention. Through image segmentation and classification, BULC-U allows us to use both the spectral and spatial characteristics of imagery to sharpen classifications and create time series. It is hoped that this study may allow us and other users of this new method to consider time series across ever larger areas.
NASA Astrophysics Data System (ADS)
Susanti, Yuliana; Zukhronah, Etik; Pratiwi, Hasih; Respatiwulan; Sri Sulistijowati, H.
2017-11-01
To achieve food resilience in Indonesia, food diversification by exploring potentials of local food is required. Corn is one of alternating staple food of Javanese society. For that reason, corn production needs to be improved by considering the influencing factors. CHAID and CRT are methods of data mining which can be used to classify the influencing variables. The present study seeks to dig up information on the potentials of local food availability of corn in regencies and cities in Java Island. CHAID analysis yields four classifications with accuracy of 78.8%, while CRT analysis yields seven classifications with accuracy of 79.6%.
Deep multi-scale convolutional neural network for hyperspectral image classification
NASA Astrophysics Data System (ADS)
Zhang, Feng-zhe; Yang, Xia
2018-04-01
In this paper, we proposed a multi-scale convolutional neural network for hyperspectral image classification task. Firstly, compared with conventional convolution, we utilize multi-scale convolutions, which possess larger respective fields, to extract spectral features of hyperspectral image. We design a deep neural network with a multi-scale convolution layer which contains 3 different convolution kernel sizes. Secondly, to avoid overfitting of deep neural network, dropout is utilized, which randomly sleeps neurons, contributing to improve the classification accuracy a bit. In addition, new skills like ReLU in deep learning is utilized in this paper. We conduct experiments on University of Pavia and Salinas datasets, and obtained better classification accuracy compared with other methods.
Rey, Sergio J.; Stephens, Philip A.; Laura, Jason R.
2017-01-01
Large data contexts present a number of challenges to optimal choropleth map classifiers. Application of optimal classifiers to a sample of the attribute space is one proposed solution. The properties of alternative sampling-based classification methods are examined through a series of Monte Carlo simulations. The impacts of spatial autocorrelation, number of desired classes, and form of sampling are shown to have significant impacts on the accuracy of map classifications. Tradeoffs between improved speed of the sampling approaches and loss of accuracy are also considered. The results suggest the possibility of guiding the choice of classification scheme as a function of the properties of large data sets.
Silva, Luís; Vaz, João Rocha; Castro, Maria António; Serranho, Pedro; Cabri, Jan; Pezarat-Correia, Pedro
2015-08-01
The quantification of non-linear characteristics of electromyography (EMG) must contain information allowing to discriminate neuromuscular strategies during dynamic skills. There are a lack of studies about muscle coordination under motor constrains during dynamic contractions. In golf, both handicap (Hc) and low back pain (LBP) are the main factors associated with the occurrence of injuries. The aim of this study was to analyze the accuracy of support vector machines SVM on EMG-based classification to discriminate Hc (low and high handicap) and LBP (with and without LPB) in the main phases of golf swing. For this purpose recurrence quantification analysis (RQA) features of the trunk and the lower limb muscles were used to feed a SVM classifier. Recurrence rate (RR) and the ratio between determinism (DET) and RR showed a high discriminant power. The Hc accuracy for the swing, backswing, and downswing were 94.4±2.7%, 97.1±2.3%, and 95.3±2.6%, respectively. For LBP, the accuracy was 96.9±3.8% for the swing, and 99.7±0.4% in the backswing. External oblique (EO), biceps femoris (BF), semitendinosus (ST) and rectus femoris (RF) showed high accuracy depending on the laterality within the phase. RQA features and SVM showed a high muscle discriminant capacity within swing phases by Hc and by LBP. Low back pain golfers showed different neuromuscular coordination strategies when compared with asymptomatic. Copyright © 2015 Elsevier Ltd. All rights reserved.
A new self-report inventory of dyslexia for students: criterion and construct validity.
Tamboer, Peter; Vorst, Harrie C M
2015-02-01
The validity of a Dutch self-report inventory of dyslexia was ascertained in two samples of students. Six biographical questions, 20 general language statements and 56 specific language statements were based on dyslexia as a multi-dimensional deficit. Dyslexia and non-dyslexia were assessed with two criteria: identification with test results (Sample 1) and classification using biographical information (both samples). Using discriminant analyses, these criteria were predicted with various groups of statements. All together, 11 discriminant functions were used to estimate classification accuracy of the inventory. In Sample 1, 15 statements predicted the test criterion with classification accuracy of 98%, and 18 statements predicted the biographical criterion with classification accuracy of 97%. In Sample 2, 16 statements predicted the biographical criterion with classification accuracy of 94%. Estimations of positive and negative predictive value were 89% and 99%. Items of various discriminant functions were factor analysed to find characteristic difficulties of students with dyslexia, resulting in a five-factor structure in Sample 1 and a four-factor structure in Sample 2. Answer bias was investigated with measures of internal consistency reliability. Less than 20 self-report items are sufficient to accurately classify students with and without dyslexia. This supports the usefulness of self-assessment of dyslexia as a valid alternative to diagnostic test batteries. Copyright © 2015 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Liu, Tao; Im, Jungho; Quackenbush, Lindi J.
2015-12-01
This study provides a novel approach to individual tree crown delineation (ITCD) using airborne Light Detection and Ranging (LiDAR) data in dense natural forests using two main steps: crown boundary refinement based on a proposed Fishing Net Dragging (FiND) method, and segment merging based on boundary classification. FiND starts with approximate tree crown boundaries derived using a traditional watershed method with Gaussian filtering and refines these boundaries using an algorithm that mimics how a fisherman drags a fishing net. Random forest machine learning is then used to classify boundary segments into two classes: boundaries between trees and boundaries between branches that belong to a single tree. Three groups of LiDAR-derived features-two from the pseudo waveform generated along with crown boundaries and one from a canopy height model (CHM)-were used in the classification. The proposed ITCD approach was tested using LiDAR data collected over a mountainous region in the Adirondack Park, NY, USA. Overall accuracy of boundary classification was 82.4%. Features derived from the CHM were generally more important in the classification than the features extracted from the pseudo waveform. A comprehensive accuracy assessment scheme for ITCD was also introduced by considering both area of crown overlap and crown centroids. Accuracy assessment using this new scheme shows the proposed ITCD achieved 74% and 78% as overall accuracy, respectively, for deciduous and mixed forest.
Thomas C. Edwards; D. Richard Cutler; Niklaus E. Zimmermann; Linda Geiser; Gretchen G. Moisen
2006-01-01
We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by...
Using New Models to Analyze Complex Regularities of the World: Commentary on Musso et al. (2013)
ERIC Educational Resources Information Center
Nokelainen, Petri; Silander, Tomi
2014-01-01
This commentary to the recent article by Musso et al. (2013) discusses issues related to model fitting, comparison of classification accuracy of generative and discriminative models, and two (or more) cultures of data modeling. We start by questioning the extremely high classification accuracy with an empirical data from a complex domain. There is…
ERIC Educational Resources Information Center
Ball, Carrie R.; O'Connor, Edward
2016-01-01
This study examined the predictive validity and classification accuracy of two commonly used universal screening measures relative to a statewide achievement test. Results indicated that second-grade performance on oral reading fluency and the Measures of Academic Progress (MAP), together with special education status, explained 68% of the…
Subject-Adaptive Real-Time Sleep Stage Classification Based on Conditional Random Field
Luo, Gang; Min, Wanli
2007-01-01
Sleep staging is the pattern recognition task of classifying sleep recordings into sleep stages. This task is one of the most important steps in sleep analysis. It is crucial for the diagnosis and treatment of various sleep disorders, and also relates closely to brain-machine interfaces. We report an automatic, online sleep stager using electroencephalogram (EEG) signal based on a recently-developed statistical pattern recognition method, conditional random field, and novel potential functions that have explicit physical meanings. Using sleep recordings from human subjects, we show that the average classification accuracy of our sleep stager almost approaches the theoretical limit and is about 8% higher than that of existing systems. Moreover, for a new subject snew with limited training data Dnew, we perform subject adaptation to improve classification accuracy. Our idea is to use the knowledge learned from old subjects to obtain from Dnew a regulated estimate of CRF’s parameters. Using sleep recordings from human subjects, we show that even without any Dnew, our sleep stager can achieve an average classification accuracy of 70% on snew. This accuracy increases with the size of Dnew and eventually becomes close to the theoretical limit. PMID:18693884
NASA Astrophysics Data System (ADS)
Zhang, Zhiming; de Wulf, Robert R.; van Coillie, Frieke M. B.; Verbeke, Lieven P. C.; de Clercq, Eva M.; Ou, Xiaokun
2011-01-01
Mapping of vegetation using remote sensing in mountainous areas is considerably hampered by topographic effects on the spectral response pattern. A variety of topographic normalization techniques have been proposed to correct these illumination effects due to topography. The purpose of this study was to compare six different topographic normalization methods (Cosine correction, Minnaert correction, C-correction, Sun-canopy-sensor correction, two-stage topographic normalization, and slope matching technique) for their effectiveness in enhancing vegetation classification in mountainous environments. Since most of the vegetation classes in the rugged terrain of the Lancang Watershed (China) did not feature a normal distribution, artificial neural networks (ANNs) were employed as a classifier. Comparing the ANN classifications, none of the topographic correction methods could significantly improve ETM+ image classification overall accuracy. Nevertheless, at the class level, the accuracy of pine forest could be increased by using topographically corrected images. On the contrary, oak forest and mixed forest accuracies were significantly decreased by using corrected images. The results also showed that none of the topographic normalization strategies was satisfactorily able to correct for the topographic effects in severely shadowed areas.
Support vector machine and principal component analysis for microarray data classification
NASA Astrophysics Data System (ADS)
Astuti, Widi; Adiwijaya
2018-03-01
Cancer is a leading cause of death worldwide although a significant proportion of it can be cured if it is detected early. In recent decades, technology called microarray takes an important role in the diagnosis of cancer. By using data mining technique, microarray data classification can be performed to improve the accuracy of cancer diagnosis compared to traditional techniques. The characteristic of microarray data is small sample but it has huge dimension. Since that, there is a challenge for researcher to provide solutions for microarray data classification with high performance in both accuracy and running time. This research proposed the usage of Principal Component Analysis (PCA) as a dimension reduction method along with Support Vector Method (SVM) optimized by kernel functions as a classifier for microarray data classification. The proposed scheme was applied on seven data sets using 5-fold cross validation and then evaluation and analysis conducted on term of both accuracy and running time. The result showed that the scheme can obtained 100% accuracy for Ovarian and Lung Cancer data when Linear and Cubic kernel functions are used. In term of running time, PCA greatly reduced the running time for every data sets.