Comparative Analysis of Haar and Daubechies Wavelet for Hyper Spectral Image Classification
NASA Astrophysics Data System (ADS)
Sharif, I.; Khare, S.
2014-11-01
With the number of channels in the hundreds instead of in the tens Hyper spectral imagery possesses much richer spectral information than multispectral imagery. The increased dimensionality of such Hyper spectral data provides a challenge to the current technique for analyzing data. Conventional classification methods may not be useful without dimension reduction pre-processing. So dimension reduction has become a significant part of Hyper spectral image processing. This paper presents a comparative analysis of the efficacy of Haar and Daubechies wavelets for dimensionality reduction in achieving image classification. Spectral data reduction using Wavelet Decomposition could be useful because it preserves the distinction among spectral signatures. Daubechies wavelets optimally capture the polynomial trends while Haar wavelet is discontinuous and resembles a step function. The performance of these wavelets are compared in terms of classification accuracy and time complexity. This paper shows that wavelet reduction has more separate classes and yields better or comparable classification accuracy. In the context of the dimensionality reduction algorithm, it is found that the performance of classification of Daubechies wavelets is better as compared to Haar wavelet while Daubechies takes more time compare to Haar wavelet. The experimental results demonstrate the classification system consistently provides over 84% classification accuracy.
Hosseinpour-Feizi, Hojjat; Soleimanpour, Jafar; Sales, Jafar Ganjpour; Arzroumchilar, Ali
2011-01-01
The aim of this study was to investigate the interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis, and to compare the results of surgery performed based on classification of the scoliosis according to each of these classification systems. The study was conducted in Shohada Hospital in Tabriz, Iran, between 2009 and 2010. First, a reliability assessment was undertaken to assess interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis. Second, postoperative efficacy and safety of surgery performed based on the Lenke and King classifications were compared. Kappa coefficients of agreement were calculated to assess the agreement. Outcomes were compared using bivariate tests and repeated measures analysis of variance. A low to moderate interobserver agreement was observed for the King classification; the Lenke classification yielded mostly high agreement coefficients. The outcome of surgery was not found to be substantially different between the two systems. Based on the results, the Lenke classification method seems advantageous. This takes into consideration the Lenke classification's priority in providing details of curvatures in different anatomical surfaces to explain precise intensity of scoliosis, that it has higher interobserver agreement scores, and also that it leads to noninferior postoperative results compared with the King classification method.
Hosseinpour-Feizi, Hojjat; Soleimanpour, Jafar; Sales, Jafar Ganjpour; Arzroumchilar, Ali
2011-01-01
Purpose The aim of this study was to investigate the interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis, and to compare the results of surgery performed based on classification of the scoliosis according to each of these classification systems. Methods The study was conducted in Shohada Hospital in Tabriz, Iran, between 2009 and 2010. First, a reliability assessment was undertaken to assess interobserver agreement of the Lenke and King classifications for adolescent idiopathic scoliosis. Second, postoperative efficacy and safety of surgery performed based on the Lenke and King classifications were compared. Kappa coefficients of agreement were calculated to assess the agreement. Outcomes were compared using bivariate tests and repeated measures analysis of variance. Results A low to moderate interobserver agreement was observed for the King classification; the Lenke classification yielded mostly high agreement coefficients. The outcome of surgery was not found to be substantially different between the two systems. Conclusion Based on the results, the Lenke classification method seems advantageous. This takes into consideration the Lenke classification’s priority in providing details of curvatures in different anatomical surfaces to explain precise intensity of scoliosis, that it has higher interobserver agreement scores, and also that it leads to noninferior postoperative results compared with the King classification method. PMID:22267934
Vector quantizer designs for joint compression and terrain categorization of multispectral imagery
NASA Technical Reports Server (NTRS)
Gorman, John D.; Lyons, Daniel F.
1994-01-01
Two vector quantizer designs for compression of multispectral imagery and their impact on terrain categorization performance are evaluated. The mean-squared error (MSE) and classification performance of the two quantizers are compared, and it is shown that a simple two-stage design minimizing MSE subject to a constraint on classification performance has a significantly better classification performance than a standard MSE-based tree-structured vector quantizer followed by maximum likelihood classification. This improvement in classification performance is obtained with minimal loss in MSE performance. The results show that it is advantageous to tailor compression algorithm designs to the required data exploitation tasks. Applications of joint compression/classification include compression for the archival or transmission of Landsat imagery that is later used for land utility surveys and/or radiometric analysis.
Billings, M; Amin Hadavand, M; Alevizos, I
2018-03-01
The introduction of new classification criteria for Sjögren's syndrome, known as the 2016 American College of Rheumatology/European League against Rheumatism Classification Criteria (ACR-EULAR), created a need for the evaluation of its performance in an external cohort. The purpose of this study was to compare the performance of the 2016 ACR-EULAR classification set with the widely used American-European Consensus Group Classification criteria (AECG) in the cohort at the National Institutes of Health, USA, and to compare the performance of the sets in classifying both primary and secondary Sjögren's syndrome (pSS and sSS). The study cohort at the NIH (N = 1,303) was enrolled for clinical suspicion of SS. Participants were classified as SS, pSS, and sSS according to both classification sets. Performance of 2016 ACR-EULAR and AECG sets was compared holding each as gold standard to the other. Statistical analysis of test diagnostics and agreement between the two sets were undertaken. By the AECG set, 701 were classified as having SS (627 pSS, 74 sSS) and 714 were classified with SS (647 pSS, 67 sSS) by the 2016 ACR-EULAR set. Sensitivity and specificity of the two sets were comparable in classifying SS, pSS, and sSS. There was high agreement between the two sets for classifying SS (κ = 0.79), pSS (κ = 0.81), and sSS (κ = 0.87). The specificity of the 2016 ACR-EULAR set was significantly higher for classifying sSS than pSS, while the sensitivity was similar for the two disease groups. However, this pattern was also exhibited by the AECG set. There was high agreement between the two classification sets with comparable performance diagnostics. There was no evidence of superior performance value by the new 2016 ACR-EULAR set over the AECG set, and the two sets were found to be equivalent. Findings from our cohort indicate that 2016 ACR-EULAR classification could be extended to classification of sSS. Published 2018. This article is a U.S. Government work and is in the public domain in the USA.
Salari, Nader; Shohaimi, Shamarina; Najafi, Farid; Nallappan, Meenakshii; Karishnarajah, Isthrinayagy
2014-01-01
Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms, genetic algorithms, and artificial neural networks are considered as the most common and effective methods in classification problems in numerous studies. In the present study, the results of the implementation of a novel hybrid feature selection-classification model using the above mentioned methods are presented. The purpose is benefitting from the synergies obtained from combining these technologies for the development of classification models. Such a combination creates an opportunity to invest in the strength of each algorithm, and is an approach to make up for their deficiencies. To develop proposed model, with the aim of obtaining the best array of features, first, feature ranking techniques such as the Fisher's discriminant ratio and class separability criteria were used to prioritize features. Second, the obtained results that included arrays of the top-ranked features were used as the initial population of a genetic algorithm to produce optimum arrays of features. Third, using a modified k-Nearest Neighbor method as well as an improved method of backpropagation neural networks, the classification process was advanced based on optimum arrays of the features selected by genetic algorithms. The performance of the proposed model was compared with thirteen well-known classification models based on seven datasets. Furthermore, the statistical analysis was performed using the Friedman test followed by post-hoc tests. The experimental findings indicated that the novel proposed hybrid model resulted in significantly better classification performance compared with all 13 classification methods. Finally, the performance results of the proposed model was benchmarked against the best ones reported as the state-of-the-art classifiers in terms of classification accuracy for the same data sets. The substantial findings of the comprehensive comparative study revealed that performance of the proposed model in terms of classification accuracy is desirable, promising, and competitive to the existing state-of-the-art classification models. PMID:25419659
Feature Selection for Chemical Sensor Arrays Using Mutual Information
Wang, X. Rosalind; Lizier, Joseph T.; Nowotny, Thomas; Berna, Amalia Z.; Prokopenko, Mikhail; Trowell, Stephen C.
2014-01-01
We address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best features and upper bounds on classification performance. We selected feature sets that exhibit the maximal mutual information with the identity of the chemicals. The selected features closely match those found to perform well in the previous study using a wrapper approach to conduct an exhaustive search of all permitted feature combinations. By comparing the classification performance of support vector machines (using features selected by mutual information) with the performance observed in the previous study, we found that while our approach does not always give the maximum possible classification performance, it always selects features that achieve classification performance approaching the optimum obtained by exhaustive search. We performed further classification using the selected feature set with some common classifiers and found that, for the selected features, Bayesian Networks gave the best performance. Finally, we compared the observed classification performances with the performance of classifiers using randomly selected features. We found that the selected features consistently outperformed randomly selected features for all tested classifiers. The mutual information filter approach is therefore a computationally efficient method for selecting near optimal features for chemical sensor arrays. PMID:24595058
Austin, Peter C.; Tu, Jack V.; Ho, Jennifer E.; Levy, Daniel; Lee, Douglas S.
2014-01-01
Objective Physicians classify patients into those with or without a specific disease. Furthermore, there is often interest in classifying patients according to disease etiology or subtype. Classification trees are frequently used to classify patients according to the presence or absence of a disease. However, classification trees can suffer from limited accuracy. In the data-mining and machine learning literature, alternate classification schemes have been developed. These include bootstrap aggregation (bagging), boosting, random forests, and support vector machines. Study design and Setting We compared the performance of these classification methods with those of conventional classification trees to classify patients with heart failure according to the following sub-types: heart failure with preserved ejection fraction (HFPEF) vs. heart failure with reduced ejection fraction (HFREF). We also compared the ability of these methods to predict the probability of the presence of HFPEF with that of conventional logistic regression. Results We found that modern, flexible tree-based methods from the data mining literature offer substantial improvement in prediction and classification of heart failure sub-type compared to conventional classification and regression trees. However, conventional logistic regression had superior performance for predicting the probability of the presence of HFPEF compared to the methods proposed in the data mining literature. Conclusion The use of tree-based methods offers superior performance over conventional classification and regression trees for predicting and classifying heart failure subtypes in a population-based sample of patients from Ontario. However, these methods do not offer substantial improvements over logistic regression for predicting the presence of HFPEF. PMID:23384592
Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa
2018-07-01
Automatic text classification techniques are useful for classifying plaintext medical documents. This study aims to automatically predict the cause of death from free text forensic autopsy reports by comparing various schemes for feature extraction, term weighing or feature value representation, text classification, and feature reduction. For experiments, the autopsy reports belonging to eight different causes of death were collected, preprocessed and converted into 43 master feature vectors using various schemes for feature extraction, representation, and reduction. The six different text classification techniques were applied on these 43 master feature vectors to construct a classification model that can predict the cause of death. Finally, classification model performance was evaluated using four performance measures i.e. overall accuracy, macro precision, macro-F-measure, and macro recall. From experiments, it was found that that unigram features obtained the highest performance compared to bigram, trigram, and hybrid-gram features. Furthermore, in feature representation schemes, term frequency, and term frequency with inverse document frequency obtained similar and better results when compared with binary frequency, and normalized term frequency with inverse document frequency. Furthermore, the chi-square feature reduction approach outperformed Pearson correlation, and information gain approaches. Finally, in text classification algorithms, support vector machine classifier outperforms random forest, Naive Bayes, k-nearest neighbor, decision tree, and ensemble-voted classifier. Our results and comparisons hold practical importance and serve as references for future works. Moreover, the comparison outputs will act as state-of-art techniques to compare future proposals with existing automated text classification techniques. Copyright © 2017 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Labudda, Kirsten; von Rothkirch, Nadine; Pawlikowski, Mirko; Laier, Christian; Brand, Matthias
2010-06-01
To investigate whether patients with alcohol-related Korsakoff syndrome (KR) have emotion-specific or general deficits in multicategoric classification performance. Earlier studies have shown reduced performance in classifying stimuli according to their emotional valence in patients with KS. However, it is unclear whether such classification deficits are of emotion-specific nature or whether they can also occur when nonemotional classifications are demanded. In this study, we examined 35 patients with alcoholic KS and 35 healthy participants with the Emotional Picture Task (EPT) to assess valence classification performance, the Semantic Classification Task (SCT) to assess nonemotional categorizations, and an extensive neuropsychologic test battery. KS patients exhibited lower classification performance in both tasks compared with the healthy participants. EPT and SCT performance were related to each other. EPT and SCT performance correlated with general knowledge and EPT performance in addition with executive functions. Our results indicate a common underlying mechanism of the patients' reductions in emotional and nonemotional classification performance. These deficits are most probably based on problems in retrieving object and category knowledge and, partially, on executive functioning.
ERIC Educational Resources Information Center
Fan, Xitao; Wang, Lin
The Monte Carlo study compared the performance of predictive discriminant analysis (PDA) and that of logistic regression (LR) for the two-group classification problem. Prior probabilities were used for classification, but the cost of misclassification was assumed to be equal. The study used a fully crossed three-factor experimental design (with…
Feature ranking and rank aggregation for automatic sleep stage classification: a comparative study.
Najdi, Shirin; Gharbali, Ali Abdollahi; Fonseca, José Manuel
2017-08-18
Nowadays, sleep quality is one of the most important measures of healthy life, especially considering the huge number of sleep-related disorders. Identifying sleep stages using polysomnographic (PSG) signals is the traditional way of assessing sleep quality. However, the manual process of sleep stage classification is time-consuming, subjective and costly. Therefore, in order to improve the accuracy and efficiency of the sleep stage classification, researchers have been trying to develop automatic classification algorithms. Automatic sleep stage classification mainly consists of three steps: pre-processing, feature extraction and classification. Since classification accuracy is deeply affected by the extracted features, a poor feature vector will adversely affect the classifier and eventually lead to low classification accuracy. Therefore, special attention should be given to the feature extraction and selection process. In this paper the performance of seven feature selection methods, as well as two feature rank aggregation methods, were compared. Pz-Oz EEG, horizontal EOG and submental chin EMG recordings of 22 healthy males and females were used. A comprehensive feature set including 49 features was extracted from these recordings. The extracted features are among the most common and effective features used in sleep stage classification from temporal, spectral, entropy-based and nonlinear categories. The feature selection methods were evaluated and compared using three criteria: classification accuracy, stability, and similarity. Simulation results show that MRMR-MID achieves the highest classification performance while Fisher method provides the most stable ranking. In our simulations, the performance of the aggregation methods was in the average level, although they are known to generate more stable results and better accuracy. The Borda and RRA rank aggregation methods could not outperform significantly the conventional feature ranking methods. Among conventional methods, some of them slightly performed better than others, although the choice of a suitable technique is dependent on the computational complexity and accuracy requirements of the user.
Comparing ecoregional classifications for natural areas management in the Klamath Region, USA
Sarr, Daniel A.; Duff, Andrew; Dinger, Eric C.; Shafer, Sarah L.; Wing, Michael; Seavy, Nathaniel E.; Alexander, John D.
2015-01-01
We compared three existing ecoregional classification schemes (Bailey, Omernik, and World Wildlife Fund) with two derived schemes (Omernik Revised and Climate Zones) to explore their effectiveness in explaining species distributions and to better understand natural resource geography in the Klamath Region, USA. We analyzed presence/absence data derived from digital distribution maps for trees, amphibians, large mammals, small mammals, migrant birds, and resident birds using three statistical analyses of classification accuracy (Analysis of Similarity, Canonical Analysis of Principal Coordinates, and Classification Strength). The classifications were roughly comparable in classification accuracy, with Omernik Revised showing the best overall performance. Trees showed the strongest fidelity to the classifications, and large mammals showed the weakest fidelity. We discuss the implications for regional biogeography and describe how intermediate resolution ecoregional classifications may be appropriate for use as natural areas management domains.
GPU based cloud system for high-performance arrhythmia detection with parallel k-NN algorithm.
Tae Joon Jun; Hyun Ji Park; Hyuk Yoo; Young-Hak Kim; Daeyoung Kim
2016-08-01
In this paper, we propose an GPU based Cloud system for high-performance arrhythmia detection. Pan-Tompkins algorithm is used for QRS detection and we optimized beat classification algorithm with K-Nearest Neighbor (K-NN). To support high performance beat classification on the system, we parallelized beat classification algorithm with CUDA to execute the algorithm on virtualized GPU devices on the Cloud system. MIT-BIH Arrhythmia database is used for validation of the algorithm. The system achieved about 93.5% of detection rate which is comparable to previous researches while our algorithm shows 2.5 times faster execution time compared to CPU only detection algorithm.
Bahadure, Nilesh Bhaskarrao; Ray, Arun Kumar; Thethi, Har Pal
2018-01-17
The detection of a brain tumor and its classification from modern imaging modalities is a primary concern, but a time-consuming and tedious work was performed by radiologists or clinical supervisors. The accuracy of detection and classification of tumor stages performed by radiologists is depended on their experience only, so the computer-aided technology is very important to aid with the diagnosis accuracy. In this study, to improve the performance of tumor detection, we investigated comparative approach of different segmentation techniques and selected the best one by comparing their segmentation score. Further, to improve the classification accuracy, the genetic algorithm is employed for the automatic classification of tumor stage. The decision of classification stage is supported by extracting relevant features and area calculation. The experimental results of proposed technique are evaluated and validated for performance and quality analysis on magnetic resonance brain images, based on segmentation score, accuracy, sensitivity, specificity, and dice similarity index coefficient. The experimental results achieved 92.03% accuracy, 91.42% specificity, 92.36% sensitivity, and an average segmentation score between 0.82 and 0.93 demonstrating the effectiveness of the proposed technique for identifying normal and abnormal tissues from brain MR images. The experimental results also obtained an average of 93.79% dice similarity index coefficient, which indicates better overlap between the automated extracted tumor regions with manually extracted tumor region by radiologists.
NASA Astrophysics Data System (ADS)
Suiter, Ashley Elizabeth
Multi-spectral imagery provides a robust and low-cost dataset for assessing wetland extent and quality over broad regions and is frequently used for wetland inventories. However in forested wetlands, hydrology is obscured by tree canopy making it difficult to detect with multi-spectral imagery alone. Because of this, classification of forested wetlands often includes greater errors than that of other wetlands types. Elevation and terrain derivatives have been shown to be useful for modelling wetland hydrology. But, few studies have addressed the use of LiDAR intensity data detecting hydrology in forested wetlands. Due the tendency of LiDAR signal to be attenuated by water, this research proposed the fusion of LiDAR intensity data with LiDAR elevation, terrain data, and aerial imagery, for the detection of forested wetland hydrology. We examined the utility of LiDAR intensity data and determined whether the fusion of Lidar derived data with multispectral imagery increased the accuracy of forested wetland classification compared with a classification performed with only multi-spectral image. Four classifications were performed: Classification A -- All Imagery, Classification B -- All LiDAR, Classification C -- LiDAR without Intensity, and Classification D -- Fusion of All Data. These classifications were performed using random forest and each resulted in a 3-foot resolution thematic raster of forested upland and forested wetland locations in Vermilion County, Illinois. The accuracies of these classifications were compared using Kappa Coefficient of Agreement. Importance statistics produced within the random forest classifier were evaluated in order to understand the contribution of individual datasets. Classification D, which used the fusion of LiDAR and multi-spectral imagery as input variables, had moderate to strong agreement between reference data and classification results. It was found that Classification A performed using all the LiDAR data and its derivatives (intensity, elevation, slope, aspect, curvatures, and Topographic Wetness Index) was the most accurate classification with Kappa: 78.04%, indicating moderate to strong agreement. However, Classification C, performed with LiDAR derivative without intensity data had less agreement than would be expected by chance, indicating that LiDAR contributed significantly to the accuracy of Classification B.
A new approach to enhance the performance of decision tree for classifying gene expression data.
Hassan, Md; Kotagiri, Ramamohanarao
2013-12-20
Gene expression data classification is a challenging task due to the large dimensionality and very small number of samples. Decision tree is one of the popular machine learning approaches to address such classification problems. However, the existing decision tree algorithms use a single gene feature at each node to split the data into its child nodes and hence might suffer from poor performance specially when classifying gene expression dataset. By using a new decision tree algorithm where, each node of the tree consists of more than one gene, we enhance the classification performance of traditional decision tree classifiers. Our method selects suitable genes that are combined using a linear function to form a derived composite feature. To determine the structure of the tree we use the area under the Receiver Operating Characteristics curve (AUC). Experimental analysis demonstrates higher classification accuracy using the new decision tree compared to the other existing decision trees in literature. We experimentally compare the effect of our scheme against other well known decision tree techniques. Experiments show that our algorithm can substantially boost the classification performance of the decision tree.
Supervised DNA Barcodes species classification: analysis, comparisons and results
2014-01-01
Background Specific fragments, coming from short portions of DNA (e.g., mitochondrial, nuclear, and plastid sequences), have been defined as DNA Barcode and can be used as markers for organisms of the main life kingdoms. Species classification with DNA Barcode sequences has been proven effective on different organisms. Indeed, specific gene regions have been identified as Barcode: COI in animals, rbcL and matK in plants, and ITS in fungi. The classification problem assigns an unknown specimen to a known species by analyzing its Barcode. This task has to be supported with reliable methods and algorithms. Methods In this work the efficacy of supervised machine learning methods to classify species with DNA Barcode sequences is shown. The Weka software suite, which includes a collection of supervised classification methods, is adopted to address the task of DNA Barcode analysis. Classifier families are tested on synthetic and empirical datasets belonging to the animal, fungus, and plant kingdoms. In particular, the function-based method Support Vector Machines (SVM), the rule-based RIPPER, the decision tree C4.5, and the Naïve Bayes method are considered. Additionally, the classification results are compared with respect to ad-hoc and well-established DNA Barcode classification methods. Results A software that converts the DNA Barcode FASTA sequences to the Weka format is released, to adapt different input formats and to allow the execution of the classification procedure. The analysis of results on synthetic and real datasets shows that SVM and Naïve Bayes outperform on average the other considered classifiers, although they do not provide a human interpretable classification model. Rule-based methods have slightly inferior classification performances, but deliver the species specific positions and nucleotide assignments. On synthetic data the supervised machine learning methods obtain superior classification performances with respect to the traditional DNA Barcode classification methods. On empirical data their classification performances are at a comparable level to the other methods. Conclusions The classification analysis shows that supervised machine learning methods are promising candidates for handling with success the DNA Barcoding species classification problem, obtaining excellent performances. To conclude, a powerful tool to perform species identification is now available to the DNA Barcoding community. PMID:24721333
Singaporewalla, R M; Hwee, J; Lang, T U; Desai, V
2017-07-01
Clinico-pathological correlation of thyroid nodules is not routinely performed as until recently there was no objective classification system for reporting thyroid nodules on ultrasound. We compared the Thyroid Imaging Reporting and Data System (TIRADS) of classifying thyroid nodules on ultrasound with the findings on fine-needle aspiration cytology (FNAC) reported using the Bethesda System. A retrospective analysis of 100 consecutive cases over 1 year (Jan-Dec 2015) was performed comparing single-surgeon-performed bedside thyroid nodule ultrasound findings based on the TIRADS classification to the FNAC report based on the Bethesda Classification. TIRADS 1 (normal thyroid gland) and biopsy-proven malignancy referred by other clinicians were excluded. Benign-appearing nodules were reported as TIRADS 2 and 3. Indeterminate or suspected follicular lesions were reported as TIRADS 4, and malignant-appearing nodules were classified as TIRADS 5 during surgeon-performed bedside ultrasound. All the nodules were subjected to ultrasound-guided FNAC, and TIRADS findings were compared to Bethesda FNAC Classification. Of the 100 cases, 74 were considered benign or probably benign, 20 were suspicious for malignancy, and 6 were indeterminate on ultrasound. Overall concordance rate with FNAC was 83% with sensitivity and specificity of 70.6 and 90.4%, respectively. The negative predictive value was 93.8%. It is essential for clinicians performing bedside ultrasound thyroid and guided FNAC to document their sonographic impression of the nodule in an objective fashion using the TIRADS classification and correlate with the gold standard cytology to improve their learning curve and audit their results.
Prediction of customer behaviour analysis using classification algorithms
NASA Astrophysics Data System (ADS)
Raju, Siva Subramanian; Dhandayudam, Prabha
2018-04-01
Customer Relationship management plays a crucial role in analyzing of customer behavior patterns and their values with an enterprise. Analyzing of customer data can be efficient performed using various data mining techniques, with the goal of developing business strategies and to enhance the business. In this paper, three classification models (NB, J48, and MLPNN) are studied and evaluated for our experimental purpose. The performance measures of the three classifications are compared using three different parameters (accuracy, sensitivity, specificity) and experimental results expose J48 algorithm has better accuracy with compare to NB and MLPNN algorithm.
Best Merge Region Growing with Integrated Probabilistic Classification for Hyperspectral Imagery
NASA Technical Reports Server (NTRS)
Tarabalka, Yuliya; Tilton, James C.
2011-01-01
A new method for spectral-spatial classification of hyperspectral images is proposed. The method is based on the integration of probabilistic classification within the hierarchical best merge region growing algorithm. For this purpose, preliminary probabilistic support vector machines classification is performed. Then, hierarchical step-wise optimization algorithm is applied, by iteratively merging regions with the smallest Dissimilarity Criterion (DC). The main novelty of this method consists in defining a DC between regions as a function of region statistical and geometrical features along with classification probabilities. Experimental results are presented on a 200-band AVIRIS image of the Northwestern Indiana s vegetation area and compared with those obtained by recently proposed spectral-spatial classification techniques. The proposed method improves classification accuracies when compared to other classification approaches.
NASA Astrophysics Data System (ADS)
Zagouras, Athanassios; Argiriou, Athanassios A.; Flocas, Helena A.; Economou, George; Fotopoulos, Spiros
2012-11-01
Classification of weather maps at various isobaric levels as a methodological tool is used in several problems related to meteorology, climatology, atmospheric pollution and to other fields for many years. Initially the classification was performed manually. The criteria used by the person performing the classification are features of isobars or isopleths of geopotential height, depending on the type of maps to be classified. Although manual classifications integrate the perceptual experience and other unquantifiable qualities of the meteorology specialists involved, these are typically subjective and time consuming. Furthermore, during the last years different approaches of automated methods for atmospheric circulation classification have been proposed, which present automated and so-called objective classifications. In this paper a new method of atmospheric circulation classification of isobaric maps is presented. The method is based on graph theory. It starts with an intelligent prototype selection using an over-partitioning mode of fuzzy c-means (FCM) algorithm, proceeds to a graph formulation for the entire dataset and produces the clusters based on the contemporary dominant sets clustering method. Graph theory is a novel mathematical approach, allowing a more efficient representation of spatially correlated data, compared to the classical Euclidian space representation approaches, used in conventional classification methods. The method has been applied to the classification of 850 hPa atmospheric circulation over the Eastern Mediterranean. The evaluation of the automated methods is performed by statistical indexes; results indicate that the classification is adequately comparable with other state-of-the-art automated map classification methods, for a variable number of clusters.
Transportation Modes Classification Using Sensors on Smartphones.
Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu
2016-08-19
This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user's transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes.
Transportation Modes Classification Using Sensors on Smartphones
Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu
2016-01-01
This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user’s transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes. PMID:27548182
Wulsin, D. F.; Gupta, J. R.; Mani, R.; Blanco, J. A.; Litt, B.
2011-01-01
Clinical electroencephalography (EEG) records vast amounts of human complex data yet is still reviewed primarily by human readers. Deep Belief Nets (DBNs) are a relatively new type of multi-layer neural network commonly tested on two-dimensional image data, but are rarely applied to times-series data such as EEG. We apply DBNs in a semi-supervised paradigm to model EEG waveforms for classification and anomaly detection. DBN performance was comparable to standard classifiers on our EEG dataset, and classification time was found to be 1.7 to 103.7 times faster than the other high-performing classifiers. We demonstrate how the unsupervised step of DBN learning produces an autoencoder that can naturally be used in anomaly measurement. We compare the use of raw, unprocessed data—a rarity in automated physiological waveform analysis—to hand-chosen features and find that raw data produces comparable classification and better anomaly measurement performance. These results indicate that DBNs and raw data inputs may be more effective for online automated EEG waveform recognition than other common techniques. PMID:21525569
Minimum distance classification in remote sensing
NASA Technical Reports Server (NTRS)
Wacker, A. G.; Landgrebe, D. A.
1972-01-01
The utilization of minimum distance classification methods in remote sensing problems, such as crop species identification, is considered. Literature concerning both minimum distance classification problems and distance measures is reviewed. Experimental results are presented for several examples. The objective of these examples is to: (a) compare the sample classification accuracy of a minimum distance classifier, with the vector classification accuracy of a maximum likelihood classifier, and (b) compare the accuracy of a parametric minimum distance classifier with that of a nonparametric one. Results show the minimum distance classifier performance is 5% to 10% better than that of the maximum likelihood classifier. The nonparametric classifier is only slightly better than the parametric version.
Engelken, Florian; Wassilew, Georgi I; Köhlitz, Torsten; Brockhaus, Sebastian; Hamm, Bernd; Perka, Carsten; Diederichs, und Gerd
2014-01-01
The purpose of this study was to quantify the performance of the Goutallier classification for assessing fatty degeneration of the gluteus muscles from magnetic resonance (MR) images and to compare its performance to a newly proposed system. Eighty-four hips with clinical signs of gluteal insufficiency and 50 hips from asymptomatic controls were analyzed using a standard classification system (Goutallier) and a new scoring system (Quartile). Interobserver reliability and intraobserver repeatability were determined, and accuracy was assessed by comparing readers' scores with quantitative estimates of the proportion of intramuscular fat based on MR signal intensities (gold standard). The existing Goutallier classification system and the new Quartile system performed equally well in assessing fatty degeneration of the gluteus muscles, both showing excellent levels of interrater and intrarater agreement. While the Goutallier classification system has the advantage of being widely known, the benefit of the Quartile system is that it is based on more clearly defined grades of fatty degeneration. Copyright © 2014 Elsevier Inc. All rights reserved.
Pashaei, Elnaz; Ozen, Mustafa; Aydin, Nizamettin
2015-08-01
Improving accuracy of supervised classification algorithms in biomedical applications is one of active area of research. In this study, we improve the performance of Particle Swarm Optimization (PSO) combined with C4.5 decision tree (PSO+C4.5) classifier by applying Boosted C5.0 decision tree as the fitness function. To evaluate the effectiveness of our proposed method, it is implemented on 1 microarray dataset and 5 different medical data sets obtained from UCI machine learning databases. Moreover, the results of PSO + Boosted C5.0 implementation are compared to eight well-known benchmark classification methods (PSO+C4.5, support vector machine under the kernel of Radial Basis Function, Classification And Regression Tree (CART), C4.5 decision tree, C5.0 decision tree, Boosted C5.0 decision tree, Naive Bayes and Weighted K-Nearest neighbor). Repeated five-fold cross-validation method was used to justify the performance of classifiers. Experimental results show that our proposed method not only improve the performance of PSO+C4.5 but also obtains higher classification accuracy compared to the other classification methods.
Solti, Imre; Cooke, Colin R; Xia, Fei; Wurfel, Mark M
2009-11-01
This paper compares the performance of keyword and machine learning-based chest x-ray report classification for Acute Lung Injury (ALI). ALI mortality is approximately 30 percent. High mortality is, in part, a consequence of delayed manual chest x-ray classification. An automated system could reduce the time to recognize ALI and lead to reductions in mortality. For our study, 96 and 857 chest x-ray reports in two corpora were labeled by domain experts for ALI. We developed a keyword and a Maximum Entropy-based classification system. Word unigram and character n-grams provided the features for the machine learning system. The Maximum Entropy algorithm with character 6-gram achieved the highest performance (Recall=0.91, Precision=0.90 and F-measure=0.91) on the 857-report corpus. This study has shown that for the classification of ALI chest x-ray reports, the machine learning approach is superior to the keyword based system and achieves comparable results to highest performing physician annotators.
Solti, Imre; Cooke, Colin R.; Xia, Fei; Wurfel, Mark M.
2010-01-01
This paper compares the performance of keyword and machine learning-based chest x-ray report classification for Acute Lung Injury (ALI). ALI mortality is approximately 30 percent. High mortality is, in part, a consequence of delayed manual chest x-ray classification. An automated system could reduce the time to recognize ALI and lead to reductions in mortality. For our study, 96 and 857 chest x-ray reports in two corpora were labeled by domain experts for ALI. We developed a keyword and a Maximum Entropy-based classification system. Word unigram and character n-grams provided the features for the machine learning system. The Maximum Entropy algorithm with character 6-gram achieved the highest performance (Recall=0.91, Precision=0.90 and F-measure=0.91) on the 857-report corpus. This study has shown that for the classification of ALI chest x-ray reports, the machine learning approach is superior to the keyword based system and achieves comparable results to highest performing physician annotators. PMID:21152268
Ferrat, Emilie; Paillaud, Elena; Caillet, Philippe; Laurent, Marie; Tournigand, Christophe; Lagrange, Jean-Léon; Droz, Jean-Pierre; Balducci, Lodovico; Audureau, Etienne; Canouï-Poitrine, Florence; Bastuji-Garin, Sylvie
2017-03-01
Purpose Frailty classifications of older patients with cancer have been developed to assist physicians in selecting cancer treatments and geriatric interventions. They have not been compared, and their performance in predicting outcomes has not been assessed. Our objectives were to assess agreement among four classifications and to compare their predictive performance in a large cohort of in- and outpatients with various cancers. Patients and Methods We prospectively included 1,021 patients age 70 years or older who had solid or hematologic malignancies and underwent a geriatric assessment in one of two French teaching hospitals between 2007 and 2012. Among them, 763 were assessed using four classifications: Balducci, International Society of Geriatric Oncology (SIOG) 1, SIOG2, and a latent class typology. Agreement was assessed using the κ statistic. Outcomes were 1-year mortality and 6-month unscheduled admissions. Results All four classifications had good discrimination for 1-year mortality (C-index ≥ 0.70); discrimination was best with SIOG1. For 6-month unscheduled admissions, discrimination was good with all four classifications (C-index ≥ 0.70). For classification into three (fit, vulnerable, or frail) or two categories (fit v vulnerable or frail and fit or vulnerable v frail), agreement among the four classifications ranged from very poor (κ ≤ 0.20) to good (0.60 < κ ≤ 0.80). Agreement was best between SIOG1 and the latent class typology and between SIOG1 and Balducci. Conclusion These four frailty classifications have good prognostic performance among older in- and outpatients with various cancers. They may prove useful in decision making about cancer treatments and geriatric interventions and/or in stratifying older patients with cancer in clinical trials.
Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks.
Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R; Nguyen, Tuan N; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T
2017-01-01
This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively.
Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks
Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R.; Nguyen, Tuan N.; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T.
2017-01-01
This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively. PMID:28326009
Grant, Angeline; Njiru, James; Okoth, Edgar; Awino, Imelda; Briend, André; Murage, Samuel; Abdirahman, Saida; Myatt, Mark
2018-01-01
A novel approach for improving community case-detection of acute malnutrition involves mothers/caregivers screening their children for acute malnutrition using a mid-upper arm circumference (MUAC) insertion tape. The objective of this study was to test three simple MUAC classification devices to determine whether they improved the sensitivity of mothers/caregivers at detecting acute malnutrition. Prospective, non-randomised, partially-blinded, clinical diagnostic trial describing and comparing the performance of three "Click-MUAC" devices and a MUAC insertion tape. The study took place in twenty-one health facilities providing integrated management of acute malnutrition (IMAM) services in Isiolo County, Kenya. Mothers/caregivers classified their child ( n =1040), aged 6-59 months, using the "Click-MUAC" devices and a MUAC insertion tape. These classifications were compared to a "gold standard" classification (the mean of three measurements taken by a research assistant using the MUAC insertion tape). The sensitivity of mother/caregiver classifications was high for all devices (>93% for severe acute malnutrition (SAM), defined by MUAC < 115 mm, and > 90% for global acute malnutrition (GAM), defined by MUAC < 125 mm). Mother/caregiver sensitivity for SAM and GAM classification was higher using the MUAC insertion tape (100% sensitivity for SAM and 99% sensitivity for GAM) than using "Click-MUAC" devices. Younden's J for SAM classification, and sensitivity for GAM classification, were significantly higher for the MUAC insertion tape (99% and 99% respectively). Specificity was high for all devices (>96%) with no significant difference between the "Click-MUAC" devices and the MUAC insertion tape. The results of this study indicate that, although the "Click-MUAC" devices performed well, the MUAC insertion tape performed best. The results for sensitivity are higher than found in previous studies. The high sensitivity for both SAM and GAM classification by mothers/caregivers with the MUAC insertion tape could be due to the use of an improved MUAC tape design which has a number of new design features. The one-on-one demonstration provided to mothers/caregivers on the use of the devices may also have helped improve sensitivity. The results of this study provide evidence that mothers/caregivers can perform sensitive and specific classifications of their child's nutritional status using MUAC. Clinical trials registration number: NCT02833740.
2014-01-01
Background Left bundle branch block (LBBB) and right bundle branch block (RBBB) not only mask electrocardiogram (ECG) changes that reflect diseases but also indicate important underlying pathology. The timely detection of LBBB and RBBB is critical in the treatment of cardiac diseases. Inter-patient heartbeat classification is based on independent training and testing sets to construct and evaluate a heartbeat classification system. Therefore, a heartbeat classification system with a high performance evaluation possesses a strong predictive capability for unknown data. The aim of this study was to propose a method for inter-patient classification of heartbeats to accurately detect LBBB and RBBB from the normal beat (NORM). Methods This study proposed a heartbeat classification method through a combination of three different types of classifiers: a minimum distance classifier constructed between NORM and LBBB; a weighted linear discriminant classifier between NORM and RBBB based on Bayesian decision making using posterior probabilities; and a linear support vector machine (SVM) between LBBB and RBBB. Each classifier was used with matching features to obtain better classification performance. The final types of the test heartbeats were determined using a majority voting strategy through the combination of class labels from the three classifiers. The optimal parameters for the classifiers were selected using cross-validation on the training set. The effects of different lead configurations on the classification results were assessed, and the performance of these three classifiers was compared for the detection of each pair of heartbeat types. Results The study results showed that a two-lead configuration exhibited better classification results compared with a single-lead configuration. The construction of a classifier with good performance between each pair of heartbeat types significantly improved the heartbeat classification performance. The results showed a sensitivity of 91.4% and a positive predictive value of 37.3% for LBBB and a sensitivity of 92.8% and a positive predictive value of 88.8% for RBBB. Conclusions A multi-classifier ensemble method was proposed based on inter-patient data and demonstrated a satisfactory classification performance. This approach has the potential for application in clinical practice to distinguish LBBB and RBBB from NORM of unknown patients. PMID:24903422
Huang, Huifang; Liu, Jie; Zhu, Qiang; Wang, Ruiping; Hu, Guangshu
2014-06-05
Left bundle branch block (LBBB) and right bundle branch block (RBBB) not only mask electrocardiogram (ECG) changes that reflect diseases but also indicate important underlying pathology. The timely detection of LBBB and RBBB is critical in the treatment of cardiac diseases. Inter-patient heartbeat classification is based on independent training and testing sets to construct and evaluate a heartbeat classification system. Therefore, a heartbeat classification system with a high performance evaluation possesses a strong predictive capability for unknown data. The aim of this study was to propose a method for inter-patient classification of heartbeats to accurately detect LBBB and RBBB from the normal beat (NORM). This study proposed a heartbeat classification method through a combination of three different types of classifiers: a minimum distance classifier constructed between NORM and LBBB; a weighted linear discriminant classifier between NORM and RBBB based on Bayesian decision making using posterior probabilities; and a linear support vector machine (SVM) between LBBB and RBBB. Each classifier was used with matching features to obtain better classification performance. The final types of the test heartbeats were determined using a majority voting strategy through the combination of class labels from the three classifiers. The optimal parameters for the classifiers were selected using cross-validation on the training set. The effects of different lead configurations on the classification results were assessed, and the performance of these three classifiers was compared for the detection of each pair of heartbeat types. The study results showed that a two-lead configuration exhibited better classification results compared with a single-lead configuration. The construction of a classifier with good performance between each pair of heartbeat types significantly improved the heartbeat classification performance. The results showed a sensitivity of 91.4% and a positive predictive value of 37.3% for LBBB and a sensitivity of 92.8% and a positive predictive value of 88.8% for RBBB. A multi-classifier ensemble method was proposed based on inter-patient data and demonstrated a satisfactory classification performance. This approach has the potential for application in clinical practice to distinguish LBBB and RBBB from NORM of unknown patients.
Algamal, Z Y; Lee, M H
2017-01-01
A high-dimensional quantitative structure-activity relationship (QSAR) classification model typically contains a large number of irrelevant and redundant descriptors. In this paper, a new design of descriptor selection for the QSAR classification model estimation method is proposed by adding a new weight inside L1-norm. The experimental results of classifying the anti-hepatitis C virus activity of thiourea derivatives demonstrate that the proposed descriptor selection method in the QSAR classification model performs effectively and competitively compared with other existing penalized methods in terms of classification performance on both the training and the testing datasets. Moreover, it is noteworthy that the results obtained in terms of stability test and applicability domain provide a robust QSAR classification model. It is evident from the results that the developed QSAR classification model could conceivably be employed for further high-dimensional QSAR classification studies.
NASA Astrophysics Data System (ADS)
Niazmardi, S.; Safari, A.; Homayouni, S.
2017-09-01
Crop mapping through classification of Satellite Image Time-Series (SITS) data can provide very valuable information for several agricultural applications, such as crop monitoring, yield estimation, and crop inventory. However, the SITS data classification is not straightforward. Because different images of a SITS data have different levels of information regarding the classification problems. Moreover, the SITS data is a four-dimensional data that cannot be classified using the conventional classification algorithms. To address these issues in this paper, we presented a classification strategy based on Multiple Kernel Learning (MKL) algorithms for SITS data classification. In this strategy, initially different kernels are constructed from different images of the SITS data and then they are combined into a composite kernel using the MKL algorithms. The composite kernel, once constructed, can be used for the classification of the data using the kernel-based classification algorithms. We compared the computational time and the classification performances of the proposed classification strategy using different MKL algorithms for the purpose of crop mapping. The considered MKL algorithms are: MKL-Sum, SimpleMKL, LPMKL and Group-Lasso MKL algorithms. The experimental tests of the proposed strategy on two SITS data sets, acquired by SPOT satellite sensors, showed that this strategy was able to provide better performances when compared to the standard classification algorithm. The results also showed that the optimization method of the used MKL algorithms affects both the computational time and classification accuracy of this strategy.
Gender classification under extended operating conditions
NASA Astrophysics Data System (ADS)
Rude, Howard N.; Rizki, Mateen
2014-06-01
Gender classification is a critical component of a robust image security system. Many techniques exist to perform gender classification using facial features. In contrast, this paper explores gender classification using body features extracted from clothed subjects. Several of the most effective types of features for gender classification identified in literature were implemented and applied to the newly developed Seasonal Weather And Gender (SWAG) dataset. SWAG contains video clips of approximately 2000 samples of human subjects captured over a period of several months. The subjects are wearing casual business attire and outer garments appropriate for the specific weather conditions observed in the Midwest. The results from a series of experiments are presented that compare the classification accuracy of systems that incorporate various types and combinations of features applied to multiple looks at subjects at different image resolutions to determine a baseline performance for gender classification.
Active learning for clinical text classification: is it better than random sampling?
Figueroa, Rosa L; Zeng-Treitler, Qing; Ngo, Long H; Goryachev, Sergey; Wiechmann, Eduardo P
2012-01-01
This study explores active learning algorithms as a way to reduce the requirements for large training sets in medical text classification tasks. Three existing active learning algorithms (distance-based (DIST), diversity-based (DIV), and a combination of both (CMB)) were used to classify text from five datasets. The performance of these algorithms was compared to that of passive learning on the five datasets. We then conducted a novel investigation of the interaction between dataset characteristics and the performance results. Classification accuracy and area under receiver operating characteristics (ROC) curves for each algorithm at different sample sizes were generated. The performance of active learning algorithms was compared with that of passive learning using a weighted mean of paired differences. To determine why the performance varies on different datasets, we measured the diversity and uncertainty of each dataset using relative entropy and correlated the results with the performance differences. The DIST and CMB algorithms performed better than passive learning. With a statistical significance level set at 0.05, DIST outperformed passive learning in all five datasets, while CMB was found to be better than passive learning in four datasets. We found strong correlations between the dataset diversity and the DIV performance, as well as the dataset uncertainty and the performance of the DIST algorithm. For medical text classification, appropriate active learning algorithms can yield performance comparable to that of passive learning with considerably smaller training sets. In particular, our results suggest that DIV performs better on data with higher diversity and DIST on data with lower uncertainty.
Active learning for clinical text classification: is it better than random sampling?
Figueroa, Rosa L; Ngo, Long H; Goryachev, Sergey; Wiechmann, Eduardo P
2012-01-01
Objective This study explores active learning algorithms as a way to reduce the requirements for large training sets in medical text classification tasks. Design Three existing active learning algorithms (distance-based (DIST), diversity-based (DIV), and a combination of both (CMB)) were used to classify text from five datasets. The performance of these algorithms was compared to that of passive learning on the five datasets. We then conducted a novel investigation of the interaction between dataset characteristics and the performance results. Measurements Classification accuracy and area under receiver operating characteristics (ROC) curves for each algorithm at different sample sizes were generated. The performance of active learning algorithms was compared with that of passive learning using a weighted mean of paired differences. To determine why the performance varies on different datasets, we measured the diversity and uncertainty of each dataset using relative entropy and correlated the results with the performance differences. Results The DIST and CMB algorithms performed better than passive learning. With a statistical significance level set at 0.05, DIST outperformed passive learning in all five datasets, while CMB was found to be better than passive learning in four datasets. We found strong correlations between the dataset diversity and the DIV performance, as well as the dataset uncertainty and the performance of the DIST algorithm. Conclusion For medical text classification, appropriate active learning algorithms can yield performance comparable to that of passive learning with considerably smaller training sets. In particular, our results suggest that DIV performs better on data with higher diversity and DIST on data with lower uncertainty. PMID:22707743
Genome-Wide Comparative Gene Family Classification
Frech, Christian; Chen, Nansheng
2010-01-01
Correct classification of genes into gene families is important for understanding gene function and evolution. Although gene families of many species have been resolved both computationally and experimentally with high accuracy, gene family classification in most newly sequenced genomes has not been done with the same high standard. This project has been designed to develop a strategy to effectively and accurately classify gene families across genomes. We first examine and compare the performance of computer programs developed for automated gene family classification. We demonstrate that some programs, including the hierarchical average-linkage clustering algorithm MC-UPGMA and the popular Markov clustering algorithm TRIBE-MCL, can reconstruct manual curation of gene families accurately. However, their performance is highly sensitive to parameter setting, i.e. different gene families require different program parameters for correct resolution. To circumvent the problem of parameterization, we have developed a comparative strategy for gene family classification. This strategy takes advantage of existing curated gene families of reference species to find suitable parameters for classifying genes in related genomes. To demonstrate the effectiveness of this novel strategy, we use TRIBE-MCL to classify chemosensory and ABC transporter gene families in C. elegans and its four sister species. We conclude that fully automated programs can establish biologically accurate gene families if parameterized accordingly. Comparative gene family classification finds optimal parameters automatically, thus allowing rapid insights into gene families of newly sequenced species. PMID:20976221
Physical Human Activity Recognition Using Wearable Sensors.
Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine
2015-12-11
This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors' placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject.
Physical Human Activity Recognition Using Wearable Sensors
Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine
2015-01-01
This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors’ placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject. PMID:26690450
An alternative respiratory sounds classification system utilizing artificial neural networks.
Oweis, Rami J; Abdulhay, Enas W; Khayal, Amer; Awad, Areen
2015-01-01
Computerized lung sound analysis involves recording lung sound via an electronic device, followed by computer analysis and classification based on specific signal characteristics as non-linearity and nonstationarity caused by air turbulence. An automatic analysis is necessary to avoid dependence on expert skills. This work revolves around exploiting autocorrelation in the feature extraction stage. All process stages were implemented in MATLAB. The classification process was performed comparatively using both artificial neural networks (ANNs) and adaptive neuro-fuzzy inference systems (ANFIS) toolboxes. The methods have been applied to 10 different respiratory sounds for classification. The ANN was superior to the ANFIS system and returned superior performance parameters. Its accuracy, specificity, and sensitivity were 98.6%, 100%, and 97.8%, respectively. The obtained parameters showed superiority to many recent approaches. The promising proposed method is an efficient fast tool for the intended purpose as manifested in the performance parameters, specifically, accuracy, specificity, and sensitivity. Furthermore, it may be added that utilizing the autocorrelation function in the feature extraction in such applications results in enhanced performance and avoids undesired computation complexities compared to other techniques.
Voice based gender classification using machine learning
NASA Astrophysics Data System (ADS)
Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.
2017-11-01
Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.
NASA Technical Reports Server (NTRS)
Benediktsson, J. A.; Swain, P. H.; Ersoy, O. K.
1993-01-01
Application of neural networks to classification of remote sensing data is discussed. Conventional two-layer backpropagation is found to give good results in classification of remote sensing data but is not efficient in training. A more efficient variant, based on conjugate-gradient optimization, is used for classification of multisource remote sensing and geographic data and very-high-dimensional data. The conjugate-gradient neural networks give excellent performance in classification of multisource data, but do not compare as well with statistical methods in classification of very-high-dimentional data.
NASA Technical Reports Server (NTRS)
Tarabalka, Y.; Tilton, J. C.; Benediktsson, J. A.; Chanussot, J.
2012-01-01
The Hierarchical SEGmentation (HSEG) algorithm, which combines region object finding with region object clustering, has given good performances for multi- and hyperspectral image analysis. This technique produces at its output a hierarchical set of image segmentations. The automated selection of a single segmentation level is often necessary. We propose and investigate the use of automatically selected markers for this purpose. In this paper, a novel Marker-based HSEG (M-HSEG) method for spectral-spatial classification of hyperspectral images is proposed. Two classification-based approaches for automatic marker selection are adapted and compared for this purpose. Then, a novel constrained marker-based HSEG algorithm is applied, resulting in a spectral-spatial classification map. Three different implementations of the M-HSEG method are proposed and their performances in terms of classification accuracies are compared. The experimental results, presented for three hyperspectral airborne images, demonstrate that the proposed approach yields accurate segmentation and classification maps, and thus is attractive for remote sensing image analysis.
Sevel, Landrew S; Boissoneault, Jeff; Letzen, Janelle E; Robinson, Michael E; Staud, Roland
2018-05-30
Chronic fatigue syndrome (CFS) is a disorder associated with fatigue, pain, and structural/functional abnormalities seen during magnetic resonance brain imaging (MRI). Therefore, we evaluated the performance of structural MRI (sMRI) abnormalities in the classification of CFS patients versus healthy controls and compared it to machine learning (ML) classification based upon self-report (SR). Participants included 18 CFS patients and 15 healthy controls (HC). All subjects underwent T1-weighted sMRI and provided visual analogue-scale ratings of fatigue, pain intensity, anxiety, depression, anger, and sleep quality. sMRI data were segmented using FreeSurfer and 61 regions based on functional and structural abnormalities previously reported in patients with CFS. Classification was performed in RapidMiner using a linear support vector machine and bootstrap optimism correction. We compared ML classifiers based on (1) 61 a priori sMRI regional estimates and (2) SR ratings. The sMRI model achieved 79.58% classification accuracy. The SR (accuracy = 95.95%) outperformed both sMRI models. Estimates from multiple brain areas related to cognition, emotion, and memory contributed strongly to group classification. This is the first ML-based group classification of CFS. Our findings suggest that sMRI abnormalities are useful for discriminating CFS patients from HC, but SR ratings remain most effective in classification tasks.
NASA Astrophysics Data System (ADS)
Khan, Asif; Ryoo, Chang-Kyung; Kim, Heung Soo
2017-04-01
This paper presents a comparative study of different classification algorithms for the classification of various types of inter-ply delaminations in smart composite laminates. Improved layerwise theory is used to model delamination at different interfaces along the thickness and longitudinal directions of the smart composite laminate. The input-output data obtained through surface bonded piezoelectric sensor and actuator is analyzed by the system identification algorithm to get the system parameters. The identified parameters for the healthy and delaminated structure are supplied as input data to the classification algorithms. The classification algorithms considered in this study are ZeroR, Classification via regression, Naïve Bayes, Multilayer Perceptron, Sequential Minimal Optimization, Multiclass-Classifier, and Decision tree (J48). The open source software of Waikato Environment for Knowledge Analysis (WEKA) is used to evaluate the classification performance of the classifiers mentioned above via 75-25 holdout and leave-one-sample-out cross-validation regarding classification accuracy, precision, recall, kappa statistic and ROC Area.
Developing Local Oral Reading Fluency Cut Scores for Predicting High-Stakes Test Performance
ERIC Educational Resources Information Center
Grapin, Sally L.; Kranzler, John H.; Waldron, Nancy; Joyce-Beaulieu, Diana; Algina, James
2017-01-01
This study evaluated the classification accuracy of a second grade oral reading fluency curriculum-based measure (R-CBM) in predicting third grade state test performance. It also compared the long-term classification accuracy of local and publisher-recommended R-CBM cut scores. Participants were 266 students who were divided into a calibration…
A review of classification algorithms for EEG-based brain-computer interfaces.
Lotte, F; Congedo, M; Lécuyer, A; Lamarche, F; Arnaldi, B
2007-06-01
In this paper we review classification algorithms used to design brain-computer interface (BCI) systems based on electroencephalography (EEG). We briefly present the commonly employed algorithms and describe their critical properties. Based on the literature, we compare them in terms of performance and provide guidelines to choose the suitable classification algorithm(s) for a specific BCI.
2014-01-01
Background The inter-patient classification schema and the Association for the Advancement of Medical Instrumentation (AAMI) standards are important to the construction and evaluation of automated heartbeat classification systems. The majority of previously proposed methods that take the above two aspects into consideration use the same features and classification method to classify different classes of heartbeats. The performance of the classification system is often unsatisfactory with respect to the ventricular ectopic beat (VEB) and supraventricular ectopic beat (SVEB). Methods Based on the different characteristics of VEB and SVEB, a novel hierarchical heartbeat classification system was constructed. This was done in order to improve the classification performance of these two classes of heartbeats by using different features and classification methods. First, random projection and support vector machine (SVM) ensemble were used to detect VEB. Then, the ratio of the RR interval was compared to a predetermined threshold to detect SVEB. The optimal parameters for the classification models were selected on the training set and used in the independent testing set to assess the final performance of the classification system. Meanwhile, the effect of different lead configurations on the classification results was evaluated. Results Results showed that the performance of this classification system was notably superior to that of other methods. The VEB detection sensitivity was 93.9% with a positive predictive value of 90.9%, and the SVEB detection sensitivity was 91.1% with a positive predictive value of 42.2%. In addition, this classification process was relatively fast. Conclusions A hierarchical heartbeat classification system was proposed based on the inter-patient data division to detect VEB and SVEB. It demonstrated better classification performance than existing methods. It can be regarded as a promising system for detecting VEB and SVEB of unknown patients in clinical practice. PMID:24981916
Spectral-Spatial Classification of Hyperspectral Images Using Hierarchical Optimization
NASA Technical Reports Server (NTRS)
Tarabalka, Yuliya; Tilton, James C.
2011-01-01
A new spectral-spatial method for hyperspectral data classification is proposed. For a given hyperspectral image, probabilistic pixelwise classification is first applied. Then, hierarchical step-wise optimization algorithm is performed, by iteratively merging neighboring regions with the smallest Dissimilarity Criterion (DC) and recomputing class labels for new regions. The DC is computed by comparing region mean vectors, class labels and a number of pixels in the two regions under consideration. The algorithm is converged when all the pixels get involved in the region merging procedure. Experimental results are presented on two remote sensing hyperspectral images acquired by the AVIRIS and ROSIS sensors. The proposed approach improves classification accuracies and provides maps with more homogeneous regions, when compared to previously proposed classification techniques.
Classifying four-category visual objects using multiple ERP components in single-trial ERP.
Qin, Yu; Zhan, Yu; Wang, Changming; Zhang, Jiacai; Yao, Li; Guo, Xiaojuan; Wu, Xia; Hu, Bin
2016-08-01
Object categorization using single-trial electroencephalography (EEG) data measured while participants view images has been studied intensively. In previous studies, multiple event-related potential (ERP) components (e.g., P1, N1, P2, and P3) were used to improve the performance of object categorization of visual stimuli. In this study, we introduce a novel method that uses multiple-kernel support vector machine to fuse multiple ERP component features. We investigate whether fusing the potential complementary information of different ERP components (e.g., P1, N1, P2a, and P2b) can improve the performance of four-category visual object classification in single-trial EEGs. We also compare the classification accuracy of different ERP component fusion methods. Our experimental results indicate that the classification accuracy increases through multiple ERP fusion. Additional comparative analyses indicate that the multiple-kernel fusion method can achieve a mean classification accuracy higher than 72 %, which is substantially better than that achieved with any single ERP component feature (55.07 % for the best single ERP component, N1). We compare the classification results with those of other fusion methods and determine that the accuracy of the multiple-kernel fusion method is 5.47, 4.06, and 16.90 % higher than those of feature concatenation, feature extraction, and decision fusion, respectively. Our study shows that our multiple-kernel fusion method outperforms other fusion methods and thus provides a means to improve the classification performance of single-trial ERPs in brain-computer interface research.
Brain-computer interfacing under distraction: an evaluation study
NASA Astrophysics Data System (ADS)
Brandl, Stephanie; Frølich, Laura; Höhne, Johannes; Müller, Klaus-Robert; Samek, Wojciech
2016-10-01
Objective. While motor-imagery based brain-computer interfaces (BCIs) have been studied over many years by now, most of these studies have taken place in controlled lab settings. Bringing BCI technology into everyday life is still one of the main challenges in this field of research. Approach. This paper systematically investigates BCI performance under 6 types of distractions that mimic out-of-lab environments. Main results. We report results of 16 participants and show that the performance of the standard common spatial patterns (CSP) + regularized linear discriminant analysis classification pipeline drops significantly in this ‘simulated’ out-of-lab setting. We then investigate three methods for improving the performance: (1) artifact removal, (2) ensemble classification, and (3) a 2-step classification approach. While artifact removal does not enhance the BCI performance significantly, both ensemble classification and the 2-step classification combined with CSP significantly improve the performance compared to the standard procedure. Significance. Systematically analyzing out-of-lab scenarios is crucial when bringing BCI into everyday life. Algorithms must be adapted to overcome nonstationary environments in order to tackle real-world challenges.
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-01-01
Background Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Methodology/Principal Findings Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. Conclusions/Significance The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice. PMID:21359184
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-02-16
Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice.
Trakoolwilaiwan, Thanawin; Behboodi, Bahareh; Lee, Jaeseok; Kim, Kyungsoo; Choi, Ji-Woong
2018-01-01
The aim of this work is to develop an effective brain-computer interface (BCI) method based on functional near-infrared spectroscopy (fNIRS). In order to improve the performance of the BCI system in terms of accuracy, the ability to discriminate features from input signals and proper classification are desired. Previous studies have mainly extracted features from the signal manually, but proper features need to be selected carefully. To avoid performance degradation caused by manual feature selection, we applied convolutional neural networks (CNNs) as the automatic feature extractor and classifier for fNIRS-based BCI. In this study, the hemodynamic responses evoked by performing rest, right-, and left-hand motor execution tasks were measured on eight healthy subjects to compare performances. Our CNN-based method provided improvements in classification accuracy over conventional methods employing the most commonly used features of mean, peak, slope, variance, kurtosis, and skewness, classified by support vector machine (SVM) and artificial neural network (ANN). Specifically, up to 6.49% and 3.33% improvement in classification accuracy was achieved by CNN compared with SVM and ANN, respectively.
Janousova, Eva; Schwarz, Daniel; Kasparek, Tomas
2015-06-30
We investigated a combination of three classification algorithms, namely the modified maximum uncertainty linear discriminant analysis (mMLDA), the centroid method, and the average linkage, with three types of features extracted from three-dimensional T1-weighted magnetic resonance (MR) brain images, specifically MR intensities, grey matter densities, and local deformations for distinguishing 49 first episode schizophrenia male patients from 49 healthy male subjects. The feature sets were reduced using intersubject principal component analysis before classification. By combining the classifiers, we were able to obtain slightly improved results when compared with single classifiers. The best classification performance (81.6% accuracy, 75.5% sensitivity, and 87.8% specificity) was significantly better than classification by chance. We also showed that classifiers based on features calculated using more computation-intensive image preprocessing perform better; mMLDA with classification boundary calculated as weighted mean discriminative scores of the groups had improved sensitivity but similar accuracy compared to the original MLDA; reducing a number of eigenvectors during data reduction did not always lead to higher classification accuracy, since noise as well as the signal important for classification were removed. Our findings provide important information for schizophrenia research and may improve accuracy of computer-aided diagnostics of neuropsychiatric diseases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Cooperative Learning for Distributed In-Network Traffic Classification
NASA Astrophysics Data System (ADS)
Joseph, S. B.; Loo, H. R.; Ismail, I.; Andromeda, T.; Marsono, M. N.
2017-04-01
Inspired by the concept of autonomic distributed/decentralized network management schemes, we consider the issue of information exchange among distributed network nodes to network performance and promote scalability for in-network monitoring. In this paper, we propose a cooperative learning algorithm for propagation and synchronization of network information among autonomic distributed network nodes for online traffic classification. The results show that network nodes with sharing capability perform better with a higher average accuracy of 89.21% (sharing data) and 88.37% (sharing clusters) compared to 88.06% for nodes without cooperative learning capability. The overall performance indicates that cooperative learning is promising for distributed in-network traffic classification.
ERIC Educational Resources Information Center
Funk, Kerri L.; Tseng, M. S.
Two groups of 32 educable mentally retarded children (ages 7 to 14 years) were compared as to their arithmetic and classification performances attributable to the presence or absence of a 4 1/2 week exposure to classification tasks. The randomized block pretest-posttest design was used. The experimental group and the control group were matched on…
Non-parametric analysis of LANDSAT maps using neural nets and parallel computers
NASA Technical Reports Server (NTRS)
Salu, Yehuda; Tilton, James
1991-01-01
Nearest neighbor approaches and a new neural network, the Binary Diamond, are used for the classification of images of ground pixels obtained by LANDSAT satellite. The performances are evaluated by comparing classifications of a scene in the vicinity of Washington DC. The problem of optimal selection of categories is addressed as a step in the classification process.
Application of Sensor Fusion to Improve Uav Image Classification
NASA Astrophysics Data System (ADS)
Jabari, S.; Fathollahi, F.; Zhang, Y.
2017-08-01
Image classification is one of the most important tasks of remote sensing projects including the ones that are based on using UAV images. Improving the quality of UAV images directly affects the classification results and can save a huge amount of time and effort in this area. In this study, we show that sensor fusion can improve image quality which results in increasing the accuracy of image classification. Here, we tested two sensor fusion configurations by using a Panchromatic (Pan) camera along with either a colour camera or a four-band multi-spectral (MS) camera. We use the Pan camera to benefit from its higher sensitivity and the colour or MS camera to benefit from its spectral properties. The resulting images are then compared to the ones acquired by a high resolution single Bayer-pattern colour camera (here referred to as HRC). We assessed the quality of the output images by performing image classification tests. The outputs prove that the proposed sensor fusion configurations can achieve higher accuracies compared to the images of the single Bayer-pattern colour camera. Therefore, incorporating a Pan camera on-board in the UAV missions and performing image fusion can help achieving higher quality images and accordingly higher accuracy classification results.
Classification of Odours for Mobile Robots Using an Ensemble of Linear Classifiers
NASA Astrophysics Data System (ADS)
Trincavelli, Marco; Coradeschi, Silvia; Loutfi, Amy
2009-05-01
This paper investigates the classification of odours using an electronic nose mounted on a mobile robot. The samples are collected as the robot explores the environment. Under such conditions, the sensor response differs from typical three phase sampling processes. In this paper, we focus particularly on the classification problem and how it is influenced by the movement of the robot. To cope with these influences, an algorithm consisting of an ensemble of classifiers is presented. Experimental results show that this algorithm increases classification performance compared to other traditional classification methods.
NASA Astrophysics Data System (ADS)
Fleig, Anne K.; Tallaksen, Lena M.; Hisdal, Hege; Stahl, Kerstin; Hannah, David M.
Classifications of weather and circulation patterns are often applied in research seeking to relate atmospheric state to surface environmental phenomena. However, numerous procedures have been applied to define the patterns, thus limiting comparability between studies. The COST733 Action “ Harmonisation and Applications of Weather Type Classifications for European regions” tests 73 different weather type classifications (WTC) and their associate weather types (WTs) and compares the WTCs’ utility for various applications. The objective of this study is to evaluate the potential of these WTCs for analysis of regional hydrological drought development in north-western Europe. Hydrological drought is defined in terms of a Regional Drought Area Index (RDAI), which is based on deficits derived from daily river flow series. RDAI series (1964-2001) were calculated for four homogeneous regions in Great Britain and two in Denmark. For each region, WTs associated with hydrological drought development were identified based on antecedent and concurrent WT-frequencies for major drought events. The utility of the different WTCs for the study of hydrological drought development was evaluated, and the influence of WTC attributes, i.e. input variables, number of defined WTs and general classification concept, on WTC performance was assessed. The objective Grosswetterlagen (OGWL), the objective Second-Generation Lamb Weather Type Classification (LWT2) with 18 WTs and two implementations of the objective Wetterlagenklassifikation (WLK; with 40 and 28 WTs) outperformed all other WTCs. In general, WTCs with more WTs (⩾27) were found to perform better than WTCs with less (⩽18) WTs. The influence of input variables was not consistent across the different classification procedures, and the performance of a WTC was determined primarily by the classification procedure itself. Overall, classification procedures following the relatively simple general classification concept of predefining WTs based on thresholds, performed better than those based on more sophisticated classification concepts such as deriving WTs by cluster analysis or artificial neural networks. In particular, PCA based WTCs with 9 WTs and automated WTCs with a high number of predefined WTs (subjectively and threshold based) performed well. It is suggested that the explicit consideration of the air flow characteristics of meridionality, zonality and cyclonicity in the definition of WTs is a useful feature for a WTC when analysing regional hydrological drought development.
Performance of resonant radar target identification algorithms using intra-class weighting functions
NASA Astrophysics Data System (ADS)
Mustafa, A.
The use of calibrated resonant-region radar cross section (RCS) measurements of targets for the classification of large aircraft is discussed. Errors in the RCS estimate of full scale aircraft flying over an ocean, introduced by the ionospheric variability and the sea conditions were studied. The Weighted Target Representative (WTR) classification algorithm was developed, implemented, tested and compared with the nearest neighbor (NN) algorithm. The WTR-algorithm has a low sensitivity to the uncertainty in the aspect angle of the unknown target returns. In addition, this algorithm was based on the development of a new catalog of representative data which reduces the storage requirements and increases the computational efficiency of the classification system compared to the NN-algorithm. Experiments were designed to study and evaluate the characteristics of the WTR- and the NN-algorithms, investigate the classifiability of targets and study the relative behavior of the number of misclassifications as a function of the target backscatter features. The classification results and statistics were shown in the form of performance curves, performance tables and confusion tables.
A Machine Learning-based Method for Question Type Classification in Biomedical Question Answering.
Sarrouti, Mourad; Ouatik El Alaoui, Said
2017-05-18
Biomedical question type classification is one of the important components of an automatic biomedical question answering system. The performance of the latter depends directly on the performance of its biomedical question type classification system, which consists of assigning a category to each question in order to determine the appropriate answer extraction algorithm. This study aims to automatically classify biomedical questions into one of the four categories: (1) yes/no, (2) factoid, (3) list, and (4) summary. In this paper, we propose a biomedical question type classification method based on machine learning approaches to automatically assign a category to a biomedical question. First, we extract features from biomedical questions using the proposed handcrafted lexico-syntactic patterns. Then, we feed these features for machine-learning algorithms. Finally, the class label is predicted using the trained classifiers. Experimental evaluations performed on large standard annotated datasets of biomedical questions, provided by the BioASQ challenge, demonstrated that our method exhibits significant improved performance when compared to four baseline systems. The proposed method achieves a roughly 10-point increase over the best baseline in terms of accuracy. Moreover, the obtained results show that using handcrafted lexico-syntactic patterns as features' provider of support vector machine (SVM) lead to the highest accuracy of 89.40 %. The proposed method can automatically classify BioASQ questions into one of the four categories: yes/no, factoid, list, and summary. Furthermore, the results demonstrated that our method produced the best classification performance compared to four baseline systems.
Hyperspectral feature mapping classification based on mathematical morphology
NASA Astrophysics Data System (ADS)
Liu, Chang; Li, Junwei; Wang, Guangping; Wu, Jingli
2016-03-01
This paper proposed a hyperspectral feature mapping classification algorithm based on mathematical morphology. Without the priori information such as spectral library etc., the spectral and spatial information can be used to realize the hyperspectral feature mapping classification. The mathematical morphological erosion and dilation operations are performed respectively to extract endmembers. The spectral feature mapping algorithm is used to carry on hyperspectral image classification. The hyperspectral image collected by AVIRIS is applied to evaluate the proposed algorithm. The proposed algorithm is compared with minimum Euclidean distance mapping algorithm, minimum Mahalanobis distance mapping algorithm, SAM algorithm and binary encoding mapping algorithm. From the results of the experiments, it is illuminated that the proposed algorithm's performance is better than that of the other algorithms under the same condition and has higher classification accuracy.
Hierarchical structure for audio-video based semantic classification of sports video sequences
NASA Astrophysics Data System (ADS)
Kolekar, M. H.; Sengupta, S.
2005-07-01
A hierarchical structure for sports event classification based on audio and video content analysis is proposed in this paper. Compared to the event classifications in other games, those of cricket are very challenging and yet unexplored. We have successfully solved cricket video classification problem using a six level hierarchical structure. The first level performs event detection based on audio energy and Zero Crossing Rate (ZCR) of short-time audio signal. In the subsequent levels, we classify the events based on video features using a Hidden Markov Model implemented through Dynamic Programming (HMM-DP) using color or motion as a likelihood function. For some of the game-specific decisions, a rule-based classification is also performed. Our proposed hierarchical structure can easily be applied to any other sports. Our results are very promising and we have moved a step forward towards addressing semantic classification problems in general.
NASA Astrophysics Data System (ADS)
Qu, Haicheng; Liang, Xuejian; Liang, Shichao; Liu, Wanjun
2018-01-01
Many methods of hyperspectral image classification have been proposed recently, and the convolutional neural network (CNN) achieves outstanding performance. However, spectral-spatial classification of CNN requires an excessively large model, tremendous computations, and complex network, and CNN is generally unable to use the noisy bands caused by water-vapor absorption. A dimensionality-varied CNN (DV-CNN) is proposed to address these issues. There are four stages in DV-CNN and the dimensionalities of spectral-spatial feature maps vary with the stages. DV-CNN can reduce the computation and simplify the structure of the network. All feature maps are processed by more kernels in higher stages to extract more precise features. DV-CNN also improves the classification accuracy and enhances the robustness to water-vapor absorption bands. The experiments are performed on data sets of Indian Pines and Pavia University scene. The classification performance of DV-CNN is compared with state-of-the-art methods, which contain the variations of CNN, traditional, and other deep learning methods. The experiment of performance analysis about DV-CNN itself is also carried out. The experimental results demonstrate that DV-CNN outperforms state-of-the-art methods for spectral-spatial classification and it is also robust to water-vapor absorption bands. Moreover, reasonable parameters selection is effective to improve classification accuracy.
Posture and performance: sitting vs. standing for security screening.
Drury, C G; Hsiao, Y L; Joseph, C; Joshi, S; Lapp, J; Pennathur, P R
2008-03-01
A classification of the literature on the effects of workplace posture on performance of different mental tasks showed few consistent patterns. A parallel classification of the complementary effect of performance on postural variables gave similar results. Because of a lack of data for signal detection tasks, an experiment was performed using 12 experienced security operators performing an X-ray baggage-screening task with three different workplace arrangements. The current workplace, sitting on a high chair viewing a screen placed on top of the X-ray machine, was compared to a standing workplace and a conventional desk-sitting workplace. No performance effects of workplace posture were found, although the experiment was able to measure performance effects of learning and body part discomfort effects of workplace posture. There are implications for the classification of posture and performance and for the justification of ergonomics improvements based on performance increases.
NASA Astrophysics Data System (ADS)
Ranaie, Mehrdad; Soffianian, Alireza; Pourmanafi, Saeid; Mirghaffari, Noorollah; Tarkesh, Mostafa
2018-03-01
In recent decade, analyzing the remotely sensed imagery is considered as one of the most common and widely used procedures in the environmental studies. In this case, supervised image classification techniques play a central role. Hence, taking a high resolution Worldview-3 over a mixed urbanized landscape in Iran, three less applied image classification methods including Bagged CART, Stochastic gradient boosting model and Neural network with feature extraction were tested and compared with two prevalent methods: random forest and support vector machine with linear kernel. To do so, each method was run ten time and three validation techniques was used to estimate the accuracy statistics consist of cross validation, independent validation and validation with total of train data. Moreover, using ANOVA and Tukey test, statistical difference significance between the classification methods was significantly surveyed. In general, the results showed that random forest with marginal difference compared to Bagged CART and stochastic gradient boosting model is the best performing method whilst based on independent validation there was no significant difference between the performances of classification methods. It should be finally noted that neural network with feature extraction and linear support vector machine had better processing speed than other.
ERIC Educational Resources Information Center
Lee, Eunjoo; Park, Hyejin; Nam, Mihwa; Whyte, James
2011-01-01
The purpose of the study was to identify Nursing Interventions Classification (NIC) interventions performed by Korean school nurses. The Korean data were then compared to U.S. data from other studies in order to identify differences and similarities between Korean and U.S. school nurse practice. Of the 542 available NIC interventions, 180 were…
ERIC Educational Resources Information Center
Chen, Pei-Hua; Chang, Hua-Hua; Wu, Haiyan
2012-01-01
Two sampling-and-classification-based procedures were developed for automated test assembly: the Cell Only and the Cell and Cube methods. A simulation study based on a 540-item bank was conducted to compare the performance of the procedures with the performance of a mixed-integer programming (MIP) method for assembling multiple parallel test…
NASA Astrophysics Data System (ADS)
Zhang, Min; Zhou, Xiangrong; Goshima, Satoshi; Chen, Huayue; Muramatsu, Chisako; Hara, Takeshi; Yokoyama, Ryojiro; Kanematsu, Masayuki; Fujita, Hiroshi
2012-03-01
We aim at using a new texton based texture classification method in the classification of pulmonary emphysema in computed tomography (CT) images of the lungs. Different from conventional computer-aided diagnosis (CAD) pulmonary emphysema classification methods, in this paper, firstly, the dictionary of texton is learned via applying sparse representation(SR) to image patches in the training dataset. Then the SR coefficients of the test images over the dictionary are used to construct the histograms for texture presentations. Finally, classification is performed by using a nearest neighbor classifier with a histogram dissimilarity measure as distance. The proposed approach is tested on 3840 annotated regions of interest consisting of normal tissue and mild, moderate and severe pulmonary emphysema of three subtypes. The performance of the proposed system, with an accuracy of about 88%, is comparably higher than state of the art method based on the basic rotation invariant local binary pattern histograms and the texture classification method based on texton learning by k-means, which performs almost the best among other approaches in the literature.
Zbroch, Tomasz; Knapp, Paweł Grzegorz; Knapp, Piotr Andrzej
2007-09-01
Increasing knowledge concerning carcinogenesis within cervical epithelium has forced us to make continues modifications of cytology classification of the cervical smears. Eventually, new descriptions of the submicroscopic cytomorphological abnormalities have enabled the implementation of Bethesda System which was meant to take place of the former Papanicolaou classification although temporarily both are sometimes used simultaneously. The aim of this study was to compare results of these two classification systems in the aspect of diagnostic accuracy verified by further tests of the diagnostic algorithm for the cervical lesion evaluation. The study was conducted in the group of women selected from general population, the criteria being the place of living and cervical cancer age risk group, in the consecutive periods of mass screening in Podlaski region. The performed diagnostic tests have been based on the commonly used algorithm, as well as identical laboratory and methodological conditions. Performed assessment revealed comparable diagnostic accuracy of both analyzing classifications, verified by histological examination, although with marked higher specificity for dysplastic lesions with decreased number of HSIL results and increased diagnosis of LSILs. Higher number of performed colposcopies and biopsies were an additional consequence of TBS classification. Results based on Bethesda System made it possible to find the sources and reasons of abnormalities with much greater precision, which enabled causing agent treatment. Two evaluated cytology classification systems, although not much different, depicted higher potential of TBS and better, more effective communication between cytology laboratory and gynecologist, making reasonable implementation of The Bethesda System in the daily cytology screening work.
Machine Learning for Biological Trajectory Classification Applications
NASA Technical Reports Server (NTRS)
Sbalzarini, Ivo F.; Theriot, Julie; Koumoutsakos, Petros
2002-01-01
Machine-learning techniques, including clustering algorithms, support vector machines and hidden Markov models, are applied to the task of classifying trajectories of moving keratocyte cells. The different algorithms axe compared to each other as well as to expert and non-expert test persons, using concepts from signal-detection theory. The algorithms performed very well as compared to humans, suggesting a robust tool for trajectory classification in biological applications.
ERIC Educational Resources Information Center
Nedwek, Brian P.; Neal, John E.
This study developed a classification scheme to critically compare performance assessment projects at higher education universities in North America and Europe. Performance indicators and assessment initiatives were compared using nine basic dimensions: (1) locus of control, (2) degree of governmental involvement, (3) focus of performance…
Automated Decision Tree Classification of Corneal Shape
Twa, Michael D.; Parthasarathy, Srinivasan; Roberts, Cynthia; Mahmoud, Ashraf M.; Raasch, Thomas W.; Bullimore, Mark A.
2011-01-01
Purpose The volume and complexity of data produced during videokeratography examinations present a challenge of interpretation. As a consequence, results are often analyzed qualitatively by subjective pattern recognition or reduced to comparisons of summary indices. We describe the application of decision tree induction, an automated machine learning classification method, to discriminate between normal and keratoconic corneal shapes in an objective and quantitative way. We then compared this method with other known classification methods. Methods The corneal surface was modeled with a seventh-order Zernike polynomial for 132 normal eyes of 92 subjects and 112 eyes of 71 subjects diagnosed with keratoconus. A decision tree classifier was induced using the C4.5 algorithm, and its classification performance was compared with the modified Rabinowitz–McDonnell index, Schwiegerling’s Z3 index (Z3), Keratoconus Prediction Index (KPI), KISA%, and Cone Location and Magnitude Index using recommended classification thresholds for each method. We also evaluated the area under the receiver operator characteristic (ROC) curve for each classification method. Results Our decision tree classifier performed equal to or better than the other classifiers tested: accuracy was 92% and the area under the ROC curve was 0.97. Our decision tree classifier reduced the information needed to distinguish between normal and keratoconus eyes using four of 36 Zernike polynomial coefficients. The four surface features selected as classification attributes by the decision tree method were inferior elevation, greater sagittal depth, oblique toricity, and trefoil. Conclusions Automated decision tree classification of corneal shape through Zernike polynomials is an accurate quantitative method of classification that is interpretable and can be generated from any instrument platform capable of raw elevation data output. This method of pattern classification is extendable to other classification problems. PMID:16357645
Chiarelli, Antonio Maria; Croce, Pierpaolo; Merla, Arcangelo; Zappasodi, Filippo
2018-06-01
Brain-computer interface (BCI) refers to procedures that link the central nervous system to a device. BCI was historically performed using electroencephalography (EEG). In the last years, encouraging results were obtained by combining EEG with other neuroimaging technologies, such as functional near infrared spectroscopy (fNIRS). A crucial step of BCI is brain state classification from recorded signal features. Deep artificial neural networks (DNNs) recently reached unprecedented complex classification outcomes. These performances were achieved through increased computational power, efficient learning algorithms, valuable activation functions, and restricted or back-fed neurons connections. By expecting significant overall BCI performances, we investigated the capabilities of combining EEG and fNIRS recordings with state-of-the-art deep learning procedures. We performed a guided left and right hand motor imagery task on 15 subjects with a fixed classification response time of 1 s and overall experiment length of 10 min. Left versus right classification accuracy of a DNN in the multi-modal recording modality was estimated and it was compared to standalone EEG and fNIRS and other classifiers. At a group level we obtained significant increase in performance when considering multi-modal recordings and DNN classifier with synergistic effect. BCI performances can be significantly improved by employing multi-modal recordings that provide electrical and hemodynamic brain activity information, in combination with advanced non-linear deep learning classification procedures.
NASA Astrophysics Data System (ADS)
Chiarelli, Antonio Maria; Croce, Pierpaolo; Merla, Arcangelo; Zappasodi, Filippo
2018-06-01
Objective. Brain–computer interface (BCI) refers to procedures that link the central nervous system to a device. BCI was historically performed using electroencephalography (EEG). In the last years, encouraging results were obtained by combining EEG with other neuroimaging technologies, such as functional near infrared spectroscopy (fNIRS). A crucial step of BCI is brain state classification from recorded signal features. Deep artificial neural networks (DNNs) recently reached unprecedented complex classification outcomes. These performances were achieved through increased computational power, efficient learning algorithms, valuable activation functions, and restricted or back-fed neurons connections. By expecting significant overall BCI performances, we investigated the capabilities of combining EEG and fNIRS recordings with state-of-the-art deep learning procedures. Approach. We performed a guided left and right hand motor imagery task on 15 subjects with a fixed classification response time of 1 s and overall experiment length of 10 min. Left versus right classification accuracy of a DNN in the multi-modal recording modality was estimated and it was compared to standalone EEG and fNIRS and other classifiers. Main results. At a group level we obtained significant increase in performance when considering multi-modal recordings and DNN classifier with synergistic effect. Significance. BCI performances can be significantly improved by employing multi-modal recordings that provide electrical and hemodynamic brain activity information, in combination with advanced non-linear deep learning classification procedures.
Zhang, Jianhua; Li, Sunan; Wang, Rubin
2017-01-01
In this paper, we deal with the Mental Workload (MWL) classification problem based on the measured physiological data. First we discussed the optimal depth (i.e., the number of hidden layers) and parameter optimization algorithms for the Convolutional Neural Networks (CNN). The base CNNs designed were tested according to five classification performance indices, namely Accuracy, Precision, F-measure, G-mean, and required training time. Then we developed an Ensemble Convolutional Neural Network (ECNN) to enhance the accuracy and robustness of the individual CNN model. For the ECNN design, three model aggregation approaches (weighted averaging, majority voting and stacking) were examined and a resampling strategy was used to enhance the diversity of individual CNN models. The results of MWL classification performance comparison indicated that the proposed ECNN framework can effectively improve MWL classification performance and is featured by entirely automatic feature extraction and MWL classification, when compared with traditional machine learning methods.
Austin, Peter C; Lee, Douglas S
2011-01-01
Purpose: Classification trees are increasingly being used to classifying patients according to the presence or absence of a disease or health outcome. A limitation of classification trees is their limited predictive accuracy. In the data-mining and machine learning literature, boosting has been developed to improve classification. Boosting with classification trees iteratively grows classification trees in a sequence of reweighted datasets. In a given iteration, subjects that were misclassified in the previous iteration are weighted more highly than subjects that were correctly classified. Classifications from each of the classification trees in the sequence are combined through a weighted majority vote to produce a final classification. The authors' objective was to examine whether boosting improved the accuracy of classification trees for predicting outcomes in cardiovascular patients. Methods: We examined the utility of boosting classification trees for classifying 30-day mortality outcomes in patients hospitalized with either acute myocardial infarction or congestive heart failure. Results: Improvements in the misclassification rate using boosted classification trees were at best minor compared to when conventional classification trees were used. Minor to modest improvements to sensitivity were observed, with only a negligible reduction in specificity. For predicting cardiovascular mortality, boosted classification trees had high specificity, but low sensitivity. Conclusions: Gains in predictive accuracy for predicting cardiovascular outcomes were less impressive than gains in performance observed in the data mining literature. PMID:22254181
Evaluating data mining algorithms using molecular dynamics trajectories.
Tatsis, Vasileios A; Tjortjis, Christos; Tzirakis, Panagiotis
2013-01-01
Molecular dynamics simulations provide a sample of a molecule's conformational space. Experiments on the mus time scale, resulting in large amounts of data, are nowadays routine. Data mining techniques such as classification provide a way to analyse such data. In this work, we evaluate and compare several classification algorithms using three data sets which resulted from computer simulations, of a potential enzyme mimetic biomolecule. We evaluated 65 classifiers available in the well-known data mining toolkit Weka, using 'classification' errors to assess algorithmic performance. Results suggest that: (i) 'meta' classifiers perform better than the other groups, when applied to molecular dynamics data sets; (ii) Random Forest and Rotation Forest are the best classifiers for all three data sets; and (iii) classification via clustering yields the highest classification error. Our findings are consistent with bibliographic evidence, suggesting a 'roadmap' for dealing with such data.
Al Ajmi, Eiman; Forghani, Behzad; Reinhold, Caroline; Bayat, Maryam; Forghani, Reza
2018-06-01
There is a rich amount of quantitative information in spectral datasets generated from dual-energy CT (DECT). In this study, we compare the performance of texture analysis performed on multi-energy datasets to that of virtual monochromatic images (VMIs) at 65 keV only, using classification of the two most common benign parotid neoplasms as a testing paradigm. Forty-two patients with pathologically proven Warthin tumour (n = 25) or pleomorphic adenoma (n = 17) were evaluated. Texture analysis was performed on VMIs ranging from 40 to 140 keV in 5-keV increments (multi-energy analysis) or 65-keV VMIs only, which is typically considered equivalent to single-energy CT. Random forest (RF) models were constructed for outcome prediction using separate randomly selected training and testing sets or the entire patient set. Using multi-energy texture analysis, tumour classification in the independent testing set had accuracy, sensitivity, specificity, positive predictive value, and negative predictive value of 92%, 86%, 100%, 100%, and 83%, compared to 75%, 57%, 100%, 100%, and 63%, respectively, for single-energy analysis. Multi-energy texture analysis demonstrates superior performance compared to single-energy texture analysis of VMIs at 65 keV for classification of benign parotid tumours. • We present and validate a paradigm for texture analysis of DECT scans. • Multi-energy dataset texture analysis is superior to single-energy dataset texture analysis. • DECT texture analysis has high accura\\cy for diagnosis of benign parotid tumours. • DECT texture analysis with machine learning can enhance non-invasive diagnostic tumour evaluation.
Kumar, Shiu; Mamun, Kabir; Sharma, Alok
2017-12-01
Classification of electroencephalography (EEG) signals for motor imagery based brain computer interface (MI-BCI) is an exigent task and common spatial pattern (CSP) has been extensively explored for this purpose. In this work, we focused on developing a new framework for classification of EEG signals for MI-BCI. We propose a single band CSP framework for MI-BCI that utilizes the concept of tangent space mapping (TSM) in the manifold of covariance matrices. The proposed method is named CSP-TSM. Spatial filtering is performed on the bandpass filtered MI EEG signal. Riemannian tangent space is utilized for extracting features from the spatial filtered signal. The TSM features are then fused with the CSP variance based features and feature selection is performed using Lasso. Linear discriminant analysis (LDA) is then applied to the selected features and finally classification is done using support vector machine (SVM) classifier. The proposed framework gives improved performance for MI EEG signal classification in comparison with several competing methods. Experiments conducted shows that the proposed framework reduces the overall classification error rate for MI-BCI by 3.16%, 5.10% and 1.70% (for BCI Competition III dataset IVa, BCI Competition IV Dataset I and BCI Competition IV Dataset IIb, respectively) compared to the conventional CSP method under the same experimental settings. The proposed CSP-TSM method produces promising results when compared with several competing methods in this paper. In addition, the computational complexity is less compared to that of TSM method. Our proposed CSP-TSM framework can be potentially used for developing improved MI-BCI systems. Copyright © 2017 Elsevier Ltd. All rights reserved.
Semi-supervised vibration-based classification and condition monitoring of compressors
NASA Astrophysics Data System (ADS)
Potočnik, Primož; Govekar, Edvard
2017-09-01
Semi-supervised vibration-based classification and condition monitoring of the reciprocating compressors installed in refrigeration appliances is proposed in this paper. The method addresses the problem of industrial condition monitoring where prior class definitions are often not available or difficult to obtain from local experts. The proposed method combines feature extraction, principal component analysis, and statistical analysis for the extraction of initial class representatives, and compares the capability of various classification methods, including discriminant analysis (DA), neural networks (NN), support vector machines (SVM), and extreme learning machines (ELM). The use of the method is demonstrated on a case study which was based on industrially acquired vibration measurements of reciprocating compressors during the production of refrigeration appliances. The paper presents a comparative qualitative analysis of the applied classifiers, confirming the good performance of several nonlinear classifiers. If the model parameters are properly selected, then very good classification performance can be obtained from NN trained by Bayesian regularization, SVM and ELM classifiers. The method can be effectively applied for the industrial condition monitoring of compressors.
Ecosystem classifications based on summer and winter conditions.
Andrew, Margaret E; Nelson, Trisalyn A; Wulder, Michael A; Hobart, George W; Coops, Nicholas C; Farmer, Carson J Q
2013-04-01
Ecosystem classifications map an area into relatively homogenous units for environmental research, monitoring, and management. However, their effectiveness is rarely tested. Here, three classifications are (1) defined and characterized for Canada along summertime productivity (moderate-resolution imaging spectrometer fraction of absorbed photosynthetically active radiation) and wintertime snow conditions (special sensor microwave/imager snow water equivalent), independently and in combination, and (2) comparatively evaluated to determine the ability of each classification to represent the spatial and environmental patterns of alternative schemes, including the Canadian ecozone framework. All classifications depicted similar patterns across Canada, but detailed class distributions differed. Class spatial characteristics varied with environmental conditions within classifications, but were comparable between classifications. There was moderate correspondence between classifications. The strongest association was between productivity classes and ecozones. The classification along both productivity and snow balanced these two sets of variables, yielding intermediate levels of association in all pairwise comparisons. Despite relatively low spatial agreement between classifications, they successfully captured patterns of the environmental conditions underlying alternate schemes (e.g., snow classes explained variation in productivity and vice versa). The performance of ecosystem classifications and the relevance of their input variables depend on the environmental patterns and processes used for applications and evaluation. Productivity or snow regimes, as constructed here, may be desirable when summarizing patterns controlled by summer- or wintertime conditions, respectively, or of climate change responses. General purpose ecosystem classifications should include both sets of drivers. Classifications should be carefully, quantitatively, and comparatively evaluated relative to a particular application prior to their implementation as monitoring and assessment frameworks.
A machine learning approach for viral genome classification.
Remita, Mohamed Amine; Halioui, Ahmed; Malick Diouara, Abou Abdallah; Daigle, Bruno; Kiani, Golrokh; Diallo, Abdoulaye Baniré
2017-04-11
Advances in cloning and sequencing technology are yielding a massive number of viral genomes. The classification and annotation of these genomes constitute important assets in the discovery of genomic variability, taxonomic characteristics and disease mechanisms. Existing classification methods are often designed for specific well-studied family of viruses. Thus, the viral comparative genomic studies could benefit from more generic, fast and accurate tools for classifying and typing newly sequenced strains of diverse virus families. Here, we introduce a virus classification platform, CASTOR, based on machine learning methods. CASTOR is inspired by a well-known technique in molecular biology: restriction fragment length polymorphism (RFLP). It simulates, in silico, the restriction digestion of genomic material by different enzymes into fragments. It uses two metrics to construct feature vectors for machine learning algorithms in the classification step. We benchmark CASTOR for the classification of distinct datasets of human papillomaviruses (HPV), hepatitis B viruses (HBV) and human immunodeficiency viruses type 1 (HIV-1). Results reveal true positive rates of 99%, 99% and 98% for HPV Alpha species, HBV genotyping and HIV-1 M subtyping, respectively. Furthermore, CASTOR shows a competitive performance compared to well-known HIV-1 specific classifiers (REGA and COMET) on whole genomes and pol fragments. The performance of CASTOR, its genericity and robustness could permit to perform novel and accurate large scale virus studies. The CASTOR web platform provides an open access, collaborative and reproducible machine learning classifiers. CASTOR can be accessed at http://castor.bioinfo.uqam.ca .
Prahm, Cosima; Eckstein, Korbinian; Ortiz-Catalan, Max; Dorffner, Georg; Kaniusas, Eugenijus; Aszmann, Oskar C
2016-08-31
Controlling a myoelectric prosthesis for upper limbs is increasingly challenging for the user as more electrodes and joints become available. Motion classification based on pattern recognition with a multi-electrode array allows multiple joints to be controlled simultaneously. Previous pattern recognition studies are difficult to compare, because individual research groups use their own data sets. To resolve this shortcoming and to facilitate comparisons, open access data sets were analysed using components of BioPatRec and Netlab pattern recognition models. Performances of the artificial neural networks, linear models, and training program components were compared. Evaluation took place within the BioPatRec environment, a Matlab-based open source platform that provides feature extraction, processing and motion classification algorithms for prosthetic control. The algorithms were applied to myoelectric signals for individual and simultaneous classification of movements, with the aim of finding the best performing algorithm and network model. Evaluation criteria included classification accuracy and training time. Results in both the linear and the artificial neural network models demonstrated that Netlab's implementation using scaled conjugate training algorithm reached significantly higher accuracies than BioPatRec. It is concluded that the best movement classification performance would be achieved through integrating Netlab training algorithms in the BioPatRec environment so that future prosthesis training can be shortened and control made more reliable. Netlab was therefore included into the newest release of BioPatRec (v4.0).
Classification of stillbirths is an ongoing dilemma.
Nappi, Luigi; Trezza, Federica; Bufo, Pantaleo; Riezzo, Irene; Turillazzi, Emanuela; Borghi, Chiara; Bonaccorsi, Gloria; Scutiero, Gennaro; Fineschi, Vittorio; Greco, Pantaleo
2016-10-01
To compare different classification systems in a cohort of stillbirths undergoing a comprehensive workup; to establish whether a particular classification system is most suitable and useful in determining cause of death, purporting the lowest percentage of unexplained death. Cases of stillbirth at gestational age 22-41 weeks occurring at the Department of Gynecology and Obstetrics of Foggia University during a 4 year period were collected. The World Health Organization (WHO) diagnosis of stillbirth was used. All the data collection was based on the recommendations of an Italian diagnostic workup for stillbirth. Two expert obstetricians reviewed all cases and classified causes according to five classification systems. Relevant Condition at Death (ReCoDe) and Causes Of Death and Associated Conditions (CODAC) classification systems performed best in retaining information. The ReCoDe system provided the lowest rate of unexplained stillbirth (14%) compared to de Galan-Roosen (16%), CODAC (16%), Tulip (18%), Wigglesworth (62%). Classification of stillbirth is influenced by the multiplicity of possible causes and factors related to fetal death. Fetal autopsy, placental histology and cytogenetic analysis are strongly recommended to have a complete diagnostic evaluation. Commonly employed classification systems performed differently in our experience, the most satisfactory being the ReCoDe. Given the rate of "unexplained" cases, none can be considered optimal and further efforts are necessary to work out a clinically useful system.
Zhang, Junming; Wu, Yan
2018-03-28
Many systems are developed for automatic sleep stage classification. However, nearly all models are based on handcrafted features. Because of the large feature space, there are so many features that feature selection should be used. Meanwhile, designing handcrafted features is a difficult and time-consuming task because the feature designing needs domain knowledge of experienced experts. Results vary when different sets of features are chosen to identify sleep stages. Additionally, many features that we may be unaware of exist. However, these features may be important for sleep stage classification. Therefore, a new sleep stage classification system, which is based on the complex-valued convolutional neural network (CCNN), is proposed in this study. Unlike the existing sleep stage methods, our method can automatically extract features from raw electroencephalography data and then classify sleep stage based on the learned features. Additionally, we also prove that the decision boundaries for the real and imaginary parts of a complex-valued convolutional neuron intersect orthogonally. The classification performances of handcrafted features are compared with those of learned features via CCNN. Experimental results show that the proposed method is comparable to the existing methods. CCNN obtains a better classification performance and considerably faster convergence speed than convolutional neural network. Experimental results also show that the proposed method is a useful decision-support tool for automatic sleep stage classification.
NASA Astrophysics Data System (ADS)
Sasaki, Kenya; Mitani, Yoshihiro; Fujita, Yusuke; Hamamoto, Yoshihiko; Sakaida, Isao
2017-02-01
In this paper, in order to classify liver cirrhosis on regions of interest (ROIs) images from B-mode ultrasound images, we have proposed to use the higher order local autocorrelation (HLAC) features. In a previous study, we tried to classify liver cirrhosis by using a Gabor filter based approach. However, the classification performance of the Gabor feature was poor from our preliminary experimental results. In order accurately to classify liver cirrhosis, we examined to use the HLAC features for liver cirrhosis classification. The experimental results show the effectiveness of HLAC features compared with the Gabor feature. Furthermore, by using a binary image made by an adaptive thresholding method, the classification performance of HLAC features has improved.
Models of Marine Fish Biodiversity: Assessing Predictors from Three Habitat Classification Schemes.
Yates, Katherine L; Mellin, Camille; Caley, M Julian; Radford, Ben T; Meeuwig, Jessica J
2016-01-01
Prioritising biodiversity conservation requires knowledge of where biodiversity occurs. Such knowledge, however, is often lacking. New technologies for collecting biological and physical data coupled with advances in modelling techniques could help address these gaps and facilitate improved management outcomes. Here we examined the utility of environmental data, obtained using different methods, for developing models of both uni- and multivariate biodiversity metrics. We tested which biodiversity metrics could be predicted best and evaluated the performance of predictor variables generated from three types of habitat data: acoustic multibeam sonar imagery, predicted habitat classification, and direct observer habitat classification. We used boosted regression trees (BRT) to model metrics of fish species richness, abundance and biomass, and multivariate regression trees (MRT) to model biomass and abundance of fish functional groups. We compared model performance using different sets of predictors and estimated the relative influence of individual predictors. Models of total species richness and total abundance performed best; those developed for endemic species performed worst. Abundance models performed substantially better than corresponding biomass models. In general, BRT and MRTs developed using predicted habitat classifications performed less well than those using multibeam data. The most influential individual predictor was the abiotic categorical variable from direct observer habitat classification and models that incorporated predictors from direct observer habitat classification consistently outperformed those that did not. Our results show that while remotely sensed data can offer considerable utility for predictive modelling, the addition of direct observer habitat classification data can substantially improve model performance. Thus it appears that there are aspects of marine habitats that are important for modelling metrics of fish biodiversity that are not fully captured by remotely sensed data. As such, the use of remotely sensed data to model biodiversity represents a compromise between model performance and data availability.
Models of Marine Fish Biodiversity: Assessing Predictors from Three Habitat Classification Schemes
Yates, Katherine L.; Mellin, Camille; Caley, M. Julian; Radford, Ben T.; Meeuwig, Jessica J.
2016-01-01
Prioritising biodiversity conservation requires knowledge of where biodiversity occurs. Such knowledge, however, is often lacking. New technologies for collecting biological and physical data coupled with advances in modelling techniques could help address these gaps and facilitate improved management outcomes. Here we examined the utility of environmental data, obtained using different methods, for developing models of both uni- and multivariate biodiversity metrics. We tested which biodiversity metrics could be predicted best and evaluated the performance of predictor variables generated from three types of habitat data: acoustic multibeam sonar imagery, predicted habitat classification, and direct observer habitat classification. We used boosted regression trees (BRT) to model metrics of fish species richness, abundance and biomass, and multivariate regression trees (MRT) to model biomass and abundance of fish functional groups. We compared model performance using different sets of predictors and estimated the relative influence of individual predictors. Models of total species richness and total abundance performed best; those developed for endemic species performed worst. Abundance models performed substantially better than corresponding biomass models. In general, BRT and MRTs developed using predicted habitat classifications performed less well than those using multibeam data. The most influential individual predictor was the abiotic categorical variable from direct observer habitat classification and models that incorporated predictors from direct observer habitat classification consistently outperformed those that did not. Our results show that while remotely sensed data can offer considerable utility for predictive modelling, the addition of direct observer habitat classification data can substantially improve model performance. Thus it appears that there are aspects of marine habitats that are important for modelling metrics of fish biodiversity that are not fully captured by remotely sensed data. As such, the use of remotely sensed data to model biodiversity represents a compromise between model performance and data availability. PMID:27333202
Feature selection and classification of multiparametric medical images using bagging and SVM
NASA Astrophysics Data System (ADS)
Fan, Yong; Resnick, Susan M.; Davatzikos, Christos
2008-03-01
This paper presents a framework for brain classification based on multi-parametric medical images. This method takes advantage of multi-parametric imaging to provide a set of discriminative features for classifier construction by using a regional feature extraction method which takes into account joint correlations among different image parameters; in the experiments herein, MRI and PET images of the brain are used. Support vector machine classifiers are then trained based on the most discriminative features selected from the feature set. To facilitate robust classification and optimal selection of parameters involved in classification, in view of the well-known "curse of dimensionality", base classifiers are constructed in a bagging (bootstrap aggregating) framework for building an ensemble classifier and the classification parameters of these base classifiers are optimized by means of maximizing the area under the ROC (receiver operating characteristic) curve estimated from their prediction performance on left-out samples of bootstrap sampling. This classification system is tested on a sex classification problem, where it yields over 90% classification rates for unseen subjects. The proposed classification method is also compared with other commonly used classification algorithms, with favorable results. These results illustrate that the methods built upon information jointly extracted from multi-parametric images have the potential to perform individual classification with high sensitivity and specificity.
Climate Classification is an Important Factor in Assessing Hospital Performance Metrics
NASA Astrophysics Data System (ADS)
Boland, M. R.; Parhi, P.; Gentine, P.; Tatonetti, N. P.
2017-12-01
Context/Purpose: Climate is a known modulator of disease, but its impact on hospital performance metrics remains unstudied. Methods: We assess the relationship between Köppen-Geiger climate classification and hospital performance metrics, specifically 30-day mortality, as reported in Hospital Compare, and collected for the period July 2013 through June 2014 (7/1/2013 - 06/30/2014). A hospital-level multivariate linear regression analysis was performed while controlling for known socioeconomic factors to explore the relationship between all-cause mortality and climate. Hospital performance scores were obtained from 4,524 hospitals belonging to 15 distinct Köppen-Geiger climates and 2,373 unique counties. Results: Model results revealed that hospital performance metrics for mortality showed significant climate dependence (p<0.001) after adjusting for socioeconomic factors. Interpretation: Currently, hospitals are reimbursed by Governmental agencies using 30-day mortality rates along with 30-day readmission rates. These metrics allow Government agencies to rank hospitals according to their `performance' along these metrics. Various socioeconomic factors are taken into consideration when determining individual hospitals performance. However, no climate-based adjustment is made within the existing framework. Our results indicate that climate-based variability in 30-day mortality rates does exist even after socioeconomic confounder adjustment. Use of standardized high-level climate classification systems (such as Koppen-Geiger) would be useful to incorporate in future metrics. Conclusion: Climate is a significant factor in evaluating hospital 30-day mortality rates. These results demonstrate that climate classification is an important factor when comparing hospital performance across the United States.
Dynamic classification of fetal heart rates by hierarchical Dirichlet process mixture models.
Yu, Kezi; Quirk, J Gerald; Djurić, Petar M
2017-01-01
In this paper, we propose an application of non-parametric Bayesian (NPB) models for classification of fetal heart rate (FHR) recordings. More specifically, we propose models that are used to differentiate between FHR recordings that are from fetuses with or without adverse outcomes. In our work, we rely on models based on hierarchical Dirichlet processes (HDP) and the Chinese restaurant process with finite capacity (CRFC). Two mixture models were inferred from real recordings, one that represents healthy and another, non-healthy fetuses. The models were then used to classify new recordings and provide the probability of the fetus being healthy. First, we compared the classification performance of the HDP models with that of support vector machines on real data and concluded that the HDP models achieved better performance. Then we demonstrated the use of mixture models based on CRFC for dynamic classification of the performance of (FHR) recordings in a real-time setting.
Dynamic classification of fetal heart rates by hierarchical Dirichlet process mixture models
Yu, Kezi; Quirk, J. Gerald
2017-01-01
In this paper, we propose an application of non-parametric Bayesian (NPB) models for classification of fetal heart rate (FHR) recordings. More specifically, we propose models that are used to differentiate between FHR recordings that are from fetuses with or without adverse outcomes. In our work, we rely on models based on hierarchical Dirichlet processes (HDP) and the Chinese restaurant process with finite capacity (CRFC). Two mixture models were inferred from real recordings, one that represents healthy and another, non-healthy fetuses. The models were then used to classify new recordings and provide the probability of the fetus being healthy. First, we compared the classification performance of the HDP models with that of support vector machines on real data and concluded that the HDP models achieved better performance. Then we demonstrated the use of mixture models based on CRFC for dynamic classification of the performance of (FHR) recordings in a real-time setting. PMID:28953927
Protein Sequence Classification with Improved Extreme Learning Machine Algorithms
2014-01-01
Precisely classifying a protein sequence from a large biological protein sequences database plays an important role for developing competitive pharmacological products. Comparing the unseen sequence with all the identified protein sequences and returning the category index with the highest similarity scored protein, conventional methods are usually time-consuming. Therefore, it is urgent and necessary to build an efficient protein sequence classification system. In this paper, we study the performance of protein sequence classification using SLFNs. The recent efficient extreme learning machine (ELM) and its invariants are utilized as the training algorithms. The optimal pruned ELM is first employed for protein sequence classification in this paper. To further enhance the performance, the ensemble based SLFNs structure is constructed where multiple SLFNs with the same number of hidden nodes and the same activation function are used as ensembles. For each ensemble, the same training algorithm is adopted. The final category index is derived using the majority voting method. Two approaches, namely, the basic ELM and the OP-ELM, are adopted for the ensemble based SLFNs. The performance is analyzed and compared with several existing methods using datasets obtained from the Protein Information Resource center. The experimental results show the priority of the proposed algorithms. PMID:24795876
Figueroa, Rosa L; Flores, Christopher A
2016-08-01
Obesity is a chronic disease with an increasing impact on the world's population. In this work, we present a method of identifying obesity automatically using text mining techniques and information related to body weight measures and obesity comorbidities. We used a dataset of 3015 de-identified medical records that contain labels for two classification problems. The first classification problem distinguishes between obesity, overweight, normal weight, and underweight. The second classification problem differentiates between obesity types: super obesity, morbid obesity, severe obesity and moderate obesity. We used a Bag of Words approach to represent the records together with unigram and bigram representations of the features. We implemented two approaches: a hierarchical method and a nonhierarchical one. We used Support Vector Machine and Naïve Bayes together with ten-fold cross validation to evaluate and compare performances. Our results indicate that the hierarchical approach does not work as well as the nonhierarchical one. In general, our results show that Support Vector Machine obtains better performances than Naïve Bayes for both classification problems. We also observed that bigram representation improves performance compared with unigram representation.
Deshpande, Gopikrishna; Wang, Peng; Rangaprakash, D; Wilamowski, Bogdan
2015-12-01
Automated recognition and classification of brain diseases are of tremendous value to society. Attention deficit hyperactivity disorder (ADHD) is a diverse spectrum disorder whose diagnosis is based on behavior and hence will benefit from classification utilizing objective neuroimaging measures. Toward this end, an international competition was conducted for classifying ADHD using functional magnetic resonance imaging data acquired from multiple sites worldwide. Here, we consider the data from this competition as an example to illustrate the utility of fully connected cascade (FCC) artificial neural network (ANN) architecture for performing classification. We employed various directional and nondirectional brain connectivity-based methods to extract discriminative features which gave better classification accuracy compared to raw data. Our accuracy for distinguishing ADHD from healthy subjects was close to 90% and between the ADHD subtypes was close to 95%. Further, we show that, if properly used, FCC ANN performs very well compared to other classifiers such as support vector machines in terms of accuracy, irrespective of the feature used. Finally, the most discriminative connectivity features provided insights about the pathophysiology of ADHD and showed reduced and altered connectivity involving the left orbitofrontal cortex and various cerebellar regions in ADHD.
Support vector machine (SVM) was applied for land-cover characterization using MODIS time-series data. Classification performance was examined with respect to training sample size, sample variability, and landscape homogeneity (purity). The results were compared to two convention...
ECG signal analysis through hidden Markov models.
Andreão, Rodrigo V; Dorizzi, Bernadette; Boudy, Jérôme
2006-08-01
This paper presents an original hidden Markov model (HMM) approach for online beat segmentation and classification of electrocardiograms. The HMM framework has been visited because of its ability of beat detection, segmentation and classification, highly suitable to the electrocardiogram (ECG) problem. Our approach addresses a large panel of topics some of them never studied before in other HMM related works: waveforms modeling, multichannel beat segmentation and classification, and unsupervised adaptation to the patient's ECG. The performance was evaluated on the two-channel QT database in terms of waveform segmentation precision, beat detection and classification. Our waveform segmentation results compare favorably to other systems in the literature. We also obtained high beat detection performance with sensitivity of 99.79% and a positive predictivity of 99.96%, using a test set of 59 recordings. Moreover, premature ventricular contraction beats were detected using an original classification strategy. The results obtained validate our approach for real world application.
Poirazi, Panayiota; Neocleous, Costas; Pattichis, Costantinos S; Schizas, Christos N
2004-05-01
A three-layer neural network (NN) with novel adaptive architecture has been developed. The hidden layer of the network consists of slabs of single neuron models, where neurons within a slab--but not between slabs--have the same type of activation function. The network activation functions in all three layers have adaptable parameters. The network was trained using a biologically inspired, guided-annealing learning rule on a variety of medical data. Good training/testing classification performance was obtained on all data sets tested. The performance achieved was comparable to that of SVM classifiers. It was shown that the adaptive network architecture, inspired from the modular organization often encountered in the mammalian cerebral cortex, can benefit classification performance.
Koenecke, Christian; Göhring, Gudrun; de Wreede, Liesbeth C.; van Biezen, Anja; Scheid, Christof; Volin, Liisa; Maertens, Johan; Finke, Jürgen; Schaap, Nicolaas; Robin, Marie; Passweg, Jakob; Cornelissen, Jan; Beelen, Dietrich; Heuser, Michael; de Witte, Theo; Kröger, Nicolaus
2015-01-01
The aim of this study was to determine the impact of the revised 5-group International Prognostic Scoring System cytogenetic classification on outcome after allogeneic stem cell transplantation in patients with myelodysplastic syndromes or secondary acute myeloid leukemia who were reported to the European Society for Blood and Marrow Transplantation database. A total of 903 patients had sufficient cytogenetic information available at stem cell transplantation to be classified according to the 5-group classification. Poor and very poor risk according to this classification was an independent predictor of shorter relapse-free survival (hazard ratio 1.40 and 2.14), overall survival (hazard ratio 1.38 and 2.14), and significantly higher cumulative incidence of relapse (hazard ratio 1.64 and 2.76), compared to patients with very good, good or intermediate risk. When comparing the predictive performance of a series of Cox models both for relapse-free survival and for overall survival, a model with simplified 5-group cytogenetics (merging very good, good and intermediate cytogenetics) performed best. Furthermore, monosomal karyotype is an additional negative predictor for outcome within patients of the poor, but not the very poor risk group of the 5-group classification. The revised International Prognostic Scoring System cytogenetic classification allows patients with myelodysplastic syndromes to be separated into three groups with clearly different outcomes after stem cell transplantation. Poor and very poor risk cytogenetics were strong predictors of poor patient outcome. The new cytogenetic classification added value to prediction of patient outcome compared to prediction models using only traditional risk factors or the 3-group International Prognostic Scoring System cytogenetic classification. PMID:25552702
Multi-source remotely sensed data fusion for improving land cover classification
NASA Astrophysics Data System (ADS)
Chen, Bin; Huang, Bo; Xu, Bing
2017-02-01
Although many advances have been made in past decades, land cover classification of fine-resolution remotely sensed (RS) data integrating multiple temporal, angular, and spectral features remains limited, and the contribution of different RS features to land cover classification accuracy remains uncertain. We proposed to improve land cover classification accuracy by integrating multi-source RS features through data fusion. We further investigated the effect of different RS features on classification performance. The results of fusing Landsat-8 Operational Land Imager (OLI) data with Moderate Resolution Imaging Spectroradiometer (MODIS), China Environment 1A series (HJ-1A), and Advanced Spaceborne Thermal Emission and Reflection (ASTER) digital elevation model (DEM) data, showed that the fused data integrating temporal, spectral, angular, and topographic features achieved better land cover classification accuracy than the original RS data. Compared with the topographic feature, the temporal and angular features extracted from the fused data played more important roles in classification performance, especially those temporal features containing abundant vegetation growth information, which markedly increased the overall classification accuracy. In addition, the multispectral and hyperspectral fusion successfully discriminated detailed forest types. Our study provides a straightforward strategy for hierarchical land cover classification by making full use of available RS data. All of these methods and findings could be useful for land cover classification at both regional and global scales.
NASA Astrophysics Data System (ADS)
Anitha, J.; Vijila, C. Kezi Selva; Hemanth, D. Jude
2010-02-01
Diabetic retinopathy (DR) is a chronic eye disease for which early detection is highly essential to avoid any fatal results. Image processing of retinal images emerge as a feasible tool for this early diagnosis. Digital image processing techniques involve image classification which is a significant technique to detect the abnormality in the eye. Various automated classification systems have been developed in the recent years but most of them lack high classification accuracy. Artificial neural networks are the widely preferred artificial intelligence technique since it yields superior results in terms of classification accuracy. In this work, Radial Basis function (RBF) neural network based bi-level classification system is proposed to differentiate abnormal DR Images and normal retinal images. The results are analyzed in terms of classification accuracy, sensitivity and specificity. A comparative analysis is performed with the results of the probabilistic classifier namely Bayesian classifier to show the superior nature of neural classifier. Experimental results show promising results for the neural classifier in terms of the performance measures.
Malay sentiment analysis based on combined classification approaches and Senti-lexicon algorithm.
Al-Saffar, Ahmed; Awang, Suryanti; Tao, Hai; Omar, Nazlia; Al-Saiagh, Wafaa; Al-Bared, Mohammed
2018-01-01
Sentiment analysis techniques are increasingly exploited to categorize the opinion text to one or more predefined sentiment classes for the creation and automated maintenance of review-aggregation websites. In this paper, a Malay sentiment analysis classification model is proposed to improve classification performances based on the semantic orientation and machine learning approaches. First, a total of 2,478 Malay sentiment-lexicon phrases and words are assigned with a synonym and stored with the help of more than one Malay native speaker, and the polarity is manually allotted with a score. In addition, the supervised machine learning approaches and lexicon knowledge method are combined for Malay sentiment classification with evaluating thirteen features. Finally, three individual classifiers and a combined classifier are used to evaluate the classification accuracy. In experimental results, a wide-range of comparative experiments is conducted on a Malay Reviews Corpus (MRC), and it demonstrates that the feature extraction improves the performance of Malay sentiment analysis based on the combined classification. However, the results depend on three factors, the features, the number of features and the classification approach.
Malay sentiment analysis based on combined classification approaches and Senti-lexicon algorithm
Awang, Suryanti; Tao, Hai; Omar, Nazlia; Al-Saiagh, Wafaa; Al-bared, Mohammed
2018-01-01
Sentiment analysis techniques are increasingly exploited to categorize the opinion text to one or more predefined sentiment classes for the creation and automated maintenance of review-aggregation websites. In this paper, a Malay sentiment analysis classification model is proposed to improve classification performances based on the semantic orientation and machine learning approaches. First, a total of 2,478 Malay sentiment-lexicon phrases and words are assigned with a synonym and stored with the help of more than one Malay native speaker, and the polarity is manually allotted with a score. In addition, the supervised machine learning approaches and lexicon knowledge method are combined for Malay sentiment classification with evaluating thirteen features. Finally, three individual classifiers and a combined classifier are used to evaluate the classification accuracy. In experimental results, a wide-range of comparative experiments is conducted on a Malay Reviews Corpus (MRC), and it demonstrates that the feature extraction improves the performance of Malay sentiment analysis based on the combined classification. However, the results depend on three factors, the features, the number of features and the classification approach. PMID:29684036
Single-accelerometer-based daily physical activity classification.
Long, Xi; Yin, Bin; Aarts, Ronald M
2009-01-01
In this study, a single tri-axial accelerometer placed on the waist was used to record the acceleration data for human physical activity classification. The data collection involved 24 subjects performing daily real-life activities in a naturalistic environment without researchers' intervention. For the purpose of assessing customers' daily energy expenditure, walking, running, cycling, driving, and sports were chosen as target activities for classification. This study compared a Bayesian classification with that of a Decision Tree based approach. A Bayes classifier has the advantage to be more extensible, requiring little effort in classifier retraining and software update upon further expansion or modification of the target activities. Principal components analysis was applied to remove the correlation among features and to reduce the feature vector dimension. Experiments using leave-one-subject-out and 10-fold cross validation protocols revealed a classification accuracy of approximately 80%, which was comparable with that obtained by a Decision Tree classifier.
Graph-Based Semi-Supervised Hyperspectral Image Classification Using Spatial Information
NASA Astrophysics Data System (ADS)
Jamshidpour, N.; Homayouni, S.; Safari, A.
2017-09-01
Hyperspectral image classification has been one of the most popular research areas in the remote sensing community in the past decades. However, there are still some problems that need specific attentions. For example, the lack of enough labeled samples and the high dimensionality problem are two most important issues which degrade the performance of supervised classification dramatically. The main idea of semi-supervised learning is to overcome these issues by the contribution of unlabeled samples, which are available in an enormous amount. In this paper, we propose a graph-based semi-supervised classification method, which uses both spectral and spatial information for hyperspectral image classification. More specifically, two graphs were designed and constructed in order to exploit the relationship among pixels in spectral and spatial spaces respectively. Then, the Laplacians of both graphs were merged to form a weighted joint graph. The experiments were carried out on two different benchmark hyperspectral data sets. The proposed method performed significantly better than the well-known supervised classification methods, such as SVM. The assessments consisted of both accuracy and homogeneity analyses of the produced classification maps. The proposed spectral-spatial SSL method considerably increased the classification accuracy when the labeled training data set is too scarce.When there were only five labeled samples for each class, the performance improved 5.92% and 10.76% compared to spatial graph-based SSL, for AVIRIS Indian Pine and Pavia University data sets respectively.
Chen, Yifei; Sun, Yuxing; Han, Bing-Qing
2015-01-01
Protein interaction article classification is a text classification task in the biological domain to determine which articles describe protein-protein interactions. Since the feature space in text classification is high-dimensional, feature selection is widely used for reducing the dimensionality of features to speed up computation without sacrificing classification performance. Many existing feature selection methods are based on the statistical measure of document frequency and term frequency. One potential drawback of these methods is that they treat features separately. Hence, first we design a similarity measure between the context information to take word cooccurrences and phrase chunks around the features into account. Then we introduce the similarity of context information to the importance measure of the features to substitute the document and term frequency. Hence we propose new context similarity-based feature selection methods. Their performance is evaluated on two protein interaction article collections and compared against the frequency-based methods. The experimental results reveal that the context similarity-based methods perform better in terms of the F1 measure and the dimension reduction rate. Benefiting from the context information surrounding the features, the proposed methods can select distinctive features effectively for protein interaction article classification.
PROTAX-Sound: A probabilistic framework for automated animal sound identification
Somervuo, Panu; Ovaskainen, Otso
2017-01-01
Autonomous audio recording is stimulating new field in bioacoustics, with a great promise for conducting cost-effective species surveys. One major current challenge is the lack of reliable classifiers capable of multi-species identification. We present PROTAX-Sound, a statistical framework to perform probabilistic classification of animal sounds. PROTAX-Sound is based on a multinomial regression model, and it can utilize as predictors any kind of sound features or classifications produced by other existing algorithms. PROTAX-Sound combines audio and image processing techniques to scan environmental audio files. It identifies regions of interest (a segment of the audio file that contains a vocalization to be classified), extracts acoustic features from them and compares with samples in a reference database. The output of PROTAX-Sound is the probabilistic classification of each vocalization, including the possibility that it represents species not present in the reference database. We demonstrate the performance of PROTAX-Sound by classifying audio from a species-rich case study of tropical birds. The best performing classifier achieved 68% classification accuracy for 200 bird species. PROTAX-Sound improves the classification power of current techniques by combining information from multiple classifiers in a manner that yields calibrated classification probabilities. PMID:28863178
PROTAX-Sound: A probabilistic framework for automated animal sound identification.
de Camargo, Ulisses Moliterno; Somervuo, Panu; Ovaskainen, Otso
2017-01-01
Autonomous audio recording is stimulating new field in bioacoustics, with a great promise for conducting cost-effective species surveys. One major current challenge is the lack of reliable classifiers capable of multi-species identification. We present PROTAX-Sound, a statistical framework to perform probabilistic classification of animal sounds. PROTAX-Sound is based on a multinomial regression model, and it can utilize as predictors any kind of sound features or classifications produced by other existing algorithms. PROTAX-Sound combines audio and image processing techniques to scan environmental audio files. It identifies regions of interest (a segment of the audio file that contains a vocalization to be classified), extracts acoustic features from them and compares with samples in a reference database. The output of PROTAX-Sound is the probabilistic classification of each vocalization, including the possibility that it represents species not present in the reference database. We demonstrate the performance of PROTAX-Sound by classifying audio from a species-rich case study of tropical birds. The best performing classifier achieved 68% classification accuracy for 200 bird species. PROTAX-Sound improves the classification power of current techniques by combining information from multiple classifiers in a manner that yields calibrated classification probabilities.
Ground-based cloud classification by learning stable local binary patterns
NASA Astrophysics Data System (ADS)
Wang, Yu; Shi, Cunzhao; Wang, Chunheng; Xiao, Baihua
2018-07-01
Feature selection and extraction is the first step in implementing pattern classification. The same is true for ground-based cloud classification. Histogram features based on local binary patterns (LBPs) are widely used to classify texture images. However, the conventional uniform LBP approach cannot capture all the dominant patterns in cloud texture images, thereby resulting in low classification performance. In this study, a robust feature extraction method by learning stable LBPs is proposed based on the averaged ranks of the occurrence frequencies of all rotation invariant patterns defined in the LBPs of cloud images. The proposed method is validated with a ground-based cloud classification database comprising five cloud types. Experimental results demonstrate that the proposed method achieves significantly higher classification accuracy than the uniform LBP, local texture patterns (LTP), dominant LBP (DLBP), completed LBP (CLTP) and salient LBP (SaLBP) methods in this cloud image database and under different noise conditions. And the performance of the proposed method is comparable with that of the popular deep convolutional neural network (DCNN) method, but with less computation complexity. Furthermore, the proposed method also achieves superior performance on an independent test data set.
Application of LANDSAT data to wetland study and land use classification in west Tennessee
NASA Technical Reports Server (NTRS)
Jones, N. L.; Shahrokhi, F.
1977-01-01
The Obion-Forked Deer River Basin in northwest Tennessee is confronted with several acute land use problems which result in excessive erosion, sedimentation, pollution, and hydrologic runoff. LANDSAT data was applied to determine land use of selected watershed areas within the basin, with special emphasis on determining wetland boundaries. Densitometric analysis was performed to allow numerical classification of objects observed in the imagery on the basis of measurements of optical densities. Multispectral analysis of the LANDSAT imagery provided the capability of altering the color of the image presentation in order to enhance desired relationships. Manual mapping and classification techniques were performed in order to indicate a level of accuracy of the LANDSAT data as compared with high and low altitude photography for land use classification.
Sharma, Harshita; Zerbe, Norman; Klempert, Iris; Hellwich, Olaf; Hufnagl, Peter
2017-11-01
Deep learning using convolutional neural networks is an actively emerging field in histological image analysis. This study explores deep learning methods for computer-aided classification in H&E stained histopathological whole slide images of gastric carcinoma. An introductory convolutional neural network architecture is proposed for two computerized applications, namely, cancer classification based on immunohistochemical response and necrosis detection based on the existence of tumor necrosis in the tissue. Classification performance of the developed deep learning approach is quantitatively compared with traditional image analysis methods in digital histopathology requiring prior computation of handcrafted features, such as statistical measures using gray level co-occurrence matrix, Gabor filter-bank responses, LBP histograms, gray histograms, HSV histograms and RGB histograms, followed by random forest machine learning. Additionally, the widely known AlexNet deep convolutional framework is comparatively analyzed for the corresponding classification problems. The proposed convolutional neural network architecture reports favorable results, with an overall classification accuracy of 0.6990 for cancer classification and 0.8144 for necrosis detection. Copyright © 2017 Elsevier Ltd. All rights reserved.
Shin, Younghak; Lee, Seungchan; Ahn, Minkyu; Cho, Hohyun; Jun, Sung Chan; Lee, Heung-No
2015-11-01
One of the main problems related to electroencephalogram (EEG) based brain-computer interface (BCI) systems is the non-stationarity of the underlying EEG signals. This results in the deterioration of the classification performance during experimental sessions. Therefore, adaptive classification techniques are required for EEG based BCI applications. In this paper, we propose simple adaptive sparse representation based classification (SRC) schemes. Supervised and unsupervised dictionary update techniques for new test data and a dictionary modification method by using the incoherence measure of the training data are investigated. The proposed methods are very simple and additional computation for the re-training of the classifier is not needed. The proposed adaptive SRC schemes are evaluated using two BCI experimental datasets. The proposed methods are assessed by comparing classification results with the conventional SRC and other adaptive classification methods. On the basis of the results, we find that the proposed adaptive schemes show relatively improved classification accuracy as compared to conventional methods without requiring additional computation. Copyright © 2015 Elsevier Ltd. All rights reserved.
Automatic classification of protein structures using physicochemical parameters.
Mohan, Abhilash; Rao, M Divya; Sunderrajan, Shruthi; Pennathur, Gautam
2014-09-01
Protein classification is the first step to functional annotation; SCOP and Pfam databases are currently the most relevant protein classification schemes. However, the disproportion in the number of three dimensional (3D) protein structures generated versus their classification into relevant superfamilies/families emphasizes the need for automated classification schemes. Predicting function of novel proteins based on sequence information alone has proven to be a major challenge. The present study focuses on the use of physicochemical parameters in conjunction with machine learning algorithms (Naive Bayes, Decision Trees, Random Forest and Support Vector Machines) to classify proteins into their respective SCOP superfamily/Pfam family, using sequence derived information. Spectrophores™, a 1D descriptor of the 3D molecular field surrounding a structure was used as a benchmark to compare the performance of the physicochemical parameters. The machine learning algorithms were modified to select features based on information gain for each SCOP superfamily/Pfam family. The effect of combining physicochemical parameters and spectrophores on classification accuracy (CA) was studied. Machine learning algorithms trained with the physicochemical parameters consistently classified SCOP superfamilies and Pfam families with a classification accuracy above 90%, while spectrophores performed with a CA of around 85%. Feature selection improved classification accuracy for both physicochemical parameters and spectrophores based machine learning algorithms. Combining both attributes resulted in a marginal loss of performance. Physicochemical parameters were able to classify proteins from both schemes with classification accuracy ranging from 90-96%. These results suggest the usefulness of this method in classifying proteins from amino acid sequences.
Reduction from cost-sensitive ordinal ranking to weighted binary classification.
Lin, Hsuan-Tien; Li, Ling
2012-05-01
We present a reduction framework from ordinal ranking to binary classification. The framework consists of three steps: extracting extended examples from the original examples, learning a binary classifier on the extended examples with any binary classification algorithm, and constructing a ranker from the binary classifier. Based on the framework, we show that a weighted 0/1 loss of the binary classifier upper-bounds the mislabeling cost of the ranker, both error-wise and regret-wise. Our framework allows not only the design of good ordinal ranking algorithms based on well-tuned binary classification approaches, but also the derivation of new generalization bounds for ordinal ranking from known bounds for binary classification. In addition, our framework unifies many existing ordinal ranking algorithms, such as perceptron ranking and support vector ordinal regression. When compared empirically on benchmark data sets, some of our newly designed algorithms enjoy advantages in terms of both training speed and generalization performance over existing algorithms. In addition, the newly designed algorithms lead to better cost-sensitive ordinal ranking performance, as well as improved listwise ranking performance.
Random forests for classification in ecology
Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J.
2007-01-01
Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in ecology. Advantages of RF compared to other statistical classifiers include (1) very high classification accuracy; (2) a novel method of determining variable importance; (3) ability to model complex interactions among predictor variables; (4) flexibility to perform several types of statistical data analysis, including regression, classification, survival analysis, and unsupervised learning; and (5) an algorithm for imputing missing values. We compared the accuracies of RF and four other commonly used statistical classifiers using data on invasive plant species presence in Lava Beds National Monument, California, USA, rare lichen species presence in the Pacific Northwest, USA, and nest sites for cavity nesting birds in the Uinta Mountains, Utah, USA. We observed high classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods. We also observed that the variables that RF identified as most important for classifying invasive plant species coincided with expectations based on the literature. ?? 2007 by the Ecological Society of America.
Selective classification for improved robustness of myoelectric control under nonideal conditions.
Scheme, Erik J; Englehart, Kevin B; Hudgins, Bernard S
2011-06-01
Recent literature in pattern recognition-based myoelectric control has highlighted a disparity between classification accuracy and the usability of upper limb prostheses. This paper suggests that the conventionally defined classification accuracy may be idealistic and may not reflect true clinical performance. Herein, a novel myoelectric control system based on a selective multiclass one-versus-one classification scheme, capable of rejecting unknown data patterns, is introduced. This scheme is shown to outperform nine other popular classifiers when compared using conventional classification accuracy as well as a form of leave-one-out analysis that may be more representative of real prosthetic use. Additionally, the classification scheme allows for real-time, independent adjustment of individual class-pair boundaries making it flexible and intuitive for clinical use.
Recursive feature selection with significant variables of support vectors.
Tsai, Chen-An; Huang, Chien-Hsun; Chang, Ching-Wei; Chen, Chun-Houh
2012-01-01
The development of DNA microarray makes researchers screen thousands of genes simultaneously and it also helps determine high- and low-expression level genes in normal and disease tissues. Selecting relevant genes for cancer classification is an important issue. Most of the gene selection methods use univariate ranking criteria and arbitrarily choose a threshold to choose genes. However, the parameter setting may not be compatible to the selected classification algorithms. In this paper, we propose a new gene selection method (SVM-t) based on the use of t-statistics embedded in support vector machine. We compared the performance to two similar SVM-based methods: SVM recursive feature elimination (SVMRFE) and recursive support vector machine (RSVM). The three methods were compared based on extensive simulation experiments and analyses of two published microarray datasets. In the simulation experiments, we found that the proposed method is more robust in selecting informative genes than SVMRFE and RSVM and capable to attain good classification performance when the variations of informative and noninformative genes are different. In the analysis of two microarray datasets, the proposed method yields better performance in identifying fewer genes with good prediction accuracy, compared to SVMRFE and RSVM.
NASA Astrophysics Data System (ADS)
Majumder, S. K.; Krishna, H.; Sidramesh, M.; Chaturvedi, P.; Gupta, P. K.
2011-08-01
We report the results of a comparative evaluation of in vivo fluorescence and Raman spectroscopy for diagnosis of oral neoplasia. The study carried out at Tata Memorial Hospital, Mumbai, involved 26 healthy volunteers and 138 patients being screened for neoplasm of oral cavity. Spectral measurements were taken from multiple sites of abnormal as well as apparently uninvolved contra-lateral regions of the oral cavity in each patient. The different tissue sites investigated belonged to one of the four histopathology categories: 1) squamous cell carcinoma (SCC), 2) oral sub-mucous fibrosis (OSMF), 3) leukoplakia (LP) and 4) normal squamous tissue. A probability based multivariate statistical algorithm utilizing nonlinear Maximum Representation and Discrimination Feature for feature extraction and Sparse Multinomial Logistic Regression for classification was developed for direct multi-class classification in a leave-one-patient-out cross validation mode. The results reveal that the performance of Raman spectroscopy is considerably superior to that of fluorescence in stratifying the oral tissues into respective histopathologic categories. The best classification accuracy was observed to be 90%, 93%, 94%, and 89% for SCC, SMF, leukoplakia, and normal oral tissues, respectively, on the basis of leave-one-patient-out cross-validation, with an overall accuracy of 91%. However, when a binary classification was employed to distinguish spectra from all the SCC, SMF and leukoplakik tissue sites together from normal, fluorescence and Raman spectroscopy were seen to have almost comparable performances with Raman yielding marginally better classification accuracy of 98.5% as compared to 94% of fluorescence.
Preprocessing and meta-classification for brain-computer interfaces.
Hammon, Paul S; de Sa, Virginia R
2007-03-01
A brain-computer interface (BCI) is a system which allows direct translation of brain states into actions, bypassing the usual muscular pathways. A BCI system works by extracting user brain signals, applying machine learning algorithms to classify the user's brain state, and performing a computer-controlled action. Our goal is to improve brain state classification. Perhaps the most obvious way to improve classification performance is the selection of an advanced learning algorithm. However, it is now well known in the BCI community that careful selection of preprocessing steps is crucial to the success of any classification scheme. Furthermore, recent work indicates that combining the output of multiple classifiers (meta-classification) leads to improved classification rates relative to single classifiers (Dornhege et al., 2004). In this paper, we develop an automated approach which systematically analyzes the relative contributions of different preprocessing and meta-classification approaches. We apply this procedure to three data sets drawn from BCI Competition 2003 (Blankertz et al., 2004) and BCI Competition III (Blankertz et al., 2006), each of which exhibit very different characteristics. Our final classification results compare favorably with those from past BCI competitions. Additionally, we analyze the relative contributions of individual preprocessing and meta-classification choices and discuss which types of BCI data benefit most from specific algorithms.
NASA Astrophysics Data System (ADS)
Broderick, Ciaran; Fealy, Rowan
2013-04-01
Circulation type classifications (CTCs) compiled as part of the COST733 Action, entitled 'Harmonisation and Application of Weather Type Classifications for European Regions', are examined for their synoptic and climatological applicability to Ireland based on their ability to characterise surface temperature and precipitation. In all 16 different objective classification schemes, representative of four different methodological approaches to circulation typing (optimization algorithms, threshold based methods, eigenvector techniques and leader algorithms) are considered. Several statistical metrics which variously quantify the ability of CTCs to discretize daily data into well-defined homogeneous groups are used to evaluate and compare different approaches to synoptic typing. The records from 14 meteorological stations located across the island of Ireland are used in the study. The results indicate that while it was not possible to identify a single optimum classification or approach to circulation typing - conditional on the location and surface variables considered - a number of general assertions regarding the performance of different schemes can be made. The findings for surface temperature indicate that that those classifications based on predefined thresholds (e.g. Litynski, GrossWetterTypes and original Lamb Weather Type) perform well, as do the Kruizinga and Lund classification schemes. Similarly for precipitation predefined type classifications return high skill scores, as do those classifications derived using some optimization procedure (e.g. SANDRA, Self Organizing Maps and K-Means clustering). For both temperature and precipitation the results generally indicate that the classifications perform best for the winter season - reflecting the closer coupling between large-scale circulation and surface conditions during this period. In contrast to the findings for temperature, spatial patterns in the performance of classifications were more evident for precipitation. In the case of this variable those more westerly synoptic stations open to zonal airflow and less influenced by regional scale forcings generally exhibited a stronger link with large-scale circulation.
Reduction in training time of a deep learning model in detection of lesions in CT
NASA Astrophysics Data System (ADS)
Makkinejad, Nazanin; Tajbakhsh, Nima; Zarshenas, Amin; Khokhar, Ashfaq; Suzuki, Kenji
2018-02-01
Deep learning (DL) emerged as a powerful tool for object detection and classification in medical images. Building a well-performing DL model, however, requires a huge number of images for training, and it takes days to train a DL model even on a cutting edge high-performance computing platform. This study is aimed at developing a method for selecting a "small" number of representative samples from a large collection of training samples to train a DL model for the could be used to detect polyps in CT colonography (CTC), without compromising the classification performance. Our proposed method for representative sample selection (RSS) consists of a K-means clustering algorithm. For the performance evaluation, we applied the proposed method to select samples for the training of a massive training artificial neural network based DL model, to be used for the classification of polyps and non-polyps in CTC. Our results show that the proposed method reduce the training time by a factor of 15, while maintaining the classification performance equivalent to the model trained using the full training set. We compare the performance using area under the receiveroperating- characteristic curve (AUC).
Khalilzadeh, Omid; Baerlocher, Mark O; Shyn, Paul B; Connolly, Bairbre L; Devane, A Michael; Morris, Christopher S; Cohen, Alan M; Midia, Mehran; Thornton, Raymond H; Gross, Kathleen; Caplin, Drew M; Aeron, Gunjan; Misra, Sanjay; Patel, Nilesh H; Walker, T Gregory; Martinez-Salazar, Gloria; Silberzweig, James E; Nikolic, Boris
2017-10-01
To develop a new adverse event (AE) classification for the interventional radiology (IR) procedures and evaluate its clinical, research, and educational value compared with the existing Society of Interventional Radiology (SIR) classification via an SIR member survey. A new AE classification was developed by members of the Standards of Practice Committee of the SIR. Subsequently, a survey was created by a group of 18 members from the SIR Standards of Practice Committee and Service Lines. Twelve clinical AE case scenarios were generated that encompassed a broad spectrum of IR procedures and potential AEs. Survey questions were designed to evaluate the following domains: educational and research values, accountability for intraprocedural challenges, consistency of AE reporting, unambiguity, and potential for incorporation into existing quality-assurance framework. For each AE scenario, the survey participants were instructed to answer questions about the proposed and existing SIR classifications. SIR members were invited via online survey links, and 68 members participated among 140 surveyed. Answers on new and existing classifications were evaluated and compared statistically. Overall comparison between the two surveys was performed by generalized linear modeling. The proposed AE classification received superior evaluations in terms of consistency of reporting (P < .05) and potential for incorporation into existing quality-assurance framework (P < .05). Respondents gave a higher overall rating to the educational and research value of the new compared with the existing classification (P < .05). This study proposed an AE classification system that outperformed the existing SIR classification in the studied domains. Copyright © 2017 SIR. Published by Elsevier Inc. All rights reserved.
Marciano, Michael A; Adelman, Jonathan D
2017-03-01
The deconvolution of DNA mixtures remains one of the most critical challenges in the field of forensic DNA analysis. In addition, of all the data features required to perform such deconvolution, the number of contributors in the sample is widely considered the most important, and, if incorrectly chosen, the most likely to negatively influence the mixture interpretation of a DNA profile. Unfortunately, most current approaches to mixture deconvolution require the assumption that the number of contributors is known by the analyst, an assumption that can prove to be especially faulty when faced with increasingly complex mixtures of 3 or more contributors. In this study, we propose a probabilistic approach for estimating the number of contributors in a DNA mixture that leverages the strengths of machine learning. To assess this approach, we compare classification performances of six machine learning algorithms and evaluate the model from the top-performing algorithm against the current state of the art in the field of contributor number classification. Overall results show over 98% accuracy in identifying the number of contributors in a DNA mixture of up to 4 contributors. Comparative results showed 3-person mixtures had a classification accuracy improvement of over 6% compared to the current best-in-field methodology, and that 4-person mixtures had a classification accuracy improvement of over 20%. The Probabilistic Assessment for Contributor Estimation (PACE) also accomplishes classification of mixtures of up to 4 contributors in less than 1s using a standard laptop or desktop computer. Considering the high classification accuracy rates, as well as the significant time commitment required by the current state of the art model versus seconds required by a machine learning-derived model, the approach described herein provides a promising means of estimating the number of contributors and, subsequently, will lead to improved DNA mixture interpretation. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Epileptic seizure detection in EEG signal with GModPCA and support vector machine.
Jaiswal, Abeg Kumar; Banka, Haider
2017-01-01
Epilepsy is one of the most common neurological disorders caused by recurrent seizures. Electroencephalograms (EEGs) record neural activity and can detect epilepsy. Visual inspection of an EEG signal for epileptic seizure detection is a time-consuming process and may lead to human error; therefore, recently, a number of automated seizure detection frameworks were proposed to replace these traditional methods. Feature extraction and classification are two important steps in these procedures. Feature extraction focuses on finding the informative features that could be used for classification and correct decision-making. Therefore, proposing effective feature extraction techniques for seizure detection is of great significance. Principal Component Analysis (PCA) is a dimensionality reduction technique used in different fields of pattern recognition including EEG signal classification. Global modular PCA (GModPCA) is a variation of PCA. In this paper, an effective framework with GModPCA and Support Vector Machine (SVM) is presented for epileptic seizure detection in EEG signals. The feature extraction is performed with GModPCA, whereas SVM trained with radial basis function kernel performed the classification between seizure and nonseizure EEG signals. Seven different experimental cases were conducted on the benchmark epilepsy EEG dataset. The system performance was evaluated using 10-fold cross-validation. In addition, we prove analytically that GModPCA has less time and space complexities as compared to PCA. The experimental results show that EEG signals have strong inter-sub-pattern correlations. GModPCA and SVM have been able to achieve 100% accuracy for the classification between normal and epileptic signals. Along with this, seven different experimental cases were tested. The classification results of the proposed approach were better than were compared the results of some of the existing methods proposed in literature. It is also found that the time and space complexities of GModPCA are less as compared to PCA. This study suggests that GModPCA and SVM could be used for automated epileptic seizure detection in EEG signal.
Land Cover Analysis by Using Pixel-Based and Object-Based Image Classification Method in Bogor
NASA Astrophysics Data System (ADS)
Amalisana, Birohmatin; Rokhmatullah; Hernina, Revi
2017-12-01
The advantage of image classification is to provide earth’s surface information like landcover and time-series changes. Nowadays, pixel-based image classification technique is commonly performed with variety of algorithm such as minimum distance, parallelepiped, maximum likelihood, mahalanobis distance. On the other hand, landcover classification can also be acquired by using object-based image classification technique. In addition, object-based classification uses image segmentation from parameter such as scale, form, colour, smoothness and compactness. This research is aimed to compare the result of landcover classification and its change detection between parallelepiped pixel-based and object-based classification method. Location of this research is Bogor with 20 years range of observation from 1996 until 2016. This region is famous as urban areas which continuously change due to its rapid development, so that time-series landcover information of this region will be interesting.
Comparison of six electromyography acquisition setups on hand movement classification tasks
Pizzolato, Stefano; Tagliapietra, Luca; Cognolato, Matteo; Reggiani, Monica; Müller, Henning
2017-01-01
Hand prostheses controlled by surface electromyography are promising due to the non-invasive approach and the control capabilities offered by machine learning. Nevertheless, dexterous prostheses are still scarcely spread due to control difficulties, low robustness and often prohibitive costs. Several sEMG acquisition setups are now available, ranging in terms of costs between a few hundred and several thousand dollars. The objective of this paper is the relative comparison of six acquisition setups on an identical hand movement classification task, in order to help the researchers to choose the proper acquisition setup for their requirements. The acquisition setups are based on four different sEMG electrodes (including Otto Bock, Delsys Trigno, Cometa Wave + Dormo ECG and two Thalmic Myo armbands) and they were used to record more than 50 hand movements from intact subjects with a standardized acquisition protocol. The relative performance of the six sEMG acquisition setups is compared on 41 identical hand movements with a standardized feature extraction and data analysis pipeline aimed at performing hand movement classification. Comparable classification results are obtained with three acquisition setups including the Delsys Trigno, the Cometa Wave and the affordable setup composed of two Myo armbands. The results suggest that practical sEMG tests can be performed even when costs are relevant (e.g. in small laboratories, developing countries or use by children). All the presented datasets can be used for offline tests and their quality can easily be compared as the data sets are publicly available. PMID:29023548
Comparison of six electromyography acquisition setups on hand movement classification tasks.
Pizzolato, Stefano; Tagliapietra, Luca; Cognolato, Matteo; Reggiani, Monica; Müller, Henning; Atzori, Manfredo
2017-01-01
Hand prostheses controlled by surface electromyography are promising due to the non-invasive approach and the control capabilities offered by machine learning. Nevertheless, dexterous prostheses are still scarcely spread due to control difficulties, low robustness and often prohibitive costs. Several sEMG acquisition setups are now available, ranging in terms of costs between a few hundred and several thousand dollars. The objective of this paper is the relative comparison of six acquisition setups on an identical hand movement classification task, in order to help the researchers to choose the proper acquisition setup for their requirements. The acquisition setups are based on four different sEMG electrodes (including Otto Bock, Delsys Trigno, Cometa Wave + Dormo ECG and two Thalmic Myo armbands) and they were used to record more than 50 hand movements from intact subjects with a standardized acquisition protocol. The relative performance of the six sEMG acquisition setups is compared on 41 identical hand movements with a standardized feature extraction and data analysis pipeline aimed at performing hand movement classification. Comparable classification results are obtained with three acquisition setups including the Delsys Trigno, the Cometa Wave and the affordable setup composed of two Myo armbands. The results suggest that practical sEMG tests can be performed even when costs are relevant (e.g. in small laboratories, developing countries or use by children). All the presented datasets can be used for offline tests and their quality can easily be compared as the data sets are publicly available.
A Neuro-Fuzzy Approach in the Classification of Students' Academic Performance
2013-01-01
Classifying the student academic performance with high accuracy facilitates admission decisions and enhances educational services at educational institutions. The purpose of this paper is to present a neuro-fuzzy approach for classifying students into different groups. The neuro-fuzzy classifier used previous exam results and other related factors as input variables and labeled students based on their expected academic performance. The results showed that the proposed approach achieved a high accuracy. The results were also compared with those obtained from other well-known classification approaches, including support vector machine, Naive Bayes, neural network, and decision tree approaches. The comparative analysis indicated that the neuro-fuzzy approach performed better than the others. It is expected that this work may be used to support student admission procedures and to strengthen the services of educational institutions. PMID:24302928
A neuro-fuzzy approach in the classification of students' academic performance.
Do, Quang Hung; Chen, Jeng-Fung
2013-01-01
Classifying the student academic performance with high accuracy facilitates admission decisions and enhances educational services at educational institutions. The purpose of this paper is to present a neuro-fuzzy approach for classifying students into different groups. The neuro-fuzzy classifier used previous exam results and other related factors as input variables and labeled students based on their expected academic performance. The results showed that the proposed approach achieved a high accuracy. The results were also compared with those obtained from other well-known classification approaches, including support vector machine, Naive Bayes, neural network, and decision tree approaches. The comparative analysis indicated that the neuro-fuzzy approach performed better than the others. It is expected that this work may be used to support student admission procedures and to strengthen the services of educational institutions.
NASA Astrophysics Data System (ADS)
Paul, Subir; Nagesh Kumar, D.
2018-04-01
Hyperspectral (HS) data comprises of continuous spectral responses of hundreds of narrow spectral bands with very fine spectral resolution or bandwidth, which offer feature identification and classification with high accuracy. In the present study, Mutual Information (MI) based Segmented Stacked Autoencoder (S-SAE) approach for spectral-spatial classification of the HS data is proposed to reduce the complexity and computational time compared to Stacked Autoencoder (SAE) based feature extraction. A non-parametric dependency measure (MI) based spectral segmentation is proposed instead of linear and parametric dependency measure to take care of both linear and nonlinear inter-band dependency for spectral segmentation of the HS bands. Then morphological profiles are created corresponding to segmented spectral features to assimilate the spatial information in the spectral-spatial classification approach. Two non-parametric classifiers, Support Vector Machine (SVM) with Gaussian kernel and Random Forest (RF) are used for classification of the three most popularly used HS datasets. Results of the numerical experiments carried out in this study have shown that SVM with a Gaussian kernel is providing better results for the Pavia University and Botswana datasets whereas RF is performing better for Indian Pines dataset. The experiments performed with the proposed methodology provide encouraging results compared to numerous existing approaches.
A Novel Feature Selection Technique for Text Classification Using Naïve Bayes.
Dey Sarkar, Subhajit; Goswami, Saptarsi; Agarwal, Aman; Aktar, Javed
2014-01-01
With the proliferation of unstructured data, text classification or text categorization has found many applications in topic classification, sentiment analysis, authorship identification, spam detection, and so on. There are many classification algorithms available. Naïve Bayes remains one of the oldest and most popular classifiers. On one hand, implementation of naïve Bayes is simple and, on the other hand, this also requires fewer amounts of training data. From the literature review, it is found that naïve Bayes performs poorly compared to other classifiers in text classification. As a result, this makes the naïve Bayes classifier unusable in spite of the simplicity and intuitiveness of the model. In this paper, we propose a two-step feature selection method based on firstly a univariate feature selection and then feature clustering, where we use the univariate feature selection method to reduce the search space and then apply clustering to select relatively independent feature sets. We demonstrate the effectiveness of our method by a thorough evaluation and comparison over 13 datasets. The performance improvement thus achieved makes naïve Bayes comparable or superior to other classifiers. The proposed algorithm is shown to outperform other traditional methods like greedy search based wrapper or CFS.
CNN-BLPred: a Convolutional neural network based predictor for β-Lactamases (BL) and their classes.
White, Clarence; Ismail, Hamid D; Saigo, Hiroto; Kc, Dukka B
2017-12-28
The β-Lactamase (BL) enzyme family is an important class of enzymes that plays a key role in bacterial resistance to antibiotics. As the newly identified number of BL enzymes is increasing daily, it is imperative to develop a computational tool to classify the newly identified BL enzymes into one of its classes. There are two types of classification of BL enzymes: Molecular Classification and Functional Classification. Existing computational methods only address Molecular Classification and the performance of these existing methods is unsatisfactory. We addressed the unsatisfactory performance of the existing methods by implementing a Deep Learning approach called Convolutional Neural Network (CNN). We developed CNN-BLPred, an approach for the classification of BL proteins. The CNN-BLPred uses Gradient Boosted Feature Selection (GBFS) in order to select the ideal feature set for each BL classification. Based on the rigorous benchmarking of CCN-BLPred using both leave-one-out cross-validation and independent test sets, CCN-BLPred performed better than the other existing algorithms. Compared with other architectures of CNN, Recurrent Neural Network, and Random Forest, the simple CNN architecture with only one convolutional layer performs the best. After feature extraction, we were able to remove ~95% of the 10,912 features using Gradient Boosted Trees. During 10-fold cross validation, we increased the accuracy of the classic BL predictions by 7%. We also increased the accuracy of Class A, Class B, Class C, and Class D performance by an average of 25.64%. The independent test results followed a similar trend. We implemented a deep learning algorithm known as Convolutional Neural Network (CNN) to develop a classifier for BL classification. Combined with feature selection on an exhaustive feature set and using balancing method such as Random Oversampling (ROS), Random Undersampling (RUS) and Synthetic Minority Oversampling Technique (SMOTE), CNN-BLPred performs significantly better than existing algorithms for BL classification.
NASA Technical Reports Server (NTRS)
Salu, Yehuda; Tilton, James
1993-01-01
The classification of multispectral image data obtained from satellites has become an important tool for generating ground cover maps. This study deals with the application of nonparametric pixel-by-pixel classification methods in the classification of pixels, based on their multispectral data. A new neural network, the Binary Diamond, is introduced, and its performance is compared with a nearest neighbor algorithm and a back-propagation network. The Binary Diamond is a multilayer, feed-forward neural network, which learns from examples in unsupervised, 'one-shot' mode. It recruits its neurons according to the actual training set, as it learns. The comparisons of the algorithms were done by using a realistic data base, consisting of approximately 90,000 Landsat 4 Thematic Mapper pixels. The Binary Diamond and the nearest neighbor performances were close, with some advantages to the Binary Diamond. The performance of the back-propagation network lagged behind. An efficient nearest neighbor algorithm, the binned nearest neighbor, is described. Ways for improving the performances, such as merging categories, and analyzing nonboundary pixels, are addressed and evaluated.
Joint Feature Selection and Classification for Multilabel Learning.
Huang, Jun; Li, Guorong; Huang, Qingming; Wu, Xindong
2018-03-01
Multilabel learning deals with examples having multiple class labels simultaneously. It has been applied to a variety of applications, such as text categorization and image annotation. A large number of algorithms have been proposed for multilabel learning, most of which concentrate on multilabel classification problems and only a few of them are feature selection algorithms. Current multilabel classification models are mainly built on a single data representation composed of all the features which are shared by all the class labels. Since each class label might be decided by some specific features of its own, and the problems of classification and feature selection are often addressed independently, in this paper, we propose a novel method which can perform joint feature selection and classification for multilabel learning, named JFSC. Different from many existing methods, JFSC learns both shared features and label-specific features by considering pairwise label correlations, and builds the multilabel classifier on the learned low-dimensional data representations simultaneously. A comparative study with state-of-the-art approaches manifests a competitive performance of our proposed method both in classification and feature selection for multilabel learning.
Li, Zhaohua; Wang, Yuduo; Quan, Wenxiang; Wu, Tongning; Lv, Bin
2015-02-15
Based on near-infrared spectroscopy (NIRS), recent converging evidence has been observed that patients with schizophrenia exhibit abnormal functional activities in the prefrontal cortex during a verbal fluency task (VFT). Therefore, some studies have attempted to employ NIRS measurements to differentiate schizophrenia patients from healthy controls with different classification methods. However, no systematic evaluation was conducted to compare their respective classification performances on the same study population. In this study, we evaluated the classification performance of four classification methods (including linear discriminant analysis, k-nearest neighbors, Gaussian process classifier, and support vector machines) on an NIRS-aided schizophrenia diagnosis. We recruited a large sample of 120 schizophrenia patients and 120 healthy controls and measured the hemoglobin response in the prefrontal cortex during the VFT using a multichannel NIRS system. Features for classification were extracted from three types of NIRS data in each channel. We subsequently performed a principal component analysis (PCA) for feature selection prior to comparison of the different classification methods. We achieved a maximum accuracy of 85.83% and an overall mean accuracy of 83.37% using a PCA-based feature selection on oxygenated hemoglobin signals and support vector machine classifier. This is the first comprehensive evaluation of different classification methods for the diagnosis of schizophrenia based on different types of NIRS signals. Our results suggested that, using the appropriate classification method, NIRS has the potential capacity to be an effective objective biomarker for the diagnosis of schizophrenia. Copyright © 2014 Elsevier B.V. All rights reserved.
WND-CHARM: Multi-purpose image classification using compound image transforms
Orlov, Nikita; Shamir, Lior; Macura, Tomasz; Johnston, Josiah; Eckley, D. Mark; Goldberg, Ilya G.
2008-01-01
We describe a multi-purpose image classifier that can be applied to a wide variety of image classification tasks without modifications or fine-tuning, and yet provide classification accuracy comparable to state-of-the-art task-specific image classifiers. The proposed image classifier first extracts a large set of 1025 image features including polynomial decompositions, high contrast features, pixel statistics, and textures. These features are computed on the raw image, transforms of the image, and transforms of transforms of the image. The feature values are then used to classify test images into a set of pre-defined image classes. This classifier was tested on several different problems including biological image classification and face recognition. Although we cannot make a claim of universality, our experimental results show that this classifier performs as well or better than classifiers developed specifically for these image classification tasks. Our classifier’s high performance on a variety of classification problems is attributed to (i) a large set of features extracted from images; and (ii) an effective feature selection and weighting algorithm sensitive to specific image classification problems. The algorithms are available for free download from openmicroscopy.org. PMID:18958301
NASA Astrophysics Data System (ADS)
Wan, Xiaoqing; Zhao, Chunhui; Gao, Bing
2017-11-01
The integration of an edge-preserving filtering technique in the classification of a hyperspectral image (HSI) has been proven effective in enhancing classification performance. This paper proposes an ensemble strategy for HSI classification using an edge-preserving filter along with a deep learning model and edge detection. First, an adaptive guided filter is applied to the original HSI to reduce the noise in degraded images and to extract powerful spectral-spatial features. Second, the extracted features are fed as input to a stacked sparse autoencoder to adaptively exploit more invariant and deep feature representations; then, a random forest classifier is applied to fine-tune the entire pretrained network and determine the classification output. Third, a Prewitt compass operator is further performed on the HSI to extract the edges of the first principal component after dimension reduction. Moreover, the regional growth rule is applied to the resulting edge logical image to determine the local region for each unlabeled pixel. Finally, the categories of the corresponding neighborhood samples are determined in the original classification map; then, the major voting mechanism is implemented to generate the final output. Extensive experiments proved that the proposed method achieves competitive performance compared with several traditional approaches.
Comparison Between Supervised and Unsupervised Classifications of Neuronal Cell Types: A Case Study
Guerra, Luis; McGarry, Laura M; Robles, Víctor; Bielza, Concha; Larrañaga, Pedro; Yuste, Rafael
2011-01-01
In the study of neural circuits, it becomes essential to discern the different neuronal cell types that build the circuit. Traditionally, neuronal cell types have been classified using qualitative descriptors. More recently, several attempts have been made to classify neurons quantitatively, using unsupervised clustering methods. While useful, these algorithms do not take advantage of previous information known to the investigator, which could improve the classification task. For neocortical GABAergic interneurons, the problem to discern among different cell types is particularly difficult and better methods are needed to perform objective classifications. Here we explore the use of supervised classification algorithms to classify neurons based on their morphological features, using a database of 128 pyramidal cells and 199 interneurons from mouse neocortex. To evaluate the performance of different algorithms we used, as a “benchmark,” the test to automatically distinguish between pyramidal cells and interneurons, defining “ground truth” by the presence or absence of an apical dendrite. We compared hierarchical clustering with a battery of different supervised classification algorithms, finding that supervised classifications outperformed hierarchical clustering. In addition, the selection of subsets of distinguishing features enhanced the classification accuracy for both sets of algorithms. The analysis of selected variables indicates that dendritic features were most useful to distinguish pyramidal cells from interneurons when compared with somatic and axonal morphological variables. We conclude that supervised classification algorithms are better matched to the general problem of distinguishing neuronal cell types when some information on these cell groups, in our case being pyramidal or interneuron, is known a priori. As a spin-off of this methodological study, we provide several methods to automatically distinguish neocortical pyramidal cells from interneurons, based on their morphologies. © 2010 Wiley Periodicals, Inc. Develop Neurobiol 71: 71–82, 2011 PMID:21154911
Beheshti, Iman; Demirel, Hasan; Farokhian, Farnaz; Yang, Chunlan; Matsuda, Hiroshi
2016-12-01
This paper presents an automatic computer-aided diagnosis (CAD) system based on feature ranking for detection of Alzheimer's disease (AD) using structural magnetic resonance imaging (sMRI) data. The proposed CAD system is composed of four systematic stages. First, global and local differences in the gray matter (GM) of AD patients compared to the GM of healthy controls (HCs) are analyzed using a voxel-based morphometry technique. The aim is to identify significant local differences in the volume of GM as volumes of interests (VOIs). Second, the voxel intensity values of the VOIs are extracted as raw features. Third, the raw features are ranked using a seven-feature ranking method, namely, statistical dependency (SD), mutual information (MI), information gain (IG), Pearson's correlation coefficient (PCC), t-test score (TS), Fisher's criterion (FC), and the Gini index (GI). The features with higher scores are more discriminative. To determine the number of top features, the estimated classification error based on training set made up of the AD and HC groups is calculated, with the vector size that minimized this error selected as the top discriminative feature. Fourth, the classification is performed using a support vector machine (SVM). In addition, a data fusion approach among feature ranking methods is introduced to improve the classification performance. The proposed method is evaluated using a data-set from ADNI (130 AD and 130 HC) with 10-fold cross-validation. The classification accuracy of the proposed automatic system for the diagnosis of AD is up to 92.48% using the sMRI data. An automatic CAD system for the classification of AD based on feature-ranking method and classification errors is proposed. In this regard, seven-feature ranking methods (i.e., SD, MI, IG, PCC, TS, FC, and GI) are evaluated. The optimal size of top discriminative features is determined by the classification error estimation in the training phase. The experimental results indicate that the performance of the proposed system is comparative to that of state-of-the-art classification models. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Fiannaca, Antonino; La Rosa, Massimo; Rizzo, Riccardo; Urso, Alfonso
2015-07-01
In this paper, an alignment-free method for DNA barcode classification that is based on both a spectral representation and a neural gas network for unsupervised clustering is proposed. In the proposed methodology, distinctive words are identified from a spectral representation of DNA sequences. A taxonomic classification of the DNA sequence is then performed using the sequence signature, i.e., the smallest set of k-mers that can assign a DNA sequence to its proper taxonomic category. Experiments were then performed to compare our method with other supervised machine learning classification algorithms, such as support vector machine, random forest, ripper, naïve Bayes, ridor, and classification tree, which also consider short DNA sequence fragments of 200 and 300 base pairs (bp). The experimental tests were conducted over 10 real barcode datasets belonging to different animal species, which were provided by the on-line resource "Barcode of Life Database". The experimental results showed that our k-mer-based approach is directly comparable, in terms of accuracy, recall and precision metrics, with the other classifiers when considering full-length sequences. In addition, we demonstrate the robustness of our method when a classification is performed task with a set of short DNA sequences that were randomly extracted from the original data. For example, the proposed method can reach the accuracy of 64.8% at the species level with 200-bp fragments. Under the same conditions, the best other classifier (random forest) reaches the accuracy of 20.9%. Our results indicate that we obtained a clear improvement over the other classifiers for the study of short DNA barcode sequence fragments. Copyright © 2015 Elsevier B.V. All rights reserved.
Histogram Curve Matching Approaches for Object-based Image Classification of Land Cover and Land Use
Toure, Sory I.; Stow, Douglas A.; Weeks, John R.; Kumar, Sunil
2013-01-01
The classification of image-objects is usually done using parametric statistical measures of central tendency and/or dispersion (e.g., mean or standard deviation). The objectives of this study were to analyze digital number histograms of image objects and evaluate classifications measures exploiting characteristic signatures of such histograms. Two histograms matching classifiers were evaluated and compared to the standard nearest neighbor to mean classifier. An ADS40 airborne multispectral image of San Diego, California was used for assessing the utility of curve matching classifiers in a geographic object-based image analysis (GEOBIA) approach. The classifications were performed with data sets having 0.5 m, 2.5 m, and 5 m spatial resolutions. Results show that histograms are reliable features for characterizing classes. Also, both histogram matching classifiers consistently performed better than the one based on the standard nearest neighbor to mean rule. The highest classification accuracies were produced with images having 2.5 m spatial resolution. PMID:24403648
Training echo state networks for rotation-invariant bone marrow cell classification.
Kainz, Philipp; Burgsteiner, Harald; Asslaber, Martin; Ahammer, Helmut
2017-01-01
The main principle of diagnostic pathology is the reliable interpretation of individual cells in context of the tissue architecture. Especially a confident examination of bone marrow specimen is dependent on a valid classification of myeloid cells. In this work, we propose a novel rotation-invariant learning scheme for multi-class echo state networks (ESNs), which achieves very high performance in automated bone marrow cell classification. Based on representing static images as temporal sequence of rotations, we show how ESNs robustly recognize cells of arbitrary rotations by taking advantage of their short-term memory capacity. The performance of our approach is compared to a classification random forest that learns rotation-invariance in a conventional way by exhaustively training on multiple rotations of individual samples. The methods were evaluated on a human bone marrow image database consisting of granulopoietic and erythropoietic cells in different maturation stages. Our ESN approach to cell classification does not rely on segmentation of cells or manual feature extraction and can therefore directly be applied to image data.
Pläschke, Rachel N; Cieslik, Edna C; Müller, Veronika I; Hoffstaedter, Felix; Plachti, Anna; Varikuti, Deepthi P; Goosses, Mareike; Latz, Anne; Caspers, Svenja; Jockwitz, Christiane; Moebus, Susanne; Gruber, Oliver; Eickhoff, Claudia R; Reetz, Kathrin; Heller, Julia; Südmeyer, Martin; Mathys, Christian; Caspers, Julian; Grefkes, Christian; Kalenscher, Tobias; Langner, Robert; Eickhoff, Simon B
2017-12-01
Previous whole-brain functional connectivity studies achieved successful classifications of patients and healthy controls but only offered limited specificity as to affected brain systems. Here, we examined whether the connectivity patterns of functional systems affected in schizophrenia (SCZ), Parkinson's disease (PD), or normal aging equally translate into high classification accuracies for these conditions. We compared classification performance between pre-defined networks for each group and, for any given network, between groups. Separate support vector machine classifications of 86 SCZ patients, 80 PD patients, and 95 older adults relative to their matched healthy/young controls, respectively, were performed on functional connectivity in 12 task-based, meta-analytically defined networks using 25 replications of a nested 10-fold cross-validation scheme. Classification performance of the various networks clearly differed between conditions, as those networks that best classified one disease were usually non-informative for the other. For SCZ, but not PD, emotion-processing, empathy, and cognitive action control networks distinguished patients most accurately from controls. For PD, but not SCZ, networks subserving autobiographical or semantic memory, motor execution, and theory-of-mind cognition yielded the best classifications. In contrast, young-old classification was excellent based on all networks and outperformed both clinical classifications. Our pattern-classification approach captured associations between clinical and developmental conditions and functional network integrity with a higher level of specificity than did previous whole-brain analyses. Taken together, our results support resting-state connectivity as a marker of functional dysregulation in specific networks known to be affected by SCZ and PD, while suggesting that aging affects network integrity in a more global way. Hum Brain Mapp 38:5845-5858, 2017. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Yu, Guan; Liu, Yufeng; Thung, Kim-Han; Shen, Dinggang
2014-01-01
Accurately identifying mild cognitive impairment (MCI) individuals who will progress to Alzheimer's disease (AD) is very important for making early interventions. Many classification methods focus on integrating multiple imaging modalities such as magnetic resonance imaging (MRI) and fluorodeoxyglucose positron emission tomography (FDG-PET). However, the main challenge for MCI classification using multiple imaging modalities is the existence of a lot of missing data in many subjects. For example, in the Alzheimer's Disease Neuroimaging Initiative (ADNI) study, almost half of the subjects do not have PET images. In this paper, we propose a new and flexible binary classification method, namely Multi-task Linear Programming Discriminant (MLPD) analysis, for the incomplete multi-source feature learning. Specifically, we decompose the classification problem into different classification tasks, i.e., one for each combination of available data sources. To solve all different classification tasks jointly, our proposed MLPD method links them together by constraining them to achieve the similar estimated mean difference between the two classes (under classification) for those shared features. Compared with the state-of-the-art incomplete Multi-Source Feature (iMSF) learning method, instead of constraining different classification tasks to choose a common feature subset for those shared features, MLPD can flexibly and adaptively choose different feature subsets for different classification tasks. Furthermore, our proposed MLPD method can be efficiently implemented by linear programming. To validate our MLPD method, we perform experiments on the ADNI baseline dataset with the incomplete MRI and PET images from 167 progressive MCI (pMCI) subjects and 226 stable MCI (sMCI) subjects. We further compared our method with the iMSF method (using incomplete MRI and PET images) and also the single-task classification method (using only MRI or only subjects with both MRI and PET images). Experimental results show very promising performance of our proposed MLPD method.
Yu, Guan; Liu, Yufeng; Thung, Kim-Han; Shen, Dinggang
2014-01-01
Accurately identifying mild cognitive impairment (MCI) individuals who will progress to Alzheimer's disease (AD) is very important for making early interventions. Many classification methods focus on integrating multiple imaging modalities such as magnetic resonance imaging (MRI) and fluorodeoxyglucose positron emission tomography (FDG-PET). However, the main challenge for MCI classification using multiple imaging modalities is the existence of a lot of missing data in many subjects. For example, in the Alzheimer's Disease Neuroimaging Initiative (ADNI) study, almost half of the subjects do not have PET images. In this paper, we propose a new and flexible binary classification method, namely Multi-task Linear Programming Discriminant (MLPD) analysis, for the incomplete multi-source feature learning. Specifically, we decompose the classification problem into different classification tasks, i.e., one for each combination of available data sources. To solve all different classification tasks jointly, our proposed MLPD method links them together by constraining them to achieve the similar estimated mean difference between the two classes (under classification) for those shared features. Compared with the state-of-the-art incomplete Multi-Source Feature (iMSF) learning method, instead of constraining different classification tasks to choose a common feature subset for those shared features, MLPD can flexibly and adaptively choose different feature subsets for different classification tasks. Furthermore, our proposed MLPD method can be efficiently implemented by linear programming. To validate our MLPD method, we perform experiments on the ADNI baseline dataset with the incomplete MRI and PET images from 167 progressive MCI (pMCI) subjects and 226 stable MCI (sMCI) subjects. We further compared our method with the iMSF method (using incomplete MRI and PET images) and also the single-task classification method (using only MRI or only subjects with both MRI and PET images). Experimental results show very promising performance of our proposed MLPD method. PMID:24820966
Removal of BCG artifacts using a non-Kirchhoffian overcomplete representation.
Dyrholm, Mads; Goldman, Robin; Sajda, Paul; Brown, Truman R
2009-02-01
We present a nonlinear unmixing approach for extracting the ballistocardiogram (BCG) from EEG recorded in an MR scanner during simultaneous acquisition of functional MRI (fMRI). First, an overcomplete basis is identified in the EEG based on a custom multipath EEG electrode cap. Next, the overcomplete basis is used to infer non-Kirchhoffian latent variables that are not consistent with a conservative electric field. Neural activity is strictly Kirchhoffian while the BCG artifact is not, and the representation can hence be used to remove the artifacts from the data in a way that does not attenuate the neural signals needed for optimal single-trial classification performance. We compare our method to more standard methods for BCG removal, namely independent component analysis and optimal basis sets, by looking at single-trial classification performance for an auditory oddball experiment. We show that our overcomplete representation method for removing BCG artifacts results in better single-trial classification performance compared to the conventional approaches, indicating that the derived neural activity in this representation retains the complex information in the trial-to-trial variability.
Effects of eye artifact removal methods on single trial P300 detection, a comparative study.
Ghaderi, Foad; Kim, Su Kyoung; Kirchner, Elsa Andrea
2014-01-15
Electroencephalographic signals are commonly contaminated by eye artifacts, even if recorded under controlled conditions. The objective of this work was to quantitatively compare standard artifact removal methods (regression, filtered regression, Infomax, and second order blind identification (SOBI)) and two artifact identification approaches for independent component analysis (ICA) methods, i.e. ADJUST and correlation. To this end, eye artifacts were removed and the cleaned datasets were used for single trial classification of P300 (a type of event related potentials elicited using the oddball paradigm). Statistical analysis of the results confirms that the combination of Infomax and ADJUST provides a relatively better performance (0.6% improvement on average of all subject) while the combination of SOBI and correlation performs the worst. Low-pass filtering the data at lower cutoffs (here 4 Hz) can also improve the classification accuracy. Without requiring any artifact reference channel, the combination of Infomax and ADJUST improves the classification performance more than the other methods for both examined filtering cutoffs, i.e., 4 Hz and 25 Hz. Copyright © 2013 Elsevier B.V. All rights reserved.
van Gemert, Jan C; Veenman, Cor J; Smeulders, Arnold W M; Geusebroek, Jan-Mark
2010-07-01
This paper studies automatic image classification by modeling soft assignment in the popular codebook model. The codebook model describes an image as a bag of discrete visual words selected from a vocabulary, where the frequency distributions of visual words in an image allow classification. One inherent component of the codebook model is the assignment of discrete visual words to continuous image features. Despite the clear mismatch of this hard assignment with the nature of continuous features, the approach has been successfully applied for some years. In this paper, we investigate four types of soft assignment of visual words to image features. We demonstrate that explicitly modeling visual word assignment ambiguity improves classification performance compared to the hard assignment of the traditional codebook model. The traditional codebook model is compared against our method for five well-known data sets: 15 natural scenes, Caltech-101, Caltech-256, and Pascal VOC 2007/2008. We demonstrate that large codebook vocabulary sizes completely deteriorate the performance of the traditional model, whereas the proposed model performs consistently. Moreover, we show that our method profits in high-dimensional feature spaces and reaps higher benefits when increasing the number of image categories.
Hayes, Timothy; Usami, Satoshi; Jacobucci, Ross; McArdle, John J
2015-12-01
In this article, we describe a recent development in the analysis of attrition: using classification and regression trees (CART) and random forest methods to generate inverse sampling weights. These flexible machine learning techniques have the potential to capture complex nonlinear, interactive selection models, yet to our knowledge, their performance in the missing data analysis context has never been evaluated. To assess the potential benefits of these methods, we compare their performance with commonly employed multiple imputation and complete case techniques in 2 simulations. These initial results suggest that weights computed from pruned CART analyses performed well in terms of both bias and efficiency when compared with other methods. We discuss the implications of these findings for applied researchers. (c) 2015 APA, all rights reserved).
Hayes, Timothy; Usami, Satoshi; Jacobucci, Ross; McArdle, John J.
2016-01-01
In this article, we describe a recent development in the analysis of attrition: using classification and regression trees (CART) and random forest methods to generate inverse sampling weights. These flexible machine learning techniques have the potential to capture complex nonlinear, interactive selection models, yet to our knowledge, their performance in the missing data analysis context has never been evaluated. To assess the potential benefits of these methods, we compare their performance with commonly employed multiple imputation and complete case techniques in 2 simulations. These initial results suggest that weights computed from pruned CART analyses performed well in terms of both bias and efficiency when compared with other methods. We discuss the implications of these findings for applied researchers. PMID:26389526
a Two-Step Classification Approach to Distinguishing Similar Objects in Mobile LIDAR Point Clouds
NASA Astrophysics Data System (ADS)
He, H.; Khoshelham, K.; Fraser, C.
2017-09-01
Nowadays, lidar is widely used in cultural heritage documentation, urban modeling, and driverless car technology for its fast and accurate 3D scanning ability. However, full exploitation of the potential of point cloud data for efficient and automatic object recognition remains elusive. Recently, feature-based methods have become very popular in object recognition on account of their good performance in capturing object details. Compared with global features describing the whole shape of the object, local features recording the fractional details are more discriminative and are applicable for object classes with considerable similarity. In this paper, we propose a two-step classification approach based on point feature histograms and the bag-of-features method for automatic recognition of similar objects in mobile lidar point clouds. Lamp post, street light and traffic sign are grouped as one category in the first-step classification for their inter similarity compared with tree and vehicle. A finer classification of the lamp post, street light and traffic sign based on the result of the first-step classification is implemented in the second step. The proposed two-step classification approach is shown to yield a considerable improvement over the conventional one-step classification approach.
Huang, Qi; Yang, Dapeng; Jiang, Li; Zhang, Huajie; Liu, Hong; Kotani, Kiyoshi
2017-01-01
Performance degradation will be caused by a variety of interfering factors for pattern recognition-based myoelectric control methods in the long term. This paper proposes an adaptive learning method with low computational cost to mitigate the effect in unsupervised adaptive learning scenarios. We presents a particle adaptive classifier (PAC), by constructing a particle adaptive learning strategy and universal incremental least square support vector classifier (LS-SVC). We compared PAC performance with incremental support vector classifier (ISVC) and non-adapting SVC (NSVC) in a long-term pattern recognition task in both unsupervised and supervised adaptive learning scenarios. Retraining time cost and recognition accuracy were compared by validating the classification performance on both simulated and realistic long-term EMG data. The classification results of realistic long-term EMG data showed that the PAC significantly decreased the performance degradation in unsupervised adaptive learning scenarios compared with NSVC (9.03% ± 2.23%, p < 0.05) and ISVC (13.38% ± 2.62%, p = 0.001), and reduced the retraining time cost compared with ISVC (2 ms per updating cycle vs. 50 ms per updating cycle). PMID:28608824
Huang, Qi; Yang, Dapeng; Jiang, Li; Zhang, Huajie; Liu, Hong; Kotani, Kiyoshi
2017-06-13
Performance degradation will be caused by a variety of interfering factors for pattern recognition-based myoelectric control methods in the long term. This paper proposes an adaptive learning method with low computational cost to mitigate the effect in unsupervised adaptive learning scenarios. We presents a particle adaptive classifier (PAC), by constructing a particle adaptive learning strategy and universal incremental least square support vector classifier (LS-SVC). We compared PAC performance with incremental support vector classifier (ISVC) and non-adapting SVC (NSVC) in a long-term pattern recognition task in both unsupervised and supervised adaptive learning scenarios. Retraining time cost and recognition accuracy were compared by validating the classification performance on both simulated and realistic long-term EMG data. The classification results of realistic long-term EMG data showed that the PAC significantly decreased the performance degradation in unsupervised adaptive learning scenarios compared with NSVC (9.03% ± 2.23%, p < 0.05) and ISVC (13.38% ± 2.62%, p = 0.001), and reduced the retraining time cost compared with ISVC (2 ms per updating cycle vs. 50 ms per updating cycle).
Assessment of various supervised learning algorithms using different performance metrics
NASA Astrophysics Data System (ADS)
Susheel Kumar, S. M.; Laxkar, Deepak; Adhikari, Sourav; Vijayarajan, V.
2017-11-01
Our work brings out comparison based on the performance of supervised machine learning algorithms on a binary classification task. The supervised machine learning algorithms which are taken into consideration in the following work are namely Support Vector Machine(SVM), Decision Tree(DT), K Nearest Neighbour (KNN), Naïve Bayes(NB) and Random Forest(RF). This paper mostly focuses on comparing the performance of above mentioned algorithms on one binary classification task by analysing the Metrics such as Accuracy, F-Measure, G-Measure, Precision, Misclassification Rate, False Positive Rate, True Positive Rate, Specificity, Prevalence.
Automatic evidence quality prediction to support evidence-based decision making.
Sarker, Abeed; Mollá, Diego; Paris, Cécile
2015-06-01
Evidence-based medicine practice requires practitioners to obtain the best available medical evidence, and appraise the quality of the evidence when making clinical decisions. Primarily due to the plethora of electronically available data from the medical literature, the manual appraisal of the quality of evidence is a time-consuming process. We present a fully automatic approach for predicting the quality of medical evidence in order to aid practitioners at point-of-care. Our approach extracts relevant information from medical article abstracts and utilises data from a specialised corpus to apply supervised machine learning for the prediction of the quality grades. Following an in-depth analysis of the usefulness of features (e.g., publication types of articles), they are extracted from the text via rule-based approaches and from the meta-data associated with the articles, and then applied in the supervised classification model. We propose the use of a highly scalable and portable approach using a sequence of high precision classifiers, and introduce a simple evaluation metric called average error distance (AED) that simplifies the comparison of systems. We also perform elaborate human evaluations to compare the performance of our system against human judgments. We test and evaluate our approaches on a publicly available, specialised, annotated corpus containing 1132 evidence-based recommendations. Our rule-based approach performs exceptionally well at the automatic extraction of publication types of articles, with F-scores of up to 0.99 for high-quality publication types. For evidence quality classification, our approach obtains an accuracy of 63.84% and an AED of 0.271. The human evaluations show that the performance of our system, in terms of AED and accuracy, is comparable to the performance of humans on the same data. The experiments suggest that our structured text classification framework achieves evaluation results comparable to those of human performance. Our overall classification approach and evaluation technique are also highly portable and can be used for various evidence grading scales. Copyright © 2015 Elsevier B.V. All rights reserved.
Comparison of Feature Selection Techniques in Machine Learning for Anatomical Brain MRI in Dementia.
Tohka, Jussi; Moradi, Elaheh; Huttunen, Heikki
2016-07-01
We present a comparative split-half resampling analysis of various data driven feature selection and classification methods for the whole brain voxel-based classification analysis of anatomical magnetic resonance images. We compared support vector machines (SVMs), with or without filter based feature selection, several embedded feature selection methods and stability selection. While comparisons of the accuracy of various classification methods have been reported previously, the variability of the out-of-training sample classification accuracy and the set of selected features due to independent training and test sets have not been previously addressed in a brain imaging context. We studied two classification problems: 1) Alzheimer's disease (AD) vs. normal control (NC) and 2) mild cognitive impairment (MCI) vs. NC classification. In AD vs. NC classification, the variability in the test accuracy due to the subject sample did not vary between different methods and exceeded the variability due to different classifiers. In MCI vs. NC classification, particularly with a large training set, embedded feature selection methods outperformed SVM-based ones with the difference in the test accuracy exceeding the test accuracy variability due to the subject sample. The filter and embedded methods produced divergent feature patterns for MCI vs. NC classification that suggests the utility of the embedded feature selection for this problem when linked with the good generalization performance. The stability of the feature sets was strongly correlated with the number of features selected, weakly correlated with the stability of classification accuracy, and uncorrelated with the average classification accuracy.
Li, Yiqing; Wang, Yu; Zi, Yanyang; Zhang, Mingquan
2015-10-21
The various multi-sensor signal features from a diesel engine constitute a complex high-dimensional dataset. The non-linear dimensionality reduction method, t-distributed stochastic neighbor embedding (t-SNE), provides an effective way to implement data visualization for complex high-dimensional data. However, irrelevant features can deteriorate the performance of data visualization, and thus, should be eliminated a priori. This paper proposes a feature subset score based t-SNE (FSS-t-SNE) data visualization method to deal with the high-dimensional data that are collected from multi-sensor signals. In this method, the optimal feature subset is constructed by a feature subset score criterion. Then the high-dimensional data are visualized in 2-dimension space. According to the UCI dataset test, FSS-t-SNE can effectively improve the classification accuracy. An experiment was performed with a large power marine diesel engine to validate the proposed method for diesel engine malfunction classification. Multi-sensor signals were collected by a cylinder vibration sensor and a cylinder pressure sensor. Compared with other conventional data visualization methods, the proposed method shows good visualization performance and high classification accuracy in multi-malfunction classification of a diesel engine.
Li, Yiqing; Wang, Yu; Zi, Yanyang; Zhang, Mingquan
2015-01-01
The various multi-sensor signal features from a diesel engine constitute a complex high-dimensional dataset. The non-linear dimensionality reduction method, t-distributed stochastic neighbor embedding (t-SNE), provides an effective way to implement data visualization for complex high-dimensional data. However, irrelevant features can deteriorate the performance of data visualization, and thus, should be eliminated a priori. This paper proposes a feature subset score based t-SNE (FSS-t-SNE) data visualization method to deal with the high-dimensional data that are collected from multi-sensor signals. In this method, the optimal feature subset is constructed by a feature subset score criterion. Then the high-dimensional data are visualized in 2-dimension space. According to the UCI dataset test, FSS-t-SNE can effectively improve the classification accuracy. An experiment was performed with a large power marine diesel engine to validate the proposed method for diesel engine malfunction classification. Multi-sensor signals were collected by a cylinder vibration sensor and a cylinder pressure sensor. Compared with other conventional data visualization methods, the proposed method shows good visualization performance and high classification accuracy in multi-malfunction classification of a diesel engine. PMID:26506347
NASA Astrophysics Data System (ADS)
Golick, Douglas A.; Heng-Moss, Tiffany M.; Steckelberg, Allen L.; Brooks, David. W.; Higley, Leon G.; Fowler, David
2013-08-01
The purpose of the study was to determine whether undergraduate students receiving web-based instruction based on traditional, key character, or classification instruction differed in their performance of insect identification tasks. All groups showed a significant improvement in insect identifications on pre- and post-two-dimensional picture specimen quizzes. The study also determined student performance on insect identification tasks was not as good as for family-level identification as compared to broader insect orders and arthropod classification identification tasks. Finally, students erred significantly more by misidentification than misspelling specimen names on prepared specimen quizzes. Results of this study support that short web-based insect identification exercises can improve insect identification performance. Also included is a discussion of how these results can be used in teaching and future research on biological identification.
A novel application of deep learning for single-lead ECG classification.
Mathews, Sherin M; Kambhamettu, Chandra; Barner, Kenneth E
2018-06-04
Detecting and classifying cardiac arrhythmias is critical to the diagnosis of patients with cardiac abnormalities. In this paper, a novel approach based on deep learning methodology is proposed for the classification of single-lead electrocardiogram (ECG) signals. We demonstrate the application of the Restricted Boltzmann Machine (RBM) and deep belief networks (DBN) for ECG classification following detection of ventricular and supraventricular heartbeats using single-lead ECG. The effectiveness of this proposed algorithm is illustrated using real ECG signals from the widely-used MIT-BIH database. Simulation results demonstrate that with a suitable choice of parameters, RBM and DBN can achieve high average recognition accuracies of ventricular ectopic beats (93.63%) and of supraventricular ectopic beats (95.57%) at a low sampling rate of 114 Hz. Experimental results indicate that classifiers built into this deep learning-based framework achieved state-of-the art performance models at lower sampling rates and simple features when compared to traditional methods. Further, employing features extracted at a sampling rate of 114 Hz when combined with deep learning provided enough discriminatory power for the classification task. This performance is comparable to that of traditional methods and uses a much lower sampling rate and simpler features. Thus, our proposed deep neural network algorithm demonstrates that deep learning-based methods offer accurate ECG classification and could potentially be extended to other physiological signal classifications, such as those in arterial blood pressure (ABP), nerve conduction (EMG), and heart rate variability (HRV) studies. Copyright © 2018. Published by Elsevier Ltd.
Emotion recognition based on physiological changes in music listening.
Kim, Jonghwa; André, Elisabeth
2008-12-01
Little attention has been paid so far to physiological signals for emotion recognition compared to audiovisual emotion channels such as facial expression or speech. This paper investigates the potential of physiological signals as reliable channels for emotion recognition. All essential stages of an automatic recognition system are discussed, from the recording of a physiological dataset to a feature-based multiclass classification. In order to collect a physiological dataset from multiple subjects over many weeks, we used a musical induction method which spontaneously leads subjects to real emotional states, without any deliberate lab setting. Four-channel biosensors were used to measure electromyogram, electrocardiogram, skin conductivity and respiration changes. A wide range of physiological features from various analysis domains, including time/frequency, entropy, geometric analysis, subband spectra, multiscale entropy, etc., is proposed in order to find the best emotion-relevant features and to correlate them with emotional states. The best features extracted are specified in detail and their effectiveness is proven by classification results. Classification of four musical emotions (positive/high arousal, negative/high arousal, negative/low arousal, positive/low arousal) is performed by using an extended linear discriminant analysis (pLDA). Furthermore, by exploiting a dichotomic property of the 2D emotion model, we develop a novel scheme of emotion-specific multilevel dichotomous classification (EMDC) and compare its performance with direct multiclass classification using the pLDA. Improved recognition accuracy of 95\\% and 70\\% for subject-dependent and subject-independent classification, respectively, is achieved by using the EMDC scheme.
Online clustering algorithms for radar emitter classification.
Liu, Jun; Lee, Jim P Y; Senior; Li, Lingjie; Luo, Zhi-Quan; Wong, K Max
2005-08-01
Radar emitter classification is a special application of data clustering for classifying unknown radar emitters from received radar pulse samples. The main challenges of this task are the high dimensionality of radar pulse samples, small sample group size, and closely located radar pulse clusters. In this paper, two new online clustering algorithms are developed for radar emitter classification: One is model-based using the Minimum Description Length (MDL) criterion and the other is based on competitive learning. Computational complexity is analyzed for each algorithm and then compared. Simulation results show the superior performance of the model-based algorithm over competitive learning in terms of better classification accuracy, flexibility, and stability.
Johnson, Nathan T; Dhroso, Andi; Hughes, Katelyn J; Korkin, Dmitry
2018-06-25
The extent to which the genes are expressed in the cell can be simplistically defined as a function of one or more factors of the environment, lifestyle, and genetics. RNA sequencing (RNA-Seq) is becoming a prevalent approach to quantify gene expression, and is expected to gain better insights to a number of biological and biomedical questions, compared to the DNA microarrays. Most importantly, RNA-Seq allows to quantify expression at the gene and alternative splicing isoform levels. However, leveraging the RNA-Seq data requires development of new data mining and analytics methods. Supervised machine learning methods are commonly used approaches for biological data analysis, and have recently gained attention for their applications to the RNA-Seq data. In this work, we assess the utility of supervised learning methods trained on RNA-Seq data for a diverse range of biological classification tasks. We hypothesize that the isoform-level expression data is more informative for biological classification tasks than the gene-level expression data. Our large-scale assessment is done through utilizing multiple datasets, organisms, lab groups, and RNA-Seq analysis pipelines. Overall, we performed and assessed 61 biological classification problems that leverage three independent RNA-Seq datasets and include over 2,000 samples that come from multiple organisms, lab groups, and RNA-Seq analyses. These 61 problems include predictions of the tissue type, sex, or age of the sample, healthy or cancerous phenotypes and, the pathological tumor stage for the samples from the cancerous tissue. For each classification problem, the performance of three normalization techniques and six machine learning classifiers was explored. We find that for every single classification problem, the isoform-based classifiers outperform or are comparable with gene expression based methods. The top-performing supervised learning techniques reached a near perfect classification accuracy, demonstrating the utility of supervised learning for RNA-Seq based data analysis. Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Joint Concept Correlation and Feature-Concept Relevance Learning for Multilabel Classification.
Zhao, Xiaowei; Ma, Zhigang; Li, Zhi; Li, Zhihui
2018-02-01
In recent years, multilabel classification has attracted significant attention in multimedia annotation. However, most of the multilabel classification methods focus only on the inherent correlations existing among multiple labels and concepts and ignore the relevance between features and the target concepts. To obtain more robust multilabel classification results, we propose a new multilabel classification method aiming to capture the correlations among multiple concepts by leveraging hypergraph that is proved to be beneficial for relational learning. Moreover, we consider mining feature-concept relevance, which is often overlooked by many multilabel learning algorithms. To better show the feature-concept relevance, we impose a sparsity constraint on the proposed method. We compare the proposed method with several other multilabel classification methods and evaluate the classification performance by mean average precision on several data sets. The experimental results show that the proposed method outperforms the state-of-the-art methods.
Perinatal mortality classification: an analysis of 112 cases of stillbirth.
Reis, Ana Paula; Rocha, Ana; Lebre, Andrea; Ramos, Umbelina; Cunha, Ana
2017-10-01
This was a retrospective cohort analysis of stillbirths that occurred from January 2004 to December 2013 in our institution. We compared Tulip and Wigglesworth classification systems on a cohort of stillbirths and analysed the main differences between these two classifications. In this period, there were 112 stillbirths of a total of 31,758 births (stillbirth rate of 3.5 per 1000 births). There were 99 antepartum deaths and 13 intrapartum deaths. Foetal autopsy was performed in 99 cases and placental histopathological examination in all of the cases. The Wigglesworth found 'unknown' causes in 47 cases and the Tulip classification allocated 33 of these. Fourteen cases remained in the group of 'unknown' causes. Therefore, the Wigglesworth classification of stillbirths results in a higher proportion of unexplained stillbirths. We suggest that the traditional Wigglesworth classification should be substituted by a classification that manages the available information.
Classification of EEG signals using a genetic-based machine learning classifier.
Skinner, B T; Nguyen, H T; Liu, D K
2007-01-01
This paper investigates the efficacy of the genetic-based learning classifier system XCS, for the classification of noisy, artefact-inclusive human electroencephalogram (EEG) signals represented using large condition strings (108bits). EEG signals from three participants were recorded while they performed four mental tasks designed to elicit hemispheric responses. Autoregressive (AR) models and Fast Fourier Transform (FFT) methods were used to form feature vectors with which mental tasks can be discriminated. XCS achieved a maximum classification accuracy of 99.3% and a best average of 88.9%. The relative classification performance of XCS was then compared against four non-evolutionary classifier systems originating from different learning techniques. The experimental results will be used as part of our larger research effort investigating the feasibility of using EEG signals as an interface to allow paralysed persons to control a powered wheelchair or other devices.
Feature Selection for Ridge Regression with Provable Guarantees.
Paul, Saurabh; Drineas, Petros
2016-04-01
We introduce single-set spectral sparsification as a deterministic sampling-based feature selection technique for regularized least-squares classification, which is the classification analog to ridge regression. The method is unsupervised and gives worst-case guarantees of the generalization power of the classification function after feature selection with respect to the classification function obtained using all features. We also introduce leverage-score sampling as an unsupervised randomized feature selection method for ridge regression. We provide risk bounds for both single-set spectral sparsification and leverage-score sampling on ridge regression in the fixed design setting and show that the risk in the sampled space is comparable to the risk in the full-feature space. We perform experiments on synthetic and real-world data sets; a subset of TechTC-300 data sets, to support our theory. Experimental results indicate that the proposed methods perform better than the existing feature selection methods.
A data set for evaluating the performance of multi-class multi-object video tracking
NASA Astrophysics Data System (ADS)
Chakraborty, Avishek; Stamatescu, Victor; Wong, Sebastien C.; Wigley, Grant; Kearney, David
2017-05-01
One of the challenges in evaluating multi-object video detection, tracking and classification systems is having publically available data sets with which to compare different systems. However, the measures of performance for tracking and classification are different. Data sets that are suitable for evaluating tracking systems may not be appropriate for classification. Tracking video data sets typically only have ground truth track IDs, while classification video data sets only have ground truth class-label IDs. The former identifies the same object over multiple frames, while the latter identifies the type of object in individual frames. This paper describes an advancement of the ground truth meta-data for the DARPA Neovision2 Tower data set to allow both the evaluation of tracking and classification. The ground truth data sets presented in this paper contain unique object IDs across 5 different classes of object (Car, Bus, Truck, Person, Cyclist) for 24 videos of 871 image frames each. In addition to the object IDs and class labels, the ground truth data also contains the original bounding box coordinates together with new bounding boxes in instances where un-annotated objects were present. The unique IDs are maintained during occlusions between multiple objects or when objects re-enter the field of view. This will provide: a solid foundation for evaluating the performance of multi-object tracking of different types of objects, a straightforward comparison of tracking system performance using the standard Multi Object Tracking (MOT) framework, and classification performance using the Neovision2 metrics. These data have been hosted publically.
Cong, Rui; Li, Jing; Guo, Song
2017-02-01
To examine the efficacy of qualitative shear wave elastography (SWE) in the classification and evaluation of solid breast masses, and to compare this method with conventional ultrasonograghy (US), quantitative SWE parameters and qualitative SWE classification proposed before. From April 2015 to March 2016, 314 consecutive females with 325 breast masses who decided to undergo core needle biopsy and/or surgical biopsy were enrolled. Conventional US and SWE were previously performed in all enrolled subjects. Each mass was classified by two different qualitative classifications. One was established in our study, herein named the Qual1. Qual1 could classify the SWE images into five color patterns by the visual evaluations: Color pattern 1 (homogeneous pattern); Color pattern 2 (comparative homogeneous pattern); Color pattern 3 (irregularly heterogeneous pattern); Color pattern 4 (intralesional echo pattern); and Color pattern 5 (the stiff rim sign pattern). The second qualitative classification was named Qual2 here, and included a four-color overlay pattern classification (Tozaki and Fukuma, Acta Radiologica, 2011). The Breast Imaging Reporting and Data System (BI-RADS) assessment and quantitative SWE parameters were recorded. Diagnostic performances of conventional US, SWE parameters, and combinations of US and SWE parameters were compared. With pathological results as the gold standard, of the 325 examined breast masses, 139 (42.77%) samples were malignant and 186 (57.23%) were benign. The Qual1 showed a higher Az value than the Qual2 and quantitative SWE parameters (all P<0.05). When applying Qual1=Color pattern 1 for downgrading and Qual1=Color pattern 5 for upgrading the BI-RADS categories, we obtained the highest Az value (0.951), and achieved a significantly higher specificity (86.56%, P=0.002) than that of the US (81.18%) with the same sensitivity (94.96%). The qualitative classification proposed in this study may be representative of SWE parameters and has potential to be relevant assistance in breast mass diagnoses. Copyright © 2016. Published by Elsevier B.V.
Multiple-rule bias in the comparison of classification rules
Yousefi, Mohammadmahdi R.; Hua, Jianping; Dougherty, Edward R.
2011-01-01
Motivation: There is growing discussion in the bioinformatics community concerning overoptimism of reported results. Two approaches contributing to overoptimism in classification are (i) the reporting of results on datasets for which a proposed classification rule performs well and (ii) the comparison of multiple classification rules on a single dataset that purports to show the advantage of a certain rule. Results: This article provides a careful probabilistic analysis of the second issue and the ‘multiple-rule bias’, resulting from choosing a classification rule having minimum estimated error on the dataset. It quantifies this bias corresponding to estimating the expected true error of the classification rule possessing minimum estimated error and it characterizes the bias from estimating the true comparative advantage of the chosen classification rule relative to the others by the estimated comparative advantage on the dataset. The analysis is applied to both synthetic and real data using a number of classification rules and error estimators. Availability: We have implemented in C code the synthetic data distribution model, classification rules, feature selection routines and error estimation methods. The code for multiple-rule analysis is implemented in MATLAB. The source code is available at http://gsp.tamu.edu/Publications/supplementary/yousefi11a/. Supplementary simulation results are also included. Contact: edward@ece.tamu.edu Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:21546390
NASA Astrophysics Data System (ADS)
Klomp, Sander; van der Sommen, Fons; Swager, Anne-Fré; Zinger, Svitlana; Schoon, Erik J.; Curvers, Wouter L.; Bergman, Jacques J.; de With, Peter H. N.
2017-03-01
Volumetric Laser Endomicroscopy (VLE) is a promising technique for the detection of early neoplasia in Barrett's Esophagus (BE). VLE generates hundreds of high resolution, grayscale, cross-sectional images of the esophagus. However, at present, classifying these images is a time consuming and cumbersome effort performed by an expert using a clinical prediction model. This paper explores the feasibility of using computer vision techniques to accurately predict the presence of dysplastic tissue in VLE BE images. Our contribution is threefold. First, a benchmarking is performed for widely applied machine learning techniques and feature extraction methods. Second, three new features based on the clinical detection model are proposed, having superior classification accuracy and speed, compared to earlier work. Third, we evaluate automated parameter tuning by applying simple grid search and feature selection methods. The results are evaluated on a clinically validated dataset of 30 dysplastic and 30 non-dysplastic VLE images. Optimal classification accuracy is obtained by applying a support vector machine and using our modified Haralick features and optimal image cropping, obtaining an area under the receiver operating characteristic of 0.95 compared to the clinical prediction model at 0.81. Optimal execution time is achieved using a proposed mean and median feature, which is extracted at least factor 2.5 faster than alternative features with comparable performance.
ERIC Educational Resources Information Center
Strecht, Pedro; Cruz, Luís; Soares, Carlos; Mendes-Moreira, João; Abreu, Rui
2015-01-01
Predicting the success or failure of a student in a course or program is a problem that has recently been addressed using data mining techniques. In this paper we evaluate some of the most popular classification and regression algorithms on this problem. We address two problems: prediction of approval/failure and prediction of grade. The former is…
ERIC Educational Resources Information Center
Wei, Tam Thi Dang
This study examines the differences in classificatory performance of children from middle class (MC) and from culturally deprived (CD) backgrounds at kindergarten and second grade levels. It was hypothesized that: (a) the ability to classify increases with age (b) CD children would score lower on talks of classification than children in MC groups…
2013-01-01
Background Gene expression data could likely be a momentous help in the progress of proficient cancer diagnoses and classification platforms. Lately, many researchers analyze gene expression data using diverse computational intelligence methods, for selecting a small subset of informative genes from the data for cancer classification. Many computational methods face difficulties in selecting small subsets due to the small number of samples compared to the huge number of genes (high-dimension), irrelevant genes, and noisy genes. Methods We propose an enhanced binary particle swarm optimization to perform the selection of small subsets of informative genes which is significant for cancer classification. Particle speed, rule, and modified sigmoid function are introduced in this proposed method to increase the probability of the bits in a particle’s position to be zero. The method was empirically applied to a suite of ten well-known benchmark gene expression data sets. Results The performance of the proposed method proved to be superior to other previous related works, including the conventional version of binary particle swarm optimization (BPSO) in terms of classification accuracy and the number of selected genes. The proposed method also requires lower computational time compared to BPSO. PMID:23617960
Fine-Granularity Functional Interaction Signatures for Characterization of Brain Conditions
Hu, Xintao; Zhu, Dajiang; Lv, Peili; Li, Kaiming; Han, Junwei; Wang, Lihong; Shen, Dinggang; Guo, Lei; Liu, Tianming
2014-01-01
In the human brain, functional activity occurs at multiple spatial scales. Current studies on functional brain networks and their alterations in brain diseases via resting-state functional magnetic resonance imaging (rs-fMRI) are generally either at local scale (regionally confined analysis and inter-regional functional connectivity analysis) or at global scale (graph theoretic analysis). In contrast, inferring functional interaction at fine-granularity sub-network scale has not been adequately explored yet. Here our hypothesis is that functional interaction measured at fine-granularity subnetwork scale can provide new insight into the neural mechanisms of neurological and psychological conditions, thus offering complementary information for healthy and diseased population classification. In this paper, we derived fine-granularity functional interaction (FGFI) signatures in subjects with Mild Cognitive Impairment (MCI) and Schizophrenia by diffusion tensor imaging (DTI) and rsfMRI, and used patient-control classification experiments to evaluate the distinctiveness of the derived FGFI features. Our experimental results have shown that the FGFI features alone can achieve comparable classification performance compared with the commonly used inter-regional connectivity features. However, the classification performance can be substantially improved when FGFI features and inter-regional connectivity features are integrated, suggesting the complementary information achieved from the FGFI signatures. PMID:23319242
Expected energy-based restricted Boltzmann machine for classification.
Elfwing, S; Uchibe, E; Doya, K
2015-04-01
In classification tasks, restricted Boltzmann machines (RBMs) have predominantly been used in the first stage, either as feature extractors or to provide initialization of neural networks. In this study, we propose a discriminative learning approach to provide a self-contained RBM method for classification, inspired by free-energy based function approximation (FE-RBM), originally proposed for reinforcement learning. For classification, the FE-RBM method computes the output for an input vector and a class vector by the negative free energy of an RBM. Learning is achieved by stochastic gradient-descent using a mean-squared error training objective. In an earlier study, we demonstrated that the performance and the robustness of FE-RBM function approximation can be improved by scaling the free energy by a constant that is related to the size of network. In this study, we propose that the learning performance of RBM function approximation can be further improved by computing the output by the negative expected energy (EE-RBM), instead of the negative free energy. To create a deep learning architecture, we stack several RBMs on top of each other. We also connect the class nodes to all hidden layers to try to improve the performance even further. We validate the classification performance of EE-RBM using the MNIST data set and the NORB data set, achieving competitive performance compared with other classifiers such as standard neural networks, deep belief networks, classification RBMs, and support vector machines. The purpose of using the NORB data set is to demonstrate that EE-RBM with binary input nodes can achieve high performance in the continuous input domain. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.
A Discriminative Approach to EEG Seizure Detection
Johnson, Ashley N.; Sow, Daby; Biem, Alain
2011-01-01
Seizures are abnormal sudden discharges in the brain with signatures represented in electroencephalograms (EEG). The efficacy of the application of speech processing techniques to discriminate between seizure and non-seizure states in EEGs is reported. The approach accounts for the challenges of unbalanced datasets (seizure and non-seizure), while also showing a system capable of real-time seizure detection. The Minimum Classification Error (MCE) algorithm, which is a discriminative learning algorithm with wide-use in speech processing, is applied and compared with conventional classification techniques that have already been applied to the discrimination between seizure and non-seizure states in the literature. The system is evaluated on 22 pediatric patients multi-channel EEG recordings. Experimental results show that the application of speech processing techniques and MCE compare favorably with conventional classification techniques in terms of classification performance, while requiring less computational overhead. The results strongly suggests the possibility of deploying the designed system at the bedside. PMID:22195192
A Discriminant Distance Based Composite Vector Selection Method for Odor Classification
Choi, Sang-Il; Jeong, Gu-Min
2014-01-01
We present a composite vector selection method for an effective electronic nose system that performs well even in noisy environments. Each composite vector generated from a electronic nose data sample is evaluated by computing the discriminant distance. By quantitatively measuring the amount of discriminative information in each composite vector, composite vectors containing informative variables can be distinguished and the final composite features for odor classification are extracted using the selected composite vectors. Using the only informative composite vectors can be also helpful to extract better composite features instead of using all the generated composite vectors. Experimental results with different volatile organic compound data show that the proposed system has good classification performance even in a noisy environment compared to other methods. PMID:24747735
Early classification of pathological heartbeats on wireless body sensor nodes.
Braojos, Rubén; Beretta, Ivan; Ansaloni, Giovanni; Atienza, David
2014-11-27
Smart Wireless Body Sensor Nodes (WBSNs) are a novel class of unobtrusive, battery-powered devices allowing the continuous monitoring and real-time interpretation of a subject's bio-signals, such as the electrocardiogram (ECG). These low-power platforms, while able to perform advanced signal processing to extract information on heart conditions, are usually constrained in terms of computational power and transmission bandwidth. It is therefore essential to identify in the early stages which parts of an ECG are critical for the diagnosis and, only in these cases, activate on demand more detailed and computationally intensive analysis algorithms. In this work, we present a comprehensive framework for real-time automatic classification of normal and abnormal heartbeats, targeting embedded and resource-constrained WBSNs. In particular, we provide a comparative analysis of different strategies to reduce the heartbeat representation dimensionality, and therefore the required computational effort. We then combine these techniques with a neuro-fuzzy classification strategy, which effectively discerns normal and pathological heartbeats with a minimal run time and memory overhead. We prove that, by performing a detailed analysis only on the heartbeats that our classifier identifies as abnormal, a WBSN system can drastically reduce its overall energy consumption. Finally, we assess the choice of neuro-fuzzy classification by comparing its performance and workload with respect to other state-of-the-art strategies. Experimental results using the MIT-BIH Arrhythmia database show energy savings of as much as 60% in the signal processing stage, and 63% in the subsequent wireless transmission, when a neuro-fuzzy classification structure is employed, coupled with a dimensionality reduction technique based on random projections.
Early Classification of Pathological Heartbeats on Wireless Body Sensor Nodes
Braojos, Rubén; Beretta, Ivan; Ansaloni, Giovanni; Atienza, David
2014-01-01
Smart Wireless Body Sensor Nodes (WBSNs) are a novel class of unobtrusive, battery-powered devices allowing the continuous monitoring and real-time interpretation of a subject's bio-signals, such as the electrocardiogram (ECG). These low-power platforms, while able to perform advanced signal processing to extract information on heart conditions, are usually constrained in terms of computational power and transmission bandwidth. It is therefore essential to identify in the early stages which parts of an ECG are critical for the diagnosis and, only in these cases, activate on demand more detailed and computationally intensive analysis algorithms. In this work, we present a comprehensive framework for real-time automatic classification of normal and abnormal heartbeats, targeting embedded and resource-constrained WBSNs. In particular, we provide a comparative analysis of different strategies to reduce the heartbeat representation dimensionality, and therefore the required computational effort. We then combine these techniques with a neuro-fuzzy classification strategy, which effectively discerns normal and pathological heartbeats with a minimal run time and memory overhead. We prove that, by performing a detailed analysis only on the heartbeats that our classifier identifies as abnormal, a WBSN system can drastically reduce its overall energy consumption. Finally, we assess the choice of neuro-fuzzy classification by comparing its performance and workload with respect to other state-of-the-art strategies. Experimental results using the MIT-BIH Arrhythmia database show energy savings of as much as 60% in the signal processing stage, and 63% in the subsequent wireless transmission, when a neuro-fuzzy classification structure is employed, coupled with a dimensionality reduction technique based on random projections. PMID:25436654
Single-trial EEG RSVP classification using convolutional neural networks
NASA Astrophysics Data System (ADS)
Shamwell, Jared; Lee, Hyungtae; Kwon, Heesung; Marathe, Amar R.; Lawhern, Vernon; Nothwang, William
2016-05-01
Traditionally, Brain-Computer Interfaces (BCI) have been explored as a means to return function to paralyzed or otherwise debilitated individuals. An emerging use for BCIs is in human-autonomy sensor fusion where physiological data from healthy subjects is combined with machine-generated information to enhance the capabilities of artificial systems. While human-autonomy fusion of physiological data and computer vision have been shown to improve classification during visual search tasks, to date these approaches have relied on separately trained classification models for each modality. We aim to improve human-autonomy classification performance by developing a single framework that builds codependent models of human electroencephalograph (EEG) and image data to generate fused target estimates. As a first step, we developed a novel convolutional neural network (CNN) architecture and applied it to EEG recordings of subjects classifying target and non-target image presentations during a rapid serial visual presentation (RSVP) image triage task. The low signal-to-noise ratio (SNR) of EEG inherently limits the accuracy of single-trial classification and when combined with the high dimensionality of EEG recordings, extremely large training sets are needed to prevent overfitting and achieve accurate classification from raw EEG data. This paper explores a new deep CNN architecture for generalized multi-class, single-trial EEG classification across subjects. We compare classification performance from the generalized CNN architecture trained across all subjects to the individualized XDAWN, HDCA, and CSP neural classifiers which are trained and tested on single subjects. Preliminary results show that our CNN meets and slightly exceeds the performance of the other classifiers despite being trained across subjects.
Training strategy for convolutional neural networks in pedestrian gender classification
NASA Astrophysics Data System (ADS)
Ng, Choon-Boon; Tay, Yong-Haur; Goi, Bok-Min
2017-06-01
In this work, we studied a strategy for training a convolutional neural network in pedestrian gender classification with limited amount of labeled training data. Unsupervised learning by k-means clustering on pedestrian images was used to learn the filters to initialize the first layer of the network. As a form of pre-training, supervised learning for the related task of pedestrian classification was performed. Finally, the network was fine-tuned for gender classification. We found that this strategy improved the network's generalization ability in gender classification, achieving better test results when compared to random weights initialization and slightly more beneficial than merely initializing the first layer filters by unsupervised learning. This shows that unsupervised learning followed by pre-training with pedestrian images is an effective strategy to learn useful features for pedestrian gender classification.
Median Robust Extended Local Binary Pattern for Texture Classification.
Liu, Li; Lao, Songyang; Fieguth, Paul W; Guo, Yulan; Wang, Xiaogang; Pietikäinen, Matti
2016-03-01
Local binary patterns (LBP) are considered among the most computationally efficient high-performance texture features. However, the LBP method is very sensitive to image noise and is unable to capture macrostructure information. To best address these disadvantages, in this paper, we introduce a novel descriptor for texture classification, the median robust extended LBP (MRELBP). Different from the traditional LBP and many LBP variants, MRELBP compares regional image medians rather than raw image intensities. A multiscale LBP type descriptor is computed by efficiently comparing image medians over a novel sampling scheme, which can capture both microstructure and macrostructure texture information. A comprehensive evaluation on benchmark data sets reveals MRELBP's high performance-robust to gray scale variations, rotation changes and noise-but at a low computational cost. MRELBP produces the best classification scores of 99.82%, 99.38%, and 99.77% on three popular Outex test suites. More importantly, MRELBP is shown to be highly robust to image noise, including Gaussian noise, Gaussian blur, salt-and-pepper noise, and random pixel corruption.
Comparing Features for Classification of MEG Responses to Motor Imagery.
Halme, Hanna-Leena; Parkkonen, Lauri
2016-01-01
Motor imagery (MI) with real-time neurofeedback could be a viable approach, e.g., in rehabilitation of cerebral stroke. Magnetoencephalography (MEG) noninvasively measures electric brain activity at high temporal resolution and is well-suited for recording oscillatory brain signals. MI is known to modulate 10- and 20-Hz oscillations in the somatomotor system. In order to provide accurate feedback to the subject, the most relevant MI-related features should be extracted from MEG data. In this study, we evaluated several MEG signal features for discriminating between left- and right-hand MI and between MI and rest. MEG was measured from nine healthy participants imagining either left- or right-hand finger tapping according to visual cues. Data preprocessing, feature extraction and classification were performed offline. The evaluated MI-related features were power spectral density (PSD), Morlet wavelets, short-time Fourier transform (STFT), common spatial patterns (CSP), filter-bank common spatial patterns (FBCSP), spatio-spectral decomposition (SSD), and combined SSD+CSP, CSP+PSD, CSP+Morlet, and CSP+STFT. We also compared four classifiers applied to single trials using 5-fold cross-validation for evaluating the classification accuracy and its possible dependence on the classification algorithm. In addition, we estimated the inter-session left-vs-right accuracy for each subject. The SSD+CSP combination yielded the best accuracy in both left-vs-right (mean 73.7%) and MI-vs-rest (mean 81.3%) classification. CSP+Morlet yielded the best mean accuracy in inter-session left-vs-right classification (mean 69.1%). There were large inter-subject differences in classification accuracy, and the level of the 20-Hz suppression correlated significantly with the subjective MI-vs-rest accuracy. Selection of the classification algorithm had only a minor effect on the results. We obtained good accuracy in sensor-level decoding of MI from single-trial MEG data. Feature extraction methods utilizing both the spatial and spectral profile of MI-related signals provided the best classification results, suggesting good performance of these methods in an online MEG neurofeedback system.
Alshamlan, Hala M; Badr, Ghada H; Alohali, Yousef A
2015-06-01
Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification. Copyright © 2015 Elsevier Ltd. All rights reserved.
Reduced Power Laser Designation Systems
2009-01-10
buffering of the input stage; comparing the noise performance of the candidate amplifier designs; selection of the two-transistor bootstrap design as the...circuit of choice; and comparing the performance of this circuit against that of a basic transconductance amplifier . 15. SUBJECT TERMS Laser...Guided Weapons; Laser designation; laser rangefinders; infrared photodiodes; transconductance amplifiers . 16. SECURITY CLASSIFICATION OF: a. REPORT U
On the use of interaction error potentials for adaptive brain computer interfaces.
Llera, A; van Gerven, M A J; Gómez, V; Jensen, O; Kappen, H J
2011-12-01
We propose an adaptive classification method for the Brain Computer Interfaces (BCI) which uses Interaction Error Potentials (IErrPs) as a reinforcement signal and adapts the classifier parameters when an error is detected. We analyze the quality of the proposed approach in relation to the misclassification of the IErrPs. In addition we compare static versus adaptive classification performance using artificial and MEG data. We show that the proposed adaptive framework significantly improves the static classification methods. Copyright © 2011 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Keyport, Ren N.; Oommen, Thomas; Martha, Tapas R.; Sajinkumar, K. S.; Gierke, John S.
2018-02-01
A comparative analysis of landslides detected by pixel-based and object-oriented analysis (OOA) methods was performed using very high-resolution (VHR) remotely sensed aerial images for the San Juan La Laguna, Guatemala, which witnessed widespread devastation during the 2005 Hurricane Stan. A 3-band orthophoto of 0.5 m spatial resolution together with a 115 field-based landslide inventory were used for the analysis. A binary reference was assigned with a zero value for landslide and unity for non-landslide pixels. The pixel-based analysis was performed using unsupervised classification, which resulted in 11 different trial classes. Detection of landslides using OOA includes 2-step K-means clustering to eliminate regions based on brightness; elimination of false positives using object properties such as rectangular fit, compactness, length/width ratio, mean difference of objects, and slope angle. Both overall accuracy and F-score for OOA methods outperformed pixel-based unsupervised classification methods in both landslide and non-landslide classes. The overall accuracy for OOA and pixel-based unsupervised classification was 96.5% and 94.3%, respectively, whereas the best F-score for landslide identification for OOA and pixel-based unsupervised methods: were 84.3% and 77.9%, respectively.Results indicate that the OOA is able to identify the majority of landslides with a few false positive when compared to pixel-based unsupervised classification.
NASA Astrophysics Data System (ADS)
Dash, Jatindra K.; Kale, Mandar; Mukhopadhyay, Sudipta; Khandelwal, Niranjan; Prabhakar, Nidhi; Garg, Mandeep; Kalra, Naveen
2017-03-01
In this paper, we investigate the effect of the error criteria used during a training phase of the artificial neural network (ANN) on the accuracy of the classifier for classification of lung tissues affected with Interstitial Lung Diseases (ILD). Mean square error (MSE) and the cross-entropy (CE) criteria are chosen being most popular choice in state-of-the-art implementations. The classification experiment performed on the six interstitial lung disease (ILD) patterns viz. Consolidation, Emphysema, Ground Glass Opacity, Micronodules, Fibrosis and Healthy from MedGIFT database. The texture features from an arbitrary region of interest (AROI) are extracted using Gabor filter. Two different neural networks are trained with the scaled conjugate gradient back propagation algorithm with MSE and CE error criteria function respectively for weight updation. Performance is evaluated in terms of average accuracy of these classifiers using 4 fold cross-validation. Each network is trained for five times for each fold with randomly initialized weight vectors and accuracies are computed. Significant improvement in classification accuracy is observed when ANN is trained by using CE (67.27%) as error function compared to MSE (63.60%). Moreover, standard deviation of the classification accuracy for the network trained with CE (6.69) error criteria is found less as compared to network trained with MSE (10.32) criteria.
Plaza-Leiva, Victoria; Gomez-Ruiz, Jose Antonio; Mandow, Anthony; García-Cerezo, Alfonso
2017-03-15
Improving the effectiveness of spatial shape features classification from 3D lidar data is very relevant because it is largely used as a fundamental step towards higher level scene understanding challenges of autonomous vehicles and terrestrial robots. In this sense, computing neighborhood for points in dense scans becomes a costly process for both training and classification. This paper proposes a new general framework for implementing and comparing different supervised learning classifiers with a simple voxel-based neighborhood computation where points in each non-overlapping voxel in a regular grid are assigned to the same class by considering features within a support region defined by the voxel itself. The contribution provides offline training and online classification procedures as well as five alternative feature vector definitions based on principal component analysis for scatter, tubular and planar shapes. Moreover, the feasibility of this approach is evaluated by implementing a neural network (NN) method previously proposed by the authors as well as three other supervised learning classifiers found in scene processing methods: support vector machines (SVM), Gaussian processes (GP), and Gaussian mixture models (GMM). A comparative performance analysis is presented using real point clouds from both natural and urban environments and two different 3D rangefinders (a tilting Hokuyo UTM-30LX and a Riegl). Classification performance metrics and processing time measurements confirm the benefits of the NN classifier and the feasibility of voxel-based neighborhood.
Hussain, Shaista; Basu, Arindam
2016-01-01
The development of power-efficient neuromorphic devices presents the challenge of designing spike pattern classification algorithms which can be implemented on low-precision hardware and can also achieve state-of-the-art performance. In our pursuit of meeting this challenge, we present a pattern classification model which uses a sparse connection matrix and exploits the mechanism of nonlinear dendritic processing to achieve high classification accuracy. A rate-based structural learning rule for multiclass classification is proposed which modifies a connectivity matrix of binary synaptic connections by choosing the best “k” out of “d” inputs to make connections on every dendritic branch (k < < d). Because learning only modifies connectivity, the model is well suited for implementation in neuromorphic systems using address-event representation (AER). We develop an ensemble method which combines several dendritic classifiers to achieve enhanced generalization over individual classifiers. We have two major findings: (1) Our results demonstrate that an ensemble created with classifiers comprising moderate number of dendrites performs better than both ensembles of perceptrons and of complex dendritic trees. (2) In order to determine the moderate number of dendrites required for a specific classification problem, a two-step solution is proposed. First, an adaptive approach is proposed which scales the relative size of the dendritic trees of neurons for each class. It works by progressively adding dendrites with fixed number of synapses to the network, thereby allocating synaptic resources as per the complexity of the given problem. As a second step, theoretical capacity calculations are used to convert each neuronal dendritic tree to its optimal topology where dendrites of each class are assigned different number of synapses. The performance of the model is evaluated on classification of handwritten digits from the benchmark MNIST dataset and compared with other spike classifiers. We show that our system can achieve classification accuracy within 1 − 2% of other reported spike-based classifiers while using much less synaptic resources (only 7%) compared to that used by other methods. Further, an ensemble classifier created with adaptively learned sizes can attain accuracy of 96.4% which is at par with the best reported performance of spike-based classifiers. Moreover, the proposed method achieves this by using about 20% of the synapses used by other spike algorithms. We also present results of applying our algorithm to classify the MNIST-DVS dataset collected from a real spike-based image sensor and show results comparable to the best reported ones (88.1% accuracy). For VLSI implementations, we show that the reduced synaptic memory can save upto 4X area compared to conventional crossbar topologies. Finally, we also present a biologically realistic spike-based version for calculating the correlations required by the structural learning rule and demonstrate the correspondence between the rate-based and spike-based methods of learning. PMID:27065782
Ensemble Sparse Classification of Alzheimer’s Disease
Liu, Manhua; Zhang, Daoqiang; Shen, Dinggang
2012-01-01
The high-dimensional pattern classification methods, e.g., support vector machines (SVM), have been widely investigated for analysis of structural and functional brain images (such as magnetic resonance imaging (MRI)) to assist the diagnosis of Alzheimer’s disease (AD) including its prodromal stage, i.e., mild cognitive impairment (MCI). Most existing classification methods extract features from neuroimaging data and then construct a single classifier to perform classification. However, due to noise and small sample size of neuroimaging data, it is challenging to train only a global classifier that can be robust enough to achieve good classification performance. In this paper, instead of building a single global classifier, we propose a local patch-based subspace ensemble method which builds multiple individual classifiers based on different subsets of local patches and then combines them for more accurate and robust classification. Specifically, to capture the local spatial consistency, each brain image is partitioned into a number of local patches and a subset of patches is randomly selected from the patch pool to build a weak classifier. Here, the sparse representation-based classification (SRC) method, which has shown effective for classification of image data (e.g., face), is used to construct each weak classifier. Then, multiple weak classifiers are combined to make the final decision. We evaluate our method on 652 subjects (including 198 AD patients, 225 MCI and 229 normal controls) from Alzheimer’s Disease Neuroimaging Initiative (ADNI) database using MR images. The experimental results show that our method achieves an accuracy of 90.8% and an area under the ROC curve (AUC) of 94.86% for AD classification and an accuracy of 87.85% and an AUC of 92.90% for MCI classification, respectively, demonstrating a very promising performance of our method compared with the state-of-the-art methods for AD/MCI classification using MR images. PMID:22270352
Burlina, Philippe; Billings, Seth; Joshi, Neil
2017-01-01
Objective To evaluate the use of ultrasound coupled with machine learning (ML) and deep learning (DL) techniques for automated or semi-automated classification of myositis. Methods Eighty subjects comprised of 19 with inclusion body myositis (IBM), 14 with polymyositis (PM), 14 with dermatomyositis (DM), and 33 normal (N) subjects were included in this study, where 3214 muscle ultrasound images of 7 muscles (observed bilaterally) were acquired. We considered three problems of classification including (A) normal vs. affected (DM, PM, IBM); (B) normal vs. IBM patients; and (C) IBM vs. other types of myositis (DM or PM). We studied the use of an automated DL method using deep convolutional neural networks (DL-DCNNs) for diagnostic classification and compared it with a semi-automated conventional ML method based on random forests (ML-RF) and “engineered” features. We used the known clinical diagnosis as the gold standard for evaluating performance of muscle classification. Results The performance of the DL-DCNN method resulted in accuracies ± standard deviation of 76.2% ± 3.1% for problem (A), 86.6% ± 2.4% for (B) and 74.8% ± 3.9% for (C), while the ML-RF method led to accuracies of 72.3% ± 3.3% for problem (A), 84.3% ± 2.3% for (B) and 68.9% ± 2.5% for (C). Conclusions This study demonstrates the application of machine learning methods for automatically or semi-automatically classifying inflammatory muscle disease using muscle ultrasound. Compared to the conventional random forest machine learning method used here, which has the drawback of requiring manual delineation of muscle/fat boundaries, DCNN-based classification by and large improved the accuracies in all classification problems while providing a fully automated approach to classification. PMID:28854220
Burlina, Philippe; Billings, Seth; Joshi, Neil; Albayda, Jemima
2017-01-01
To evaluate the use of ultrasound coupled with machine learning (ML) and deep learning (DL) techniques for automated or semi-automated classification of myositis. Eighty subjects comprised of 19 with inclusion body myositis (IBM), 14 with polymyositis (PM), 14 with dermatomyositis (DM), and 33 normal (N) subjects were included in this study, where 3214 muscle ultrasound images of 7 muscles (observed bilaterally) were acquired. We considered three problems of classification including (A) normal vs. affected (DM, PM, IBM); (B) normal vs. IBM patients; and (C) IBM vs. other types of myositis (DM or PM). We studied the use of an automated DL method using deep convolutional neural networks (DL-DCNNs) for diagnostic classification and compared it with a semi-automated conventional ML method based on random forests (ML-RF) and "engineered" features. We used the known clinical diagnosis as the gold standard for evaluating performance of muscle classification. The performance of the DL-DCNN method resulted in accuracies ± standard deviation of 76.2% ± 3.1% for problem (A), 86.6% ± 2.4% for (B) and 74.8% ± 3.9% for (C), while the ML-RF method led to accuracies of 72.3% ± 3.3% for problem (A), 84.3% ± 2.3% for (B) and 68.9% ± 2.5% for (C). This study demonstrates the application of machine learning methods for automatically or semi-automatically classifying inflammatory muscle disease using muscle ultrasound. Compared to the conventional random forest machine learning method used here, which has the drawback of requiring manual delineation of muscle/fat boundaries, DCNN-based classification by and large improved the accuracies in all classification problems while providing a fully automated approach to classification.
Bromuri, Stefano; Zufferey, Damien; Hennebert, Jean; Schumacher, Michael
2014-10-01
This research is motivated by the issue of classifying illnesses of chronically ill patients for decision support in clinical settings. Our main objective is to propose multi-label classification of multivariate time series contained in medical records of chronically ill patients, by means of quantization methods, such as bag of words (BoW), and multi-label classification algorithms. Our second objective is to compare supervised dimensionality reduction techniques to state-of-the-art multi-label classification algorithms. The hypothesis is that kernel methods and locality preserving projections make such algorithms good candidates to study multi-label medical time series. We combine BoW and supervised dimensionality reduction algorithms to perform multi-label classification on health records of chronically ill patients. The considered algorithms are compared with state-of-the-art multi-label classifiers in two real world datasets. Portavita dataset contains 525 diabetes type 2 (DT2) patients, with co-morbidities of DT2 such as hypertension, dyslipidemia, and microvascular or macrovascular issues. MIMIC II dataset contains 2635 patients affected by thyroid disease, diabetes mellitus, lipoid metabolism disease, fluid electrolyte disease, hypertensive disease, thrombosis, hypotension, chronic obstructive pulmonary disease (COPD), liver disease and kidney disease. The algorithms are evaluated using multi-label evaluation metrics such as hamming loss, one error, coverage, ranking loss, and average precision. Non-linear dimensionality reduction approaches behave well on medical time series quantized using the BoW algorithm, with results comparable to state-of-the-art multi-label classification algorithms. Chaining the projected features has a positive impact on the performance of the algorithm with respect to pure binary relevance approaches. The evaluation highlights the feasibility of representing medical health records using the BoW for multi-label classification tasks. The study also highlights that dimensionality reduction algorithms based on kernel methods, locality preserving projections or both are good candidates to deal with multi-label classification tasks in medical time series with many missing values and high label density. Copyright © 2014 Elsevier Inc. All rights reserved.
Genetic programming and serial processing for time series classification.
Alfaro-Cid, Eva; Sharman, Ken; Esparcia-Alcázar, Anna I
2014-01-01
This work describes an approach devised by the authors for time series classification. In our approach genetic programming is used in combination with a serial processing of data, where the last output is the result of the classification. The use of genetic programming for classification, although still a field where more research in needed, is not new. However, the application of genetic programming to classification tasks is normally done by considering the input data as a feature vector. That is, to the best of our knowledge, there are not examples in the genetic programming literature of approaches where the time series data are processed serially and the last output is considered as the classification result. The serial processing approach presented here fills a gap in the existing literature. This approach was tested in three different problems. Two of them are real world problems whose data were gathered for online or conference competitions. As there are published results of these two problems this gives us the chance to compare the performance of our approach against top performing methods. The serial processing of data in combination with genetic programming obtained competitive results in both competitions, showing its potential for solving time series classification problems. The main advantage of our serial processing approach is that it can easily handle very large datasets.
Automated classification of cell morphology by coherence-controlled holographic microscopy
NASA Astrophysics Data System (ADS)
Strbkova, Lenka; Zicha, Daniel; Vesely, Pavel; Chmelik, Radim
2017-08-01
In the last few years, classification of cells by machine learning has become frequently used in biology. However, most of the approaches are based on morphometric (MO) features, which are not quantitative in terms of cell mass. This may result in poor classification accuracy. Here, we study the potential contribution of coherence-controlled holographic microscopy enabling quantitative phase imaging for the classification of cell morphologies. We compare our approach with the commonly used method based on MO features. We tested both classification approaches in an experiment with nutritionally deprived cancer tissue cells, while employing several supervised machine learning algorithms. Most of the classifiers provided higher performance when quantitative phase features were employed. Based on the results, it can be concluded that the quantitative phase features played an important role in improving the performance of the classification. The methodology could be valuable help in refining the monitoring of live cells in an automated fashion. We believe that coherence-controlled holographic microscopy, as a tool for quantitative phase imaging, offers all preconditions for the accurate automated analysis of live cell behavior while enabling noninvasive label-free imaging with sufficient contrast and high-spatiotemporal phase sensitivity.
A hybrid clustering and classification approach for predicting crash injury severity on rural roads.
Hasheminejad, Seyed Hessam-Allah; Zahedi, Mohsen; Hasheminejad, Seyed Mohammad Hossein
2018-03-01
As a threat for transportation system, traffic crashes have a wide range of social consequences for governments. Traffic crashes are increasing in developing countries and Iran as a developing country is not immune from this risk. There are several researches in the literature to predict traffic crash severity based on artificial neural networks (ANNs), support vector machines and decision trees. This paper attempts to investigate the crash injury severity of rural roads by using a hybrid clustering and classification approach to compare the performance of classification algorithms before and after applying the clustering. In this paper, a novel rule-based genetic algorithm (GA) is proposed to predict crash injury severity, which is evaluated by performance criteria in comparison with classification algorithms like ANN. The results obtained from analysis of 13,673 crashes (5600 property damage, 778 fatal crashes, 4690 slight injuries and 2605 severe injuries) on rural roads in Tehran Province of Iran during 2011-2013 revealed that the proposed GA method outperforms other classification algorithms based on classification metrics like precision (86%), recall (88%) and accuracy (87%). Moreover, the proposed GA method has the highest level of interpretation, is easy to understand and provides feedback to analysts.
Cell dynamic morphology classification using deep convolutional neural networks.
Li, Heng; Pang, Fengqian; Shi, Yonggang; Liu, Zhiwen
2018-05-15
Cell morphology is often used as a proxy measurement of cell status to understand cell physiology. Hence, interpretation of cell dynamic morphology is a meaningful task in biomedical research. Inspired by the recent success of deep learning, we here explore the application of convolutional neural networks (CNNs) to cell dynamic morphology classification. An innovative strategy for the implementation of CNNs is introduced in this study. Mouse lymphocytes were collected to observe the dynamic morphology, and two datasets were thus set up to investigate the performances of CNNs. Considering the installation of deep learning, the classification problem was simplified from video data to image data, and was then solved by CNNs in a self-taught manner with the generated image data. CNNs were separately performed in three installation scenarios and compared with existing methods. Experimental results demonstrated the potential of CNNs in cell dynamic morphology classification, and validated the effectiveness of the proposed strategy. CNNs were successfully applied to the classification problem, and outperformed the existing methods in the classification accuracy. For the installation of CNNs, transfer learning was proved to be a promising scheme. © 2018 International Society for Advancement of Cytometry. © 2018 International Society for Advancement of Cytometry.
Automated classification of cell morphology by coherence-controlled holographic microscopy.
Strbkova, Lenka; Zicha, Daniel; Vesely, Pavel; Chmelik, Radim
2017-08-01
In the last few years, classification of cells by machine learning has become frequently used in biology. However, most of the approaches are based on morphometric (MO) features, which are not quantitative in terms of cell mass. This may result in poor classification accuracy. Here, we study the potential contribution of coherence-controlled holographic microscopy enabling quantitative phase imaging for the classification of cell morphologies. We compare our approach with the commonly used method based on MO features. We tested both classification approaches in an experiment with nutritionally deprived cancer tissue cells, while employing several supervised machine learning algorithms. Most of the classifiers provided higher performance when quantitative phase features were employed. Based on the results, it can be concluded that the quantitative phase features played an important role in improving the performance of the classification. The methodology could be valuable help in refining the monitoring of live cells in an automated fashion. We believe that coherence-controlled holographic microscopy, as a tool for quantitative phase imaging, offers all preconditions for the accurate automated analysis of live cell behavior while enabling noninvasive label-free imaging with sufficient contrast and high-spatiotemporal phase sensitivity. (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).
Yaghoobi, Mohammad; Padol, Sara; Yuan, Yuhong; Hunt, Richard H
2010-05-01
The results of clinical trials with proton pump inhibitors (PPIs) are usually based on the Hetzel-Dent (HD), Savary-Miller (SM), or Los Angeles (LA) classifications to describe the severity and assess the healing of erosive oesophagitis. However, it is not known whether these classifications are comparable. The aim of this study was to review systematically the literature to compare the healing rates of erosive oesophagitis with PPIs in clinical trials assessed by the HD, SM, or LA classifications. A recursive, English language literature search in PubMed and Cochrane databases to December 2006 was performed. Double-blind randomized control trials comparing a PPI with another PPI, an H2-RA or placebo using endoscopic assessment of the healing of oesophagitis by the HD, SM or LA, or their modified classifications at 4 or 8 weeks, were included in the study. The healing rates on treatment with the same PPI(s), and same endoscopic grade(s) were pooled and compared between different classifications using Fisher's exact test or chi2 test where appropriate. Forty-seven studies from 965 potential citations met inclusion criteria. Seventy-eight PPI arms were identified, with 27 using HD, 29 using SM, and 22 using LA for five marketed PPIs. There was insufficient data for rabeprazole and esomeprazole (week 4 only) to compare because they were evaluated by only one classification. When data from all PPIs were pooled, regardless of baseline oesophagitis grades, the LA healing rate was significantly higher than SM and HD at both 4 and 8 weeks (74, 71, and 68% at 4 weeks and 89, 84, and 83% at 8 weeks, respectively). The distribution of different grades in study population was available only for pantoprazole where it was not significantly different between LA and SM subgroups. When analyzing data for PPI and dose, the LA classification showed a higher healing rate for omeprazole 20 mg/day and pantoprazole 40 mg/day (significant at 8 weeks), whereas healing by SM classification was significantly higher for omeprazole 40 mg/day (no data for LA) and lansoprazole 30 mg/day at 4 and 8 weeks. The healing rate by individual oesophagitis grade was not always available or robust enough for meaningful analysis. However, a difference between classifications remained. There is a significant, but not always consistent, difference in oesophagitis healing rates with the same PPI(s) reported by the LA, SM, or HD classifications. The possible difference between grading classifications should be considered when interpreting or comparing healing rates for oesophagitis from different studies.
Nursing interventions for rehabilitation in Parkinson's disease: cross mapping of terms
Tosin, Michelle Hyczy de Siqueira; Campos, Débora Moraes; de Andrade, Leonardo Tadeu; de Oliveira, Beatriz Guitton Renaud Baptista; Santana, Rosimere Ferreira
2016-01-01
ABSTRACT Objective: to perform a cross-term mapping of nursing language in the patient record with the Nursing Interventions Classification system, in rehabilitation patients with Parkinson's disease. Method: a documentary research study to perform cross mapping. A probabilistic, simple random sample composed of 67 records of patients with Parkinson's disease who participated in a rehabilitation program, between March of 2009 and April of 2013. The research was conducted in three stages, in which the nursing terms were mapped to natural language and crossed with the Nursing Interventions Classification. Results: a total of 1,077 standard interventions that, after crossing with the taxonomy and refinement performed by the experts, resulted in 32 interventions equivalent to the Nursing Interventions Classification (NIC) system. The NICs, "Education: The process of the disease.", "Contract with the patient", and "Facilitation of Learning" were present in 100% of the records. For these interventions, 40 activities were described, representing 13 activities by intervention. Conclusion: the cross mapping allowed for the identification of corresponding terms with the nursing interventions used every day in rehabilitation nursing, and compared them to the Nursing Interventions Classification. PMID:27508903
Integrative analysis of environmental sequences using MEGAN4.
Huson, Daniel H; Mitra, Suparna; Ruscheweyh, Hans-Joachim; Weber, Nico; Schuster, Stephan C
2011-09-01
A major challenge in the analysis of environmental sequences is data integration. The question is how to analyze different types of data in a unified approach, addressing both the taxonomic and functional aspects. To facilitate such analyses, we have substantially extended MEGAN, a widely used taxonomic analysis program. The new program, MEGAN4, provides an integrated approach to the taxonomic and functional analysis of metagenomic, metatranscriptomic, metaproteomic, and rRNA data. While taxonomic analysis is performed based on the NCBI taxonomy, functional analysis is performed using the SEED classification of subsystems and functional roles or the KEGG classification of pathways and enzymes. A number of examples illustrate how such analyses can be performed, and show that one can also import and compare classification results obtained using others' tools. MEGAN4 is freely available for academic purposes, and installers for all three major operating systems can be downloaded from www-ab.informatik.uni-tuebingen.de/software/megan.
2013-01-01
Background Breast cancer is the leading cause of both incidence and mortality in women population. For this reason, much research effort has been devoted to develop Computer-Aided Detection (CAD) systems for early detection of the breast cancers on mammograms. In this paper, we propose a new and novel dictionary configuration underpinning sparse representation based classification (SRC). The key idea of the proposed algorithm is to improve the sparsity in terms of mass margins for the purpose of improving classification performance in CAD systems. Methods The aim of the proposed SRC framework is to construct separate dictionaries according to the types of mass margins. The underlying idea behind our method is that the separated dictionaries can enhance the sparsity of mass class (true-positive), leading to an improved performance for differentiating mammographic masses from normal tissues (false-positive). When a mass sample is given for classification, the sparse solutions based on corresponding dictionaries are separately solved and combined at score level. Experiments have been performed on both database (DB) named as Digital Database for Screening Mammography (DDSM) and clinical Full Field Digital Mammogram (FFDM) DBs. In our experiments, sparsity concentration in the true class (SCTC) and area under the Receiver operating characteristic (ROC) curve (AUC) were measured for the comparison between the proposed method and a conventional single dictionary based approach. In addition, a support vector machine (SVM) was used for comparing our method with state-of-the-arts classifier extensively used for mass classification. Results Comparing with the conventional single dictionary configuration, the proposed approach is able to improve SCTC of up to 13.9% and 23.6% on DDSM and FFDM DBs, respectively. Moreover, the proposed method is able to improve AUC with 8.2% and 22.1% on DDSM and FFDM DBs, respectively. Comparing to SVM classifier, the proposed method improves AUC with 2.9% and 11.6% on DDSM and FFDM DBs, respectively. Conclusions The proposed dictionary configuration is found to well improve the sparsity of dictionaries, resulting in an enhanced classification performance. Moreover, the results show that the proposed method is better than conventional SVM classifier for classifying breast masses subject to various margins from normal tissues. PMID:24564973
Identification of terrain cover using the optimum polarimetric classifier
NASA Technical Reports Server (NTRS)
Kong, J. A.; Swartz, A. A.; Yueh, H. A.; Novak, L. M.; Shin, R. T.
1988-01-01
A systematic approach for the identification of terrain media such as vegetation canopy, forest, and snow-covered fields is developed using the optimum polarimetric classifier. The covariance matrices for various terrain cover are computed from theoretical models of random medium by evaluating the scattering matrix elements. The optimal classification scheme makes use of a quadratic distance measure and is applied to classify a vegetation canopy consisting of both trees and grass. Experimentally measured data are used to validate the classification scheme. Analytical and Monte Carlo simulated classification errors using the fully polarimetric feature vector are compared with classification based on single features which include the phase difference between the VV and HH polarization returns. It is shown that the full polarimetric results are optimal and provide better classification performance than single feature measurements.
A Taxonomy of Introductory Physics Concepts.
NASA Astrophysics Data System (ADS)
Mokaya, Fridah; Savkar, Amit; Valente, Diego
We have designed and implemented a hierarchical taxonomic classification of physics concepts for our introductory physics for engineers course sequence taught at the University of Connecticut. This classification can be used to provide a mechanism to measure student progress in learning at the level of individual concepts or clusters of concepts, and also as part of a tool to measure effectiveness of teaching pedagogy. We examine our pre- and post-test FCI results broken down by topics using Hestenes et al.'s taxonomy classification for the FCI, and compare these results with those found using our own taxonomy classification. In addition, we expand this taxonomic classification to measure performance in our other course exams, investigating possible correlations in results achieved across different assessments at the individual topic level. UCONN CLAS(College of Liberal Arts and Science).
Principles for classification of work load for women
NASA Technical Reports Server (NTRS)
Navakatikyan, A. O.; Okhrimenko, A. P.; Karakashyan, A. N.; Buzunov, V. A.
1980-01-01
In an attempt to develop guidelines for classification by degree of intensity of various kinds of physical work performed by women, the effects of different work loads on women as compared to men were studied under industrial and experimental conditions, including response of the cardiovascular and respiratory systems to specified physical exercises of increasing intensity. Physiological criteria for assessing female labor in terms of intensity are proposed.
Baldacchino, Tara; Jacobs, William R; Anderson, Sean R; Worden, Keith; Rowson, Jennifer
2018-01-01
This contribution presents a novel methodology for myolectric-based control using surface electromyographic (sEMG) signals recorded during finger movements. A multivariate Bayesian mixture of experts (MoE) model is introduced which provides a powerful method for modeling force regression at the fingertips, while also performing finger movement classification as a by-product of the modeling algorithm. Bayesian inference of the model allows uncertainties to be naturally incorporated into the model structure. This method is tested using data from the publicly released NinaPro database which consists of sEMG recordings for 6 degree-of-freedom force activations for 40 intact subjects. The results demonstrate that the MoE model achieves similar performance compared to the benchmark set by the authors of NinaPro for finger force regression. Additionally, inherent to the Bayesian framework is the inclusion of uncertainty in the model parameters, naturally providing confidence bounds on the force regression predictions. Furthermore, the integrated clustering step allows a detailed investigation into classification of the finger movements, without incurring any extra computational effort. Subsequently, a systematic approach to assessing the importance of the number of electrodes needed for accurate control is performed via sensitivity analysis techniques. A slight degradation in regression performance is observed for a reduced number of electrodes, while classification performance is unaffected.
The use of Landsat data to inventory cotton and soybean acreage in North Alabama
NASA Technical Reports Server (NTRS)
Downs, S. W., Jr.; Faust, N. L.
1980-01-01
This study was performed to determine if Landsat data could be used to improve the accuracy of the estimation of cotton acreage. A linear classification algorithm and a maximum likelihood algorithm were used for computer classification of the area, and the classification was compared with ground truth. The classification accuracy for some fields was greater than 90 percent; however, the overall accuracy was 71 percent for cotton and 56 percent for soybeans. The results of this research indicate that computer analysis of Landsat data has potential for improving upon the methods presently being used to determine cotton acreage; however, additional experiments and refinements are needed before the method can be used operationally.
Manifold regularized multitask learning for semi-supervised multilabel image classification.
Luo, Yong; Tao, Dacheng; Geng, Bo; Xu, Chao; Maybank, Stephen J
2013-02-01
It is a significant challenge to classify images with multiple labels by using only a small number of labeled samples. One option is to learn a binary classifier for each label and use manifold regularization to improve the classification performance by exploring the underlying geometric structure of the data distribution. However, such an approach does not perform well in practice when images from multiple concepts are represented by high-dimensional visual features. Thus, manifold regularization is insufficient to control the model complexity. In this paper, we propose a manifold regularized multitask learning (MRMTL) algorithm. MRMTL learns a discriminative subspace shared by multiple classification tasks by exploiting the common structure of these tasks. It effectively controls the model complexity because different tasks limit one another's search volume, and the manifold regularization ensures that the functions in the shared hypothesis space are smooth along the data manifold. We conduct extensive experiments, on the PASCAL VOC'07 dataset with 20 classes and the MIR dataset with 38 classes, by comparing MRMTL with popular image classification algorithms. The results suggest that MRMTL is effective for image classification.
Shin, Jaeyoung; Müller, Klaus-R; Hwang, Han-Jeong
2016-01-01
We propose a near-infrared spectroscopy (NIRS)-based brain-computer interface (BCI) that can be operated in eyes-closed (EC) state. To evaluate the feasibility of NIRS-based EC BCIs, we compared the performance of an eye-open (EO) BCI paradigm and an EC BCI paradigm with respect to hemodynamic response and classification accuracy. To this end, subjects performed either mental arithmetic or imagined vocalization of the English alphabet as a baseline task with very low cognitive loading. The performances of two linear classifiers were compared; resulting in an advantage of shrinkage linear discriminant analysis (LDA). The classification accuracy of EC paradigm (75.6 ± 7.3%) was observed to be lower than that of EO paradigm (77.0 ± 9.2%), which was statistically insignificant (p = 0.5698). Subjects reported they felt it more comfortable (p = 0.057) and easier (p < 0.05) to perform the EC BCI tasks. The different task difficulty may become a cause of the slightly lower classification accuracy of EC data. From the analysis results, we could confirm the feasibility of NIRS-based EC BCIs, which can be a BCI option that may ultimately be of use for patients who cannot keep their eyes open consistently. PMID:27824089
Shin, Jaeyoung; Müller, Klaus-R; Hwang, Han-Jeong
2016-11-08
We propose a near-infrared spectroscopy (NIRS)-based brain-computer interface (BCI) that can be operated in eyes-closed (EC) state. To evaluate the feasibility of NIRS-based EC BCIs, we compared the performance of an eye-open (EO) BCI paradigm and an EC BCI paradigm with respect to hemodynamic response and classification accuracy. To this end, subjects performed either mental arithmetic or imagined vocalization of the English alphabet as a baseline task with very low cognitive loading. The performances of two linear classifiers were compared; resulting in an advantage of shrinkage linear discriminant analysis (LDA). The classification accuracy of EC paradigm (75.6 ± 7.3%) was observed to be lower than that of EO paradigm (77.0 ± 9.2%), which was statistically insignificant (p = 0.5698). Subjects reported they felt it more comfortable (p = 0.057) and easier (p < 0.05) to perform the EC BCI tasks. The different task difficulty may become a cause of the slightly lower classification accuracy of EC data. From the analysis results, we could confirm the feasibility of NIRS-based EC BCIs, which can be a BCI option that may ultimately be of use for patients who cannot keep their eyes open consistently.
Sharland, Michael J; Waring, Stephen C; Johnson, Brian P; Taran, Allise M; Rusin, Travis A; Pattock, Andrew M; Palcher, Jeanette A
2018-01-01
Assessing test performance validity is a standard clinical practice and although studies have examined the utility of cognitive/memory measures, few have examined attention measures as indicators of performance validity beyond the Reliable Digit Span. The current study further investigates the classification probability of embedded Performance Validity Tests (PVTs) within the Brief Test of Attention (BTA) and the Conners' Continuous Performance Test (CPT-II), in a large clinical sample. This was a retrospective study of 615 patients consecutively referred for comprehensive outpatient neuropsychological evaluation. Non-credible performance was defined two ways: failure on one or more PVTs and failure on two or more PVTs. Classification probability of the BTA and CPT-II into non-credible groups was assessed. Sensitivity, specificity, positive predictive value, and negative predictive value were derived to identify clinically relevant cut-off scores. When using failure on two or more PVTs as the indicator for non-credible responding compared to failure on one or more PVTs, highest classification probability, or area under the curve (AUC), was achieved by the BTA (AUC = .87 vs. .79). CPT-II Omission, Commission, and Total Errors exhibited higher classification probability as well. Overall, these findings corroborate previous findings, extending them to a large clinical sample. BTA and CPT-II are useful embedded performance validity indicators within a clinical battery but should not be used in isolation without other performance validity indicators.
Comparisons and Selections of Features and Classifiers for Short Text Classification
NASA Astrophysics Data System (ADS)
Wang, Ye; Zhou, Zhi; Jin, Shan; Liu, Debin; Lu, Mi
2017-10-01
Short text is considerably different from traditional long text documents due to its shortness and conciseness, which somehow hinders the applications of conventional machine learning and data mining algorithms in short text classification. According to traditional artificial intelligence methods, we divide short text classification into three steps, namely preprocessing, feature selection and classifier comparison. In this paper, we have illustrated step-by-step how we approach our goals. Specifically, in feature selection, we compared the performance and robustness of the four methods of one-hot encoding, tf-idf weighting, word2vec and paragraph2vec, and in the classification part, we deliberately chose and compared Naive Bayes, Logistic Regression, Support Vector Machine, K-nearest Neighbor and Decision Tree as our classifiers. Then, we compared and analysed the classifiers horizontally with each other and vertically with feature selections. Regarding the datasets, we crawled more than 400,000 short text files from Shanghai and Shenzhen Stock Exchanges and manually labeled them into two classes, the big and the small. There are eight labels in the big class, and 59 labels in the small class.
Ghorai, Santanu; Mukherjee, Anirban; Dutta, Pranab K
2010-06-01
In this brief we have proposed the multiclass data classification by computationally inexpensive discriminant analysis through vector-valued regularized kernel function approximation (VVRKFA). VVRKFA being an extension of fast regularized kernel function approximation (FRKFA), provides the vector-valued response at single step. The VVRKFA finds a linear operator and a bias vector by using a reduced kernel that maps a pattern from feature space into the low dimensional label space. The classification of patterns is carried out in this low dimensional label subspace. A test pattern is classified depending on its proximity to class centroids. The effectiveness of the proposed method is experimentally verified and compared with multiclass support vector machine (SVM) on several benchmark data sets as well as on gene microarray data for multi-category cancer classification. The results indicate the significant improvement in both training and testing time compared to that of multiclass SVM with comparable testing accuracy principally in large data sets. Experiments in this brief also serve as comparison of performance of VVRKFA with stratified random sampling and sub-sampling.
Multilabel user classification using the community structure of online networks
Papadopoulos, Symeon; Kompatsiaris, Yiannis
2017-01-01
We study the problem of semi-supervised, multi-label user classification of networked data in the online social platform setting. We propose a framework that combines unsupervised community extraction and supervised, community-based feature weighting before training a classifier. We introduce Approximate Regularized Commute-Time Embedding (ARCTE), an algorithm that projects the users of a social graph onto a latent space, but instead of packing the global structure into a matrix of predefined rank, as many spectral and neural representation learning methods do, it extracts local communities for all users in the graph in order to learn a sparse embedding. To this end, we employ an improvement of personalized PageRank algorithms for searching locally in each user’s graph structure. Then, we perform supervised community feature weighting in order to boost the importance of highly predictive communities. We assess our method performance on the problem of user classification by performing an extensive comparative study among various recent methods based on graph embeddings. The comparison shows that ARCTE significantly outperforms the competition in almost all cases, achieving up to 35% relative improvement compared to the second best competing method in terms of F1-score. PMID:28278242
Multilabel user classification using the community structure of online networks.
Rizos, Georgios; Papadopoulos, Symeon; Kompatsiaris, Yiannis
2017-01-01
We study the problem of semi-supervised, multi-label user classification of networked data in the online social platform setting. We propose a framework that combines unsupervised community extraction and supervised, community-based feature weighting before training a classifier. We introduce Approximate Regularized Commute-Time Embedding (ARCTE), an algorithm that projects the users of a social graph onto a latent space, but instead of packing the global structure into a matrix of predefined rank, as many spectral and neural representation learning methods do, it extracts local communities for all users in the graph in order to learn a sparse embedding. To this end, we employ an improvement of personalized PageRank algorithms for searching locally in each user's graph structure. Then, we perform supervised community feature weighting in order to boost the importance of highly predictive communities. We assess our method performance on the problem of user classification by performing an extensive comparative study among various recent methods based on graph embeddings. The comparison shows that ARCTE significantly outperforms the competition in almost all cases, achieving up to 35% relative improvement compared to the second best competing method in terms of F1-score.
Younghak Shin; Balasingham, Ilangko
2017-07-01
Colonoscopy is a standard method for screening polyps by highly trained physicians. Miss-detected polyps in colonoscopy are potential risk factor for colorectal cancer. In this study, we investigate an automatic polyp classification framework. We aim to compare two different approaches named hand-craft feature method and convolutional neural network (CNN) based deep learning method. Combined shape and color features are used for hand craft feature extraction and support vector machine (SVM) method is adopted for classification. For CNN approach, three convolution and pooling based deep learning framework is used for classification purpose. The proposed framework is evaluated using three public polyp databases. From the experimental results, we have shown that the CNN based deep learning framework shows better classification performance than the hand-craft feature based methods. It achieves over 90% of classification accuracy, sensitivity, specificity and precision.
Zhou, Tao; Li, Zhaofu; Pan, Jianjun
2018-01-27
This paper focuses on evaluating the ability and contribution of using backscatter intensity, texture, coherence, and color features extracted from Sentinel-1A data for urban land cover classification and comparing different multi-sensor land cover mapping methods to improve classification accuracy. Both Landsat-8 OLI and Hyperion images were also acquired, in combination with Sentinel-1A data, to explore the potential of different multi-sensor urban land cover mapping methods to improve classification accuracy. The classification was performed using a random forest (RF) method. The results showed that the optimal window size of the combination of all texture features was 9 × 9, and the optimal window size was different for each individual texture feature. For the four different feature types, the texture features contributed the most to the classification, followed by the coherence and backscatter intensity features; and the color features had the least impact on the urban land cover classification. Satisfactory classification results can be obtained using only the combination of texture and coherence features, with an overall accuracy up to 91.55% and a kappa coefficient up to 0.8935, respectively. Among all combinations of Sentinel-1A-derived features, the combination of the four features had the best classification result. Multi-sensor urban land cover mapping obtained higher classification accuracy. The combination of Sentinel-1A and Hyperion data achieved higher classification accuracy compared to the combination of Sentinel-1A and Landsat-8 OLI images, with an overall accuracy of up to 99.12% and a kappa coefficient up to 0.9889. When Sentinel-1A data was added to Hyperion images, the overall accuracy and kappa coefficient were increased by 4.01% and 0.0519, respectively.
Korczowski, L; Congedo, M; Jutten, C
2015-08-01
The classification of electroencephalographic (EEG) data recorded from multiple users simultaneously is an important challenge in the field of Brain-Computer Interface (BCI). In this paper we compare different approaches for classification of single-trials Event-Related Potential (ERP) on two subjects playing a collaborative BCI game. The minimum distance to mean (MDM) classifier in a Riemannian framework is extended to use the diversity of the inter-subjects spatio-temporal statistics (MDM-hyper) or to merge multiple classifiers (MDM-multi). We show that both these classifiers outperform significantly the mean performance of the two users and analogous classifiers based on the step-wise linear discriminant analysis. More importantly, the MDM-multi outperforms the performance of the best player within the pair.
NASA Astrophysics Data System (ADS)
de Oliveira, Helder C. R.; Moraes, Diego R.; Reche, Gustavo A.; Borges, Lucas R.; Catani, Juliana H.; de Barros, Nestor; Melo, Carlos F. E.; Gonzaga, Adilson; Vieira, Marcelo A. C.
2017-03-01
This paper presents a new local micro-pattern texture descriptor for the detection of Architectural Distortion (AD) in digital mammography images. AD is a subtle contraction of breast parenchyma that may represent an early sign of breast cancer. Due to its subtlety and variability, AD is more difficult to detect compared to microcalcifications and masses, and is commonly found in retrospective evaluations of false-negative mammograms. Several computer-based systems have been proposed for automatic detection of AD, but their performance are still unsatisfactory. The proposed descriptor, Local Mapped Pattern (LMP), is a generalization of the Local Binary Pattern (LBP), which is considered one of the most powerful feature descriptor for texture classification in digital images. Compared to LBP, the LMP descriptor captures more effectively the minor differences between the local image pixels. Moreover, LMP is a parametric model which can be optimized for the desired application. In our work, the LMP performance was compared to the LBP and four Haralick's texture descriptors for the classification of 400 regions of interest (ROIs) extracted from clinical mammograms. ROIs were selected and divided into four classes: AD, normal tissue, microcalcifications and masses. Feature vectors were used as input to a multilayer perceptron neural network, with a single hidden layer. Results showed that LMP is a good descriptor to distinguish AD from other anomalies in digital mammography. LMP performance was slightly better than the LBP and comparable to Haralick's descriptors (mean classification accuracy = 83%).
Bhaduri, Aritra; Banerjee, Amitava; Roy, Subhrajit; Kar, Sougata; Basu, Arindam
2018-03-01
We present a neuromorphic current mode implementation of a spiking neural classifier with lumped square law dendritic nonlinearity. It has been shown previously in software simulations that such a system with binary synapses can be trained with structural plasticity algorithms to achieve comparable classification accuracy with fewer synaptic resources than conventional algorithms. We show that even in real analog systems with manufacturing imperfections (CV of 23.5% and 14.4% for dendritic branch gains and leaks respectively), this network is able to produce comparable results with fewer synaptic resources. The chip fabricated in [Formula: see text]m complementary metal oxide semiconductor has eight dendrites per cell and uses two opposing cells per class to cancel common-mode inputs. The chip can operate down to a [Formula: see text] V and dissipates 19 nW of static power per neuronal cell and [Formula: see text] 125 pJ/spike. For two-class classification problems of high-dimensional rate encoded binary patterns, the hardware achieves comparable performance as software implementation of the same with only about a 0.5% reduction in accuracy. On two UCI data sets, the IC integrated circuit has classification accuracy comparable to standard machine learners like support vector machines and extreme learning machines while using two to five times binary synapses. We also show that the system can operate on mean rate encoded spike patterns, as well as short bursts of spikes. To the best of our knowledge, this is the first attempt in hardware to perform classification exploiting dendritic properties and binary synapses.
Dixon, Roger A.; de Frias, Cindy M.
2014-01-01
Objective Although recent theories of brain and cognitive aging distinguish among normal, exceptional, and impaired groups, further empirical evidence is required. We adapted and applied standard procedures for classifying groups of cognitively impaired (CI) and cognitively normal (CN) older adults to a third classification, cognitively healthy, exceptional, or elite (CE) aging. We then examined concurrent and two-wave longitudinal performance on composite variables of episodic, semantic, and working memory. Method We began with a two-wave source sample from the Victoria Longitudinal Study (VLS) (source n=570; baseline age=53–90 years). The goals were to: (a) apply standard and objective classification procedures to discriminate three cognitive status groups, (b) conduct baseline comparisons of memory performance, (c) develop two-wave status stability and change subgroups, and (d) compare of stability subgroup differences in memory performance and change. Results As expected, the CE group performed best on all three memory composites. Similarly, expected status stability effects were observed: (a) stable CE and CN groups performed memory tasks better than their unstable counterparts and (b) stable (and chronic) CI group performed worse than its unstable (variable) counterpart. These stability group differences were maintained over two waves. Conclusion New data validate the expectations that (a) objective clinical classification procedures for cognitive impairment can be adapted for detecting cognitively advantaged older adults and (b) performance in three memory systems is predictably related to the tripartite classification. PMID:24742143
Optimal two-phase sampling design for comparing accuracies of two binary classification rules.
Xu, Huiping; Hui, Siu L; Grannis, Shaun
2014-02-10
In this paper, we consider the design for comparing the performance of two binary classification rules, for example, two record linkage algorithms or two screening tests. Statistical methods are well developed for comparing these accuracy measures when the gold standard is available for every unit in the sample, or in a two-phase study when the gold standard is ascertained only in the second phase in a subsample using a fixed sampling scheme. However, these methods do not attempt to optimize the sampling scheme to minimize the variance of the estimators of interest. In comparing the performance of two classification rules, the parameters of primary interest are the difference in sensitivities, specificities, and positive predictive values. We derived the analytic variance formulas for these parameter estimates and used them to obtain the optimal sampling design. The efficiency of the optimal sampling design is evaluated through an empirical investigation that compares the optimal sampling with simple random sampling and with proportional allocation. Results of the empirical study show that the optimal sampling design is similar for estimating the difference in sensitivities and in specificities, and both achieve a substantial amount of variance reduction with an over-sample of subjects with discordant results and under-sample of subjects with concordant results. A heuristic rule is recommended when there is no prior knowledge of individual sensitivities and specificities, or the prevalence of the true positive findings in the study population. The optimal sampling is applied to a real-world example in record linkage to evaluate the difference in classification accuracy of two matching algorithms. Copyright © 2013 John Wiley & Sons, Ltd.
Comparisons of neural networks to standard techniques for image classification and correlation
NASA Technical Reports Server (NTRS)
Paola, Justin D.; Schowengerdt, Robert A.
1994-01-01
Neural network techniques for multispectral image classification and spatial pattern detection are compared to the standard techniques of maximum-likelihood classification and spatial correlation. The neural network produced a more accurate classification than maximum-likelihood of a Landsat scene of Tucson, Arizona. Some of the errors in the maximum-likelihood classification are illustrated using decision region and class probability density plots. As expected, the main drawback to the neural network method is the long time required for the training stage. The network was trained using several different hidden layer sizes to optimize both the classification accuracy and training speed, and it was found that one node per class was optimal. The performance improved when 3x3 local windows of image data were entered into the net. This modification introduces texture into the classification without explicit calculation of a texture measure. Larger windows were successfully used for the detection of spatial features in Landsat and Magellan synthetic aperture radar imagery.
Ensemble methods with simple features for document zone classification
NASA Astrophysics Data System (ADS)
Obafemi-Ajayi, Tayo; Agam, Gady; Xie, Bingqing
2012-01-01
Document layout analysis is of fundamental importance for document image understanding and information retrieval. It requires the identification of blocks extracted from a document image via features extraction and block classification. In this paper, we focus on the classification of the extracted blocks into five classes: text (machine printed), handwriting, graphics, images, and noise. We propose a new set of features for efficient classifications of these blocks. We present a comparative evaluation of three ensemble based classification algorithms (boosting, bagging, and combined model trees) in addition to other known learning algorithms. Experimental results are demonstrated for a set of 36503 zones extracted from 416 document images which were randomly selected from the tobacco legacy document collection. The results obtained verify the robustness and effectiveness of the proposed set of features in comparison to the commonly used Ocropus recognition features. When used in conjunction with the Ocropus feature set, we further improve the performance of the block classification system to obtain a classification accuracy of 99.21%.
Abou Zeid, Elias; Rezazadeh Sereshkeh, Alborz; Schultz, Benjamin; Chau, Tom
2017-01-01
In recent years, the readiness potential (RP), a type of pre-movement neural activity, has been investigated for asynchronous electroencephalogram (EEG)-based brain-computer interfaces (BCIs). Since the RP is attenuated for involuntary movements, a BCI driven by RP alone could facilitate intentional control amid a plethora of unintentional movements. Previous studies have mainly attempted binary single-trial classification of RP. An RP-based BCI with three or more states would expand the options for functional control. Here, we propose a ternary BCI based on single-trial RPs. This BCI classifies amongst an idle state, a left hand and a right hand self-initiated fine movement. A pipeline of spatio-temporal filtering with per participant parameter optimization was used for feature extraction. The ternary classification was decomposed into binary classifications using a decision-directed acyclic graph (DDAG). For each class pair in the DDAG structure, an ordered diversified classifier system (ODCS-DDAG) was used to select the best among various classification algorithms or to combine the results of different classification algorithms. Using EEG data from 14 participants performing self-initiated left or right key presses, punctuated with rest periods, we compared the performance of ODCS-DDAG to a ternary classifier and four popular multiclass decomposition methods using only a single classification algorithm. ODCS-DDAG had the highest performance (0.769 Cohen's Kappa score) and was significantly better than the ternary classifier and two of the four multiclass decomposition methods. Our work supports further study of RP-based BCI for intuitive asynchronous environmental control or augmentative communication. PMID:28596725
Classification of brain tumours using short echo time 1H MR spectra
NASA Astrophysics Data System (ADS)
Devos, A.; Lukas, L.; Suykens, J. A. K.; Vanhamme, L.; Tate, A. R.; Howe, F. A.; Majós, C.; Moreno-Torres, A.; van der Graaf, M.; Arús, C.; Van Huffel, S.
2004-09-01
The purpose was to objectively compare the application of several techniques and the use of several input features for brain tumour classification using Magnetic Resonance Spectroscopy (MRS). Short echo time 1H MRS signals from patients with glioblastomas ( n = 87), meningiomas ( n = 57), metastases ( n = 39), and astrocytomas grade II ( n = 22) were provided by six centres in the European Union funded INTERPRET project. Linear discriminant analysis, least squares support vector machines (LS-SVM) with a linear kernel and LS-SVM with radial basis function kernel were applied and evaluated over 100 stratified random splittings of the dataset into training and test sets. The area under the receiver operating characteristic curve (AUC) was used to measure the performance of binary classifiers, while the percentage of correct classifications was used to evaluate the multiclass classifiers. The influence of several factors on the classification performance has been tested: L2- vs. water normalization, magnitude vs. real spectra and baseline correction. The effect of input feature reduction was also investigated by using only the selected frequency regions containing the most discriminatory information, and peak integrated values. Using L2-normalized complete spectra the automated binary classifiers reached a mean test AUC of more than 0.95, except for glioblastomas vs. metastases. Similar results were obtained for all classification techniques and input features except for water normalized spectra, where classification performance was lower. This indicates that data acquisition and processing can be simplified for classification purposes, excluding the need for separate water signal acquisition, baseline correction or phasing.
Ahmad, Tariq; Desai, Nihar; Wilson, Francis; Schulte, Phillip; Dunning, Allison; Jacoby, Daniel; Allen, Larry; Fiuzat, Mona; Rogers, Joseph; Felker, G Michael; O'Connor, Christopher; Patel, Chetan B
2016-01-01
Classification of acute decompensated heart failure (ADHF) is based on subjective criteria that crudely capture disease heterogeneity. Improved phenotyping of the syndrome may help improve therapeutic strategies. To derive cluster analysis-based groupings for patients hospitalized with ADHF, and compare their prognostic performance to hemodynamic classifications derived at the bedside. We performed a cluster analysis on baseline clinical variables and PAC measurements of 172 ADHF patients from the ESCAPE trial. Employing regression techniques, we examined associations between clusters and clinically determined hemodynamic profiles (warm/cold/wet/dry). We assessed association with clinical outcomes using Cox proportional hazards models. Likelihood ratio tests were used to compare the prognostic value of cluster data to that of hemodynamic data. We identified four advanced HF clusters: 1) male Caucasians with ischemic cardiomyopathy, multiple comorbidities, lowest B-type natriuretic peptide (BNP) levels; 2) females with non-ischemic cardiomyopathy, few comorbidities, most favorable hemodynamics; 3) young African American males with non-ischemic cardiomyopathy, most adverse hemodynamics, advanced disease; and 4) older Caucasians with ischemic cardiomyopathy, concomitant renal insufficiency, highest BNP levels. There was no association between clusters and bedside-derived hemodynamic profiles (p = 0.70). For all adverse clinical outcomes, Cluster 4 had the highest risk, and Cluster 2, the lowest. Compared to Cluster 4, Clusters 1-3 had 45-70% lower risk of all-cause mortality. Clusters were significantly associated with clinical outcomes, whereas hemodynamic profiles were not. By clustering patients with similar objective variables, we identified four clinically relevant phenotypes of ADHF patients, with no discernable relationship to hemodynamic profiles, but distinct associations with adverse outcomes. Our analysis suggests that ADHF classification using simultaneous considerations of etiology, comorbid conditions, and biomarker levels, may be superior to bedside classifications.
Gaze-independent ERP-BCIs: augmenting performance through location-congruent bimodal stimuli
Thurlings, Marieke E.; Brouwer, Anne-Marie; Van Erp, Jan B. F.; Werkhoven, Peter
2014-01-01
Gaze-independent event-related potential (ERP) based brain-computer interfaces (BCIs) yield relatively low BCI performance and traditionally employ unimodal stimuli. Bimodal ERP-BCIs may increase BCI performance due to multisensory integration or summation in the brain. An additional advantage of bimodal BCIs may be that the user can choose which modality or modalities to attend to. We studied bimodal, visual-tactile, gaze-independent BCIs and investigated whether or not ERP components’ tAUCs and subsequent classification accuracies are increased for (1) bimodal vs. unimodal stimuli; (2) location-congruent vs. location-incongruent bimodal stimuli; and (3) attending to both modalities vs. to either one modality. We observed an enhanced bimodal (compared to unimodal) P300 tAUC, which appeared to be positively affected by location-congruency (p = 0.056) and resulted in higher classification accuracies. Attending either to one or to both modalities of the bimodal location-congruent stimuli resulted in differences between ERP components, but not in classification performance. We conclude that location-congruent bimodal stimuli improve ERP-BCIs, and offer the user the possibility to switch the attended modality without losing performance. PMID:25249947
Plenis, Alina; Olędzka, Ilona; Bączek, Tomasz
2013-05-05
This paper focuses on a comparative study of the column classification system based on the quantitative structure-retention relationships (QSRR method) and column performance in real biomedical analysis. The assay was carried out for the LC separation of moclobemide and its metabolites in human plasma, using a set of 24 stationary phases. The QSRR models established for the studied stationary phases were compared with the column test performance results under two chemometric techniques - the principal component analysis (PCA) and the hierarchical clustering analysis (HCA). The study confirmed that the stationary phase classes found closely related by the QSRR approach yielded comparable separation for moclobemide and its metabolites. Therefore, the QSRR method could be considered supportive in the selection of a suitable column for the biomedical analysis offering the selection of similar or dissimilar columns with a relatively higher certainty. Copyright © 2013 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, J; Nishikawa, R; Reiser, I
Purpose: Segmentation quality can affect quantitative image feature analysis. The objective of this study is to examine the relationship between computed tomography (CT) image quality, segmentation performance, and quantitative image feature analysis. Methods: A total of 90 pathology proven breast lesions in 87 dedicated breast CT images were considered. An iterative image reconstruction (IIR) algorithm was used to obtain CT images with different quality. With different combinations of 4 variables in the algorithm, this study obtained a total of 28 different qualities of CT images. Two imaging tasks/objectives were considered: 1) segmentation and 2) classification of the lesion as benignmore » or malignant. Twenty-three image features were extracted after segmentation using a semi-automated algorithm and 5 of them were selected via a feature selection technique. Logistic regression was trained and tested using leave-one-out-cross-validation and its area under the ROC curve (AUC) was recorded. The standard deviation of a homogeneous portion and the gradient of a parenchymal portion of an example breast were used as an estimate of image noise and sharpness. The DICE coefficient was computed using a radiologist’s drawing on the lesion. Mean DICE and AUC were used as performance metrics for each of the 28 reconstructions. The relationship between segmentation and classification performance under different reconstructions were compared. Distributions (median, 95% confidence interval) of DICE and AUC for each reconstruction were also compared. Results: Moderate correlation (Pearson’s rho = 0.43, p-value = 0.02) between DICE and AUC values was found. However, the variation between DICE and AUC values for each reconstruction increased as the image sharpness increased. There was a combination of IIR parameters that resulted in the best segmentation with the worst classification performance. Conclusion: There are certain images that yield better segmentation or classification performance. The best segmentation Result does not necessarily lead to the best classification Result. This work has been supported in part by grants from the NIH R21-EB015053. R Nishikawa is receives royalties form Hologic, Inc.« less
Classification of spontaneous EEG signals in migraine
NASA Astrophysics Data System (ADS)
Bellotti, R.; De Carlo, F.; de Tommaso, M.; Lucente, M.
2007-08-01
We set up a classification system able to detect patients affected by migraine without aura, through the analysis of their spontaneous EEG patterns. First, the signals are characterized by means of wavelet-based features, than a supervised neural network is used to classify the multichannel data. For the feature extraction, scale-dependent and scale-independent methods are considered with a variety of wavelet functions. Both the approaches provide very high and almost comparable classification performances. A complete separation of the two groups is obtained when the data are plotted in the plane spanned by two suitable neural outputs.
Evaluation of the performance of the reduced local lymph node assay for skin sensitization testing.
Ezendam, Janine; Muller, Andre; Hakkert, Betty C; van Loveren, Henk
2013-06-01
The local lymph node assay (LLNA) is the preferred method for classification of sensitizers within REACH. To reduce the number of mice for the identification of sensitizers the reduced LLNA was proposed, which uses only the high dose group of the LLNA. To evaluate the performance of this method for classification, LLNA data from REACH registrations were used and classification based on all dose groups was compared to classification based on the high dose group. We confirmed previous examinations of the reduced LLNA showing that this method is less sensitive compared to the LLNA. The reduced LLNA misclassified 3.3% of the sensitizers identified in the LLNA and misclassification occurred in all potency classes and that there was no clear association with irritant properties. It is therefore not possible to predict beforehand which substances might be misclassified. Another limitation of the reduced LLNA is that skin sensitizing potency cannot be assessed. For these reasons, it is not recommended to use the reduced LLNA as a stand-alone assay for skin sensitization testing within REACH. In the future, the reduced LLNA might be of added value in a weight of evidence approach to confirm negative results obtained with non-animal approaches. Copyright © 2013 Elsevier Inc. All rights reserved.
Cross-classification of musical and vocal emotions in the auditory cortex.
Paquette, Sébastien; Takerkart, Sylvain; Saget, Shinji; Peretz, Isabelle; Belin, Pascal
2018-05-09
Whether emotions carried by voice and music are processed by the brain using similar mechanisms has long been investigated. Yet neuroimaging studies do not provide a clear picture, mainly due to lack of control over stimuli. Here, we report a functional magnetic resonance imaging (fMRI) study using comparable stimulus material in the voice and music domains-the Montreal Affective Voices and the Musical Emotional Bursts-which include nonverbal short bursts of happiness, fear, sadness, and neutral expressions. We use a multivariate emotion-classification fMRI analysis involving cross-timbre classification as a means of comparing the neural mechanisms involved in processing emotional information in the two domains. We find, for affective stimuli in the violin, clarinet, or voice timbres, that local fMRI patterns in the bilateral auditory cortex and upper premotor regions support above-chance emotion classification when training and testing sets are performed within the same timbre category. More importantly, classifier performance generalized well across timbre in cross-classifying schemes, albeit with a slight accuracy drop when crossing the voice-music boundary, providing evidence for a shared neural code for processing musical and vocal emotions, with possibly a cost for the voice due to its evolutionary significance. © 2018 New York Academy of Sciences.
Carnahan, Brian; Meyer, Gérard; Kuntz, Lois-Ann
2003-01-01
Multivariate classification models play an increasingly important role in human factors research. In the past, these models have been based primarily on discriminant analysis and logistic regression. Models developed from machine learning research offer the human factors professional a viable alternative to these traditional statistical classification methods. To illustrate this point, two machine learning approaches--genetic programming and decision tree induction--were used to construct classification models designed to predict whether or not a student truck driver would pass his or her commercial driver license (CDL) examination. The models were developed and validated using the curriculum scores and CDL exam performances of 37 student truck drivers who had completed a 320-hr driver training course. Results indicated that the machine learning classification models were superior to discriminant analysis and logistic regression in terms of predictive accuracy. Actual or potential applications of this research include the creation of models that more accurately predict human performance outcomes.
CLASSIFYING MEDICAL IMAGES USING MORPHOLOGICAL APPEARANCE MANIFOLDS.
Varol, Erdem; Gaonkar, Bilwaj; Davatzikos, Christos
2013-12-31
Input features for medical image classification algorithms are extracted from raw images using a series of pre processing steps. One common preprocessing step in computational neuroanatomy and functional brain mapping is the nonlinear registration of raw images to a common template space. Typically, the registration methods used are parametric and their output varies greatly with changes in parameters. Most results reported previously perform registration using a fixed parameter setting and use the results as input to the subsequent classification step. The variation in registration results due to choice of parameters thus translates to variation of performance of the classifiers that depend on the registration step for input. Analogous issues have been investigated in the computer vision literature, where image appearance varies with pose and illumination, thereby making classification vulnerable to these confounding parameters. The proposed methodology addresses this issue by sampling image appearances as registration parameters vary, and shows that better classification accuracies can be obtained this way, compared to the conventional approach.
Mikhno, Arthur; Nuevo, Pablo Martinez; Devanand, Davangere P.; Parsey, Ramin V.; Laine, Andrew F.
2013-01-01
Multimodality classification of Alzheimer’s disease (AD) and its prodromal stage, Mild Cognitive Impairment (MCI), is of interest to the medical community. We improve on prior classification frameworks by incorporating multiple features from MRI and PET data obtained with multiple radioligands, fluorodeoxyglucose (FDG) and Pittsburg compound B (PIB). We also introduce a new MRI feature, invariant shape descriptors based on 3D Zernike moments applied to the hippocampus region. Classification performance is evaluated on data from 17 healthy controls (CTR), 22 MCI, and 17 AD subjects. Zernike significantly outperforms volume, accuracy (Zernike to volume): CTR/AD (90.7% to 71.6%), CTR/MCI (76.2% to 60.0%), MCI/AD (84.3% to 65.5%). Zernike also provides comparable and complementary performance to PET. Optimal accuracy is achieved when Zernike and PET features are combined (accuracy, specificity, sensitivity), CTR/AD (98.8%, 99.5%, 98.1%), CTR/MCI (84.3%, 82.9%, 85.9%) and MCI/AD (93.3%, 93.6%, 93.3%). PMID:24576927
Mikhno, Arthur; Nuevo, Pablo Martinez; Devanand, Davangere P; Parsey, Ramin V; Laine, Andrew F
2012-01-01
Multimodality classification of Alzheimer's disease (AD) and its prodromal stage, Mild Cognitive Impairment (MCI), is of interest to the medical community. We improve on prior classification frameworks by incorporating multiple features from MRI and PET data obtained with multiple radioligands, fluorodeoxyglucose (FDG) and Pittsburg compound B (PIB). We also introduce a new MRI feature, invariant shape descriptors based on 3D Zernike moments applied to the hippocampus region. Classification performance is evaluated on data from 17 healthy controls (CTR), 22 MCI, and 17 AD subjects. Zernike significantly outperforms volume, accuracy (Zernike to volume): CTR/AD (90.7% to 71.6%), CTR/MCI (76.2% to 60.0%), MCI/AD (84.3% to 65.5%). Zernike also provides comparable and complementary performance to PET. Optimal accuracy is achieved when Zernike and PET features are combined (accuracy, specificity, sensitivity), CTR/AD (98.8%, 99.5%, 98.1%), CTR/MCI (84.3%, 82.9%, 85.9%) and MCI/AD (93.3%, 93.6%, 93.3%).
Farquhar, J; Hill, N J
2013-04-01
Detecting event related potentials (ERPs) from single trials is critical to the operation of many stimulus-driven brain computer interface (BCI) systems. The low strength of the ERP signal compared to the noise (due to artifacts and BCI irrelevant brain processes) makes this a challenging signal detection problem. Previous work has tended to focus on how best to detect a single ERP type (such as the visual oddball response). However, the underlying ERP detection problem is essentially the same regardless of stimulus modality (e.g., visual or tactile), ERP component (e.g., P300 oddball response, or the error-potential), measurement system or electrode layout. To investigate whether a single ERP detection method might work for a wider range of ERP BCIs we compare detection performance over a large corpus of more than 50 ERP BCI datasets whilst systematically varying the electrode montage, spectral filter, spatial filter and classifier training methods. We identify an interesting interaction between spatial whitening and regularised classification which made detection performance independent of the choice of spectral filter low-pass frequency. Our results show that pipeline consisting of spectral filtering, spatial whitening, and regularised classification gives near maximal performance in all cases. Importantly, this pipeline is simple to implement and completely automatic with no expert feature selection or parameter tuning required. Thus, we recommend this combination as a "best-practice" method for ERP detection problems.
Towards the use of similarity distances to music genre classification: A comparative study.
Goienetxea, Izaro; Martínez-Otzeta, José María; Sierra, Basilio; Mendialdua, Iñigo
2018-01-01
Music genre classification is a challenging research concept, for which open questions remain regarding classification approach, music piece representation, distances between/within genres, and so on. In this paper an investigation on the classification of generated music pieces is performed, based on the idea that grouping close related known pieces in different sets -or clusters- and then generating in an automatic way a new song which is somehow "inspired" in each set, the new song would be more likely to be classified as belonging to the set which inspired it, based on the same distance used to separate the clusters. Different music pieces representations and distances among pieces are used; obtained results are promising, and indicate the appropriateness of the used approach even in a such a subjective area as music genre classification is.
Towards the use of similarity distances to music genre classification: A comparative study
Martínez-Otzeta, José María; Sierra, Basilio; Mendialdua, Iñigo
2018-01-01
Music genre classification is a challenging research concept, for which open questions remain regarding classification approach, music piece representation, distances between/within genres, and so on. In this paper an investigation on the classification of generated music pieces is performed, based on the idea that grouping close related known pieces in different sets –or clusters– and then generating in an automatic way a new song which is somehow “inspired” in each set, the new song would be more likely to be classified as belonging to the set which inspired it, based on the same distance used to separate the clusters. Different music pieces representations and distances among pieces are used; obtained results are promising, and indicate the appropriateness of the used approach even in a such a subjective area as music genre classification is. PMID:29444160
Falk, Joakim; Björvell, Catrin
2012-01-01
The Swedish health care system stands before an implementation of standardized language. The first classification of nursing diagnoses translated into Swedish, The NANDA, was released in January 2011. The aim of the present study was to examine whether the usage of the NANDA classification affected nursing students’ choice of nursing interventions. Thirty-three nursing students in a clinical setting were divided into two groups. The intervention group had access to the NANDA classification text book, while the comparison group did not. In total 78 nursing assessments were performed and 218 nursing interventions initiated. The principle findings show that there were no statistical significant differences between the groups regarding the amount, quality or category of nursing interventions when using the NANDA classification compared to free text format nursing diagnoses. PMID:24199065
Chatterjee, Sankhadeep; Dey, Nilanjan; Shi, Fuqian; Ashour, Amira S; Fong, Simon James; Sen, Soumya
2018-04-01
Dengue fever detection and classification have a vital role due to the recent outbreaks of different kinds of dengue fever. Recently, the advancement in the microarray technology can be employed for such classification process. Several studies have established that the gene selection phase takes a significant role in the classifier performance. Subsequently, the current study focused on detecting two different variations, namely, dengue fever (DF) and dengue hemorrhagic fever (DHF). A modified bag-of-features method has been proposed to select the most promising genes in the classification process. Afterward, a modified cuckoo search optimization algorithm has been engaged to support the artificial neural (ANN-MCS) to classify the unknown subjects into three different classes namely, DF, DHF, and another class containing convalescent and normal cases. The proposed method has been compared with other three well-known classifiers, namely, multilayer perceptron feed-forward network (MLP-FFN), artificial neural network (ANN) trained with cuckoo search (ANN-CS), and ANN trained with PSO (ANN-PSO). Experiments have been carried out with different number of clusters for the initial bag-of-features-based feature selection phase. After obtaining the reduced dataset, the hybrid ANN-MCS model has been employed for the classification process. The results have been compared in terms of the confusion matrix-based performance measuring metrics. The experimental results indicated a highly statistically significant improvement with the proposed classifier over the traditional ANN-CS model.
Plaza-Leiva, Victoria; Gomez-Ruiz, Jose Antonio; Mandow, Anthony; García-Cerezo, Alfonso
2017-01-01
Improving the effectiveness of spatial shape features classification from 3D lidar data is very relevant because it is largely used as a fundamental step towards higher level scene understanding challenges of autonomous vehicles and terrestrial robots. In this sense, computing neighborhood for points in dense scans becomes a costly process for both training and classification. This paper proposes a new general framework for implementing and comparing different supervised learning classifiers with a simple voxel-based neighborhood computation where points in each non-overlapping voxel in a regular grid are assigned to the same class by considering features within a support region defined by the voxel itself. The contribution provides offline training and online classification procedures as well as five alternative feature vector definitions based on principal component analysis for scatter, tubular and planar shapes. Moreover, the feasibility of this approach is evaluated by implementing a neural network (NN) method previously proposed by the authors as well as three other supervised learning classifiers found in scene processing methods: support vector machines (SVM), Gaussian processes (GP), and Gaussian mixture models (GMM). A comparative performance analysis is presented using real point clouds from both natural and urban environments and two different 3D rangefinders (a tilting Hokuyo UTM-30LX and a Riegl). Classification performance metrics and processing time measurements confirm the benefits of the NN classifier and the feasibility of voxel-based neighborhood. PMID:28294963
SCOWLP classification: Structural comparison and analysis of protein binding regions
Teyra, Joan; Paszkowski-Rogacz, Maciej; Anders, Gerd; Pisabarro, M Teresa
2008-01-01
Background Detailed information about protein interactions is critical for our understanding of the principles governing protein recognition mechanisms. The structures of many proteins have been experimentally determined in complex with different ligands bound either in the same or different binding regions. Thus, the structural interactome requires the development of tools to classify protein binding regions. A proper classification may provide a general view of the regions that a protein uses to bind others and also facilitate a detailed comparative analysis of the interacting information for specific protein binding regions at atomic level. Such classification might be of potential use for deciphering protein interaction networks, understanding protein function, rational engineering and design. Description Protein binding regions (PBRs) might be ideally described as well-defined separated regions that share no interacting residues one another. However, PBRs are often irregular, discontinuous and can share a wide range of interacting residues among them. The criteria to define an individual binding region can be often arbitrary and may differ from other binding regions within a protein family. Therefore, the rational behind protein interface classification should aim to fulfil the requirements of the analysis to be performed. We extract detailed interaction information of protein domains, peptides and interfacial solvent from the SCOWLP database and we classify the PBRs of each domain family. For this purpose, we define a similarity index based on the overlapping of interacting residues mapped in pair-wise structural alignments. We perform our classification with agglomerative hierarchical clustering using the complete-linkage method. Our classification is calculated at different similarity cut-offs to allow flexibility in the analysis of PBRs, feature especially interesting for those protein families with conflictive binding regions. The hierarchical classification of PBRs is implemented into the SCOWLP database and extends the SCOP classification with three additional family sub-levels: Binding Region, Interface and Contacting Domains. SCOWLP contains 9,334 binding regions distributed within 2,561 families. In 65% of the cases we observe families containing more than one binding region. Besides, 22% of the regions are forming complex with more than one different protein family. Conclusion The current SCOWLP classification and its web application represent a framework for the study of protein interfaces and comparative analysis of protein family binding regions. This comparison can be performed at atomic level and allows the user to study interactome conservation and variability. The new SCOWLP classification may be of great utility for reconstruction of protein complexes, understanding protein networks and ligand design. SCOWLP will be updated with every SCOP release. The web application is available at . PMID:18182098
Classification of footwear outsole patterns using Fourier transform and local interest points.
Richetelli, Nicole; Lee, Mackenzie C; Lasky, Carleen A; Gump, Madison E; Speir, Jacqueline A
2017-06-01
Successful classification of questioned footwear has tremendous evidentiary value; the result can minimize the potential suspect pool and link a suspect to a victim, a crime scene, or even multiple crime scenes to each other. With this in mind, several different automated and semi-automated classification models have been applied to the forensic footwear recognition problem, with superior performance commonly associated with two different approaches: correlation of image power (magnitude) or phase, and the use of local interest points transformed using the Scale Invariant Feature Transform (SIFT) and compared using Random Sample Consensus (RANSAC). Despite the distinction associated with each of these methods, all three have not been cross-compared using a single dataset, of limited quality (i.e., characteristic of crime scene-like imagery), and created using a wide combination of image inputs. To address this question, the research presented here examines the classification performance of the Fourier-Mellin transform (FMT), phase-only correlation (POC), and local interest points (transformed using SIFT and compared using RANSAC), as a function of inputs that include mixed media (blood and dust), transfer mechanisms (gel lifters), enhancement techniques (digital and chemical) and variations in print substrate (ceramic tiles, vinyl tiles and paper). Results indicate that POC outperforms both FMT and SIFT+RANSAC, regardless of image input (type, quality and totality), and that the difference in stochastic dominance detected for POC is significant across all image comparison scenarios evaluated in this study. Copyright © 2017 Elsevier B.V. All rights reserved.
Kesharaju, Manasa; Nagarajah, Romesh
2015-09-01
The motivation for this research stems from a need for providing a non-destructive testing method capable of detecting and locating any defects and microstructural variations within armour ceramic components before issuing them to the soldiers who rely on them for their survival. The development of an automated ultrasonic inspection based classification system would make possible the checking of each ceramic component and immediately alert the operator about the presence of defects. Generally, in many classification problems a choice of features or dimensionality reduction is significant and simultaneously very difficult, as a substantial computational effort is required to evaluate possible feature subsets. In this research, a combination of artificial neural networks and genetic algorithms are used to optimize the feature subset used in classification of various defects in reaction-sintered silicon carbide ceramic components. Initially wavelet based feature extraction is implemented from the region of interest. An Artificial Neural Network classifier is employed to evaluate the performance of these features. Genetic Algorithm based feature selection is performed. Principal Component Analysis is a popular technique used for feature selection and is compared with the genetic algorithm based technique in terms of classification accuracy and selection of optimal number of features. The experimental results confirm that features identified by Principal Component Analysis lead to improved performance in terms of classification percentage with 96% than Genetic algorithm with 94%. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Makra, László; Puskás, János; Matyasovszky, István; Csépe, Zoltán; Lelovics, Enikő; Bálint, Beatrix; Tusnády, Gábor
2015-09-01
Weather classification approaches may be useful tools in modelling the occurrence of respiratory diseases. The aim of the study is to compare the performance of an objectively defined weather classification and the Spatial Synoptic Classification (SSC) in classifying emergency department (ED) visits for acute asthma depending from weather, air pollutants, and airborne pollen variables for Szeged, Hungary, for the 9-year period 1999-2007. The research is performed for three different pollen-related periods of the year and the annual data set. According to age and gender, nine patient categories, eight meteorological variables, seven chemical air pollutants, and two pollen categories were used. In general, partly dry and cold air and partly warm and humid air aggravate substantially the symptoms of asthmatics. Our major findings are consistent with this establishment. Namely, for the objectively defined weather types favourable conditions for asthma ER visits occur when an anticyclonic ridge weather situation happens with near extreme temperature and humidity parameters. Accordingly, the SSC weather types facilitate aggravating asthmatic conditions if warm or cool weather occur with high humidity in both cases. Favourable conditions for asthma attacks are confirmed in the extreme seasons when atmospheric stability contributes to enrichment of air pollutants. The total efficiency of the two classification approaches is similar in spite of the fact that the methodology for derivation of the individual types within the two classification approaches is completely different.
Makra, László; Puskás, János; Matyasovszky, István; Csépe, Zoltán; Lelovics, Enikő; Bálint, Beatrix; Tusnády, Gábor
2015-09-01
Weather classification approaches may be useful tools in modelling the occurrence of respiratory diseases. The aim of the study is to compare the performance of an objectively defined weather classification and the Spatial Synoptic Classification (SSC) in classifying emergency department (ED) visits for acute asthma depending from weather, air pollutants, and airborne pollen variables for Szeged, Hungary, for the 9-year period 1999-2007. The research is performed for three different pollen-related periods of the year and the annual data set. According to age and gender, nine patient categories, eight meteorological variables, seven chemical air pollutants, and two pollen categories were used. In general, partly dry and cold air and partly warm and humid air aggravate substantially the symptoms of asthmatics. Our major findings are consistent with this establishment. Namely, for the objectively defined weather types favourable conditions for asthma ER visits occur when an anticyclonic ridge weather situation happens with near extreme temperature and humidity parameters. Accordingly, the SSC weather types facilitate aggravating asthmatic conditions if warm or cool weather occur with high humidity in both cases. Favourable conditions for asthma attacks are confirmed in the extreme seasons when atmospheric stability contributes to enrichment of air pollutants. The total efficiency of the two classification approaches is similar in spite of the fact that the methodology for derivation of the individual types within the two classification approaches is completely different.
A comparative study of nonparametric methods for pattern recognition
NASA Technical Reports Server (NTRS)
Hahn, S. F.; Nelson, G. D.
1972-01-01
The applied research discussed in this report determines and compares the correct classification percentage of the nonparametric sign test, Wilcoxon's signed rank test, and K-class classifier with the performance of the Bayes classifier. The performance is determined for data which have Gaussian, Laplacian and Rayleigh probability density functions. The correct classification percentage is shown graphically for differences in modes and/or means of the probability density functions for four, eight and sixteen samples. The K-class classifier performed very well with respect to the other classifiers used. Since the K-class classifier is a nonparametric technique, it usually performed better than the Bayes classifier which assumes the data to be Gaussian even though it may not be. The K-class classifier has the advantage over the Bayes in that it works well with non-Gaussian data without having to determine the probability density function of the data. It should be noted that the data in this experiment was always unimodal.
A hybrid approach to select features and classify diseases based on medical data
NASA Astrophysics Data System (ADS)
AbdelLatif, Hisham; Luo, Jiawei
2018-03-01
Feature selection is popular problem in the classification of diseases in clinical medicine. Here, we developing a hybrid methodology to classify diseases, based on three medical datasets, Arrhythmia, Breast cancer, and Hepatitis datasets. This methodology called k-means ANOVA Support Vector Machine (K-ANOVA-SVM) uses K-means cluster with ANOVA statistical to preprocessing data and selection the significant features, and Support Vector Machines in the classification process. To compare and evaluate the performance, we choice three classification algorithms, decision tree Naïve Bayes, Support Vector Machines and applied the medical datasets direct to these algorithms. Our methodology was a much better classification accuracy is given of 98% in Arrhythmia datasets, 92% in Breast cancer datasets and 88% in Hepatitis datasets, Compare to use the medical data directly with decision tree Naïve Bayes, and Support Vector Machines. Also, the ROC curve and precision with (K-ANOVA-SVM) Achieved best results than other algorithms
An unbalanced spectra classification method based on entropy
NASA Astrophysics Data System (ADS)
Liu, Zhong-bao; Zhao, Wen-juan
2017-05-01
How to solve the problem of distinguishing the minority spectra from the majority of the spectra is quite important in astronomy. In view of this, an unbalanced spectra classification method based on entropy (USCM) is proposed in this paper to deal with the unbalanced spectra classification problem. USCM greatly improves the performances of the traditional classifiers on distinguishing the minority spectra as it takes the data distribution into consideration in the process of classification. However, its time complexity is exponential with the training size, and therefore, it can only deal with the problem of small- and medium-scale classification. How to solve the large-scale classification problem is quite important to USCM. It can be easily obtained by mathematical computation that the dual form of USCM is equivalent to the minimum enclosing ball (MEB), and core vector machine (CVM) is introduced, USCM based on CVM is proposed to deal with the large-scale classification problem. Several comparative experiments on the 4 subclasses of K-type spectra, 3 subclasses of F-type spectra and 3 subclasses of G-type spectra from Sloan Digital Sky Survey (SDSS) verify USCM and USCM based on CVM perform better than kNN (k nearest neighbor) and SVM (support vector machine) in dealing with the problem of rare spectra mining respectively on the small- and medium-scale datasets and the large-scale datasets.
Kayani, Babar; Konan, Sujith; Pietrzak, Jurek R T; Haddad, Fares S
2018-03-27
The objective of this study was to compare macroscopic bone and soft tissue injury between robotic-arm assisted total knee arthroplasty (RA-TKA) and conventional jig-based total knee arthroplasty (CJ-TKA) and create a validated classification system for reporting iatrogenic bone and periarticular soft tissue injury after TKA. This study included 30 consecutive CJ-TKAs followed by 30 consecutive RA-TKAs performed by a single surgeon. Intraoperative photographs of the femur, tibia, and periarticular soft tissues were taken before implantation of prostheses. Using these outcomes, the macroscopic soft tissue injury (MASTI) classification system was developed to grade iatrogenic bone and soft tissue injuries. Interobserver and Intraobserver validity of the proposed classification system was assessed. Patients undergoing RA-TKA had reduced medial soft tissue injury in both passively correctible (P < .05) and noncorrectible varus deformities (P < .05); more pristine femoral (P < .05) and tibial (P < .05) bone resection cuts; and improved MASTI scores compared to CJ-TKA (P < .05). There was high interobserver (intraclass correlation coefficient 0.92 [95% confidence interval: 0.88-0.96], P < .05) and intraobserver agreement (intraclass correlation coefficient 0.94 [95% confidence interval: 0.92-0.97], P < .05) of the proposed MASTI classification system. There is reduced bone and periarticular soft tissue injury in patients undergoing RA-TKA compared to CJ-TKA. The proposed MASTI classification system is a reproducible grading scheme for describing iatrogenic bone and soft tissue injury in TKA. RA-TKA is associated with reduced bone and soft tissue injury compared with conventional jig-based TKA. The proposed MASTI classification may facilitate further research correlating macroscopic soft tissue injury during TKA to long-term clinical and functional outcomes. Copyright © 2018 Elsevier Inc. All rights reserved.
Yuan, Yuan; Lin, Jianzhe; Wang, Qi
2016-12-01
Hyperspectral image (HSI) classification is a crucial issue in remote sensing. Accurate classification benefits a large number of applications such as land use analysis and marine resource utilization. But high data correlation brings difficulty to reliable classification, especially for HSI with abundant spectral information. Furthermore, the traditional methods often fail to well consider the spatial coherency of HSI that also limits the classification performance. To address these inherent obstacles, a novel spectral-spatial classification scheme is proposed in this paper. The proposed method mainly focuses on multitask joint sparse representation (MJSR) and a stepwise Markov random filed framework, which are claimed to be two main contributions in this procedure. First, the MJSR not only reduces the spectral redundancy, but also retains necessary correlation in spectral field during classification. Second, the stepwise optimization further explores the spatial correlation that significantly enhances the classification accuracy and robustness. As far as several universal quality evaluation indexes are concerned, the experimental results on Indian Pines and Pavia University demonstrate the superiority of our method compared with the state-of-the-art competitors.
E. Freeman; G. Moisen; J. Coulston; B. Wilson
2014-01-01
Random forests (RF) and stochastic gradient boosting (SGB), both involving an ensemble of classification and regression trees, are compared for modeling tree canopy cover for the 2011 National Land Cover Database (NLCD). The objectives of this study were twofold. First, sensitivity of RF and SGB to choices in tuning parameters was explored. Second, performance of the...
Choosing the Most Effective Pattern Classification Model under Learning-Time Constraint.
Saito, Priscila T M; Nakamura, Rodrigo Y M; Amorim, Willian P; Papa, João P; de Rezende, Pedro J; Falcão, Alexandre X
2015-01-01
Nowadays, large datasets are common and demand faster and more effective pattern analysis techniques. However, methodologies to compare classifiers usually do not take into account the learning-time constraints required by applications. This work presents a methodology to compare classifiers with respect to their ability to learn from classification errors on a large learning set, within a given time limit. Faster techniques may acquire more training samples, but only when they are more effective will they achieve higher performance on unseen testing sets. We demonstrate this result using several techniques, multiple datasets, and typical learning-time limits required by applications.
Piccinonna, Sara; Ragone, Rosa; Stocchero, Matteo; Del Coco, Laura; De Pascali, Sandra Angelica; Schena, Francesco Paolo; Fanizzi, Francesco Paolo
2016-05-15
Nuclear Magnetic Resonance (NMR) spectroscopy is emerging as a powerful technique in olive oil fingerprinting, but its analytical robustness has to be proved. Here, we report a comparative study between two laboratories on olive oil (1)H NMR fingerprinting, aiming to demonstrate the robustness of NMR-based metabolomics in generating comparable data sets for cultivar classification. Sample preparation and data acquisition were performed independently in two laboratories, equipped with different resolution spectrometers (400 and 500 MHz), using two identical sets of mono-varietal olive oils. Partial Least Squares (PLS)-based techniques were applied to compare the data sets produced by the two laboratories. Despite differences in spectrum baseline, and in intensity and shape of peaks, the amount of shared information was significant (almost 70%) and related to cultivar (same metabolites discriminated between cultivars). In conclusion, regardless of the variability due to operator and machine, the data sets from the two participating units were comparable for the purpose of classification. Copyright © 2015 Elsevier Ltd. All rights reserved.
Qureshi, Muhammad Naveed Iqbal; Min, Beomjun; Jo, Hang Joon; Lee, Boreom
2016-01-01
The classification of neuroimaging data for the diagnosis of certain brain diseases is one of the main research goals of the neuroscience and clinical communities. In this study, we performed multiclass classification using a hierarchical extreme learning machine (H-ELM) classifier. We compared the performance of this classifier with that of a support vector machine (SVM) and basic extreme learning machine (ELM) for cortical MRI data from attention deficit/hyperactivity disorder (ADHD) patients. We used 159 structural MRI images of children from the publicly available ADHD-200 MRI dataset. The data consisted of three types, namely, typically developing (TDC), ADHD-inattentive (ADHD-I), and ADHD-combined (ADHD-C). We carried out feature selection by using standard SVM-based recursive feature elimination (RFE-SVM) that enabled us to achieve good classification accuracy (60.78%). In this study, we found the RFE-SVM feature selection approach in combination with H-ELM to effectively enable the acquisition of high multiclass classification accuracy rates for structural neuroimaging data. In addition, we found that the most important features for classification were the surface area of the superior frontal lobe, and the cortical thickness, volume, and mean surface area of the whole cortex. PMID:27500640
Hierarchical trie packet classification algorithm based on expectation-maximization clustering.
Bi, Xia-An; Zhao, Junxia
2017-01-01
With the development of computer network bandwidth, packet classification algorithms which are able to deal with large-scale rule sets are in urgent need. Among the existing algorithms, researches on packet classification algorithms based on hierarchical trie have become an important packet classification research branch because of their widely practical use. Although hierarchical trie is beneficial to save large storage space, it has several shortcomings such as the existence of backtracking and empty nodes. This paper proposes a new packet classification algorithm, Hierarchical Trie Algorithm Based on Expectation-Maximization Clustering (HTEMC). Firstly, this paper uses the formalization method to deal with the packet classification problem by means of mapping the rules and data packets into a two-dimensional space. Secondly, this paper uses expectation-maximization algorithm to cluster the rules based on their aggregate characteristics, and thereby diversified clusters are formed. Thirdly, this paper proposes a hierarchical trie based on the results of expectation-maximization clustering. Finally, this paper respectively conducts simulation experiments and real-environment experiments to compare the performances of our algorithm with other typical algorithms, and analyzes the results of the experiments. The hierarchical trie structure in our algorithm not only adopts trie path compression to eliminate backtracking, but also solves the problem of low efficiency of trie updates, which greatly improves the performance of the algorithm.
Qureshi, Muhammad Naveed Iqbal; Min, Beomjun; Jo, Hang Joon; Lee, Boreom
2016-01-01
The classification of neuroimaging data for the diagnosis of certain brain diseases is one of the main research goals of the neuroscience and clinical communities. In this study, we performed multiclass classification using a hierarchical extreme learning machine (H-ELM) classifier. We compared the performance of this classifier with that of a support vector machine (SVM) and basic extreme learning machine (ELM) for cortical MRI data from attention deficit/hyperactivity disorder (ADHD) patients. We used 159 structural MRI images of children from the publicly available ADHD-200 MRI dataset. The data consisted of three types, namely, typically developing (TDC), ADHD-inattentive (ADHD-I), and ADHD-combined (ADHD-C). We carried out feature selection by using standard SVM-based recursive feature elimination (RFE-SVM) that enabled us to achieve good classification accuracy (60.78%). In this study, we found the RFE-SVM feature selection approach in combination with H-ELM to effectively enable the acquisition of high multiclass classification accuracy rates for structural neuroimaging data. In addition, we found that the most important features for classification were the surface area of the superior frontal lobe, and the cortical thickness, volume, and mean surface area of the whole cortex.
Using statistical text classification to identify health information technology incidents
Chai, Kevin E K; Anthony, Stephen; Coiera, Enrico; Magrabi, Farah
2013-01-01
Objective To examine the feasibility of using statistical text classification to automatically identify health information technology (HIT) incidents in the USA Food and Drug Administration (FDA) Manufacturer and User Facility Device Experience (MAUDE) database. Design We used a subset of 570 272 incidents including 1534 HIT incidents reported to MAUDE between 1 January 2008 and 1 July 2010. Text classifiers using regularized logistic regression were evaluated with both ‘balanced’ (50% HIT) and ‘stratified’ (0.297% HIT) datasets for training, validation, and testing. Dataset preparation, feature extraction, feature selection, cross-validation, classification, performance evaluation, and error analysis were performed iteratively to further improve the classifiers. Feature-selection techniques such as removing short words and stop words, stemming, lemmatization, and principal component analysis were examined. Measurements κ statistic, F1 score, precision and recall. Results Classification performance was similar on both the stratified (0.954 F1 score) and balanced (0.995 F1 score) datasets. Stemming was the most effective technique, reducing the feature set size to 79% while maintaining comparable performance. Training with balanced datasets improved recall (0.989) but reduced precision (0.165). Conclusions Statistical text classification appears to be a feasible method for identifying HIT reports within large databases of incidents. Automated identification should enable more HIT problems to be detected, analyzed, and addressed in a timely manner. Semi-supervised learning may be necessary when applying machine learning to big data analysis of patient safety incidents and requires further investigation. PMID:23666777
Vishnyakova, Dina; Pasche, Emilie; Ruch, Patrick
2012-01-01
We report on the original integration of an automatic text categorization pipeline, so-called ToxiCat (Toxicogenomic Categorizer), that we developed to perform biomedical documents classification and prioritization in order to speed up the curation of the Comparative Toxicogenomics Database (CTD). The task can be basically described as a binary classification task, where a scoring function is used to rank a selected set of articles. Then components of a question-answering system are used to extract CTD-specific annotations from the ranked list of articles. The ranking function is generated using a Support Vector Machine, which combines three main modules: an information retrieval engine for MEDLINE (EAGLi), a gene normalization service (NormaGene) developed for a previous BioCreative campaign and finally, a set of answering components and entity recognizer for diseases and chemicals. The main components of the pipeline are publicly available both as web application and web services. The specific integration performed for the BioCreative competition is available via a web user interface at http://pingu.unige.ch:8080/Toxicat.
Wang, Shuaiqun; Aorigele; Kong, Wei; Zeng, Weiming; Hong, Xiaomin
2016-01-01
Gene expression data composed of thousands of genes play an important role in classification platforms and disease diagnosis. Hence, it is vital to select a small subset of salient features over a large number of gene expression data. Lately, many researchers devote themselves to feature selection using diverse computational intelligence methods. However, in the progress of selecting informative genes, many computational methods face difficulties in selecting small subsets for cancer classification due to the huge number of genes (high dimension) compared to the small number of samples, noisy genes, and irrelevant genes. In this paper, we propose a new hybrid algorithm HICATS incorporating imperialist competition algorithm (ICA) which performs global search and tabu search (TS) that conducts fine-tuned search. In order to verify the performance of the proposed algorithm HICATS, we have tested it on 10 well-known benchmark gene expression classification datasets with dimensions varying from 2308 to 12600. The performance of our proposed method proved to be superior to other related works including the conventional version of binary optimization algorithm in terms of classification accuracy and the number of selected genes.
Aorigele; Zeng, Weiming; Hong, Xiaomin
2016-01-01
Gene expression data composed of thousands of genes play an important role in classification platforms and disease diagnosis. Hence, it is vital to select a small subset of salient features over a large number of gene expression data. Lately, many researchers devote themselves to feature selection using diverse computational intelligence methods. However, in the progress of selecting informative genes, many computational methods face difficulties in selecting small subsets for cancer classification due to the huge number of genes (high dimension) compared to the small number of samples, noisy genes, and irrelevant genes. In this paper, we propose a new hybrid algorithm HICATS incorporating imperialist competition algorithm (ICA) which performs global search and tabu search (TS) that conducts fine-tuned search. In order to verify the performance of the proposed algorithm HICATS, we have tested it on 10 well-known benchmark gene expression classification datasets with dimensions varying from 2308 to 12600. The performance of our proposed method proved to be superior to other related works including the conventional version of binary optimization algorithm in terms of classification accuracy and the number of selected genes. PMID:27579323
New Features for Neuron Classification.
Hernández-Pérez, Leonardo A; Delgado-Castillo, Duniel; Martín-Pérez, Rainer; Orozco-Morales, Rubén; Lorenzo-Ginori, Juan V
2018-04-28
This paper addresses the problem of obtaining new neuron features capable of improving results of neuron classification. Most studies on neuron classification using morphological features have been based on Euclidean geometry. Here three one-dimensional (1D) time series are derived from the three-dimensional (3D) structure of neuron instead, and afterwards a spatial time series is finally constructed from which the features are calculated. Digitally reconstructed neurons were separated into control and pathological sets, which are related to three categories of alterations caused by epilepsy, Alzheimer's disease (long and local projections), and ischemia. These neuron sets were then subjected to supervised classification and the results were compared considering three sets of features: morphological, features obtained from the time series and a combination of both. The best results were obtained using features from the time series, which outperformed the classification using only morphological features, showing higher correct classification rates with differences of 5.15, 3.75, 5.33% for epilepsy and Alzheimer's disease (long and local projections) respectively. The morphological features were better for the ischemia set with a difference of 3.05%. Features like variance, Spearman auto-correlation, partial auto-correlation, mutual information, local minima and maxima, all related to the time series, exhibited the best performance. Also we compared different evaluators, among which ReliefF was the best ranked.
Comparing Features for Classification of MEG Responses to Motor Imagery
Halme, Hanna-Leena; Parkkonen, Lauri
2016-01-01
Background Motor imagery (MI) with real-time neurofeedback could be a viable approach, e.g., in rehabilitation of cerebral stroke. Magnetoencephalography (MEG) noninvasively measures electric brain activity at high temporal resolution and is well-suited for recording oscillatory brain signals. MI is known to modulate 10- and 20-Hz oscillations in the somatomotor system. In order to provide accurate feedback to the subject, the most relevant MI-related features should be extracted from MEG data. In this study, we evaluated several MEG signal features for discriminating between left- and right-hand MI and between MI and rest. Methods MEG was measured from nine healthy participants imagining either left- or right-hand finger tapping according to visual cues. Data preprocessing, feature extraction and classification were performed offline. The evaluated MI-related features were power spectral density (PSD), Morlet wavelets, short-time Fourier transform (STFT), common spatial patterns (CSP), filter-bank common spatial patterns (FBCSP), spatio—spectral decomposition (SSD), and combined SSD+CSP, CSP+PSD, CSP+Morlet, and CSP+STFT. We also compared four classifiers applied to single trials using 5-fold cross-validation for evaluating the classification accuracy and its possible dependence on the classification algorithm. In addition, we estimated the inter-session left-vs-right accuracy for each subject. Results The SSD+CSP combination yielded the best accuracy in both left-vs-right (mean 73.7%) and MI-vs-rest (mean 81.3%) classification. CSP+Morlet yielded the best mean accuracy in inter-session left-vs-right classification (mean 69.1%). There were large inter-subject differences in classification accuracy, and the level of the 20-Hz suppression correlated significantly with the subjective MI-vs-rest accuracy. Selection of the classification algorithm had only a minor effect on the results. Conclusions We obtained good accuracy in sensor-level decoding of MI from single-trial MEG data. Feature extraction methods utilizing both the spatial and spectral profile of MI-related signals provided the best classification results, suggesting good performance of these methods in an online MEG neurofeedback system. PMID:27992574
Halldin, Cara N; Petsonk, Edward L; Laney, A Scott
2014-03-01
Chest radiographs are recommended for prevention and detection of pneumoconiosis. In 2011, the International Labour Office (ILO) released a revision of the International Classification of Radiographs of Pneumoconioses that included a digitized standard images set. The present study compared results of classifications of digital chest images performed using the new ILO 2011 digitized standard images to classification approaches used in the past. Underground coal miners (N = 172) were examined using both digital and film-screen radiography (FSR) on the same day. Seven National Institute for Occupational Safety and Health-certified B Readers independently classified all 172 digital radiographs, once using the ILO 2011 digitized standard images (DRILO2011-D) and once using digitized standard images used in the previous research (DRRES). The same seven B Readers classified all the miners' chest films using the ILO film-based standards. Agreement between classifications of FSR and digital radiography was identical, using a standard image set (either DRILO2011-D or DRRES). The overall weighted κ value was 0.58. Some specific differences in the results were seen and noted. However, intrareader variability in this study was similar to the published values and did not appear to be affected by the use of the new ILO 2011 digitized standard images. These findings validate the use of the ILO digitized standard images for classification of small pneumoconiotic opacities. When digital chest radiographs are obtained and displayed appropriately, results of pneumoconiosis classifications using the 2011 ILO digitized standards are comparable to film-based ILO classifications and to classifications using earlier research standards. Published by Elsevier Inc.
2012-01-01
Background Electromyography (EMG) pattern-recognition based control strategies for multifunctional myoelectric prosthesis systems have been studied commonly in a controlled laboratory setting. Before these myoelectric prosthesis systems are clinically viable, it will be necessary to assess the effect of some disparities between the ideal laboratory setting and practical use on the control performance. One important obstacle is the impact of arm position variation that causes the changes of EMG pattern when performing identical motions in different arm positions. This study aimed to investigate the impacts of arm position variation on EMG pattern-recognition based motion classification in upper-limb amputees and the solutions for reducing these impacts. Methods With five unilateral transradial (TR) amputees, the EMG signals and tri-axial accelerometer mechanomyography (ACC-MMG) signals were simultaneously collected from both amputated and intact arms when performing six classes of arm and hand movements in each of five arm positions that were considered in the study. The effect of the arm position changes was estimated in terms of motion classification error and compared between amputated and intact arms. Then the performance of three proposed methods in attenuating the impact of arm positions was evaluated. Results With EMG signals, the average intra-position and inter-position classification errors across all five arm positions and five subjects were around 7.3% and 29.9% from amputated arms, respectively, about 1.0% and 10% low in comparison with those from intact arms. While ACC-MMG signals could yield a similar intra-position classification error (9.9%) as EMG, they had much higher inter-position classification error with an average value of 81.1% over the arm positions and the subjects. When the EMG data from all five arm positions were involved in the training set, the average classification error reached a value of around 10.8% for amputated arms. Using a two-stage cascade classifier, the average classification error was around 9.0% over all five arm positions. Reducing ACC-MMG channels from 8 to 2 only increased the average position classification error across all five arm positions from 0.7% to 1.0% in amputated arms. Conclusions The performance of EMG pattern-recognition based method in classifying movements strongly depends on arm positions. This dependency is a little stronger in intact arm than in amputated arm, which suggests that the investigations associated with practical use of a myoelectric prosthesis should use the limb amputees as subjects instead of using able-body subjects. The two-stage cascade classifier mode with ACC-MMG for limb position identification and EMG for limb motion classification may be a promising way to reduce the effect of limb position variation on classification performance. PMID:23036049
Improved Fuzzy K-Nearest Neighbor Using Modified Particle Swarm Optimization
NASA Astrophysics Data System (ADS)
Jamaluddin; Siringoringo, Rimbun
2017-12-01
Fuzzy k-Nearest Neighbor (FkNN) is one of the most powerful classification methods. The presence of fuzzy concepts in this method successfully improves its performance on almost all classification issues. The main drawbackof FKNN is that it is difficult to determine the parameters. These parameters are the number of neighbors (k) and fuzzy strength (m). Both parameters are very sensitive. This makes it difficult to determine the values of ‘m’ and ‘k’, thus making FKNN difficult to control because no theories or guides can deduce how proper ‘m’ and ‘k’ should be. This study uses Modified Particle Swarm Optimization (MPSO) to determine the best value of ‘k’ and ‘m’. MPSO is focused on the Constriction Factor Method. Constriction Factor Method is an improvement of PSO in order to avoid local circumstances optima. The model proposed in this study was tested on the German Credit Dataset. The test of the data/The data test has been standardized by UCI Machine Learning Repository which is widely applied to classification problems. The application of MPSO to the determination of FKNN parameters is expected to increase the value of classification performance. Based on the experiments that have been done indicating that the model offered in this research results in a better classification performance compared to the Fk-NN model only. The model offered in this study has an accuracy rate of 81%, while. With using Fk-NN model, it has the accuracy of 70%. At the end is done comparison of research model superiority with 2 other classification models;such as Naive Bayes and Decision Tree. This research model has a better performance level, where Naive Bayes has accuracy 75%, and the decision tree model has 70%
Pre-operative prediction of surgical morbidity in children: comparison of five statistical models.
Cooper, Jennifer N; Wei, Lai; Fernandez, Soledad A; Minneci, Peter C; Deans, Katherine J
2015-02-01
The accurate prediction of surgical risk is important to patients and physicians. Logistic regression (LR) models are typically used to estimate these risks. However, in the fields of data mining and machine-learning, many alternative classification and prediction algorithms have been developed. This study aimed to compare the performance of LR to several data mining algorithms for predicting 30-day surgical morbidity in children. We used the 2012 National Surgical Quality Improvement Program-Pediatric dataset to compare the performance of (1) a LR model that assumed linearity and additivity (simple LR model) (2) a LR model incorporating restricted cubic splines and interactions (flexible LR model) (3) a support vector machine, (4) a random forest and (5) boosted classification trees for predicting surgical morbidity. The ensemble-based methods showed significantly higher accuracy, sensitivity, specificity, PPV, and NPV than the simple LR model. However, none of the models performed better than the flexible LR model in terms of the aforementioned measures or in model calibration or discrimination. Support vector machines, random forests, and boosted classification trees do not show better performance than LR for predicting pediatric surgical morbidity. After further validation, the flexible LR model derived in this study could be used to assist with clinical decision-making based on patient-specific surgical risks. Copyright © 2014 Elsevier Ltd. All rights reserved.
Enhancing Breast Cancer Recurrence Algorithms Through Selective Use of Medical Record Data.
Kroenke, Candyce H; Chubak, Jessica; Johnson, Lisa; Castillo, Adrienne; Weltzien, Erin; Caan, Bette J
2016-03-01
The utility of data-based algorithms in research has been questioned because of errors in identification of cancer recurrences. We adapted previously published breast cancer recurrence algorithms, selectively using medical record (MR) data to improve classification. We evaluated second breast cancer event (SBCE) and recurrence-specific algorithms previously published by Chubak and colleagues in 1535 women from the Life After Cancer Epidemiology (LACE) and 225 women from the Women's Health Initiative cohorts and compared classification statistics to published values. We also sought to improve classification with minimal MR examination. We selected pairs of algorithms-one with high sensitivity/high positive predictive value (PPV) and another with high specificity/high PPV-using MR information to resolve discrepancies between algorithms, properly classifying events based on review; we called this "triangulation." Finally, in LACE, we compared associations between breast cancer survival risk factors and recurrence using MR data, single Chubak algorithms, and triangulation. The SBCE algorithms performed well in identifying SBCE and recurrences. Recurrence-specific algorithms performed more poorly than published except for the high-specificity/high-PPV algorithm, which performed well. The triangulation method (sensitivity = 81.3%, specificity = 99.7%, PPV = 98.1%, NPV = 96.5%) improved recurrence classification over two single algorithms (sensitivity = 57.1%, specificity = 95.5%, PPV = 71.3%, NPV = 91.9%; and sensitivity = 74.6%, specificity = 97.3%, PPV = 84.7%, NPV = 95.1%), with 10.6% MR review. Triangulation performed well in survival risk factor analyses vs analyses using MR-identified recurrences. Use of multiple recurrence algorithms in administrative data, in combination with selective examination of MR data, may improve recurrence data quality and reduce research costs. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Enhancing Breast Cancer Recurrence Algorithms Through Selective Use of Medical Record Data
Chubak, Jessica; Johnson, Lisa; Castillo, Adrienne; Weltzien, Erin; Caan, Bette J.
2016-01-01
Abstract Background: The utility of data-based algorithms in research has been questioned because of errors in identification of cancer recurrences. We adapted previously published breast cancer recurrence algorithms, selectively using medical record (MR) data to improve classification. Methods: We evaluated second breast cancer event (SBCE) and recurrence-specific algorithms previously published by Chubak and colleagues in 1535 women from the Life After Cancer Epidemiology (LACE) and 225 women from the Women’s Health Initiative cohorts and compared classification statistics to published values. We also sought to improve classification with minimal MR examination. We selected pairs of algorithms—one with high sensitivity/high positive predictive value (PPV) and another with high specificity/high PPV—using MR information to resolve discrepancies between algorithms, properly classifying events based on review; we called this “triangulation.” Finally, in LACE, we compared associations between breast cancer survival risk factors and recurrence using MR data, single Chubak algorithms, and triangulation. Results: The SBCE algorithms performed well in identifying SBCE and recurrences. Recurrence-specific algorithms performed more poorly than published except for the high-specificity/high-PPV algorithm, which performed well. The triangulation method (sensitivity = 81.3%, specificity = 99.7%, PPV = 98.1%, NPV = 96.5%) improved recurrence classification over two single algorithms (sensitivity = 57.1%, specificity = 95.5%, PPV = 71.3%, NPV = 91.9%; and sensitivity = 74.6%, specificity = 97.3%, PPV = 84.7%, NPV = 95.1%), with 10.6% MR review. Triangulation performed well in survival risk factor analyses vs analyses using MR-identified recurrences. Conclusions: Use of multiple recurrence algorithms in administrative data, in combination with selective examination of MR data, may improve recurrence data quality and reduce research costs. PMID:26582243
Spatial-temporal discriminant analysis for ERP-based brain-computer interface.
Zhang, Yu; Zhou, Guoxu; Zhao, Qibin; Jin, Jing; Wang, Xingyu; Cichocki, Andrzej
2013-03-01
Linear discriminant analysis (LDA) has been widely adopted to classify event-related potential (ERP) in brain-computer interface (BCI). Good classification performance of the ERP-based BCI usually requires sufficient data recordings for effective training of the LDA classifier, and hence a long system calibration time which however may depress the system practicability and cause the users resistance to the BCI system. In this study, we introduce a spatial-temporal discriminant analysis (STDA) to ERP classification. As a multiway extension of the LDA, the STDA method tries to maximize the discriminant information between target and nontarget classes through finding two projection matrices from spatial and temporal dimensions collaboratively, which reduces effectively the feature dimensionality in the discriminant analysis, and hence decreases significantly the number of required training samples. The proposed STDA method was validated with dataset II of the BCI Competition III and dataset recorded from our own experiments, and compared to the state-of-the-art algorithms for ERP classification. Online experiments were additionally implemented for the validation. The superior classification performance in using few training samples shows that the STDA is effective to reduce the system calibration time and improve the classification accuracy, thereby enhancing the practicability of ERP-based BCI.
Scheme, Erik J; Englehart, Kevin B
2013-07-01
When controlling a powered upper limb prosthesis it is important not only to know how to move the device, but also when not to move. A novel approach to pattern recognition control, using a selective multiclass one-versus-one classification scheme has been shown to be capable of rejecting unintended motions. This method was shown to outperform other popular classification schemes when presented with muscle contractions that did not correspond to desired actions. In this work, a 3-D Fitts' Law test is proposed as a suitable alternative to using virtual limb environments for evaluating real-time myoelectric control performance. The test is used to compare the selective approach to a state-of-the-art linear discriminant analysis classification based scheme. The framework is shown to obey Fitts' Law for both control schemes, producing linear regression fittings with high coefficients of determination (R(2) > 0.936). Additional performance metrics focused on quality of control are discussed and incorporated in the evaluation. Using this framework the selective classification based scheme is shown to produce significantly higher efficiency and completion rates, and significantly lower overshoot and stopping distances, with no significant difference in throughput.
Ensemble Pruning for Glaucoma Detection in an Unbalanced Data Set.
Adler, Werner; Gefeller, Olaf; Gul, Asma; Horn, Folkert K; Khan, Zardad; Lausen, Berthold
2016-12-07
Random forests are successful classifier ensemble methods consisting of typically 100 to 1000 classification trees. Ensemble pruning techniques reduce the computational cost, especially the memory demand, of random forests by reducing the number of trees without relevant loss of performance or even with increased performance of the sub-ensemble. The application to the problem of an early detection of glaucoma, a severe eye disease with low prevalence, based on topographical measurements of the eye background faces specific challenges. We examine the performance of ensemble pruning strategies for glaucoma detection in an unbalanced data situation. The data set consists of 102 topographical features of the eye background of 254 healthy controls and 55 glaucoma patients. We compare the area under the receiver operating characteristic curve (AUC), and the Brier score on the total data set, in the majority class, and in the minority class of pruned random forest ensembles obtained with strategies based on the prediction accuracy of greedily grown sub-ensembles, the uncertainty weighted accuracy, and the similarity between single trees. To validate the findings and to examine the influence of the prevalence of glaucoma in the data set, we additionally perform a simulation study with lower prevalences of glaucoma. In glaucoma classification all three pruning strategies lead to improved AUC and smaller Brier scores on the total data set with sub-ensembles as small as 30 to 80 trees compared to the classification results obtained with the full ensemble consisting of 1000 trees. In the simulation study, we were able to show that the prevalence of glaucoma is a critical factor and lower prevalence decreases the performance of our pruning strategies. The memory demand for glaucoma classification in an unbalanced data situation based on random forests could effectively be reduced by the application of pruning strategies without loss of performance in a population with increased risk of glaucoma.
A novel deep learning approach for classification of EEG motor imagery signals.
Tabar, Yousef Rezaei; Halici, Ugur
2017-02-01
Signal classification is an important issue in brain computer interface (BCI) systems. Deep learning approaches have been used successfully in many recent studies to learn features and classify different types of data. However, the number of studies that employ these approaches on BCI applications is very limited. In this study we aim to use deep learning methods to improve classification performance of EEG motor imagery signals. In this study we investigate convolutional neural networks (CNN) and stacked autoencoders (SAE) to classify EEG Motor Imagery signals. A new form of input is introduced to combine time, frequency and location information extracted from EEG signal and it is used in CNN having one 1D convolutional and one max-pooling layers. We also proposed a new deep network by combining CNN and SAE. In this network, the features that are extracted in CNN are classified through the deep network SAE. The classification performance obtained by the proposed method on BCI competition IV dataset 2b in terms of kappa value is 0.547. Our approach yields 9% improvement over the winner algorithm of the competition. Our results show that deep learning methods provide better classification performance compared to other state of art approaches. These methods can be applied successfully to BCI systems where the amount of data is large due to daily recording.
An assessment of the effectiveness of a random forest classifier for land-cover classification
NASA Astrophysics Data System (ADS)
Rodriguez-Galiano, V. F.; Ghimire, B.; Rogan, J.; Chica-Olmo, M.; Rigol-Sanchez, J. P.
2012-01-01
Land cover monitoring using remotely sensed data requires robust classification methods which allow for the accurate mapping of complex land cover and land use categories. Random forest (RF) is a powerful machine learning classifier that is relatively unknown in land remote sensing and has not been evaluated thoroughly by the remote sensing community compared to more conventional pattern recognition techniques. Key advantages of RF include: their non-parametric nature; high classification accuracy; and capability to determine variable importance. However, the split rules for classification are unknown, therefore RF can be considered to be black box type classifier. RF provides an algorithm for estimating missing values; and flexibility to perform several types of data analysis, including regression, classification, survival analysis, and unsupervised learning. In this paper, the performance of the RF classifier for land cover classification of a complex area is explored. Evaluation was based on several criteria: mapping accuracy, sensitivity to data set size and noise. Landsat-5 Thematic Mapper data captured in European spring and summer were used with auxiliary variables derived from a digital terrain model to classify 14 different land categories in the south of Spain. Results show that the RF algorithm yields accurate land cover classifications, with 92% overall accuracy and a Kappa index of 0.92. RF is robust to training data reduction and noise because significant differences in kappa values were only observed for data reduction and noise addition values greater than 50 and 20%, respectively. Additionally, variables that RF identified as most important for classifying land cover coincided with expectations. A McNemar test indicates an overall better performance of the random forest model over a single decision tree at the 0.00001 significance level.
Classification of Stellar Spectra with Fuzzy Minimum Within-Class Support Vector Machine
NASA Astrophysics Data System (ADS)
Zhong-bao, Liu; Wen-ai, Song; Jing, Zhang; Wen-juan, Zhao
2017-06-01
Classification is one of the important tasks in astronomy, especially in spectra analysis. Support Vector Machine (SVM) is a typical classification method, which is widely used in spectra classification. Although it performs well in practice, its classification accuracies can not be greatly improved because of two limitations. One is it does not take the distribution of the classes into consideration. The other is it is sensitive to noise. In order to solve the above problems, inspired by the maximization of the Fisher's Discriminant Analysis (FDA) and the SVM separability constraints, fuzzy minimum within-class support vector machine (FMWSVM) is proposed in this paper. In FMWSVM, the distribution of the classes is reflected by the within-class scatter in FDA and the fuzzy membership function is introduced to decrease the influence of the noise. The comparative experiments with SVM on the SDSS datasets verify the effectiveness of the proposed classifier FMWSVM.
Classification of standard-like heterotic-string vacua
NASA Astrophysics Data System (ADS)
Faraggi, Alon E.; Rizos, John; Sonmez, Hasan
2018-02-01
We extend the free fermionic classification methodology to the class of standard-like heterotic-string vacua, in which the SO (10) GUT symmetry is broken at the string level to SU (3) × SU (2) × U(1) 2. The space of GGSO free phase configurations in this case is vastly enlarged compared to the corresponding SO (6) × SO (4) and SU (5) × U (1) vacua. Extracting substantial numbers of phenomenologically viable models therefore requires a modification of the classification methods. This is achieved by identifying conditions on the GGSO projection coefficients, which are satisfied at the SO (10) level by random phase configurations, and that lead to three generation models with the SO (10) symmetry broken to the SU (3) × SU (2) × U(1) 2 subgroup. Around each of these fertile SO (10) configurations, we perform a complete classification of standard-like models, by adding the SO (10) symmetry breaking basis vectors, and scanning all the associated GGSO phases. Following this methodology we are able to generate some 107 three generation Standard-like Models. We present the results of the classification and one exemplary model with distinct phenomenological properties, compared to previous SLM constructions.
A comparison of fitness-case sampling methods for genetic programming
NASA Astrophysics Data System (ADS)
Martínez, Yuliana; Naredo, Enrique; Trujillo, Leonardo; Legrand, Pierrick; López, Uriel
2017-11-01
Genetic programming (GP) is an evolutionary computation paradigm for automatic program induction. GP has produced impressive results but it still needs to overcome some practical limitations, particularly its high computational cost, overfitting and excessive code growth. Recently, many researchers have proposed fitness-case sampling methods to overcome some of these problems, with mixed results in several limited tests. This paper presents an extensive comparative study of four fitness-case sampling methods, namely: Interleaved Sampling, Random Interleaved Sampling, Lexicase Selection and Keep-Worst Interleaved Sampling. The algorithms are compared on 11 symbolic regression problems and 11 supervised classification problems, using 10 synthetic benchmarks and 12 real-world data-sets. They are evaluated based on test performance, overfitting and average program size, comparing them with a standard GP search. Comparisons are carried out using non-parametric multigroup tests and post hoc pairwise statistical tests. The experimental results suggest that fitness-case sampling methods are particularly useful for difficult real-world symbolic regression problems, improving performance, reducing overfitting and limiting code growth. On the other hand, it seems that fitness-case sampling cannot improve upon GP performance when considering supervised binary classification.
Improving the performance of extreme learning machine for hyperspectral image classification
NASA Astrophysics Data System (ADS)
Li, Jiaojiao; Du, Qian; Li, Wei; Li, Yunsong
2015-05-01
Extreme learning machine (ELM) and kernel ELM (KELM) can offer comparable performance as the standard powerful classifier―support vector machine (SVM), but with much lower computational cost due to extremely simple training step. However, their performance may be sensitive to several parameters, such as the number of hidden neurons. An empirical linear relationship between the number of training samples and the number of hidden neurons is proposed. Such a relationship can be easily estimated with two small training sets and extended to large training sets so as to greatly reduce computational cost. Other parameters, such as the steepness parameter in the sigmodal activation function and regularization parameter in the KELM, are also investigated. The experimental results show that classification performance is sensitive to these parameters; fortunately, simple selections will result in suboptimal performance.
A Support Vector Machine-Based Gender Identification Using Speech Signal
NASA Astrophysics Data System (ADS)
Lee, Kye-Hwan; Kang, Sang-Ick; Kim, Deok-Hwan; Chang, Joon-Hyuk
We propose an effective voice-based gender identification method using a support vector machine (SVM). The SVM is a binary classification algorithm that classifies two groups by finding the voluntary nonlinear boundary in a feature space and is known to yield high classification performance. In the present work, we compare the identification performance of the SVM with that of a Gaussian mixture model (GMM)-based method using the mel frequency cepstral coefficients (MFCC). A novel approach of incorporating a features fusion scheme based on a combination of the MFCC and the fundamental frequency is proposed with the aim of improving the performance of gender identification. Experimental results demonstrate that the gender identification performance using the SVM is significantly better than that of the GMM-based scheme. Moreover, the performance is substantially improved when the proposed features fusion technique is applied.
Comparison of Classification Methods for Detecting Emotion from Mandarin Speech
NASA Astrophysics Data System (ADS)
Pao, Tsang-Long; Chen, Yu-Te; Yeh, Jun-Heng
It is said that technology comes out from humanity. What is humanity? The very definition of humanity is emotion. Emotion is the basis for all human expression and the underlying theme behind everything that is done, said, thought or imagined. Making computers being able to perceive and respond to human emotion, the human-computer interaction will be more natural. Several classifiers are adopted for automatically assigning an emotion category, such as anger, happiness or sadness, to a speech utterance. These classifiers were designed independently and tested on various emotional speech corpora, making it difficult to compare and evaluate their performance. In this paper, we first compared several popular classification methods and evaluated their performance by applying them to a Mandarin speech corpus consisting of five basic emotions, including anger, happiness, boredom, sadness and neutral. The extracted feature streams contain MFCC, LPCC, and LPC. The experimental results show that the proposed WD-MKNN classifier achieves an accuracy of 81.4% for the 5-class emotion recognition and outperforms other classification techniques, including KNN, MKNN, DW-KNN, LDA, QDA, GMM, HMM, SVM, and BPNN. Then, to verify the advantage of the proposed method, we compared these classifiers by applying them to another Mandarin expressive speech corpus consisting of two emotions. The experimental results still show that the proposed WD-MKNN outperforms others.
Plenis, Alina; Rekowska, Natalia; Bączek, Tomasz
2016-01-01
This article focuses on correlating the column classification obtained from the method created at the Katholieke Universiteit Leuven (KUL), with the chromatographic resolution attained in biomedical separation. In the KUL system, each column is described with four parameters, which enables estimation of the FKUL value characterising similarity of those parameters to the selected reference stationary phase. Thus, a ranking list based on the FKUL value can be calculated for the chosen reference column, then correlated with the results of the column performance test. In this study, the column performance test was based on analysis of moclobemide and its two metabolites in human plasma by liquid chromatography (LC), using 18 columns. The comparative study was performed using traditional correlation of the FKUL values with the retention parameters of the analytes describing the column performance test. In order to deepen the comparative assessment of both data sets, factor analysis (FA) was also used. The obtained results indicated that the stationary phase classes, closely related according to the KUL method, yielded comparable separation for the target substances. Therefore, the column ranking system based on the FKUL-values could be considered supportive in the choice of the appropriate column for biomedical analysis. PMID:26805819
Plenis, Alina; Rekowska, Natalia; Bączek, Tomasz
2016-01-21
This article focuses on correlating the column classification obtained from the method created at the Katholieke Universiteit Leuven (KUL), with the chromatographic resolution attained in biomedical separation. In the KUL system, each column is described with four parameters, which enables estimation of the FKUL value characterising similarity of those parameters to the selected reference stationary phase. Thus, a ranking list based on the FKUL value can be calculated for the chosen reference column, then correlated with the results of the column performance test. In this study, the column performance test was based on analysis of moclobemide and its two metabolites in human plasma by liquid chromatography (LC), using 18 columns. The comparative study was performed using traditional correlation of the FKUL values with the retention parameters of the analytes describing the column performance test. In order to deepen the comparative assessment of both data sets, factor analysis (FA) was also used. The obtained results indicated that the stationary phase classes, closely related according to the KUL method, yielded comparable separation for the target substances. Therefore, the column ranking system based on the FKUL-values could be considered supportive in the choice of the appropriate column for biomedical analysis.
NASA Astrophysics Data System (ADS)
Fonseca, Pablo; Mendoza, Julio; Wainer, Jacques; Ferrer, Jose; Pinto, Joseph; Guerrero, Jorge; Castaneda, Benjamin
2015-03-01
Breast parenchymal density is considered a strong indicator of breast cancer risk and therefore useful for preventive tasks. Measurement of breast density is often qualitative and requires the subjective judgment of radiologists. Here we explore an automatic breast composition classification workflow based on convolutional neural networks for feature extraction in combination with a support vector machines classifier. This is compared to the assessments of seven experienced radiologists. The experiments yielded an average kappa value of 0.58 when using the mode of the radiologists' classifications as ground truth. Individual radiologist performance against this ground truth yielded kappa values between 0.56 and 0.79.
NASA Astrophysics Data System (ADS)
Diamant, Idit; Shalhon, Moran; Goldberger, Jacob; Greenspan, Hayit
2016-03-01
Classification of clustered breast microcalcifications into benign and malignant categories is an extremely challenging task for computerized algorithms and expert radiologists alike. In this paper we present a novel method for feature selection based on mutual information (MI) criterion for automatic classification of microcalcifications. We explored the MI based feature selection for various texture features. The proposed method was evaluated on a standardized digital database for screening mammography (DDSM). Experimental results demonstrate the effectiveness and the advantage of using the MI-based feature selection to obtain the most relevant features for the task and thus to provide for improved performance as compared to using all features.
Mitry, Danny; Peto, Tunde; Hayat, Shabina; Morgan, James E; Khaw, Kay-Tee; Foster, Paul J
2013-01-01
Crowdsourcing is the process of outsourcing numerous tasks to many untrained individuals. Our aim was to assess the performance and repeatability of crowdsourcing for the classification of retinal fundus photography. One hundred retinal fundus photograph images with pre-determined disease criteria were selected by experts from a large cohort study. After reading brief instructions and an example classification, we requested that knowledge workers (KWs) from a crowdsourcing platform classified each image as normal or abnormal with grades of severity. Each image was classified 20 times by different KWs. Four study designs were examined to assess the effect of varying incentive and KW experience in classification accuracy. All study designs were conducted twice to examine repeatability. Performance was assessed by comparing the sensitivity, specificity and area under the receiver operating characteristic curve (AUC). Without restriction on eligible participants, two thousand classifications of 100 images were received in under 24 hours at minimal cost. In trial 1 all study designs had an AUC (95%CI) of 0.701(0.680-0.721) or greater for classification of normal/abnormal. In trial 1, the highest AUC (95%CI) for normal/abnormal classification was 0.757 (0.738-0.776) for KWs with moderate experience. Comparable results were observed in trial 2. In trial 1, between 64-86% of any abnormal image was correctly classified by over half of all KWs. In trial 2, this ranged between 74-97%. Sensitivity was ≥ 96% for normal versus severely abnormal detections across all trials. Sensitivity for normal versus mildly abnormal varied between 61-79% across trials. With minimal training, crowdsourcing represents an accurate, rapid and cost-effective method of retinal image analysis which demonstrates good repeatability. Larger studies with more comprehensive participant training are needed to explore the utility of this compelling technique in large scale medical image analysis.
Robust spike classification based on frequency domain neural waveform features.
Yang, Chenhui; Yuan, Yuan; Si, Jennie
2013-12-01
We introduce a new spike classification algorithm based on frequency domain features of the spike snippets. The goal for the algorithm is to provide high classification accuracy, low false misclassification, ease of implementation, robustness to signal degradation, and objectivity in classification outcomes. In this paper, we propose a spike classification algorithm based on frequency domain features (CFDF). It makes use of frequency domain contents of the recorded neural waveforms for spike classification. The self-organizing map (SOM) is used as a tool to determine the cluster number intuitively and directly by viewing the SOM output map. After that, spike classification can be easily performed using clustering algorithms such as the k-Means. In conjunction with our previously developed multiscale correlation of wavelet coefficient (MCWC) spike detection algorithm, we show that the MCWC and CFDF detection and classification system is robust when tested on several sets of artificial and real neural waveforms. The CFDF is comparable to or outperforms some popular automatic spike classification algorithms with artificial and real neural data. The detection and classification of neural action potentials or neural spikes is an important step in single-unit-based neuroscientific studies and applications. After the detection of neural snippets potentially containing neural spikes, a robust classification algorithm is applied for the analysis of the snippets to (1) extract similar waveforms into one class for them to be considered coming from one unit, and to (2) remove noise snippets if they do not contain any features of an action potential. Usually, a snippet is a small 2 or 3 ms segment of the recorded waveform, and differences in neural action potentials can be subtle from one unit to another. Therefore, a robust, high performance classification system like the CFDF is necessary. In addition, the proposed algorithm does not require any assumptions on statistical properties of the noise and proves to be robust under noise contamination.
Sub-pixel image classification for forest types in East Texas
NASA Astrophysics Data System (ADS)
Westbrook, Joey
Sub-pixel classification is the extraction of information about the proportion of individual materials of interest within a pixel. Landcover classification at the sub-pixel scale provides more discrimination than traditional per-pixel multispectral classifiers for pixels where the material of interest is mixed with other materials. It allows for the un-mixing of pixels to show the proportion of each material of interest. The materials of interest for this study are pine, hardwood, mixed forest and non-forest. The goal of this project was to perform a sub-pixel classification, which allows a pixel to have multiple labels, and compare the result to a traditional supervised classification, which allows a pixel to have only one label. The satellite image used was a Landsat 5 Thematic Mapper (TM) scene of the Stephen F. Austin Experimental Forest in Nacogdoches County, Texas and the four cover type classes are pine, hardwood, mixed forest and non-forest. Once classified, a multi-layer raster datasets was created that comprised four raster layers where each layer showed the percentage of that cover type within the pixel area. Percentage cover type maps were then produced and the accuracy of each was assessed using a fuzzy error matrix for the sub-pixel classifications, and the results were compared to the supervised classification in which a traditional error matrix was used. The overall accuracy of the sub-pixel classification using the aerial photo for both training and reference data had the highest (65% overall) out of the three sub-pixel classifications. This was understandable because the analyst can visually observe the cover types actually on the ground for training data and reference data, whereas using the FIA (Forest Inventory and Analysis) plot data, the analyst must assume that an entire pixel contains the exact percentage of a cover type found in a plot. An increase in accuracy was found after reclassifying each sub-pixel classification from nine classes with 10 percent interval each to five classes with 20 percent interval each. When compared to the supervised classification which has a satisfactory overall accuracy of 90%, none of the sub-pixel classification achieved the same level. However, since traditional per-pixel classifiers assign only one label to pixels throughout the landscape while sub-pixel classifications assign multiple labels to each pixel, the traditional 85% accuracy of acceptance for pixel-based classifications should not apply to sub-pixel classifications. More research is needed in order to define the level of accuracy that is deemed acceptable for sub-pixel classifications.
Evaluation of rock classifications at B. C. Rail tumbler ridge tunnels
NASA Astrophysics Data System (ADS)
Kaiser, Peter K.; Mackay, C.; Gale, A. D.
1986-10-01
Construction of four single track railway tunnels through sedimentary rocks in central British Columbia, Canada, provided an excellent opportunity to compare various rock mass classification systems and to evaluate their applicability to the local geology. The tunnels were excavated by conventional drilling and blasting techniques and supported primarily with rock bolts and shotcrete, and with steel sets in some sections. After a brief project description including tunnel construction techniques, local geology and groundwater conditions, the data collection and filed mapping procedure is reviewed. Four rock mass classification systems ( RQD, RSR, RMR, Q) for empirical tunnel design are reviewed and relevant factors for the data interpretation are discussed. In comparing and evaluating the performance of these classification systems three aspects received special attention. The tunnel support predicted by the various systems was compared to the support installed, a unique correlation between the two most useful and most frequently applied classifications, the RMR and Q systems, was established and assessed, and finally, the non-support limit and size effect were evaluated. It is concluded that the Q-system best predicted the required tunnel support and that the RMR was only adequate after adjustment for the influence of opening size. Correction equations for opening size effects are presented for the RMR system. The RSR and RQD systems are not recommended for empirical tunnel design.
NASA Astrophysics Data System (ADS)
Wilson, Machelle; Ustin, Susan L.; Rocke, David
2003-03-01
Remote sensing technologies with high spatial and spectral resolution show a great deal of promise in addressing critical environmental monitoring issues, but the ability to analyze and interpret the data lags behind the technology. Robust analytical methods are required before the wealth of data available through remote sensing can be applied to a wide range of environmental problems for which remote detection is the best method. In this study we compare the classification effectiveness of two relatively new techniques on data consisting of leaf-level reflectance from plants that have been exposed to varying levels of heavy metal toxicity. If these methodologies work well on leaf-level data, then there is some hope that they will also work well on data from airborne and space-borne platforms. The classification methods compared were support vector machine classification of exposed and non-exposed plants based on the reflectance data, and partial east squares compression of the reflectance data followed by classification using logistic discrimination (PLS/LD). PLS/LD was performed in two ways. We used the continuous concentration data as the response during compression, and then used the binary response required during logistic discrimination. We also used a binary response during compression followed by logistic discrimination. The statistics we used to compare the effectiveness of the methodologies was the leave-one-out cross validation estimate of the prediction error.
The Cluster Sensitivity Index: A Basic Measure of Classification Robustness
ERIC Educational Resources Information Center
Hom, Willard C.
2010-01-01
Analysts of institutional performance have occasionally used a peer grouping approach in which they compared institutions only to other institutions with similar characteristics. Because analysts historically have used cluster analysis to define peer groups (i.e., the group of comparable institutions), the author proposes and demonstrates with…
Objective automated quantification of fluorescence signal in histological sections of rat lens.
Talebizadeh, Nooshin; Hagström, Nanna Zhou; Yu, Zhaohua; Kronschläger, Martin; Söderberg, Per; Wählby, Carolina
2017-08-01
Visual quantification and classification of fluorescent signals is the gold standard in microscopy. The purpose of this study was to develop an automated method to delineate cells and to quantify expression of fluorescent signal of biomarkers in each nucleus and cytoplasm of lens epithelial cells in a histological section. A region of interest representing the lens epithelium was manually demarcated in each input image. Thereafter, individual cell nuclei within the region of interest were automatically delineated based on watershed segmentation and thresholding with an algorithm developed in Matlab™. Fluorescence signal was quantified within nuclei, cytoplasms and juxtaposed backgrounds. The classification of cells as labelled or not labelled was based on comparison of the fluorescence signal within cells with local background. The classification rule was thereafter optimized as compared with visual classification of a limited dataset. The performance of the automated classification was evaluated by asking 11 independent blinded observers to classify all cells (n = 395) in one lens image. Time consumed by the automatic algorithm and visual classification of cells was recorded. On an average, 77% of the cells were correctly classified as compared with the majority vote of the visual observers. The average agreement among visual observers was 83%. However, variation among visual observers was high, and agreement between two visual observers was as low as 71% in the worst case. Automated classification was on average 10 times faster than visual scoring. The presented method enables objective and fast detection of lens epithelial cells and quantification of expression of fluorescent signal with an accuracy comparable with the variability among visual observers. © 2017 International Society for Advancement of Cytometry. © 2017 International Society for Advancement of Cytometry.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cavanaugh, J.E.; McQuarrie, A.D.; Shumway, R.H.
Conventional methods for discriminating between earthquakes and explosions at regional distances have concentrated on extracting specific features such as amplitude and spectral ratios from the waveforms of the P and S phases. We consider here an optimum nonparametric classification procedure derived from the classical approach to discriminating between two Gaussian processes with unequal spectra. Two robust variations based on the minimum discrimination information statistic and Renyi's entropy are also considered. We compare the optimum classification procedure with various amplitude and spectral ratio discriminants and show that its performance is superior when applied to a small population of 8 land-based earthquakesmore » and 8 mining explosions recorded in Scandinavia. Several parametric characterizations of the notion of complexity based on modeling earthquakes and explosions as autoregressive or modulated autoregressive processes are also proposed and their performance compared with the nonparametric and feature extraction approaches.« less
NASA Astrophysics Data System (ADS)
Rahmadani, S.; Dongoran, A.; Zarlis, M.; Zakarias
2018-03-01
This paper discusses the problem of feature selection using genetic algorithms on a dataset for classification problems. The classification model used is the decicion tree (DT), and Naive Bayes. In this paper we will discuss how the Naive Bayes and Decision Tree models to overcome the classification problem in the dataset, where the dataset feature is selectively selected using GA. Then both models compared their performance, whether there is an increase in accuracy or not. From the results obtained shows an increase in accuracy if the feature selection using GA. The proposed model is referred to as GADT (GA-Decision Tree) and GANB (GA-Naive Bayes). The data sets tested in this paper are taken from the UCI Machine Learning repository.
Mexican Hat Wavelet Kernel ELM for Multiclass Classification.
Wang, Jie; Song, Yi-Fan; Ma, Tian-Lei
2017-01-01
Kernel extreme learning machine (KELM) is a novel feedforward neural network, which is widely used in classification problems. To some extent, it solves the existing problems of the invalid nodes and the large computational complexity in ELM. However, the traditional KELM classifier usually has a low test accuracy when it faces multiclass classification problems. In order to solve the above problem, a new classifier, Mexican Hat wavelet KELM classifier, is proposed in this paper. The proposed classifier successfully improves the training accuracy and reduces the training time in the multiclass classification problems. Moreover, the validity of the Mexican Hat wavelet as a kernel function of ELM is rigorously proved. Experimental results on different data sets show that the performance of the proposed classifier is significantly superior to the compared classifiers.
NASA Astrophysics Data System (ADS)
Gundreddy, Rohith Reddy; Tan, Maxine; Qui, Yuchen; Zheng, Bin
2015-03-01
The purpose of this study is to develop and test a new content-based image retrieval (CBIR) scheme that enables to achieve higher reproducibility when it is implemented in an interactive computer-aided diagnosis (CAD) system without significantly reducing lesion classification performance. This is a new Fourier transform based CBIR algorithm that determines image similarity of two regions of interest (ROI) based on the difference of average regional image pixel value distribution in two Fourier transform mapped images under comparison. A reference image database involving 227 ROIs depicting the verified soft-tissue breast lesions was used. For each testing ROI, the queried lesion center was systematically shifted from 10 to 50 pixels to simulate inter-user variation of querying suspicious lesion center when using an interactive CAD system. The lesion classification performance and reproducibility as the queried lesion center shift were assessed and compared among the three CBIR schemes based on Fourier transform, mutual information and Pearson correlation. Each CBIR scheme retrieved 10 most similar reference ROIs and computed a likelihood score of the queried ROI depicting a malignant lesion. The experimental results shown that three CBIR schemes yielded very comparable lesion classification performance as measured by the areas under ROC curves with the p-value greater than 0.498. However, the CBIR scheme using Fourier transform yielded the highest invariance to both queried lesion center shift and lesion size change. This study demonstrated the feasibility of improving robustness of the interactive CAD systems by adding a new Fourier transform based image feature to CBIR schemes.
Douglas, P K; Harris, Sam; Yuille, Alan; Cohen, Mark S
2011-05-15
Machine learning (ML) has become a popular tool for mining functional neuroimaging data, and there are now hopes of performing such analyses efficiently in real-time. Towards this goal, we compared accuracy of six different ML algorithms applied to neuroimaging data of persons engaged in a bivariate task, asserting their belief or disbelief of a variety of propositional statements. We performed unsupervised dimension reduction and automated feature extraction using independent component (IC) analysis and extracted IC time courses. Optimization of classification hyperparameters across each classifier occurred prior to assessment. Maximum accuracy was achieved at 92% for Random Forest, followed by 91% for AdaBoost, 89% for Naïve Bayes, 87% for a J48 decision tree, 86% for K*, and 84% for support vector machine. For real-time decoding applications, finding a parsimonious subset of diagnostic ICs might be useful. We used a forward search technique to sequentially add ranked ICs to the feature subspace. For the current data set, we determined that approximately six ICs represented a meaningful basis set for classification. We then projected these six IC spatial maps forward onto a later scanning session within subject. We then applied the optimized ML algorithms to these new data instances, and found that classification accuracy results were reproducible. Additionally, we compared our classification method to our previously published general linear model results on this same data set. The highest ranked IC spatial maps show similarity to brain regions associated with contrasts for belief > disbelief, and disbelief < belief. Copyright © 2010 Elsevier Inc. All rights reserved.
Li, Ying; Donohue, Kyna S; Robbins, Christopher B; Pennock, Andrew T; Ellis, Henry B; Nepple, Jeffrey J; Pandya, Nirav; Spence, David D; Willimon, Samuel Clifton; Heyworth, Benton E
2017-09-01
There is a recent trend toward increased surgical treatment of displaced midshaft clavicle fractures in adolescents. The primary purpose of this study was to evaluate the intrarater and interrater reliability of clavicle fracture classification systems and measurements of displacement, shortening, and angulation in adolescents. The secondary purpose was to compare 2 different measurement methods for fracture shortening. This study was performed by a multicenter study group conducting a prospective, comparative, observational cohort study of adolescent clavicle fractures. Eight raters evaluated 24 deidentified anteroposterior clavicle radiographs selected from patients 10-18 years of age with midshaft clavicle fractures. Two clavicle fracture classification systems were used, and 2 measurements for shortening, 1 measurement for superior-inferior displacement, and 2 measurements for fracture angulation were performed. A minimum of 2 weeks after the first round, the process was repeated. Intraclass correlation coefficients were calculated. Good to excellent intrarater and interrater agreement was achieved for the descriptive classification system of fracture displacement, direction of angulation, presence of comminution, and all continuous variables, including both measurements of shortening, superior-inferior displacement, and degrees of angulation. Moderate agreement was achieved for the Arbeitsgemeinschaft für Osteosynthesefragen classification system overall. Mean shortening by 2 different methods were significantly different from each other (P < 0.0001). Most radiographic measurements performed by investigators in a multicenter, prospective cohort study of adolescent clavicle fractures demonstrated good-to-excellent intrarater and interrater reliability. Future consensus on the most accurate and clinically appropriate measurement method for fracture shortening is critical.
Pan, Jianjun
2018-01-01
This paper focuses on evaluating the ability and contribution of using backscatter intensity, texture, coherence, and color features extracted from Sentinel-1A data for urban land cover classification and comparing different multi-sensor land cover mapping methods to improve classification accuracy. Both Landsat-8 OLI and Hyperion images were also acquired, in combination with Sentinel-1A data, to explore the potential of different multi-sensor urban land cover mapping methods to improve classification accuracy. The classification was performed using a random forest (RF) method. The results showed that the optimal window size of the combination of all texture features was 9 × 9, and the optimal window size was different for each individual texture feature. For the four different feature types, the texture features contributed the most to the classification, followed by the coherence and backscatter intensity features; and the color features had the least impact on the urban land cover classification. Satisfactory classification results can be obtained using only the combination of texture and coherence features, with an overall accuracy up to 91.55% and a kappa coefficient up to 0.8935, respectively. Among all combinations of Sentinel-1A-derived features, the combination of the four features had the best classification result. Multi-sensor urban land cover mapping obtained higher classification accuracy. The combination of Sentinel-1A and Hyperion data achieved higher classification accuracy compared to the combination of Sentinel-1A and Landsat-8 OLI images, with an overall accuracy of up to 99.12% and a kappa coefficient up to 0.9889. When Sentinel-1A data was added to Hyperion images, the overall accuracy and kappa coefficient were increased by 4.01% and 0.0519, respectively. PMID:29382073
Morris, Alan; Burgon, Nathan; McGann, Christopher; MacLeod, Robert; Cates, Joshua
2013-01-01
Radiofrequency ablation is a promising procedure for treating atrial fibrillation (AF) that relies on accurate lesion delivery in the left atrial (LA) wall for success. Late Gadolinium Enhancement MRI (LGE MRI) at three months post-ablation has proven effective for noninvasive assessment of the location and extent of scar formation, which are important factors for predicting patient outcome and planning of redo ablation procedures. We have developed an algorithm for automatic classification in LGE MRI of scar tissue in the LA wall and have evaluated accuracy and consistency compared to manual scar classifications by expert observers. Our approach clusters voxels based on normalized intensity and was chosen through a systematic comparison of the performance of multivariate clustering on many combinations of image texture. Algorithm performance was determined by overlap with ground truth, using multiple overlap measures, and the accuracy of the estimation of the total amount of scar in the LA. Ground truth was determined using the STAPLE algorithm, which produces a probabilistic estimate of the true scar classification from multiple expert manual segmentations. Evaluation of the ground truth data set was based on both inter- and intra-observer agreement, with variation among expert classifiers indicating the difficulty of scar classification for a given a dataset. Our proposed automatic scar classification algorithm performs well for both scar localization and estimation of scar volume: for ground truth datasets considered easy, variability from the ground truth was low; for those considered difficult, variability from ground truth was on par with the variability across experts. PMID:24236224
Fuzzy support vector machine for microarray imbalanced data classification
NASA Astrophysics Data System (ADS)
Ladayya, Faroh; Purnami, Santi Wulan; Irhamah
2017-11-01
DNA microarrays are data containing gene expression with small sample sizes and high number of features. Furthermore, imbalanced classes is a common problem in microarray data. This occurs when a dataset is dominated by a class which have significantly more instances than the other minority classes. Therefore, it is needed a classification method that solve the problem of high dimensional and imbalanced data. Support Vector Machine (SVM) is one of the classification methods that is capable of handling large or small samples, nonlinear, high dimensional, over learning and local minimum issues. SVM has been widely applied to DNA microarray data classification and it has been shown that SVM provides the best performance among other machine learning methods. However, imbalanced data will be a problem because SVM treats all samples in the same importance thus the results is bias for minority class. To overcome the imbalanced data, Fuzzy SVM (FSVM) is proposed. This method apply a fuzzy membership to each input point and reformulate the SVM such that different input points provide different contributions to the classifier. The minority classes have large fuzzy membership so FSVM can pay more attention to the samples with larger fuzzy membership. Given DNA microarray data is a high dimensional data with a very large number of features, it is necessary to do feature selection first using Fast Correlation based Filter (FCBF). In this study will be analyzed by SVM, FSVM and both methods by applying FCBF and get the classification performance of them. Based on the overall results, FSVM on selected features has the best classification performance compared to SVM.
NASA Astrophysics Data System (ADS)
Perry, Daniel; Morris, Alan; Burgon, Nathan; McGann, Christopher; MacLeod, Robert; Cates, Joshua
2012-03-01
Radiofrequency ablation is a promising procedure for treating atrial fibrillation (AF) that relies on accurate lesion delivery in the left atrial (LA) wall for success. Late Gadolinium Enhancement MRI (LGE MRI) at three months post-ablation has proven effective for noninvasive assessment of the location and extent of scar formation, which are important factors for predicting patient outcome and planning of redo ablation procedures. We have developed an algorithm for automatic classification in LGE MRI of scar tissue in the LA wall and have evaluated accuracy and consistency compared to manual scar classifications by expert observers. Our approach clusters voxels based on normalized intensity and was chosen through a systematic comparison of the performance of multivariate clustering on many combinations of image texture. Algorithm performance was determined by overlap with ground truth, using multiple overlap measures, and the accuracy of the estimation of the total amount of scar in the LA. Ground truth was determined using the STAPLE algorithm, which produces a probabilistic estimate of the true scar classification from multiple expert manual segmentations. Evaluation of the ground truth data set was based on both inter- and intra-observer agreement, with variation among expert classifiers indicating the difficulty of scar classification for a given a dataset. Our proposed automatic scar classification algorithm performs well for both scar localization and estimation of scar volume: for ground truth datasets considered easy, variability from the ground truth was low; for those considered difficult, variability from ground truth was on par with the variability across experts.
Automatic analysis and classification of surface electromyography.
Abou-Chadi, F E; Nashar, A; Saad, M
2001-01-01
In this paper, parametric modeling of surface electromyography (EMG) algorithms that facilitates automatic SEMG feature extraction and artificial neural networks (ANN) are combined for providing an integrated system for the automatic analysis and diagnosis of myopathic disorders. Three paradigms of ANN were investigated: the multilayer backpropagation algorithm, the self-organizing feature map algorithm and a probabilistic neural network model. The performance of the three classifiers was compared with that of the old Fisher linear discriminant (FLD) classifiers. The results have shown that the three ANN models give higher performance. The percentage of correct classification reaches 90%. Poorer diagnostic performance was obtained from the FLD classifier. The system presented here indicates that surface EMG, when properly processed, can be used to provide the physician with a diagnostic assist device.
2006-08-01
Nikolas Avouris. Evaluation of classifiers for an uneven class distribution problem. Applied Artificial Intellegence , pages 1-24, 2006. Draft manuscript...data by a hybrid artificial neural network so we may evaluate the classification capabilities of the baseline GRLVQ and our improved GRLVQI. Chapter 4...performance of GRLVQ(I), we compare the results against a baseline classification of the 23-class problem with a hybrid artificial neural network (ANN
Alshamlan, Hala; Badr, Ghada; Alohali, Yousef
2015-01-01
An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems. PMID:25961028
Alshamlan, Hala; Badr, Ghada; Alohali, Yousef
2015-01-01
An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.
NASA Astrophysics Data System (ADS)
Lesniak, J. M.; Hupse, R.; Blanc, R.; Karssemeijer, N.; Székely, G.
2012-08-01
False positive (FP) marks represent an obstacle for effective use of computer-aided detection (CADe) of breast masses in mammography. Typically, the problem can be approached either by developing more discriminative features or by employing different classifier designs. In this paper, the usage of support vector machine (SVM) classification for FP reduction in CADe is investigated, presenting a systematic quantitative evaluation against neural networks, k-nearest neighbor classification, linear discriminant analysis and random forests. A large database of 2516 film mammography examinations and 73 input features was used to train the classifiers and evaluate for their performance on correctly diagnosed exams as well as false negatives. Further, classifier robustness was investigated using varying training data and feature sets as input. The evaluation was based on the mean exam sensitivity in 0.05-1 FPs on normals on the free-response receiver operating characteristic curve (FROC), incorporated into a tenfold cross validation framework. It was found that SVM classification using a Gaussian kernel offered significantly increased detection performance (P = 0.0002) compared to the reference methods. Varying training data and input features, SVMs showed improved exploitation of large feature sets. It is concluded that with the SVM-based CADe a significant reduction of FPs is possible outperforming other state-of-the-art approaches for breast mass CADe.
Adebileje, Sikiru Afolabi; Ghasemi, Keyvan; Aiyelabegan, Hammed Tanimowo; Saligheh Rad, Hamidreza
2017-04-01
Proton magnetic resonance spectroscopy is a powerful noninvasive technique that complements the structural images of cMRI, which aids biomedical and clinical researches, by identifying and visualizing the compositions of various metabolites within the tissues of interest. However, accurate classification of proton magnetic resonance spectroscopy is still a challenging issue in clinics due to low signal-to-noise ratio, overlapping peaks of metabolites, and the presence of background macromolecules. This paper evaluates the performance of a discriminate dictionary learning classifiers based on projective dictionary pair learning method for brain gliomas proton magnetic resonance spectroscopy spectra classification task, and the result were compared with the sub-dictionary learning methods. The proton magnetic resonance spectroscopy data contain a total of 150 spectra (74 healthy, 23 grade II, 23 grade III, and 30 grade IV) from two databases. The datasets from both databases were first coupled together, followed by column normalization. The Kennard-Stone algorithm was used to split the datasets into its training and test sets. Performance comparison based on the overall accuracy, sensitivity, specificity, and precision was conducted. Based on the overall accuracy of our classification scheme, the dictionary pair learning method was found to outperform the sub-dictionary learning methods 97.78% compared with 68.89%, respectively. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Park, Myoung-Ok
2017-02-01
[Purpose] The purpose of this study was to determine effects of Gross Motor Function Classification System and Manual Ability Classification System levels on performance-based motor skills of children with spastic cerebral palsy. [Subjects and Methods] Twenty-three children with cerebral palsy were included. The Assessment of Motor and Process Skills was used to evaluate performance-based motor skills in daily life. Gross motor function was assessed using Gross Motor Function Classification Systems, and manual function was measured using the Manual Ability Classification System. [Results] Motor skills in daily activities were significantly different on Gross Motor Function Classification System level and Manual Ability Classification System level. According to the results of multiple regression analysis, children categorized as Gross Motor Function Classification System level III scored lower in terms of performance based motor skills than Gross Motor Function Classification System level I children. Also, when analyzed with respect to Manual Ability Classification System level, level II was lower than level I, and level III was lower than level II in terms of performance based motor skills. [Conclusion] The results of this study indicate that performance-based motor skills differ among children categorized based on Gross Motor Function Classification System and Manual Ability Classification System levels of cerebral palsy.
A new pre-classification method based on associative matching method
NASA Astrophysics Data System (ADS)
Katsuyama, Yutaka; Minagawa, Akihiro; Hotta, Yoshinobu; Omachi, Shinichiro; Kato, Nei
2010-01-01
Reducing the time complexity of character matching is critical to the development of efficient Japanese Optical Character Recognition (OCR) systems. To shorten processing time, recognition is usually split into separate preclassification and recognition stages. For high overall recognition performance, the pre-classification stage must both have very high classification accuracy and return only a small number of putative character categories for further processing. Furthermore, for any practical system, the speed of the pre-classification stage is also critical. The associative matching (AM) method has often been used for fast pre-classification, because its use of a hash table and reliance solely on logical bit operations to select categories makes it highly efficient. However, redundant certain level of redundancy exists in the hash table because it is constructed using only the minimum and maximum values of the data on each axis and therefore does not take account of the distribution of the data. We propose a modified associative matching method that satisfies the performance criteria described above but in a fraction of the time by modifying the hash table to reflect the underlying distribution of training characters. Furthermore, we show that our approach outperforms pre-classification by clustering, ANN and conventional AM in terms of classification accuracy, discriminative power and speed. Compared to conventional associative matching, the proposed approach results in a 47% reduction in total processing time across an evaluation test set comprising 116,528 Japanese character images.
Estimating local scaling properties for the classification of interstitial lung disease patterns
NASA Astrophysics Data System (ADS)
Huber, Markus B.; Nagarajan, Mahesh B.; Leinsinger, Gerda; Ray, Lawrence A.; Wismueller, Axel
2011-03-01
Local scaling properties of texture regions were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honeycombing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and the estimation of local scaling properties with Scaling Index Method (SIM). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions including the Bonferroni correction. The best classification results were obtained by the set of SIM features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers with the highest accuracy (94.1%, 93.7%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced texture features using local scaling properties can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.
Overlapped Partitioning for Ensemble Classifiers of P300-Based Brain-Computer Interfaces
Onishi, Akinari; Natsume, Kiyohisa
2014-01-01
A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance. PMID:24695550
Overlapped partitioning for ensemble classifiers of P300-based brain-computer interfaces.
Onishi, Akinari; Natsume, Kiyohisa
2014-01-01
A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance.
T-ray relevant frequencies for osteosarcoma classification
NASA Astrophysics Data System (ADS)
Withayachumnankul, W.; Ferguson, B.; Rainsford, T.; Findlay, D.; Mickan, S. P.; Abbott, D.
2006-01-01
We investigate the classification of the T-ray response of normal human bone cells and human osteosarcoma cells, grown in culture. Given the magnitude and phase responses within a reliable spectral range as features for input vectors, a trained support vector machine can correctly classify the two cell types to some extent. Performance of the support vector machine is deteriorated by the curse of dimensionality, resulting from the comparatively large number of features in the input vectors. Feature subset selection methods are used to select only an optimal number of relevant features for inputs. As a result, an improvement in generalization performance is attainable, and the selected frequencies can be used for further describing different mechanisms of the cells, responding to T-rays. We demonstrate a consistent classification accuracy of 89.6%, while the only one fifth of the original features are retained in the data set.
Mai, Xiaofeng; Liu, Jie; Wu, Xiong; Zhang, Qun; Guo, Changjian; Yang, Yanfu; Li, Zhaohui
2017-02-06
A Stokes-space modulation format classification (MFC) technique is proposed for coherent optical receivers by using a non-iterative clustering algorithm. In the clustering algorithm, two simple parameters are calculated to help find the density peaks of the data points in Stokes space and no iteration is required. Correct MFC can be realized in numerical simulations among PM-QPSK, PM-8QAM, PM-16QAM, PM-32QAM and PM-64QAM signals within practical optical signal-to-noise ratio (OSNR) ranges. The performance of the proposed MFC algorithm is also compared with those of other schemes based on clustering algorithms. The simulation results show that good classification performance can be achieved using the proposed MFC scheme with moderate time complexity. Proof-of-concept experiments are finally implemented to demonstrate MFC among PM-QPSK/16QAM/64QAM signals, which confirm the feasibility of our proposed MFC scheme.
Handwritten digits recognition based on immune network
NASA Astrophysics Data System (ADS)
Li, Yangyang; Wu, Yunhui; Jiao, Lc; Wu, Jianshe
2011-11-01
With the development of society, handwritten digits recognition technique has been widely applied to production and daily life. It is a very difficult task to solve these problems in the field of pattern recognition. In this paper, a new method is presented for handwritten digit recognition. The digit samples firstly are processed and features extraction. Based on these features, a novel immune network classification algorithm is designed and implemented to the handwritten digits recognition. The proposed algorithm is developed by Jerne's immune network model for feature selection and KNN method for classification. Its characteristic is the novel network with parallel commutating and learning. The performance of the proposed method is experimented to the handwritten number datasets MNIST and compared with some other recognition algorithms-KNN, ANN and SVM algorithm. The result shows that the novel classification algorithm based on immune network gives promising performance and stable behavior for handwritten digits recognition.
Segmentation and classification of cell cycle phases in fluorescence imaging.
Ersoy, Ilker; Bunyak, Filiz; Chagin, Vadim; Cardoso, M Christina; Palaniappan, Kannappan
2009-01-01
Current chemical biology methods for studying spatiotemporal correlation between biochemical networks and cell cycle phase progression in live-cells typically use fluorescence-based imaging of fusion proteins. Stable cell lines expressing fluorescently tagged protein GFP-PCNA produce rich, dynamically varying sub-cellular foci patterns characterizing the cell cycle phases, including the progress during the S-phase. Variable fluorescence patterns, drastic changes in SNR, shape and position changes and abundance of touching cells require sophisticated algorithms for reliable automatic segmentation and cell cycle classification. We extend the recently proposed graph partitioning active contours (GPAC) for fluorescence-based nucleus segmentation using regional density functions and dramatically improve its efficiency, making it scalable for high content microscopy imaging. We utilize surface shape properties of GFP-PCNA intensity field to obtain descriptors of foci patterns and perform automated cell cycle phase classification, and give quantitative performance by comparing our results to manually labeled data.
Nagarajan, Mahesh B.; Huber, Markus B.; Schlossbauer, Thomas; Leinsinger, Gerda; Krol, Andrzej; Wismüller, Axel
2014-01-01
Objective While dimension reduction has been previously explored in computer aided diagnosis (CADx) as an alternative to feature selection, previous implementations of its integration into CADx do not ensure strict separation between training and test data required for the machine learning task. This compromises the integrity of the independent test set, which serves as the basis for evaluating classifier performance. Methods and Materials We propose, implement and evaluate an improved CADx methodology where strict separation is maintained. This is achieved by subjecting the training data alone to dimension reduction; the test data is subsequently processed with out-of-sample extension methods. Our approach is demonstrated in the research context of classifying small diagnostically challenging lesions annotated on dynamic breast magnetic resonance imaging (MRI) studies. The lesions were dynamically characterized through topological feature vectors derived from Minkowski functionals. These feature vectors were then subject to dimension reduction with different linear and non-linear algorithms applied in conjunction with out-of-sample extension techniques. This was followed by classification through supervised learning with support vector regression. Area under the receiver-operating characteristic curve (AUC) was evaluated as the metric of classifier performance. Results Of the feature vectors investigated, the best performance was observed with Minkowski functional ’perimeter’ while comparable performance was observed with ’area’. Of the dimension reduction algorithms tested with ’perimeter’, the best performance was observed with Sammon’s mapping (0.84 ± 0.10) while comparable performance was achieved with exploratory observation machine (0.82 ± 0.09) and principal component analysis (0.80 ± 0.10). Conclusions The results reported in this study with the proposed CADx methodology present a significant improvement over previous results reported with such small lesions on dynamic breast MRI. In particular, non-linear algorithms for dimension reduction exhibited better classification performance than linear approaches, when integrated into our CADx methodology. We also note that while dimension reduction techniques may not necessarily provide an improvement in classification performance over feature selection, they do allow for a higher degree of feature compaction. PMID:24355697
NASA Technical Reports Server (NTRS)
Brewin, Robert J.W.; Sathyendranath, Shubha; Muller, Dagmar; Brockmann, Carsten; Deschamps, Pierre-Yves; Devred, Emmanuel; Doerffer, Roland; Fomferra, Norman; Franz, Bryan; Grant, Mike;
2013-01-01
Satellite-derived remote-sensing reflectance (Rrs) can be used for mapping biogeochemically relevant variables, such as the chlorophyll concentration and the Inherent Optical Properties (IOPs) of the water, at global scale for use in climate-change studies. Prior to generating such products, suitable algorithms have to be selected that are appropriate for the purpose. Algorithm selection needs to account for both qualitative and quantitative requirements. In this paper we develop an objective methodology designed to rank the quantitative performance of a suite of bio-optical models. The objective classification is applied using the NASA bio-Optical Marine Algorithm Dataset (NOMAD). Using in situ Rrs as input to the models, the performance of eleven semianalytical models, as well as five empirical chlorophyll algorithms and an empirical diffuse attenuation coefficient algorithm, is ranked for spectrally-resolved IOPs, chlorophyll concentration and the diffuse attenuation coefficient at 489 nm. The sensitivity of the objective classification and the uncertainty in the ranking are tested using a Monte-Carlo approach (bootstrapping). Results indicate that the performance of the semi-analytical models varies depending on the product and wavelength of interest. For chlorophyll retrieval, empirical algorithms perform better than semi-analytical models, in general. The performance of these empirical models reflects either their immunity to scale errors or instrument noise in Rrs data, or simply that the data used for model parameterisation were not independent of NOMAD. Nonetheless, uncertainty in the classification suggests that the performance of some semi-analytical algorithms at retrieving chlorophyll is comparable with the empirical algorithms. For phytoplankton absorption at 443 nm, some semi-analytical models also perform with similar accuracy to an empirical model. We discuss the potential biases, limitations and uncertainty in the approach, as well as additional qualitative considerations for algorithm selection for climate-change studies. Our classification has the potential to be routinely implemented, such that the performance of emerging algorithms can be compared with existing algorithms as they become available. In the long-term, such an approach will further aid algorithm development for ocean-colour studies.
NASA Astrophysics Data System (ADS)
Ghaffarian, S.; Ghaffarian, S.
2014-08-01
This paper presents a novel approach to detect the buildings by automization of the training area collecting stage for supervised classification. The method based on the fact that a 3d building structure should cast a shadow under suitable imaging conditions. Therefore, the methodology begins with the detection and masking out the shadow areas using luminance component of the LAB color space, which indicates the lightness of the image, and a novel double thresholding technique. Further, the training areas for supervised classification are selected by automatically determining a buffer zone on each building whose shadow is detected by using the shadow shape and the sun illumination direction. Thereafter, by calculating the statistic values of each buffer zone which is collected from the building areas the Improved Parallelepiped Supervised Classification is executed to detect the buildings. Standard deviation thresholding applied to the Parallelepiped classification method to improve its accuracy. Finally, simple morphological operations conducted for releasing the noises and increasing the accuracy of the results. The experiments were performed on set of high resolution Google Earth images. The performance of the proposed approach was assessed by comparing the results of the proposed approach with the reference data by using well-known quality measurements (Precision, Recall and F1-score) to evaluate the pixel-based and object-based performances of the proposed approach. Evaluation of the results illustrates that buildings detected from dense and suburban districts with divers characteristics and color combinations using our proposed method have 88.4 % and 853 % overall pixel-based and object-based precision performances, respectively.
NASA Astrophysics Data System (ADS)
Gibril, Mohamed Barakat A.; Idrees, Mohammed Oludare; Yao, Kouame; Shafri, Helmi Zulhaidi Mohd
2018-01-01
The growing use of optimization for geographic object-based image analysis and the possibility to derive a wide range of information about the image in textual form makes machine learning (data mining) a versatile tool for information extraction from multiple data sources. This paper presents application of data mining for land-cover classification by fusing SPOT-6, RADARSAT-2, and derived dataset. First, the images and other derived indices (normalized difference vegetation index, normalized difference water index, and soil adjusted vegetation index) were combined and subjected to segmentation process with optimal segmentation parameters obtained using combination of spatial and Taguchi statistical optimization. The image objects, which carry all the attributes of the input datasets, were extracted and related to the target land-cover classes through data mining algorithms (decision tree) for classification. To evaluate the performance, the result was compared with two nonparametric classifiers: support vector machine (SVM) and random forest (RF). Furthermore, the decision tree classification result was evaluated against six unoptimized trials segmented using arbitrary parameter combinations. The result shows that the optimized process produces better land-use land-cover classification with overall classification accuracy of 91.79%, 87.25%, and 88.69% for SVM and RF, respectively, while the results of the six unoptimized classifications yield overall accuracy between 84.44% and 88.08%. Higher accuracy of the optimized data mining classification approach compared to the unoptimized results indicates that the optimization process has significant impact on the classification quality.
Laughlin-Tommaso, Shannon K; Hesley, Gina K; Hopkins, Matthew R; Brandt, Kathleen R; Zhu, Yunxiao; Stewart, Elizabeth A
2017-11-01
To determine the reproducibility of classifying uterine fibroids using the 2011 International Federation of Gynecology and Obstetrics (FIGO) staging system. The present retrospective cohort study included patients presenting for the treatment of symptomatic uterine fibroids at the Gynecology Fibroid Clinic at Mayo Clinic, Rochester, USA, between April 1, 2013 and April 1, 2014. Magnetic resonance imaging of fibroid uteri was performed and the images were independently reviewed by two academic gynecologists and two radiologists specializing in fibroid care. Fibroid classifications assigned by each physician were compared and the significance of the variations was graded by whether they would affect surgical planning. There were 42 fibroids from 23 patients; only 6 (14%) fibroids had unanimous classification agreement. The majority (36 [86%]) had at least two unique answers and 4 (10%) fibroids had four unique classifications. Variations in classification were not associated with physician specialty. More than one-third of the classification discrepancies would have impacted surgical planning. FIGO fibroid classification was not consistent among four fibroid specialists. The variation was clinically significant for 36% of the fibroids. Additional validation of the FIGO fibroid classification system is needed. © 2017 International Federation of Gynecology and Obstetrics.
Complex extreme learning machine applications in terahertz pulsed signals feature sets.
Yin, X-X; Hadjiloucas, S; Zhang, Y
2014-11-01
This paper presents a novel approach to the automatic classification of very large data sets composed of terahertz pulse transient signals, highlighting their potential use in biochemical, biomedical, pharmaceutical and security applications. Two different types of THz spectra are considered in the classification process. Firstly a binary classification study of poly-A and poly-C ribonucleic acid samples is performed. This is then contrasted with a difficult multi-class classification problem of spectra from six different powder samples that although have fairly indistinguishable features in the optical spectrum, they also possess a few discernable spectral features in the terahertz part of the spectrum. Classification is performed using a complex-valued extreme learning machine algorithm that takes into account features in both the amplitude as well as the phase of the recorded spectra. Classification speed and accuracy are contrasted with that achieved using a support vector machine classifier. The study systematically compares the classifier performance achieved after adopting different Gaussian kernels when separating amplitude and phase signatures. The two signatures are presented as feature vectors for both training and testing purposes. The study confirms the utility of complex-valued extreme learning machine algorithms for classification of the very large data sets generated with current terahertz imaging spectrometers. The classifier can take into consideration heterogeneous layers within an object as would be required within a tomographic setting and is sufficiently robust to detect patterns hidden inside noisy terahertz data sets. The proposed study opens up the opportunity for the establishment of complex-valued extreme learning machine algorithms as new chemometric tools that will assist the wider proliferation of terahertz sensing technology for chemical sensing, quality control, security screening and clinic diagnosis. Furthermore, the proposed algorithm should also be very useful in other applications requiring the classification of very large datasets. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Inferring Human Activity Recognition with Ambient Sound on Wireless Sensor Nodes.
Salomons, Etto L; Havinga, Paul J M; van Leeuwen, Henk
2016-09-27
A wireless sensor network that consists of nodes with a sound sensor can be used to obtain context awareness in home environments. However, the limited processing power of wireless nodes offers a challenge when extracting features from the signal, and subsequently, classifying the source. Although multiple papers can be found on different methods of sound classification, none of these are aimed at limited hardware or take the efficiency of the algorithms into account. In this paper, we compare and evaluate several classification methods on a real sensor platform using different feature types and classifiers, in order to find an approach that results in a good classifier that can run on limited hardware. To be as realistic as possible, we trained our classifiers using sound waves from many different sources. We conclude that despite the fact that the classifiers are often of low quality due to the highly restricted hardware resources, sufficient performance can be achieved when (1) the window length for our classifiers is increased, and (2) if we apply a two-step approach that uses a refined classification after a global classification has been performed.
NASA Astrophysics Data System (ADS)
Adi Putra, Januar
2018-04-01
In this paper, we propose a new mammogram classification scheme to classify the breast tissues as normal or abnormal. Feature matrix is generated using Local Binary Pattern to all the detailed coefficients from 2D-DWT of the region of interest (ROI) of a mammogram. Feature selection is done by selecting the relevant features that affect the classification. Feature selection is used to reduce the dimensionality of data and features that are not relevant, in this paper the F-test and Ttest will be performed to the results of the feature extraction dataset to reduce and select the relevant feature. The best features are used in a Neural Network classifier for classification. In this research we use MIAS and DDSM database. In addition to the suggested scheme, the competent schemes are also simulated for comparative analysis. It is observed that the proposed scheme has a better say with respect to accuracy, specificity and sensitivity. Based on experiments, the performance of the proposed scheme can produce high accuracy that is 92.71%, while the lowest accuracy obtained is 77.08%.
Classifying machinery condition using oil samples and binary logistic regression
NASA Astrophysics Data System (ADS)
Phillips, J.; Cripps, E.; Lau, John W.; Hodkiewicz, M. R.
2015-08-01
The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically "black box" approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, ANN and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the ANN and SVM approaches in terms of prediction for healthy/not healthy engines.
NASA Astrophysics Data System (ADS)
Hu, Yan-Yan; Li, Dong-Sheng
2016-01-01
The hyperspectral images(HSI) consist of many closely spaced bands carrying the most object information. While due to its high dimensionality and high volume nature, it is hard to get satisfactory classification performance. In order to reduce HSI data dimensionality preparation for high classification accuracy, it is proposed to combine a band selection method of artificial immune systems (AIS) with a hybrid kernels support vector machine (SVM-HK) algorithm. In fact, after comparing different kernels for hyperspectral analysis, the approach mixed radial basis function kernel (RBF-K) with sigmoid kernel (Sig-K) and applied the optimized hybrid kernels in SVM classifiers. Then the SVM-HK algorithm used to induce the bands selection of an improved version of AIS. The AIS was composed of clonal selection and elite antibody mutation, including evaluation process with optional index factor (OIF). Experimental classification performance was on a San Diego Naval Base acquired by AVIRIS, the HRS dataset shows that the method is able to efficiently achieve bands redundancy removal while outperforming the traditional SVM classifier.
Voting strategy for artifact reduction in digital breast tomosynthesis.
Wu, Tao; Moore, Richard H; Kopans, Daniel B
2006-07-01
Artifacts are observed in digital breast tomosynthesis (DBT) reconstructions due to the small number of projections and the narrow angular range that are typically employed in tomosynthesis imaging. In this work, we investigate the reconstruction artifacts that are caused by high-attenuation features in breast and develop several artifact reduction methods based on a "voting strategy." The voting strategy identifies the projection(s) that would introduce artifacts to a voxel and rejects the projection(s) when reconstructing the voxel. Four approaches to the voting strategy were compared, including projection segmentation, maximum contribution deduction, one-step classification, and iterative classification. The projection segmentation method, based on segmentation of high-attenuation features from the projections, effectively reduces artifacts caused by metal and large calcifications that can be reliably detected and segmented from projections. The other three methods are based on the observation that contributions from artifact-inducing projections have higher value than those from normal projections. These methods attempt to identify the projection(s) that would cause artifacts by comparing contributions from different projections. Among the three methods, the iterative classification method provides the best artifact reduction; however, it can generate many false positive classifications that degrade the image quality. The maximum contribution deduction method and one-step classification method both reduce artifacts well from small calcifications, although the performance of artifact reduction is slightly better with the one-step classification. The combination of one-step classification and projection segmentation removes artifacts from both large and small calcifications.
Hierarchical trie packet classification algorithm based on expectation-maximization clustering
Bi, Xia-an; Zhao, Junxia
2017-01-01
With the development of computer network bandwidth, packet classification algorithms which are able to deal with large-scale rule sets are in urgent need. Among the existing algorithms, researches on packet classification algorithms based on hierarchical trie have become an important packet classification research branch because of their widely practical use. Although hierarchical trie is beneficial to save large storage space, it has several shortcomings such as the existence of backtracking and empty nodes. This paper proposes a new packet classification algorithm, Hierarchical Trie Algorithm Based on Expectation-Maximization Clustering (HTEMC). Firstly, this paper uses the formalization method to deal with the packet classification problem by means of mapping the rules and data packets into a two-dimensional space. Secondly, this paper uses expectation-maximization algorithm to cluster the rules based on their aggregate characteristics, and thereby diversified clusters are formed. Thirdly, this paper proposes a hierarchical trie based on the results of expectation-maximization clustering. Finally, this paper respectively conducts simulation experiments and real-environment experiments to compare the performances of our algorithm with other typical algorithms, and analyzes the results of the experiments. The hierarchical trie structure in our algorithm not only adopts trie path compression to eliminate backtracking, but also solves the problem of low efficiency of trie updates, which greatly improves the performance of the algorithm. PMID:28704476
Influence of nuclei segmentation on breast cancer malignancy classification
NASA Astrophysics Data System (ADS)
Jelen, Lukasz; Fevens, Thomas; Krzyzak, Adam
2009-02-01
Breast Cancer is one of the most deadly cancers affecting middle-aged women. Accurate diagnosis and prognosis are crucial to reduce the high death rate. Nowadays there are numerous diagnostic tools for breast cancer diagnosis. In this paper we discuss a role of nuclear segmentation from fine needle aspiration biopsy (FNA) slides and its influence on malignancy classification. Classification of malignancy plays a very important role during the diagnosis process of breast cancer. Out of all cancer diagnostic tools, FNA slides provide the most valuable information about the cancer malignancy grade which helps to choose an appropriate treatment. This process involves assessing numerous nuclear features and therefore precise segmentation of nuclei is very important. In this work we compare three powerful segmentation approaches and test their impact on the classification of breast cancer malignancy. The studied approaches involve level set segmentation, fuzzy c-means segmentation and textural segmentation based on co-occurrence matrix. Segmented nuclei were used to extract nuclear features for malignancy classification. For classification purposes four different classifiers were trained and tested with previously extracted features. The compared classifiers are Multilayer Perceptron (MLP), Self-Organizing Maps (SOM), Principal Component-based Neural Network (PCA) and Support Vector Machines (SVM). The presented results show that level set segmentation yields the best results over the three compared approaches and leads to a good feature extraction with a lowest average error rate of 6.51% over four different classifiers. The best performance was recorded for multilayer perceptron with an error rate of 3.07% using fuzzy c-means segmentation.
NASA Astrophysics Data System (ADS)
Javidnia, Katayoun; Parish, Maryam; Karimi, Sadegh; Hemmateenejad, Bahram
2013-03-01
By using FT-IR spectroscopy, many researchers from different disciplines enrich the experimental complexity of their research for obtaining more precise information. Moreover chemometrics techniques have boosted the use of IR instruments. In the present study we aimed to emphasize on the power of FT-IR spectroscopy for discrimination between different oil samples (especially fat from vegetable oils). Also our data were used to compare the performance of different classification methods. FT-IR transmittance spectra of oil samples (Corn, Colona, Sunflower, Soya, Olive, and Butter) were measured in the wave-number interval of 450-4000 cm-1. Classification analysis was performed utilizing PLS-DA, interval PLS-DA, extended canonical variate analysis (ECVA) and interval ECVA methods. The effect of data preprocessing by extended multiplicative signal correction was investigated. Whilst all employed method could distinguish butter from vegetable oils, iECVA resulted in the best performances for calibration and external test set with 100% sensitivity and specificity.
An attention-based effective neural model for drug-drug interactions extraction.
Zheng, Wei; Lin, Hongfei; Luo, Ling; Zhao, Zhehuan; Li, Zhengguang; Zhang, Yijia; Yang, Zhihao; Wang, Jian
2017-10-10
Drug-drug interactions (DDIs) often bring unexpected side effects. The clinical recognition of DDIs is a crucial issue for both patient safety and healthcare cost control. However, although text-mining-based systems explore various methods to classify DDIs, the classification performance with regard to DDIs in long and complex sentences is still unsatisfactory. In this study, we propose an effective model that classifies DDIs from the literature by combining an attention mechanism and a recurrent neural network with long short-term memory (LSTM) units. In our approach, first, a candidate-drug-oriented input attention acting on word-embedding vectors automatically learns which words are more influential for a given drug pair. Next, the inputs merging the position- and POS-embedding vectors are passed to a bidirectional LSTM layer whose outputs at the last time step represent the high-level semantic information of the whole sentence. Finally, a softmax layer performs DDI classification. Experimental results from the DDIExtraction 2013 corpus show that our system performs the best with respect to detection and classification (84.0% and 77.3%, respectively) compared with other state-of-the-art methods. In particular, for the Medline-2013 dataset with long and complex sentences, our F-score far exceeds those of top-ranking systems by 12.6%. Our approach effectively improves the performance of DDI classification tasks. Experimental analysis demonstrates that our model performs better with respect to recognizing not only close-range but also long-range patterns among words, especially for long, complex and compound sentences.
Multi-label spacecraft electrical signal classification method based on DBN and random forest
Li, Ke; Yu, Nan; Li, Pengfei; Song, Shimin; Wu, Yalei; Li, Yang; Liu, Meng
2017-01-01
In spacecraft electrical signal characteristic data, there exists a large amount of data with high-dimensional features, a high computational complexity degree, and a low rate of identification problems, which causes great difficulty in fault diagnosis of spacecraft electronic load systems. This paper proposes a feature extraction method that is based on deep belief networks (DBN) and a classification method that is based on the random forest (RF) algorithm; The proposed algorithm mainly employs a multi-layer neural network to reduce the dimension of the original data, and then, classification is applied. Firstly, we use the method of wavelet denoising, which was used to pre-process the data. Secondly, the deep belief network is used to reduce the feature dimension and improve the rate of classification for the electrical characteristics data. Finally, we used the random forest algorithm to classify the data and comparing it with other algorithms. The experimental results show that compared with other algorithms, the proposed method shows excellent performance in terms of accuracy, computational efficiency, and stability in addressing spacecraft electrical signal data. PMID:28486479
Multi-label spacecraft electrical signal classification method based on DBN and random forest.
Li, Ke; Yu, Nan; Li, Pengfei; Song, Shimin; Wu, Yalei; Li, Yang; Liu, Meng
2017-01-01
In spacecraft electrical signal characteristic data, there exists a large amount of data with high-dimensional features, a high computational complexity degree, and a low rate of identification problems, which causes great difficulty in fault diagnosis of spacecraft electronic load systems. This paper proposes a feature extraction method that is based on deep belief networks (DBN) and a classification method that is based on the random forest (RF) algorithm; The proposed algorithm mainly employs a multi-layer neural network to reduce the dimension of the original data, and then, classification is applied. Firstly, we use the method of wavelet denoising, which was used to pre-process the data. Secondly, the deep belief network is used to reduce the feature dimension and improve the rate of classification for the electrical characteristics data. Finally, we used the random forest algorithm to classify the data and comparing it with other algorithms. The experimental results show that compared with other algorithms, the proposed method shows excellent performance in terms of accuracy, computational efficiency, and stability in addressing spacecraft electrical signal data.
Using reconstructed IVUS images for coronary plaque classification.
Caballero, Karla L; Barajas, Joel; Pujol, Oriol; Rodriguez, Oriol; Radeva, Petia
2007-01-01
Coronary plaque rupture is one of the principal causes of sudden death in western societies. Reliable diagnostic of the different plaque types are of great interest for the medical community the predicting their evolution and applying an effective treatment. To achieve this, a tissue classification must be performed. Intravascular Ultrasound (IVUS) represents a technique to explore the vessel walls and to observe its histological properties. In this paper, a method to reconstruct IVUS images from the raw Radio Frequency (RF) data coming from ultrasound catheter is proposed. This framework offers a normalization scheme to compare accurately different patient studies. The automatic tissue classification is based on texture analysis and Adapting Boosting (Adaboost) learning technique combined with Error Correcting Output Codes (ECOC). In this study, 9 in-vivo cases are reconstructed with 7 different parameter set. This method improves the classification rate based on images, yielding a 91% of well-detected tissue using the best parameter set. It also reduces the inter-patient variability compared with the analysis of DICOM images, which are obtained from the commercial equipment.
CW-SSIM kernel based random forest for image classification
NASA Astrophysics Data System (ADS)
Fan, Guangzhe; Wang, Zhou; Wang, Jiheng
2010-07-01
Complex wavelet structural similarity (CW-SSIM) index has been proposed as a powerful image similarity metric that is robust to translation, scaling and rotation of images, but how to employ it in image classification applications has not been deeply investigated. In this paper, we incorporate CW-SSIM as a kernel function into a random forest learning algorithm. This leads to a novel image classification approach that does not require a feature extraction or dimension reduction stage at the front end. We use hand-written digit recognition as an example to demonstrate our algorithm. We compare the performance of the proposed approach with random forest learning based on other kernels, including the widely adopted Gaussian and the inner product kernels. Empirical evidences show that the proposed method is superior in its classification power. We also compared our proposed approach with the direct random forest method without kernel and the popular kernel-learning method support vector machine. Our test results based on both simulated and realworld data suggest that the proposed approach works superior to traditional methods without the feature selection procedure.
A Comparison of Two Scoring Methods for an Automated Speech Scoring System
ERIC Educational Resources Information Center
Xi, Xiaoming; Higgins, Derrick; Zechner, Klaus; Williamson, David
2012-01-01
This paper compares two alternative scoring methods--multiple regression and classification trees--for an automated speech scoring system used in a practice environment. The two methods were evaluated on two criteria: construct representation and empirical performance in predicting human scores. The empirical performance of the two scoring models…
Betthauser, Joseph L; Hunt, Christopher L; Osborn, Luke E; Masters, Matthew R; Levay, Gyorgy; Kaliki, Rahul R; Thakor, Nitish V
2018-04-01
Myoelectric signals can be used to predict the intended movements of an amputee for prosthesis control. However, untrained effects like limb position changes influence myoelectric signal characteristics, hindering the ability of pattern recognition algorithms to discriminate among motion classes. Despite frequent and long training sessions, these deleterious conditional influences may result in poor performance and device abandonment. We present a robust sparsity-based adaptive classification method that is significantly less sensitive to signal deviations resulting from untrained conditions. We compare this approach in the offline and online contexts of untrained upper-limb positions for amputee and able-bodied subjects to demonstrate its robustness compared against other myoelectric classification methods. We report significant performance improvements () in untrained limb positions across all subject groups. The robustness of our suggested approach helps to ensure better untrained condition performance from fewer training conditions. This method of prosthesis control has the potential to deliver real-world clinical benefits to amputees: better condition-tolerant performance, reduced training burden in terms of frequency and duration, and increased adoption of myoelectric prostheses.
NASA Technical Reports Server (NTRS)
Matic, Roy M.; Mosley, Judith I.
1994-01-01
Future space-based, remote sensing systems will have data transmission requirements that exceed available downlinks necessitating the use of lossy compression techniques for multispectral data. In this paper, we describe several algorithms for lossy compression of multispectral data which combine spectral decorrelation techniques with an adaptive, wavelet-based, image compression algorithm to exploit both spectral and spatial correlation. We compare the performance of several different spectral decorrelation techniques including wavelet transformation in the spectral dimension. The performance of each technique is evaluated at compression ratios ranging from 4:1 to 16:1. Performance measures used are visual examination, conventional distortion measures, and multispectral classification results. We also introduce a family of distortion metrics that are designed to quantify and predict the effect of compression artifacts on multi spectral classification of the reconstructed data.
Computer-aided interpretation approach for optical tomographic images
NASA Astrophysics Data System (ADS)
Klose, Christian D.; Klose, Alexander D.; Netz, Uwe J.; Scheel, Alexander K.; Beuthan, Jürgen; Hielscher, Andreas H.
2010-11-01
A computer-aided interpretation approach is proposed to detect rheumatic arthritis (RA) in human finger joints using optical tomographic images. The image interpretation method employs a classification algorithm that makes use of a so-called self-organizing mapping scheme to classify fingers as either affected or unaffected by RA. Unlike in previous studies, this allows for combining multiple image features, such as minimum and maximum values of the absorption coefficient for identifying affected and not affected joints. Classification performances obtained by the proposed method were evaluated in terms of sensitivity, specificity, Youden index, and mutual information. Different methods (i.e., clinical diagnostics, ultrasound imaging, magnet resonance imaging, and inspection of optical tomographic images), were used to produce ground truth benchmarks to determine the performance of image interpretations. Using data from 100 finger joints, findings suggest that some parameter combinations lead to higher sensitivities, while others to higher specificities when compared to single parameter classifications employed in previous studies. Maximum performances are reached when combining the minimum/maximum ratio of the absorption coefficient and image variance. In this case, sensitivities and specificities over 0.9 can be achieved. These values are much higher than values obtained when only single parameter classifications were used, where sensitivities and specificities remained well below 0.8.
Al-Sahaf, Harith; Zhang, Mengjie; Johnston, Mark
2016-01-01
In the computer vision and pattern recognition fields, image classification represents an important yet difficult task. It is a challenge to build effective computer models to replicate the remarkable ability of the human visual system, which relies on only one or a few instances to learn a completely new class or an object of a class. Recently we proposed two genetic programming (GP) methods, one-shot GP and compound-GP, that aim to evolve a program for the task of binary classification in images. The two methods are designed to use only one or a few instances per class to evolve the model. In this study, we investigate these two methods in terms of performance, robustness, and complexity of the evolved programs. We use ten data sets that vary in difficulty to evaluate these two methods. We also compare them with two other GP and six non-GP methods. The results show that one-shot GP and compound-GP outperform or achieve results comparable to competitor methods. Moreover, the features extracted by these two methods improve the performance of other classifiers with handcrafted features and those extracted by a recently developed GP-based method in most cases.
Deployment and Performance of the NASA D3R During the GPM OLYMPEx Field Campaign
NASA Technical Reports Server (NTRS)
Chandrasekar, V.; Beauchamp, Robert M.; Chen, Haonan; Vega, Manuel; Schwaller, Mathew; Willie, Delbert; Dabrowski, Aaron; Kumar, Mohit; Petersen, Walter; Wolff, David
2016-01-01
The NASA D3R was successfully deployed and operated throughout the NASA OLYMPEx field campaign. A differential phase based attenuation correction technique has been implemented for D3R observations. Hydrometeor classification has been demonstrated for five distinct classes using Ku-band observations of both convection and stratiform rain. The stratiform rain hydrometeor classification is compared against LDR observations and shows good agreement in identification of mixed-phase hydrometeors in the melting layer.
Question analysis for Indonesian comparative question
NASA Astrophysics Data System (ADS)
Saelan, A.; Purwarianti, A.; Widyantoro, D. H.
2017-01-01
Information seeking is one of human needs today. Comparing things using search engine surely take more times than search only one thing. In this paper, we analyzed comparative questions for comparative question answering system. Comparative question is a question that comparing two or more entities. We grouped comparative questions into 5 types: selection between mentioned entities, selection between unmentioned entities, selection between any entity, comparison, and yes or no question. Then we extracted 4 types of information from comparative questions: entity, aspect, comparison, and constraint. We built classifiers for classification task and information extraction task. Features used for classification task are bag of words, whether for information extraction, we used lexical, 2 previous and following words lexical, and previous label as features. We tried 2 scenarios: classification first and extraction first. For classification first, we used classification result as a feature for extraction. Otherwise, for extraction first, we used extraction result as features for classification. We found that the result would be better if we do extraction first before classification. For the extraction task, classification using SMO gave the best result (88.78%), while for classification, it is better to use naïve bayes (82.35%).
Shayan, Zahra; Mohammad Gholi Mezerji, Naser; Shayan, Leila; Naseri, Parisa
2015-11-03
Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.
Speaker-sensitive emotion recognition via ranking: Studies on acted and spontaneous speech☆
Cao, Houwei; Verma, Ragini; Nenkova, Ani
2014-01-01
We introduce a ranking approach for emotion recognition which naturally incorporates information about the general expressivity of speakers. We demonstrate that our approach leads to substantial gains in accuracy compared to conventional approaches. We train ranking SVMs for individual emotions, treating the data from each speaker as a separate query, and combine the predictions from all rankers to perform multi-class prediction. The ranking method provides two natural benefits. It captures speaker specific information even in speaker-independent training/testing conditions. It also incorporates the intuition that each utterance can express a mix of possible emotion and that considering the degree to which each emotion is expressed can be productively exploited to identify the dominant emotion. We compare the performance of the rankers and their combination to standard SVM classification approaches on two publicly available datasets of acted emotional speech, Berlin and LDC, as well as on spontaneous emotional data from the FAU Aibo dataset. On acted data, ranking approaches exhibit significantly better performance compared to SVM classification both in distinguishing a specific emotion from all others and in multi-class prediction. On the spontaneous data, which contains mostly neutral utterances with a relatively small portion of less intense emotional utterances, ranking-based classifiers again achieve much higher precision in identifying emotional utterances than conventional SVM classifiers. In addition, we discuss the complementarity of conventional SVM and ranking-based classifiers. On all three datasets we find dramatically higher accuracy for the test items on whose prediction the two methods agree compared to the accuracy of individual methods. Furthermore on the spontaneous data the ranking and standard classification are complementary and we obtain marked improvement when we combine the two classifiers by late-stage fusion. PMID:25422534
Speaker-sensitive emotion recognition via ranking: Studies on acted and spontaneous speech☆
Cao, Houwei; Verma, Ragini; Nenkova, Ani
2015-01-01
We introduce a ranking approach for emotion recognition which naturally incorporates information about the general expressivity of speakers. We demonstrate that our approach leads to substantial gains in accuracy compared to conventional approaches. We train ranking SVMs for individual emotions, treating the data from each speaker as a separate query, and combine the predictions from all rankers to perform multi-class prediction. The ranking method provides two natural benefits. It captures speaker specific information even in speaker-independent training/testing conditions. It also incorporates the intuition that each utterance can express a mix of possible emotion and that considering the degree to which each emotion is expressed can be productively exploited to identify the dominant emotion. We compare the performance of the rankers and their combination to standard SVM classification approaches on two publicly available datasets of acted emotional speech, Berlin and LDC, as well as on spontaneous emotional data from the FAU Aibo dataset. On acted data, ranking approaches exhibit significantly better performance compared to SVM classification both in distinguishing a specific emotion from all others and in multi-class prediction. On the spontaneous data, which contains mostly neutral utterances with a relatively small portion of less intense emotional utterances, ranking-based classifiers again achieve much higher precision in identifying emotional utterances than conventional SVM classifiers. In addition, we discuss the complementarity of conventional SVM and ranking-based classifiers. On all three datasets we find dramatically higher accuracy for the test items on whose prediction the two methods agree compared to the accuracy of individual methods. Furthermore on the spontaneous data the ranking and standard classification are complementary and we obtain marked improvement when we combine the two classifiers by late-stage fusion.
Transfer Kernel Common Spatial Patterns for Motor Imagery Brain-Computer Interface Classification.
Dai, Mengxi; Zheng, Dezhi; Liu, Shucong; Zhang, Pengju
2018-01-01
Motor-imagery-based brain-computer interfaces (BCIs) commonly use the common spatial pattern (CSP) as preprocessing step before classification. The CSP method is a supervised algorithm. Therefore a lot of time-consuming training data is needed to build the model. To address this issue, one promising approach is transfer learning, which generalizes a learning model can extract discriminative information from other subjects for target classification task. To this end, we propose a transfer kernel CSP (TKCSP) approach to learn a domain-invariant kernel by directly matching distributions of source subjects and target subjects. The dataset IVa of BCI Competition III is used to demonstrate the validity by our proposed methods. In the experiment, we compare the classification performance of the TKCSP against CSP, CSP for subject-to-subject transfer (CSP SJ-to-SJ), regularizing CSP (RCSP), stationary subspace CSP (ssCSP), multitask CSP (mtCSP), and the combined mtCSP and ssCSP (ss + mtCSP) method. The results indicate that the superior mean classification performance of TKCSP can achieve 81.14%, especially in case of source subjects with fewer number of training samples. Comprehensive experimental evidence on the dataset verifies the effectiveness and efficiency of the proposed TKCSP approach over several state-of-the-art methods.
Transfer Kernel Common Spatial Patterns for Motor Imagery Brain-Computer Interface Classification
Dai, Mengxi; Liu, Shucong; Zhang, Pengju
2018-01-01
Motor-imagery-based brain-computer interfaces (BCIs) commonly use the common spatial pattern (CSP) as preprocessing step before classification. The CSP method is a supervised algorithm. Therefore a lot of time-consuming training data is needed to build the model. To address this issue, one promising approach is transfer learning, which generalizes a learning model can extract discriminative information from other subjects for target classification task. To this end, we propose a transfer kernel CSP (TKCSP) approach to learn a domain-invariant kernel by directly matching distributions of source subjects and target subjects. The dataset IVa of BCI Competition III is used to demonstrate the validity by our proposed methods. In the experiment, we compare the classification performance of the TKCSP against CSP, CSP for subject-to-subject transfer (CSP SJ-to-SJ), regularizing CSP (RCSP), stationary subspace CSP (ssCSP), multitask CSP (mtCSP), and the combined mtCSP and ssCSP (ss + mtCSP) method. The results indicate that the superior mean classification performance of TKCSP can achieve 81.14%, especially in case of source subjects with fewer number of training samples. Comprehensive experimental evidence on the dataset verifies the effectiveness and efficiency of the proposed TKCSP approach over several state-of-the-art methods. PMID:29743934
Xu, Kele; Feng, Dawei; Mi, Haibo
2017-11-23
The automatic detection of diabetic retinopathy is of vital importance, as it is the main cause of irreversible vision loss in the working-age population in the developed world. The early detection of diabetic retinopathy occurrence can be very helpful for clinical treatment; although several different feature extraction approaches have been proposed, the classification task for retinal images is still tedious even for those trained clinicians. Recently, deep convolutional neural networks have manifested superior performance in image classification compared to previous handcrafted feature-based image classification methods. Thus, in this paper, we explored the use of deep convolutional neural network methodology for the automatic classification of diabetic retinopathy using color fundus image, and obtained an accuracy of 94.5% on our dataset, outperforming the results obtained by using classical approaches.
Hyperspectral Image Classification With Markov Random Fields and a Convolutional Neural Network
NASA Astrophysics Data System (ADS)
Cao, Xiangyong; Zhou, Feng; Xu, Lin; Meng, Deyu; Xu, Zongben; Paisley, John
2018-05-01
This paper presents a new supervised classification algorithm for remotely sensed hyperspectral image (HSI) which integrates spectral and spatial information in a unified Bayesian framework. First, we formulate the HSI classification problem from a Bayesian perspective. Then, we adopt a convolutional neural network (CNN) to learn the posterior class distributions using a patch-wise training strategy to better use the spatial information. Next, spatial information is further considered by placing a spatial smoothness prior on the labels. Finally, we iteratively update the CNN parameters using stochastic gradient decent (SGD) and update the class labels of all pixel vectors using an alpha-expansion min-cut-based algorithm. Compared with other state-of-the-art methods, the proposed classification method achieves better performance on one synthetic dataset and two benchmark HSI datasets in a number of experimental settings.
Employing wavelet-based texture features in ammunition classification
NASA Astrophysics Data System (ADS)
Borzino, Ángelo M. C. R.; Maher, Robert C.; Apolinário, José A.; de Campos, Marcello L. R.
2017-05-01
Pattern recognition, a branch of machine learning, involves classification of information in images, sounds, and other digital representations. This paper uses pattern recognition to identify which kind of ammunition was used when a bullet was fired based on a carefully constructed set of gunshot sound recordings. To do this task, we show that texture features obtained from the wavelet transform of a component of the gunshot signal, treated as an image, and quantized in gray levels, are good ammunition discriminators. We test the technique with eight different calibers and achieve a classification rate better than 95%. We also compare the performance of the proposed method with results obtained by standard temporal and spectrographic techniques
Classifying EEG for Brain-Computer Interface: Learning Optimal Filters for Dynamical System Features
Song, Le; Epps, Julien
2007-01-01
Classification of multichannel EEG recordings during motor imagination has been exploited successfully for brain-computer interfaces (BCI). In this paper, we consider EEG signals as the outputs of a networked dynamical system (the cortex), and exploit synchronization features from the dynamical system for classification. Herein, we also propose a new framework for learning optimal filters automatically from the data, by employing a Fisher ratio criterion. Experimental evaluations comparing the proposed dynamical system features with the CSP and the AR features reveal their competitive performance during classification. Results also show the benefits of employing the spatial and the temporal filters optimized using the proposed learning approach. PMID:18364986
NASA Astrophysics Data System (ADS)
Wutsqa, D. U.; Marwah, M.
2017-06-01
In this paper, we consider spatial operation median filter to reduce the noise in the cervical images yielded by colposcopy tool. The backpropagation neural network (BPNN) model is applied to the colposcopy images to classify cervical cancer. The classification process requires an image extraction by using a gray level co-occurrence matrix (GLCM) method to obtain image features that are used as inputs of BPNN model. The advantage of noise reduction is evaluated by comparing the performances of BPNN models with and without spatial operation median filter. The experimental result shows that the spatial operation median filter can improve the accuracy of the BPNN model for cervical cancer classification.
Abnormality detection of mammograms by discriminative dictionary learning on DSIFT descriptors.
Tavakoli, Nasrin; Karimi, Maryam; Nejati, Mansour; Karimi, Nader; Reza Soroushmehr, S M; Samavi, Shadrokh; Najarian, Kayvan
2017-07-01
Detection and classification of breast lesions using mammographic images are one of the most difficult studies in medical image processing. A number of learning and non-learning methods have been proposed for detecting and classifying these lesions. However, the accuracy of the detection/classification still needs improvement. In this paper we propose a powerful classification method based on sparse learning to diagnose breast cancer in mammograms. For this purpose, a supervised discriminative dictionary learning approach is applied on dense scale invariant feature transform (DSIFT) features. A linear classifier is also simultaneously learned with the dictionary which can effectively classify the sparse representations. Our experimental results show the superior performance of our method compared to existing approaches.
Hartman, Esther A R; van Royen-Kerkhof, Annet; Jacobs, Johannes W G; Welsing, Paco M J; Fritsch-Stork, Ruth D E
2018-03-01
To evaluate the performance in classifying systemic lupus erythematosus by the 2012 Systemic Lupus International Collaborating Clinics criteria (SLICC'12), versus the revised American College of Rheumatology criteria from 1997 (ACR'97) in adult and juvenile SLE patients. A systematic literature search was conducted in PubMed and Embase for studies comparing SLICC'12 and ACR'97 with clinical diagnosis. A meta-analysis was performed to estimate the sensitivity and specificity of SLICC'12 and ACR'97. To assess classification earlier in the disease by either set, sensitivity and specificity were compared for patients with disease duration <5years. Sensitivity and specificity of individual criteria items were also assessed. In adult SLE (nine studies: 5236 patients, 1313 controls), SLICC'12 has higher sensitivity (94.6% vs. 89.6%) and similar specificity (95.5% vs. 98.1%) compared to ACR'97. For juvenile SLE (four studies: 568 patients, 339 controls), SLICC'12 demonstrates higher sensitivity (99.9% vs. 84.3%) than ACR'97, but much lower specificity (82.0% vs. 94.1%). SLICC'12 classifies juvenile SLE patients earlier in disease course. Individual items contributing to diagnostic accuracy are low complement, anti-ds DNA and acute cutaneous lupus in SLICC'12, and the immunologic and hematologic disorder in ACR'97. Based on sensitivity and specificity SLICC'12 is best for adult SLE. Following the view that higher specificity, i.e. avoidance of false positives, is preferable, ACR'97 is best for juvenile SLE even if associated with lower sensitivity. Our results on the contribution of the individual items of SLICC'12 and ACR´97 may be of value in future efforts to update classification criteria. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Multiple-instance ensemble learning for hyperspectral images
NASA Astrophysics Data System (ADS)
Ergul, Ugur; Bilgin, Gokhan
2017-10-01
An ensemble framework for multiple-instance (MI) learning (MIL) is introduced for use in hyperspectral images (HSIs) by inspiring the bagging (bootstrap aggregation) method in ensemble learning. Ensemble-based bagging is performed by a small percentage of training samples, and MI bags are formed by a local windowing process with variable window sizes on selected instances. In addition to bootstrap aggregation, random subspace is another method used to diversify base classifiers. The proposed method is implemented using four MIL classification algorithms. The classifier model learning phase is carried out with MI bags, and the estimation phase is performed over single-test instances. In the experimental part of the study, two different HSIs that have ground-truth information are used, and comparative results are demonstrated with state-of-the-art classification methods. In general, the MI ensemble approach produces more compact results in terms of both diversity and error compared to equipollent non-MIL algorithms.
Deep Gaze Velocity Analysis During Mammographic Reading for Biometric Identification of Radiologists
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoon, Hong-Jun; Alamudun, Folami T.; Hudson, Kathy
Several studies have confirmed that the gaze velocity of the human eye can be utilized as a behavioral biometric or personalized biomarker. In this study, we leverage the local feature representation capacity of convolutional neural networks (CNNs) for eye gaze velocity analysis as the basis for biometric identification of radiologists performing breast cancer screening. Using gaze data collected from 10 radiologists reading 100 mammograms of various diagnoses, we compared the performance of a CNN-based classification algorithm with two deep learning classifiers, deep neural network and deep belief network, and a previously presented hidden Markov model classifier. The study showed thatmore » the CNN classifier is superior compared to alternative classification methods based on macro F1-scores derived from 10-fold cross-validation experiments. Our results further support the efficacy of eye gaze velocity as a biometric identifier of medical imaging experts.« less
A Modified Mean Gray Wolf Optimization Approach for Benchmark and Biomedical Problems.
Singh, Narinder; Singh, S B
2017-01-01
A modified variant of gray wolf optimization algorithm, namely, mean gray wolf optimization algorithm has been developed by modifying the position update (encircling behavior) equations of gray wolf optimization algorithm. The proposed variant has been tested on 23 standard benchmark well-known test functions (unimodal, multimodal, and fixed-dimension multimodal), and the performance of modified variant has been compared with particle swarm optimization and gray wolf optimization. Proposed algorithm has also been applied to the classification of 5 data sets to check feasibility of the modified variant. The results obtained are compared with many other meta-heuristic approaches, ie, gray wolf optimization, particle swarm optimization, population-based incremental learning, ant colony optimization, etc. The results show that the performance of modified variant is able to find best solutions in terms of high level of accuracy in classification and improved local optima avoidance.
Deep Gaze Velocity Analysis During Mammographic Reading for Biometric Identification of Radiologists
Yoon, Hong-Jun; Alamudun, Folami T.; Hudson, Kathy; ...
2018-01-24
Several studies have confirmed that the gaze velocity of the human eye can be utilized as a behavioral biometric or personalized biomarker. In this study, we leverage the local feature representation capacity of convolutional neural networks (CNNs) for eye gaze velocity analysis as the basis for biometric identification of radiologists performing breast cancer screening. Using gaze data collected from 10 radiologists reading 100 mammograms of various diagnoses, we compared the performance of a CNN-based classification algorithm with two deep learning classifiers, deep neural network and deep belief network, and a previously presented hidden Markov model classifier. The study showed thatmore » the CNN classifier is superior compared to alternative classification methods based on macro F1-scores derived from 10-fold cross-validation experiments. Our results further support the efficacy of eye gaze velocity as a biometric identifier of medical imaging experts.« less
NASA Astrophysics Data System (ADS)
Alom, Md. Zahangir; Awwal, Abdul A. S.; Lowe-Webb, Roger; Taha, Tarek M.
2017-08-01
Deep-learning methods are gaining popularity because of their state-of-the-art performance in image classification tasks. In this paper, we explore classification of laser-beam images from the National Ignition Facility (NIF) using a novel deeplearning approach. NIF is the world's largest, most energetic laser. It has nearly 40,000 optics that precisely guide, reflect, amplify, and focus 192 laser beams onto a fusion target. NIF utilizes four petawatt lasers called the Advanced Radiographic Capability (ARC) to produce backlighting X-ray illumination to capture implosion dynamics of NIF experiments with picosecond temporal resolution. In the current operational configuration, four independent short-pulse ARC beams are created and combined in a split-beam configuration in each of two NIF apertures at the entry of the pre-amplifier. The subaperture beams then propagate through the NIF beampath up to the ARC compressor. Each ARC beamlet is separately compressed with a dedicated set of four gratings and recombined as sub-apertures for transport to the parabola vessel, where the beams are focused using parabolic mirrors and pointed to the target. Small angular errors in the compressor gratings can cause the sub-aperture beams to diverge from one another and prevent accurate alignment through the transport section between the compressor and parabolic mirrors. This is an off-normal condition that must be detected and corrected. The goal of the off-normal check is to determine whether the ARC beamlets are sufficiently overlapped into a merged single spot or diverged into two distinct spots. Thus, the objective of the current work is three-fold: developing a simple algorithm to perform off-normal classification, exploring the use of Convolutional Neural Network (CNN) for the same task, and understanding the inter-relationship of the two approaches. The CNN recognition results are compared with other machine-learning approaches, such as Deep Neural Network (DNN) and Support Vector Machine (SVM). The experimental results show around 96% classification accuracy using CNN; the CNN approach also provides comparable recognition results compared to the present feature-based off-normal detection. The feature-based solution was developed to capture the expertise of a human expert in classifying the images. The misclassified results are further studied to explain the differences and discover any discrepancies or inconsistencies in current classification.
Garcia-Chimeno, Yolanda; Garcia-Zapirain, Begonya; Gomez-Beldarrain, Marian; Fernandez-Ruanova, Begonya; Garcia-Monco, Juan Carlos
2017-04-13
Feature selection methods are commonly used to identify subsets of relevant features to facilitate the construction of models for classification, yet little is known about how feature selection methods perform in diffusion tensor images (DTIs). In this study, feature selection and machine learning classification methods were tested for the purpose of automating diagnosis of migraines using both DTIs and questionnaire answers related to emotion and cognition - factors that influence of pain perceptions. We select 52 adult subjects for the study divided into three groups: control group (15), subjects with sporadic migraine (19) and subjects with chronic migraine and medication overuse (18). These subjects underwent magnetic resonance with diffusion tensor to see white matter pathway integrity of the regions of interest involved in pain and emotion. The tests also gather data about pathology. The DTI images and test results were then introduced into feature selection algorithms (Gradient Tree Boosting, L1-based, Random Forest and Univariate) to reduce features of the first dataset and classification algorithms (SVM (Support Vector Machine), Boosting (Adaboost) and Naive Bayes) to perform a classification of migraine group. Moreover we implement a committee method to improve the classification accuracy based on feature selection algorithms. When classifying the migraine group, the greatest improvements in accuracy were made using the proposed committee-based feature selection method. Using this approach, the accuracy of classification into three types improved from 67 to 93% when using the Naive Bayes classifier, from 90 to 95% with the support vector machine classifier, 93 to 94% in boosting. The features that were determined to be most useful for classification included are related with the pain, analgesics and left uncinate brain (connected with the pain and emotions). The proposed feature selection committee method improved the performance of migraine diagnosis classifiers compared to individual feature selection methods, producing a robust system that achieved over 90% accuracy in all classifiers. The results suggest that the proposed methods can be used to support specialists in the classification of migraines in patients undergoing magnetic resonance imaging.
Classification of Microarray Data Using Kernel Fuzzy Inference System
Kumar Rath, Santanu
2014-01-01
The DNA microarray classification technique has gained more popularity in both research and practice. In real data analysis, such as microarray data, the dataset contains a huge number of insignificant and irrelevant features that tend to lose useful information. Classes with high relevance and feature sets with high significance are generally referred for the selected features, which determine the samples classification into their respective classes. In this paper, kernel fuzzy inference system (K-FIS) algorithm is applied to classify the microarray data (leukemia) using t-test as a feature selection method. Kernel functions are used to map original data points into a higher-dimensional (possibly infinite-dimensional) feature space defined by a (usually nonlinear) function ϕ through a mathematical process called the kernel trick. This paper also presents a comparative study for classification using K-FIS along with support vector machine (SVM) for different set of features (genes). Performance parameters available in the literature such as precision, recall, specificity, F-measure, ROC curve, and accuracy are considered to analyze the efficiency of the classification model. From the proposed approach, it is apparent that K-FIS model obtains similar results when compared with SVM model. This is an indication that the proposed approach relies on kernel function. PMID:27433543
Visual brain activity patterns classification with simultaneous EEG-fMRI: A multimodal approach.
Ahmad, Rana Fayyaz; Malik, Aamir Saeed; Kamel, Nidal; Reza, Faruque; Amin, Hafeez Ullah; Hussain, Muhammad
2017-01-01
Classification of the visual information from the brain activity data is a challenging task. Many studies reported in the literature are based on the brain activity patterns using either fMRI or EEG/MEG only. EEG and fMRI considered as two complementary neuroimaging modalities in terms of their temporal and spatial resolution to map the brain activity. For getting a high spatial and temporal resolution of the brain at the same time, simultaneous EEG-fMRI seems to be fruitful. In this article, we propose a new method based on simultaneous EEG-fMRI data and machine learning approach to classify the visual brain activity patterns. We acquired EEG-fMRI data simultaneously on the ten healthy human participants by showing them visual stimuli. Data fusion approach is used to merge EEG and fMRI data. Machine learning classifier is used for the classification purposes. Results showed that superior classification performance has been achieved with simultaneous EEG-fMRI data as compared to the EEG and fMRI data standalone. This shows that multimodal approach improved the classification accuracy results as compared with other approaches reported in the literature. The proposed simultaneous EEG-fMRI approach for classifying the brain activity patterns can be helpful to predict or fully decode the brain activity patterns.
CP-CHARM: segmentation-free image classification made accessible.
Uhlmann, Virginie; Singh, Shantanu; Carpenter, Anne E
2016-01-27
Automated classification using machine learning often relies on features derived from segmenting individual objects, which can be difficult to automate. WND-CHARM is a previously developed classification algorithm in which features are computed on the whole image, thereby avoiding the need for segmentation. The algorithm obtained encouraging results but requires considerable computational expertise to execute. Furthermore, some benchmark sets have been shown to be subject to confounding artifacts that overestimate classification accuracy. We developed CP-CHARM, a user-friendly image-based classification algorithm inspired by WND-CHARM in (i) its ability to capture a wide variety of morphological aspects of the image, and (ii) the absence of requirement for segmentation. In order to make such an image-based classification method easily accessible to the biological research community, CP-CHARM relies on the widely-used open-source image analysis software CellProfiler for feature extraction. To validate our method, we reproduced WND-CHARM's results and ensured that CP-CHARM obtained comparable performance. We then successfully applied our approach on cell-based assay data and on tissue images. We designed these new training and test sets to reduce the effect of batch-related artifacts. The proposed method preserves the strengths of WND-CHARM - it extracts a wide variety of morphological features directly on whole images thereby avoiding the need for cell segmentation, but additionally, it makes the methods easily accessible for researchers without computational expertise by implementing them as a CellProfiler pipeline. It has been demonstrated to perform well on a wide range of bioimage classification problems, including on new datasets that have been carefully selected and annotated to minimize batch effects. This provides for the first time a realistic and reliable assessment of the whole image classification strategy.
Ruiz Hidalgo, Irene; Rodriguez, Pablo; Rozema, Jos J; Ní Dhubhghaill, Sorcha; Zakaria, Nadia; Tassignon, Marie-José; Koppen, Carina
2016-06-01
To evaluate the performance of a support vector machine algorithm that automatically and objectively identifies corneal patterns based on a combination of 22 parameters obtained from Pentacam measurements and to compare this method with other known keratoconus (KC) classification methods. Pentacam data from 860 eyes were included in the study and divided into 5 groups: 454 KC, 67 forme fruste (FF), 28 astigmatic, 117 after refractive surgery (PR), and 194 normal eyes (N). Twenty-two parameters were used for classification using a support vector machine algorithm developed in Weka, a machine-learning computer software. The cross-validation accuracy for 3 different classification tasks (KC vs. N, FF vs. N and all 5 groups) was calculated and compared with other known classification methods. The accuracy achieved in the KC versus N discrimination task was 98.9%, with 99.1% sensitivity and 98.5% specificity for KC detection. The accuracy in the FF versus N task was 93.1%, with 79.1% sensitivity and 97.9% specificity for the FF discrimination. Finally, for the 5-groups classification, the accuracy was 88.8%, with a weighted average sensitivity of 89.0% and specificity of 95.2%. Despite using the strictest definition for FF KC, the present study obtained comparable or better results than the single-parameter methods and indices reported in the literature. In some cases, direct comparisons with the literature were not possible because of differences in the compositions and definitions of the study groups, especially the FF KC.
Comparing Facial 3D Analysis With DNA Testing to Determine Zygosities of Twins.
Vuollo, Ville; Sidlauskas, Mantas; Sidlauskas, Antanas; Harila, Virpi; Salomskiene, Loreta; Zhurov, Alexei; Holmström, Lasse; Pirttiniemi, Pertti; Heikkinen, Tuomo
2015-06-01
The aim of this study was to compare facial 3D analysis to DNA testing in twin zygosity determinations. Facial 3D images of 106 pairs of young adult Lithuanian twins were taken with a stereophotogrammetric device (3dMD, Atlanta, Georgia) and zygosity was determined according to similarity of facial form. Statistical pattern recognition methodology was used for classification. The results showed that in 75% to 90% of the cases, zygosity determinations were similar to DNA-based results. There were 81 different classification scenarios, including 3 groups, 3 features, 3 different scaling methods, and 3 threshold levels. It appeared that coincidence with 0.5 mm tolerance is the most suitable feature for classification. Also, leaving out scaling improves results in most cases. Scaling was expected to equalize the magnitude of differences and therefore lead to better recognition performance. Still, better classification features and a more effective scaling method or classification in different facial areas could further improve the results. In most of the cases, male pair zygosity recognition was at a higher level compared with females. Erroneously classified twin pairs appear to be obvious outliers in the sample. In particular, faces of young dizygotic (DZ) twins may be so similar that it is very hard to define a feature that would help classify the pair as DZ. Correspondingly, monozygotic (MZ) twins may have faces with quite different shapes. Such anomalous twin pairs are interesting exceptions, but they form a considerable portion in both zygosity groups.
Unsupervised classification of operator workload from brain signals.
Schultze-Kraft, Matthias; Dähne, Sven; Gugler, Manfred; Curio, Gabriel; Blankertz, Benjamin
2016-06-01
In this study we aimed for the classification of operator workload as it is expected in many real-life workplace environments. We explored brain-signal based workload predictors that differ with respect to the level of label information required for training, including entirely unsupervised approaches. Subjects executed a task on a touch screen that required continuous effort of visual and motor processing with alternating difficulty. We first employed classical approaches for workload state classification that operate on the sensor space of EEG and compared those to the performance of three state-of-the-art spatial filtering methods: common spatial patterns (CSPs) analysis, which requires binary label information; source power co-modulation (SPoC) analysis, which uses the subjects' error rate as a target function; and canonical SPoC (cSPoC) analysis, which solely makes use of cross-frequency power correlations induced by different states of workload and thus represents an unsupervised approach. Finally, we investigated the effects of fusing brain signals and peripheral physiological measures (PPMs) and examined the added value for improving classification performance. Mean classification accuracies of 94%, 92% and 82% were achieved with CSP, SPoC, cSPoC, respectively. These methods outperformed the approaches that did not use spatial filtering and they extracted physiologically plausible components. The performance of the unsupervised cSPoC is significantly increased by augmenting it with PPM features. Our analyses ensured that the signal sources used for classification were of cortical origin and not contaminated with artifacts. Our findings show that workload states can be successfully differentiated from brain signals, even when less and less information from the experimental paradigm is used, thus paving the way for real-world applications in which label information may be noisy or entirely unavailable.
NASA Astrophysics Data System (ADS)
Curilem, Millaray; Huenupan, Fernando; Beltrán, Daniel; San Martin, Cesar; Fuentealba, Gustavo; Franco, Luis; Cardona, Carlos; Acuña, Gonzalo; Chacón, Max; Khan, M. Salman; Becerra Yoma, Nestor
2016-04-01
Automatic pattern recognition applied to seismic signals from volcanoes may assist seismic monitoring by reducing the workload of analysts, allowing them to focus on more challenging activities, such as producing reports, implementing models, and understanding volcanic behaviour. In a previous work, we proposed a structure for automatic classification of seismic events in Llaima volcano, one of the most active volcanoes in the Southern Andes, located in the Araucanía Region of Chile. A database of events taken from three monitoring stations on the volcano was used to create a classification structure, independent of which station provided the signal. The database included three types of volcanic events: tremor, long period, and volcano-tectonic and a contrast group which contains other types of seismic signals. In the present work, we maintain the same classification scheme, but we consider separately the stations information in order to assess whether the complementary information provided by different stations improves the performance of the classifier in recognising seismic patterns. This paper proposes two strategies for combining the information from the stations: i) combining the features extracted from the signals from each station and ii) combining the classifiers of each station. In the first case, the features extracted from the signals from each station are combined forming the input for a single classification structure. In the second, a decision stage combines the results of the classifiers for each station to give a unique output. The results confirm that the station-dependent strategies that combine the features and the classifiers from several stations improves the classification performance, and that the combination of the features provides the best performance. The results show an average improvement of 9% in the classification accuracy when compared with the station-independent method.
Impact of Strain Elastography on BI-RADS classification in small invasive lobular carcinoma.
Chiorean, Angelica Rita; Szep, Mădălina Brîndușa; Feier, Diana Sorina; Duma, Magdalena; Chiorean, Marco Andrei; Strilciuc, Ștefan
2018-05-02
The purpose of this study was to determine the impact of strain elastography (SE) on the Breast Imaging Reporting Data System (BI-RADS) classification depending on invasive lobular carcinoma (ILC) lesion size. We performed a retrospective analysis on a sample of 152 female subjects examined between January 2010 - January 2017. SE was performed on all patients and ILC was subsequently diagnosed by surgical or ultrasound-guided biopsy. BI-RADS 1, 2, 6 and Tsukuba BGR cases were omitted. BI-RADS scores were recorded before and after the use of SE. The differences between scores were compared to the ILC tumor size using nonparametric tests and logistic binary regression. We controlled for age, focality, clinical assessment, heredo-collateral antecedents, B-mode and Doppler ultrasound examination. An ROC curve was used to identify the optimal cut-off point for size in relationship to BI-RADS classificationdifference using Youden's index. The histological subtypes of ILC lesions (n=180) included in the sample were luminal A (70%, n=126), luminal B (27.78%, n=50), triple negative (1.67%, n=3) and HER2+ (0.56%, n=1). The BI-RADS classification was higher when SE was performed (Z=- 6.629, p<0.000). The ROC curve identified a cut-off point of 13 mm for size in relationship to BI-RADS classification difference (J=0.670, p<0.000). Small ILC tumors were 17.92% more likely to influence BI-RADS classification (p<0.000). SE offers enhanced BI-RADS classification in small ILC tumors (<13 mm). Sonoelastography brings added value to B-mode breast ultrasound as an adjacent to mammography in breast cancer screening.
NASA Astrophysics Data System (ADS)
Duarte, D.; Nex, F.; Kerle, N.; Vosselman, G.
2018-05-01
The localization and detailed assessment of damaged buildings after a disastrous event is of utmost importance to guide response operations, recovery tasks or for insurance purposes. Several remote sensing platforms and sensors are currently used for the manual detection of building damages. However, there is an overall interest in the use of automated methods to perform this task, regardless of the used platform. Owing to its synoptic coverage and predictable availability, satellite imagery is currently used as input for the identification of building damages by the International Charter, as well as the Copernicus Emergency Management Service for the production of damage grading and reference maps. Recently proposed methods to perform image classification of building damages rely on convolutional neural networks (CNN). These are usually trained with only satellite image samples in a binary classification problem, however the number of samples derived from these images is often limited, affecting the quality of the classification results. The use of up/down-sampling image samples during the training of a CNN, has demonstrated to improve several image recognition tasks in remote sensing. However, it is currently unclear if this multi resolution information can also be captured from images with different spatial resolutions like satellite and airborne imagery (from both manned and unmanned platforms). In this paper, a CNN framework using residual connections and dilated convolutions is used considering both manned and unmanned aerial image samples to perform the satellite image classification of building damages. Three network configurations, trained with multi-resolution image samples are compared against two benchmark networks where only satellite image samples are used. Combining feature maps generated from airborne and satellite image samples, and refining these using only the satellite image samples, improved nearly 4 % the overall satellite image classification of building damages.
Liu, Jingfang; Zhang, Pengzhu; Lu, Yingjie
2014-11-01
User-generated medical messages on Internet contain extensive information related to adverse drug reactions (ADRs) and are known as valuable resources for post-marketing drug surveillance. The aim of this study was to find an effective method to identify messages related to ADRs automatically from online user reviews. We conducted experiments on online user reviews using different feature set and different classification technique. Firstly, the messages from three communities, allergy community, schizophrenia community and pain management community, were collected, the 3000 messages were annotated. Secondly, the N-gram-based features set and medical domain-specific features set were generated. Thirdly, three classification techniques, SVM, C4.5 and Naïve Bayes, were used to perform classification tasks separately. Finally, we evaluated the performance of different method using different feature set and different classification technique by comparing the metrics including accuracy and F-measure. In terms of accuracy, the accuracy of SVM classifier was higher than 0.8, the accuracy of C4.5 classifier or Naïve Bayes classifier was lower than 0.8; meanwhile, the combination feature sets including n-gram-based feature set and domain-specific feature set consistently outperformed single feature set. In terms of F-measure, the highest F-measure is 0.895 which was achieved by using combination feature sets and a SVM classifier. In all, we can get the best classification performance by using combination feature sets and SVM classifier. By using combination feature sets and SVM classifier, we can get an effective method to identify messages related to ADRs automatically from online user reviews.
3D shape representation with spatial probabilistic distribution of intrinsic shape keypoints
NASA Astrophysics Data System (ADS)
Ghorpade, Vijaya K.; Checchin, Paul; Malaterre, Laurent; Trassoudaine, Laurent
2017-12-01
The accelerated advancement in modeling, digitizing, and visualizing techniques for 3D shapes has led to an increasing amount of 3D models creation and usage, thanks to the 3D sensors which are readily available and easy to utilize. As a result, determining the similarity between 3D shapes has become consequential and is a fundamental task in shape-based recognition, retrieval, clustering, and classification. Several decades of research in Content-Based Information Retrieval (CBIR) has resulted in diverse techniques for 2D and 3D shape or object classification/retrieval and many benchmark data sets. In this article, a novel technique for 3D shape representation and object classification has been proposed based on analyses of spatial, geometric distributions of 3D keypoints. These distributions capture the intrinsic geometric structure of 3D objects. The result of the approach is a probability distribution function (PDF) produced from spatial disposition of 3D keypoints, keypoints which are stable on object surface and invariant to pose changes. Each class/instance of an object can be uniquely represented by a PDF. This shape representation is robust yet with a simple idea, easy to implement but fast enough to compute. Both Euclidean and topological space on object's surface are considered to build the PDFs. Topology-based geodesic distances between keypoints exploit the non-planar surface properties of the object. The performance of the novel shape signature is tested with object classification accuracy. The classification efficacy of the new shape analysis method is evaluated on a new dataset acquired with a Time-of-Flight camera, and also, a comparative evaluation on a standard benchmark dataset with state-of-the-art methods is performed. Experimental results demonstrate superior classification performance of the new approach on RGB-D dataset and depth data.
Unsupervised classification of operator workload from brain signals
NASA Astrophysics Data System (ADS)
Schultze-Kraft, Matthias; Dähne, Sven; Gugler, Manfred; Curio, Gabriel; Blankertz, Benjamin
2016-06-01
Objective. In this study we aimed for the classification of operator workload as it is expected in many real-life workplace environments. We explored brain-signal based workload predictors that differ with respect to the level of label information required for training, including entirely unsupervised approaches. Approach. Subjects executed a task on a touch screen that required continuous effort of visual and motor processing with alternating difficulty. We first employed classical approaches for workload state classification that operate on the sensor space of EEG and compared those to the performance of three state-of-the-art spatial filtering methods: common spatial patterns (CSPs) analysis, which requires binary label information; source power co-modulation (SPoC) analysis, which uses the subjects’ error rate as a target function; and canonical SPoC (cSPoC) analysis, which solely makes use of cross-frequency power correlations induced by different states of workload and thus represents an unsupervised approach. Finally, we investigated the effects of fusing brain signals and peripheral physiological measures (PPMs) and examined the added value for improving classification performance. Main results. Mean classification accuracies of 94%, 92% and 82% were achieved with CSP, SPoC, cSPoC, respectively. These methods outperformed the approaches that did not use spatial filtering and they extracted physiologically plausible components. The performance of the unsupervised cSPoC is significantly increased by augmenting it with PPM features. Significance. Our analyses ensured that the signal sources used for classification were of cortical origin and not contaminated with artifacts. Our findings show that workload states can be successfully differentiated from brain signals, even when less and less information from the experimental paradigm is used, thus paving the way for real-world applications in which label information may be noisy or entirely unavailable.
Schwaibold, M; Schöchlin, J; Bolz, A
2002-01-01
For classification tasks in biosignal processing, several strategies and algorithms can be used. Knowledge-based systems allow prior knowledge about the decision process to be integrated, both by the developer and by self-learning capabilities. For the classification stages in a sleep stage detection framework, three inference strategies were compared regarding their specific strengths: a classical signal processing approach, artificial neural networks and neuro-fuzzy systems. Methodological aspects were assessed to attain optimum performance and maximum transparency for the user. Due to their effective and robust learning behavior, artificial neural networks could be recommended for pattern recognition, while neuro-fuzzy systems performed best for the processing of contextual information.
Open Dataset for the Automatic Recognition of Sedentary Behaviors.
Possos, William; Cruz, Robinson; Cerón, Jesús D; López, Diego M; Sierra-Torres, Carlos H
2017-01-01
Sedentarism is associated with the development of noncommunicable diseases (NCD) such as cardiovascular diseases (CVD), type 2 diabetes, and cancer. Therefore, the identification of specific sedentary behaviors (TV viewing, sitting at work, driving, relaxing, etc.) is especially relevant for planning personalized prevention programs. To build and evaluate a public a dataset for the automatic recognition (classification) of sedentary behaviors. The dataset included data from 30 subjects, who performed 23 sedentary behaviors while wearing a commercial wearable on the wrist, a smartphone on the hip and another in the thigh. Bluetooth Low Energy (BLE) beacons were used in order to improve the automatic classification of different sedentary behaviors. The study also compared six well know data mining classification techniques in order to identify the more precise method of solving the classification problem of the 23 defined behaviors. A better classification accuracy was obtained using the Random Forest algorithm and when data were collected from the phone on the hip. Furthermore, the use of beacons as a reference for obtaining the symbolic location of the individual improved the precision of the classification.
Classification Model for Damage Localization in a Plate Structure
NASA Astrophysics Data System (ADS)
Janeliukstis, R.; Ruchevskis, S.; Chate, A.
2018-01-01
The present study is devoted to the problem of damage localization by means of data classification. The commercial ANSYS finite-elements program was used to make a model of a cantilevered composite plate equipped with numerous strain sensors. The plate was divided into zones, and, for data classification purposes, each of them housed several points to which a point mass of magnitude 5 and 10% of plate mass was applied. At each of these points, a numerical modal analysis was performed, from which the first few natural frequencies and strain readings were extracted. The strain data for every point were the input for a classification procedure involving k nearest neighbors and decision trees. The classification model was trained and optimized by finetuning the key parameters of both algorithms. Finally, two new query points were simulated and subjected to a classification in terms of assigning a label to one of the zones of the plate, thus localizing these points. Damage localization results were compared for both algorithms and were found to be in good agreement with the actual application positions of point load.
Optimizing spectral CT parameters for material classification tasks
NASA Astrophysics Data System (ADS)
Rigie, D. S.; La Rivière, P. J.
2016-06-01
In this work, we propose a framework for optimizing spectral CT imaging parameters and hardware design with regard to material classification tasks. Compared with conventional CT, many more parameters must be considered when designing spectral CT systems and protocols. These choices will impact material classification performance in a non-obvious, task-dependent way with direct implications for radiation dose reduction. In light of this, we adapt Hotelling Observer formalisms typically applied to signal detection tasks to the spectral CT, material-classification problem. The result is a rapidly computable metric that makes it possible to sweep out many system configurations, generating parameter optimization curves (POC’s) that can be used to select optimal settings. The proposed model avoids restrictive assumptions about the basis-material decomposition (e.g. linearity) and incorporates signal uncertainty with a stochastic object model. This technique is demonstrated on dual-kVp and photon-counting systems for two different, clinically motivated material classification tasks (kidney stone classification and plaque removal). We show that the POC’s predicted with the proposed analytic model agree well with those derived from computationally intensive numerical simulation studies.
Optimizing Spectral CT Parameters for Material Classification Tasks
Rigie, D. S.; La Rivière, P. J.
2017-01-01
In this work, we propose a framework for optimizing spectral CT imaging parameters and hardware design with regard to material classification tasks. Compared with conventional CT, many more parameters must be considered when designing spectral CT systems and protocols. These choices will impact material classification performance in a non-obvious, task-dependent way with direct implications for radiation dose reduction. In light of this, we adapt Hotelling Observer formalisms typically applied to signal detection tasks to the spectral CT, material-classification problem. The result is a rapidly computable metric that makes it possible to sweep out many system configurations, generating parameter optimization curves (POC’s) that can be used to select optimal settings. The proposed model avoids restrictive assumptions about the basis-material decomposition (e.g. linearity) and incorporates signal uncertainty with a stochastic object model. This technique is demonstrated on dual-kVp and photon-counting systems for two different, clinically motivated material classification tasks (kidney stone classification and plaque removal). We show that the POC’s predicted with the proposed analytic model agree well with those derived from computationally intensive numerical simulation studies. PMID:27227430
McDermott, P A; Hale, R L
1982-07-01
Tested diagnostic classifications of child psychopathology produced by a computerized technique known as multidimensional actuarial classification (MAC) against the criterion of expert psychological opinion. The MAC program applies series of statistical decision rules to assess the importance of and relationships among several dimensions of classification, i.e., intellectual functioning, academic achievement, adaptive behavior, and social and behavioral adjustment, to perform differential diagnosis of children's mental retardation, specific learning disabilities, behavioral and emotional disturbance, possible communication or perceptual-motor impairment, and academic under- and overachievement in reading and mathematics. Classifications rendered by MAC are compared to those offered by two expert child psychologists for cases of 73 children referred for psychological services. Experts' agreement with MAC was significant for all classification areas, as was MAC's agreement with the experts held as a conjoint reference standard. Whereas the experts' agreement with MAC averaged 86.0% above chance, their agreement with one another averaged 76.5% above chance. Implications of the findings are explored and potential advantages of the systems-actuarial approach are discussed.
HEp-2 cell image classification method based on very deep convolutional networks with small datasets
NASA Astrophysics Data System (ADS)
Lu, Mengchi; Gao, Long; Guo, Xifeng; Liu, Qiang; Yin, Jianping
2017-07-01
Human Epithelial-2 (HEp-2) cell images staining patterns classification have been widely used to identify autoimmune diseases by the anti-Nuclear antibodies (ANA) test in the Indirect Immunofluorescence (IIF) protocol. Because manual test is time consuming, subjective and labor intensive, image-based Computer Aided Diagnosis (CAD) systems for HEp-2 cell classification are developing. However, methods proposed recently are mostly manual features extraction with low accuracy. Besides, the scale of available benchmark datasets is small, which does not exactly suitable for using deep learning methods. This issue will influence the accuracy of cell classification directly even after data augmentation. To address these issues, this paper presents a high accuracy automatic HEp-2 cell classification method with small datasets, by utilizing very deep convolutional networks (VGGNet). Specifically, the proposed method consists of three main phases, namely image preprocessing, feature extraction and classification. Moreover, an improved VGGNet is presented to address the challenges of small-scale datasets. Experimental results over two benchmark datasets demonstrate that the proposed method achieves superior performance in terms of accuracy compared with existing methods.
Bulk Magnetization Effects in EMI-Based Classification and Discrimination
2012-04-01
response adds to classification performance and ( 2 ) develop a comprehensive understanding of the engineering challenges of primary field cancellation...response adds to classification performance and ( 2 ) develop a comprehensive understanding of the engineering challenges of primary field cancellation...classification performance and ( 2 ) develop a comprehensive understanding of the engineering challenges of primary field cancellation that can support a
Gönen, Mehmet
2014-01-01
Coupled training of dimensionality reduction and classification is proposed previously to improve the prediction performance for single-label problems. Following this line of research, in this paper, we first introduce a novel Bayesian method that combines linear dimensionality reduction with linear binary classification for supervised multilabel learning and present a deterministic variational approximation algorithm to learn the proposed probabilistic model. We then extend the proposed method to find intrinsic dimensionality of the projected subspace using automatic relevance determination and to handle semi-supervised learning using a low-density assumption. We perform supervised learning experiments on four benchmark multilabel learning data sets by comparing our method with baseline linear dimensionality reduction algorithms. These experiments show that the proposed approach achieves good performance values in terms of hamming loss, average AUC, macro F1, and micro F1 on held-out test data. The low-dimensional embeddings obtained by our method are also very useful for exploratory data analysis. We also show the effectiveness of our approach in finding intrinsic subspace dimensionality and semi-supervised learning tasks. PMID:24532862
Gönen, Mehmet
2014-03-01
Coupled training of dimensionality reduction and classification is proposed previously to improve the prediction performance for single-label problems. Following this line of research, in this paper, we first introduce a novel Bayesian method that combines linear dimensionality reduction with linear binary classification for supervised multilabel learning and present a deterministic variational approximation algorithm to learn the proposed probabilistic model. We then extend the proposed method to find intrinsic dimensionality of the projected subspace using automatic relevance determination and to handle semi-supervised learning using a low-density assumption. We perform supervised learning experiments on four benchmark multilabel learning data sets by comparing our method with baseline linear dimensionality reduction algorithms. These experiments show that the proposed approach achieves good performance values in terms of hamming loss, average AUC, macro F 1 , and micro F 1 on held-out test data. The low-dimensional embeddings obtained by our method are also very useful for exploratory data analysis. We also show the effectiveness of our approach in finding intrinsic subspace dimensionality and semi-supervised learning tasks.
Implementation of several mathematical algorithms to breast tissue density classification
NASA Astrophysics Data System (ADS)
Quintana, C.; Redondo, M.; Tirao, G.
2014-02-01
The accuracy of mammographic abnormality detection methods is strongly dependent on breast tissue characteristics, where a dense breast tissue can hide lesions causing cancer to be detected at later stages. In addition, breast tissue density is widely accepted to be an important risk indicator for the development of breast cancer. This paper presents the implementation and the performance of different mathematical algorithms designed to standardize the categorization of mammographic images, according to the American College of Radiology classifications. These mathematical techniques are based on intrinsic properties calculations and on comparison with an ideal homogeneous image (joint entropy, mutual information, normalized cross correlation and index Q) as categorization parameters. The algorithms evaluation was performed on 100 cases of the mammographic data sets provided by the Ministerio de Salud de la Provincia de Córdoba, Argentina—Programa de Prevención del Cáncer de Mama (Department of Public Health, Córdoba, Argentina, Breast Cancer Prevention Program). The obtained breast classifications were compared with the expert medical diagnostics, showing a good performance. The implemented algorithms revealed a high potentiality to classify breasts into tissue density categories.
Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data.
Shah, M; Marchand, M; Corbeil, J
2012-01-01
One of the objectives of designing feature selection learning algorithms is to obtain classifiers that depend on a small number of attributes and have verifiable future performance guarantees. There are few, if any, approaches that successfully address the two goals simultaneously. To the best of our knowledge, such algorithms that give theoretical bounds on the future performance have not been proposed so far in the context of the classification of gene expression data. In this work, we investigate the premise of learning a conjunction (or disjunction) of decision stumps in Occam's Razor, Sample Compression, and PAC-Bayes learning settings for identifying a small subset of attributes that can be used to perform reliable classification tasks. We apply the proposed approaches for gene identification from DNA microarray data and compare our results to those of the well-known successful approaches proposed for the task. We show that our algorithm not only finds hypotheses with a much smaller number of genes while giving competitive classification accuracy but also having tight risk guarantees on future performance, unlike other approaches. The proposed approaches are general and extensible in terms of both designing novel algorithms and application to other domains.
The impact of feature selection on one and two-class classification performance for plant microRNAs.
Khalifa, Waleed; Yousef, Malik; Saçar Demirci, Müşerref Duygu; Allmer, Jens
2016-01-01
MicroRNAs (miRNAs) are short nucleotide sequences that form a typical hairpin structure which is recognized by a complex enzyme machinery. It ultimately leads to the incorporation of 18-24 nt long mature miRNAs into RISC where they act as recognition keys to aid in regulation of target mRNAs. It is involved to determine miRNAs experimentally and, therefore, machine learning is used to complement such endeavors. The success of machine learning mostly depends on proper input data and appropriate features for parameterization of the data. Although, in general, two-class classification (TCC) is used in the field; because negative examples are hard to come by, one-class classification (OCC) has been tried for pre-miRNA detection. Since both positive and negative examples are currently somewhat limited, feature selection can prove to be vital for furthering the field of pre-miRNA detection. In this study, we compare the performance of OCC and TCC using eight feature selection methods and seven different plant species providing positive pre-miRNA examples. Feature selection was very successful for OCC where the best feature selection method achieved an average accuracy of 95.6%, thereby being ∼29% better than the worst method which achieved 66.9% accuracy. While the performance is comparable to TCC, which performs up to 3% better than OCC, TCC is much less affected by feature selection and its largest performance gap is ∼13% which only occurs for two of the feature selection methodologies. We conclude that feature selection is crucially important for OCC and that it can perform on par with TCC given the proper set of features.
Image-based fall detection and classification of a user with a walking support system
NASA Astrophysics Data System (ADS)
Taghvaei, Sajjad; Kosuge, Kazuhiro
2017-10-01
The classification of visual human action is important in the development of systems that interact with humans. This study investigates an image-based classification of the human state while using a walking support system to improve the safety and dependability of these systems.We categorize the possible human behavior while utilizing a walker robot into eight states (i.e., sitting, standing, walking, and five falling types), and propose two different methods, namely, normal distribution and hidden Markov models (HMMs), to detect and recognize these states. The visual feature for the state classification is the centroid position of the upper body, which is extracted from the user's depth images. The first method shows that the centroid position follows a normal distribution while walking, which can be adopted to detect any non-walking state. The second method implements HMMs to detect and recognize these states. We then measure and compare the performance of both methods. The classification results are employed to control the motion of a passive-type walker (called "RT Walker") by activating its brakes in non-walking states. Thus, the system can be used for sit/stand support and fall prevention. The experiments are performed with four subjects, including an experienced physiotherapist. Results show that the algorithm can be adapted to the new user's motion pattern within 40 s, with a fall detection rate of 96.25% and state classification rate of 81.0%. The proposed method can be implemented to other abnormality detection/classification applications that employ depth image-sensing devices.
Behavioral state classification in epileptic brain using intracranial electrophysiology
NASA Astrophysics Data System (ADS)
Kremen, Vaclav; Duque, Juliano J.; Brinkmann, Benjamin H.; Berry, Brent M.; Kucewicz, Michal T.; Khadjevand, Fatemeh; Van Gompel, Jamie; Stead, Matt; St. Louis, Erik K.; Worrell, Gregory A.
2017-04-01
Objective. Automated behavioral state classification can benefit next generation implantable epilepsy devices. In this study we explored the feasibility of automated awake (AW) and slow wave sleep (SWS) classification using wide bandwidth intracranial EEG (iEEG) in patients undergoing evaluation for epilepsy surgery. Approach. Data from seven patients (age 34+/- 12 , 4 women) who underwent intracranial depth electrode implantation for iEEG monitoring were included. Spectral power features (0.1-600 Hz) spanning several frequency bands from a single electrode were used to train and test a support vector machine classifier. Main results. Classification accuracy of 97.8 ± 0.3% (normal tissue) and 89.4 ± 0.8% (epileptic tissue) across seven subjects using multiple spectral power features from a single electrode was achieved. Spectral power features from electrodes placed in normal temporal neocortex were found to be more useful (accuracy 90.8 ± 0.8%) for sleep-wake state classification than electrodes located in normal hippocampus (87.1 ± 1.6%). Spectral power in high frequency band features (Ripple (80-250 Hz), Fast Ripple (250-600 Hz)) showed comparable performance for AW and SWS classification as the best performing Berger bands (Alpha, Beta, low Gamma) with accuracy ⩾90% using a single electrode contact and single spectral feature. Significance. Automated classification of wake and SWS should prove useful for future implantable epilepsy devices with limited computational power, memory, and number of electrodes. Applications include quantifying patient sleep patterns and behavioral state dependent detection, prediction, and electrical stimulation therapies.
Yang, Ze-Hui; Zheng, Rui; Gao, Yuan; Zhang, Qiang
2016-09-01
With the widespread application of high-throughput technology, numerous meta-analysis methods have been proposed for differential expression profiling across multiple studies. We identified the suitable differentially expressed (DE) genes that contributed to lung adenocarcinoma (ADC) clustering based on seven popular multiple meta-analysis methods. Seven microarray expression profiles of ADC and normal controls were extracted from the ArrayExpress database. The Bioconductor was used to perform the data preliminary preprocessing. Then, DE genes across multiple studies were identified. Hierarchical clustering was applied to compare the classification performance for microarray data samples. The classification efficiency was compared based on accuracy, sensitivity and specificity. Across seven datasets, 573 ADC cases and 222 normal controls were collected. After filtering out unexpressed and noninformative genes, 3688 genes were remained for further analysis. The classification efficiency analysis showed that DE genes identified by sum of ranks method separated ADC from normal controls with the best accuracy, sensitivity and specificity of 0.953, 0.969 and 0.932, respectively. The gene set with the highest classification accuracy mainly participated in the regulation of response to external stimulus (P = 7.97E-04), cyclic nucleotide-mediated signaling (P = 0.01), regulation of cell morphogenesis (P = 0.01) and regulation of cell proliferation (P = 0.01). Evaluation of DE genes identified by different meta-analysis methods in classification efficiency provided a new perspective to the choice of the suitable method in a given application. Varying meta-analysis methods always present varying abilities, so synthetic consideration should be taken when providing meta-analysis methods for particular research. © 2015 John Wiley & Sons Ltd.
Petersen, Japke F; Stuiver, Martijn M; Timmermans, Adriana J; Chen, Amy; Zhang, Hongzhen; O'Neill, James P; Deady, Sandra; Vander Poorten, Vincent; Meulemans, Jeroen; Wennerberg, Johan; Skroder, Carl; Day, Andrew T; Koch, Wayne; van den Brekel, Michiel W M
2018-05-01
TNM-classification inadequately estimates patient-specific overall survival (OS). We aimed to improve this by developing a risk-prediction model for patients with advanced larynx cancer. Cohort study. We developed a risk prediction model to estimate the 5-year OS rate based on a cohort of 3,442 patients with T3T4N0N+M0 larynx cancer. The model was internally validated using bootstrapping samples and externally validated on patient data from five external centers (n = 770). The main outcome was performance of the model as tested by discrimination, calibration, and the ability to distinguish risk groups based on tertiles from the derivation dataset. The model performance was compared to a model based on T and N classification only. We included age, gender, T and N classification, and subsite as prognostic variables in the standard model. After external validation, the standard model had a significantly better fit than a model based on T and N classification alone (C statistic, 0.59 vs. 0.55, P < .001). The model was able to distinguish well among three risk groups based on tertiles of the risk score. Adding treatment modality to the model did not decrease the predictive power. As a post hoc analysis, we tested the added value of comorbidity as scored by American Society of Anesthesiologists score in a subsample, which increased the C statistic to 0.68. A risk prediction model for patients with advanced larynx cancer, consisting of readily available clinical variables, gives more accurate estimations of the estimated 5-year survival rate when compared to a model based on T and N classification alone. 2c. Laryngoscope, 128:1140-1145, 2018. © 2017 The American Laryngological, Rhinological and Otological Society, Inc.
Does bimodal stimulus presentation increase ERP components usable in BCIs?
NASA Astrophysics Data System (ADS)
Thurlings, Marieke E.; Brouwer, Anne-Marie; Van Erp, Jan B. F.; Blankertz, Benjamin; Werkhoven, Peter J.
2012-08-01
Event-related potential (ERP)-based brain-computer interfaces (BCIs) employ differences in brain responses to attended and ignored stimuli. Typically, visual stimuli are used. Tactile stimuli have recently been suggested as a gaze-independent alternative. Bimodal stimuli could evoke additional brain activity due to multisensory integration which may be of use in BCIs. We investigated the effect of visual-tactile stimulus presentation on the chain of ERP components, BCI performance (classification accuracies and bitrates) and participants’ task performance (counting of targets). Ten participants were instructed to navigate a visual display by attending (spatially) to targets in sequences of either visual, tactile or visual-tactile stimuli. We observe that attending to visual-tactile (compared to either visual or tactile) stimuli results in an enhanced early ERP component (N1). This bimodal N1 may enhance BCI performance, as suggested by a nonsignificant positive trend in offline classification accuracies. A late ERP component (P300) is reduced when attending to visual-tactile compared to visual stimuli, which is consistent with the nonsignificant negative trend of participants’ task performance. We discuss these findings in the light of affected spatial attention at high-level compared to low-level stimulus processing. Furthermore, we evaluate bimodal BCIs from a practical perspective and for future applications.
An automatic graph-based approach for artery/vein classification in retinal images.
Dashtbozorg, Behdad; Mendonça, Ana Maria; Campilho, Aurélio
2014-03-01
The classification of retinal vessels into artery/vein (A/V) is an important phase for automating the detection of vascular changes, and for the calculation of characteristic signs associated with several systemic diseases such as diabetes, hypertension, and other cardiovascular conditions. This paper presents an automatic approach for A/V classification based on the analysis of a graph extracted from the retinal vasculature. The proposed method classifies the entire vascular tree deciding on the type of each intersection point (graph nodes) and assigning one of two labels to each vessel segment (graph links). Final classification of a vessel segment as A/V is performed through the combination of the graph-based labeling results with a set of intensity features. The results of this proposed method are compared with manual labeling for three public databases. Accuracy values of 88.3%, 87.4%, and 89.8% are obtained for the images of the INSPIRE-AVR, DRIVE, and VICAVR databases, respectively. These results demonstrate that our method outperforms recent approaches for A/V classification.
A contour-based shape descriptor for biomedical image classification and retrieval
NASA Astrophysics Data System (ADS)
You, Daekeun; Antani, Sameer; Demner-Fushman, Dina; Thoma, George R.
2013-12-01
Contours, object blobs, and specific feature points are utilized to represent object shapes and extract shape descriptors that can then be used for object detection or image classification. In this research we develop a shape descriptor for biomedical image type (or, modality) classification. We adapt a feature extraction method used in optical character recognition (OCR) for character shape representation, and apply various image preprocessing methods to successfully adapt the method to our application. The proposed shape descriptor is applied to radiology images (e.g., MRI, CT, ultrasound, X-ray, etc.) to assess its usefulness for modality classification. In our experiment we compare our method with other visual descriptors such as CEDD, CLD, Tamura, and PHOG that extract color, texture, or shape information from images. The proposed method achieved the highest classification accuracy of 74.1% among all other individual descriptors in the test, and when combined with CSD (color structure descriptor) showed better performance (78.9%) than using the shape descriptor alone.
Data Clustering and Evolving Fuzzy Decision Tree for Data Base Classification Problems
NASA Astrophysics Data System (ADS)
Chang, Pei-Chann; Fan, Chin-Yuan; Wang, Yen-Wen
Data base classification suffers from two well known difficulties, i.e., the high dimensionality and non-stationary variations within the large historic data. This paper presents a hybrid classification model by integrating a case based reasoning technique, a Fuzzy Decision Tree (FDT), and Genetic Algorithms (GA) to construct a decision-making system for data classification in various data base applications. The model is major based on the idea that the historic data base can be transformed into a smaller case-base together with a group of fuzzy decision rules. As a result, the model can be more accurately respond to the current data under classifying from the inductions by these smaller cases based fuzzy decision trees. Hit rate is applied as a performance measure and the effectiveness of our proposed model is demonstrated by experimentally compared with other approaches on different data base classification applications. The average hit rate of our proposed model is the highest among others.
Li, Cheng; Liu, Junjie; Wang, Sida; Chen, Yuanyuan; Yuan, Zhigang; Zeng, Jian; Li, Zhixian
2015-01-01
To retrospectively analyze and compare the ultrasonographic characteristics and BI-RADS-US classification between patients with BRCA1 mutation-associated breast cancer and those without BRCA1 gene mutation in Guangxi, China. The study was performed in 36 lesions from 34 BRCA1 mutation-associated breast cancer patients. A total of 422 lesions from 422 breast cancer patients without BRCA1 mutations served as control group. The comparison of the ultrasonographic features and BI-RADS-US classification between two the groups were reviewed. More complex inner echo was disclosed in BRCA1 mutation-associated breast cancer patients (x(2) = 4.741, P = 0.029). The BI-RADS classification of BRCA1 mutation-associated breast cancer was lower (U = 6094.0, P = 0.022). BRCA1 mutation-associated breast cancer frequently displays as microlobulated margin and complex echo. It also shows more benign characteristics in morphology, and the BI-RADS classification is prone to be underestimated.
Biometric Authentication for Gender Classification Techniques: A Review
NASA Astrophysics Data System (ADS)
Mathivanan, P.; Poornima, K.
2017-12-01
One of the challenging biometric authentication applications is gender identification and age classification, which captures gait from far distance and analyze physical information of the subject such as gender, race and emotional state of the subject. It is found that most of the gender identification techniques have focused only with frontal pose of different human subject, image size and type of database used in the process. The study also classifies different feature extraction process such as, Principal Component Analysis (PCA) and Local Directional Pattern (LDP) that are used to extract the authentication features of a person. This paper aims to analyze different gender classification techniques that help in evaluating strength and weakness of existing gender identification algorithm. Therefore, it helps in developing a novel gender classification algorithm with less computation cost and more accuracy. In this paper, an overview and classification of different gender identification techniques are first presented and it is compared with other existing human identification system by means of their performance.
Development of neural network techniques for finger-vein pattern classification
NASA Astrophysics Data System (ADS)
Wu, Jian-Da; Liu, Chiung-Tsiung; Tsai, Yi-Jang; Liu, Jun-Ching; Chang, Ya-Wen
2010-02-01
A personal identification system using finger-vein patterns and neural network techniques is proposed in the present study. In the proposed system, the finger-vein patterns are captured by a device that can transmit near infrared through the finger and record the patterns for signal analysis and classification. The biometric system for verification consists of a combination of feature extraction using principal component analysis and pattern classification using both back-propagation network and adaptive neuro-fuzzy inference systems. Finger-vein features are first extracted by principal component analysis method to reduce the computational burden and removes noise residing in the discarded dimensions. The features are then used in pattern classification and identification. To verify the effect of the proposed adaptive neuro-fuzzy inference system in the pattern classification, the back-propagation network is compared with the proposed system. The experimental results indicated the proposed system using adaptive neuro-fuzzy inference system demonstrated a better performance than the back-propagation network for personal identification using the finger-vein patterns.
Zhang, Y N
2017-01-01
Parkinson's disease (PD) is primarily diagnosed by clinical examinations, such as walking test, handwriting test, and MRI diagnostic. In this paper, we propose a machine learning based PD telediagnosis method for smartphone. Classification of PD using speech records is a challenging task owing to the fact that the classification accuracy is still lower than doctor-level. Here we demonstrate automatic classification of PD using time frequency features, stacked autoencoders (SAE), and K nearest neighbor (KNN) classifier. KNN classifier can produce promising classification results from useful representations which were learned by SAE. Empirical results show that the proposed method achieves better performance with all tested cases across classification tasks, demonstrating machine learning capable of classifying PD with a level of competence comparable to doctor. It concludes that a smartphone can therefore potentially provide low-cost PD diagnostic care. This paper also gives an implementation on browser/server system and reports the running time cost. Both advantages and disadvantages of the proposed telediagnosis system are discussed.
2017-01-01
Parkinson's disease (PD) is primarily diagnosed by clinical examinations, such as walking test, handwriting test, and MRI diagnostic. In this paper, we propose a machine learning based PD telediagnosis method for smartphone. Classification of PD using speech records is a challenging task owing to the fact that the classification accuracy is still lower than doctor-level. Here we demonstrate automatic classification of PD using time frequency features, stacked autoencoders (SAE), and K nearest neighbor (KNN) classifier. KNN classifier can produce promising classification results from useful representations which were learned by SAE. Empirical results show that the proposed method achieves better performance with all tested cases across classification tasks, demonstrating machine learning capable of classifying PD with a level of competence comparable to doctor. It concludes that a smartphone can therefore potentially provide low-cost PD diagnostic care. This paper also gives an implementation on browser/server system and reports the running time cost. Both advantages and disadvantages of the proposed telediagnosis system are discussed. PMID:29075547
Computational approaches for the classification of seed storage proteins.
Radhika, V; Rao, V Sree Hari
2015-07-01
Seed storage proteins comprise a major part of the protein content of the seed and have an important role on the quality of the seed. These storage proteins are important because they determine the total protein content and have an effect on the nutritional quality and functional properties for food processing. Transgenic plants are being used to develop improved lines for incorporation into plant breeding programs and the nutrient composition of seeds is a major target of molecular breeding programs. Hence, classification of these proteins is crucial for the development of superior varieties with improved nutritional quality. In this study we have applied machine learning algorithms for classification of seed storage proteins. We have presented an algorithm based on nearest neighbor approach for classification of seed storage proteins and compared its performance with decision tree J48, multilayer perceptron neural (MLP) network and support vector machine (SVM) libSVM. The model based on our algorithm has been able to give higher classification accuracy in comparison to the other methods.
Tu, Li-ping; Chen, Jing-bo; Hu, Xiao-juan; Zhang, Zhi-feng
2016-01-01
Background and Goal. The application of digital image processing techniques and machine learning methods in tongue image classification in Traditional Chinese Medicine (TCM) has been widely studied nowadays. However, it is difficult for the outcomes to generalize because of lack of color reproducibility and image standardization. Our study aims at the exploration of tongue colors classification with a standardized tongue image acquisition process and color correction. Methods. Three traditional Chinese medical experts are chosen to identify the selected tongue pictures taken by the TDA-1 tongue imaging device in TIFF format through ICC profile correction. Then we compare the mean value of L * a * b * of different tongue colors and evaluate the effect of the tongue color classification by machine learning methods. Results. The L * a * b * values of the five tongue colors are statistically different. Random forest method has a better performance than SVM in classification. SMOTE algorithm can increase classification accuracy by solving the imbalance of the varied color samples. Conclusions. At the premise of standardized tongue acquisition and color reproduction, preliminary objectification of tongue color classification in Traditional Chinese Medicine (TCM) is feasible. PMID:28050555
Hierarchy-associated semantic-rule inference framework for classifying indoor scenes
NASA Astrophysics Data System (ADS)
Yu, Dan; Liu, Peng; Ye, Zhipeng; Tang, Xianglong; Zhao, Wei
2016-03-01
Typically, the initial task of classifying indoor scenes is challenging, because the spatial layout and decoration of a scene can vary considerably. Recent efforts at classifying object relationships commonly depend on the results of scene annotation and predefined rules, making classification inflexible. Furthermore, annotation results are easily affected by external factors. Inspired by human cognition, a scene-classification framework was proposed using the empirically based annotation (EBA) and a match-over rule-based (MRB) inference system. The semantic hierarchy of images is exploited by EBA to construct rules empirically for MRB classification. The problem of scene classification is divided into low-level annotation and high-level inference from a macro perspective. Low-level annotation involves detecting the semantic hierarchy and annotating the scene with a deformable-parts model and a bag-of-visual-words model. In high-level inference, hierarchical rules are extracted to train the decision tree for classification. The categories of testing samples are generated from the parts to the whole. Compared with traditional classification strategies, the proposed semantic hierarchy and corresponding rules reduce the effect of a variable background and improve the classification performance. The proposed framework was evaluated on a popular indoor scene dataset, and the experimental results demonstrate its effectiveness.
Gadermayr, M.; Liedlgruber, M.; Uhl, A.; Vécsei, A.
2013-01-01
Due to the optics used in endoscopes, a typical degradation observed in endoscopic images are barrel-type distortions. In this work we investigate the impact of methods used to correct such distortions in images on the classification accuracy in the context of automated celiac disease classification. For this purpose we compare various different distortion correction methods and apply them to endoscopic images, which are subsequently classified. Since the interpolation used in such methods is also assumed to have an influence on the resulting classification accuracies, we also investigate different interpolation methods and their impact on the classification performance. In order to be able to make solid statements about the benefit of distortion correction we use various different feature extraction methods used to obtain features for the classification. Our experiments show that it is not possible to make a clear statement about the usefulness of distortion correction methods in the context of an automated diagnosis of celiac disease. This is mainly due to the fact that an eventual benefit of distortion correction highly depends on the feature extraction method used for the classification. PMID:23981585
Adaptive sleep-wake discrimination for wearable devices.
Karlen, Walter; Floreano, Dario
2011-04-01
Sleep/wake classification systems that rely on physiological signals suffer from intersubject differences that make accurate classification with a single, subject-independent model difficult. To overcome the limitations of intersubject variability, we suggest a novel online adaptation technique that updates the sleep/wake classifier in real time. The objective of the present study was to evaluate the performance of a newly developed adaptive classification algorithm that was embedded on a wearable sleep/wake classification system called SleePic. The algorithm processed ECG and respiratory effort signals for the classification task and applied behavioral measurements (obtained from accelerometer and press-button data) for the automatic adaptation task. When trained as a subject-independent classifier algorithm, the SleePic device was only able to correctly classify 74.94 ± 6.76% of the human-rated sleep/wake data. By using the suggested automatic adaptation method, the mean classification accuracy could be significantly improved to 92.98 ± 3.19%. A subject-independent classifier based on activity data only showed a comparable accuracy of 90.44 ± 3.57%. We demonstrated that subject-independent models used for online sleep-wake classification can successfully be adapted to previously unseen subjects without the intervention of human experts or off-line calibration.
Multi-Temporal Land Cover Classification with Long Short-Term Memory Neural Networks
NASA Astrophysics Data System (ADS)
Rußwurm, M.; Körner, M.
2017-05-01
Land cover classification (LCC) is a central and wide field of research in earth observation and has already put forth a variety of classification techniques. Many approaches are based on classification techniques considering observation at certain points in time. However, some land cover classes, such as crops, change their spectral characteristics due to environmental influences and can thus not be monitored effectively with classical mono-temporal approaches. Nevertheless, these temporal observations should be utilized to benefit the classification process. After extensive research has been conducted on modeling temporal dynamics by spectro-temporal profiles using vegetation indices, we propose a deep learning approach to utilize these temporal characteristics for classification tasks. In this work, we show how long short-term memory (LSTM) neural networks can be employed for crop identification purposes with SENTINEL 2A observations from large study areas and label information provided by local authorities. We compare these temporal neural network models, i.e., LSTM and recurrent neural network (RNN), with a classical non-temporal convolutional neural network (CNN) model and an additional support vector machine (SVM) baseline. With our rather straightforward LSTM variant, we exceeded state-of-the-art classification performance, thus opening promising potential for further research.
Qi, Zhen; Tu, Li-Ping; Chen, Jing-Bo; Hu, Xiao-Juan; Xu, Jia-Tuo; Zhang, Zhi-Feng
2016-01-01
Background and Goal . The application of digital image processing techniques and machine learning methods in tongue image classification in Traditional Chinese Medicine (TCM) has been widely studied nowadays. However, it is difficult for the outcomes to generalize because of lack of color reproducibility and image standardization. Our study aims at the exploration of tongue colors classification with a standardized tongue image acquisition process and color correction. Methods . Three traditional Chinese medical experts are chosen to identify the selected tongue pictures taken by the TDA-1 tongue imaging device in TIFF format through ICC profile correction. Then we compare the mean value of L * a * b * of different tongue colors and evaluate the effect of the tongue color classification by machine learning methods. Results . The L * a * b * values of the five tongue colors are statistically different. Random forest method has a better performance than SVM in classification. SMOTE algorithm can increase classification accuracy by solving the imbalance of the varied color samples. Conclusions . At the premise of standardized tongue acquisition and color reproduction, preliminary objectification of tongue color classification in Traditional Chinese Medicine (TCM) is feasible.
Real-time, resource-constrained object classification on a micro-air vehicle
NASA Astrophysics Data System (ADS)
Buck, Louis; Ray, Laura
2013-12-01
A real-time embedded object classification algorithm is developed through the novel combination of binary feature descriptors, a bag-of-visual-words object model and the cortico-striatal loop (CSL) learning algorithm. The BRIEF, ORB and FREAK binary descriptors are tested and compared to SIFT descriptors with regard to their respective classification accuracies, execution times, and memory requirements when used with CSL on a 12.6 g ARM Cortex embedded processor running at 800 MHz. Additionally, the effect of x2 feature mapping and opponent-color representations used with these descriptors is examined. These tests are performed on four data sets of varying sizes and difficulty, and the BRIEF descriptor is found to yield the best combination of speed and classification accuracy. Its use with CSL achieves accuracies between 67% and 95% of those achieved with SIFT descriptors and allows for the embedded classification of a 128x192 pixel image in 0.15 seconds, 60 times faster than classification with SIFT. X2 mapping is found to provide substantial improvements in classification accuracy for all of the descriptors at little cost, while opponent-color descriptors are offer accuracy improvements only on colorful datasets.
Bennet, Jaison; Ganaprakasam, Chilambuchelvan Arul; Arputharaj, Kannan
2014-01-01
Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN), naive Bayes, and support vector machine (SVM). Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT) and moving window technique (MWT) is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.
Contribution of non-negative matrix factorization to the classification of remote sensing images
NASA Astrophysics Data System (ADS)
Karoui, M. S.; Deville, Y.; Hosseini, S.; Ouamri, A.; Ducrot, D.
2008-10-01
Remote sensing has become an unavoidable tool for better managing our environment, generally by realizing maps of land cover using classification techniques. The classification process requires some pre-processing, especially for data size reduction. The most usual technique is Principal Component Analysis. Another approach consists in regarding each pixel of the multispectral image as a mixture of pure elements contained in the observed area. Using Blind Source Separation (BSS) methods, one can hope to unmix each pixel and to perform the recognition of the classes constituting the observed scene. Our contribution consists in using Non-negative Matrix Factorization (NMF) combined with sparse coding as a solution to BSS, in order to generate new images (which are at least partly separated images) using HRV SPOT images from Oran area, Algeria). These images are then used as inputs of a supervised classifier integrating textural information. The results of classifications of these "separated" images show a clear improvement (correct pixel classification rate improved by more than 20%) compared to classification of initial (i.e. non separated) images. These results show the contribution of NMF as an attractive pre-processing for classification of multispectral remote sensing imagery.
Classification of interstitial lung disease patterns with topological texture features
NASA Astrophysics Data System (ADS)
Huber, Markus B.; Nagarajan, Mahesh; Leinsinger, Gerda; Ray, Lawrence A.; Wismüller, Axel
2010-03-01
Topological texture features were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honey-combing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. A set of 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and three Minkowski Functionals (MFs, e.g. MF.euler). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions and the significance thresholds were adjusted for multiple comparisons by the Bonferroni correction. The best classification results were obtained by the MF features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers. The highest accuracy was found for MF.euler (97.5%, 96.6%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced topological texture features can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.
NASA Astrophysics Data System (ADS)
Diesing, Markus; Green, Sophie L.; Stephens, David; Lark, R. Murray; Stewart, Heather A.; Dove, Dayton
2014-08-01
Marine spatial planning and conservation need underpinning with sufficiently detailed and accurate seabed substrate and habitat maps. Although multibeam echosounders enable us to map the seabed with high resolution and spatial accuracy, there is still a lack of fit-for-purpose seabed maps. This is due to the high costs involved in carrying out systematic seabed mapping programmes and the fact that the development of validated, repeatable, quantitative and objective methods of swath acoustic data interpretation is still in its infancy. We compared a wide spectrum of approaches including manual interpretation, geostatistics, object-based image analysis and machine-learning to gain further insights into the accuracy and comparability of acoustic data interpretation approaches based on multibeam echosounder data (bathymetry, backscatter and derivatives) and seabed samples with the aim to derive seabed substrate maps. Sample data were split into a training and validation data set to allow us to carry out an accuracy assessment. Overall thematic classification accuracy ranged from 67% to 76% and Cohen's kappa varied between 0.34 and 0.52. However, these differences were not statistically significant at the 5% level. Misclassifications were mainly associated with uncommon classes, which were rarely sampled. Map outputs were between 68% and 87% identical. To improve classification accuracy in seabed mapping, we suggest that more studies on the effects of factors affecting the classification performance as well as comparative studies testing the performance of different approaches need to be carried out with a view to developing guidelines for selecting an appropriate method for a given dataset. In the meantime, classification accuracy might be improved by combining different techniques to hybrid approaches and multi-method ensembles.
Le, Hoa V; Poole, Charles; Brookhart, M Alan; Schoenbach, Victor J; Beach, Kathleen J; Layton, J Bradley; Stürmer, Til
2013-11-19
The High-Dimensional Propensity Score (hd-PS) algorithm can select and adjust for baseline confounders of treatment-outcome associations in pharmacoepidemiologic studies that use healthcare claims data. How hd-PS performance is affected by aggregating medications or medical diagnoses has not been assessed. We evaluated the effects of aggregating medications or diagnoses on hd-PS performance in an empirical example using resampled cohorts with small sample size, rare outcome incidence, or low exposure prevalence. In a cohort study comparing the risk of upper gastrointestinal complications in celecoxib or traditional NSAIDs (diclofenac, ibuprofen) initiators with rheumatoid arthritis and osteoarthritis, we (1) aggregated medications and International Classification of Diseases-9 (ICD-9) diagnoses into hierarchies of the Anatomical Therapeutic Chemical classification (ATC) and the Clinical Classification Software (CCS), respectively, and (2) sampled the full cohort using techniques validated by simulations to create 9,600 samples to compare 16 aggregation scenarios across 50% and 20% samples with varying outcome incidence and exposure prevalence. We applied hd-PS to estimate relative risks (RR) using 5 dimensions, predefined confounders, ≤ 500 hd-PS covariates, and propensity score deciles. For each scenario, we calculated: (1) the geometric mean RR; (2) the difference between the scenario mean ln(RR) and the ln(RR) from published randomized controlled trials (RCT); and (3) the proportional difference in the degree of estimated confounding between that scenario and the base scenario (no aggregation). Compared with the base scenario, aggregations of medications into ATC level 4 alone or in combination with aggregation of diagnoses into CCS level 1 improved the hd-PS confounding adjustment in most scenarios, reducing residual confounding compared with the RCT findings by up to 19%. Aggregation of codes using hierarchical coding systems may improve the performance of the hd-PS to control for confounders. The balance of advantages and disadvantages of aggregation is likely to vary across research settings.
Chambon, Stanislas; Galtier, Mathieu N; Arnal, Pierrick J; Wainrib, Gilles; Gramfort, Alexandre
2018-04-01
Sleep stage classification constitutes an important preliminary exam in the diagnosis of sleep disorders. It is traditionally performed by a sleep expert who assigns to each 30 s of the signal of a sleep stage, based on the visual inspection of signals such as electroencephalograms (EEGs), electrooculograms (EOGs), electrocardiograms, and electromyograms (EMGs). We introduce here the first deep learning approach for sleep stage classification that learns end-to-end without computing spectrograms or extracting handcrafted features, that exploits all multivariate and multimodal polysomnography (PSG) signals (EEG, EMG, and EOG), and that can exploit the temporal context of each 30-s window of data. For each modality, the first layer learns linear spatial filters that exploit the array of sensors to increase the signal-to-noise ratio, and the last layer feeds the learnt representation to a softmax classifier. Our model is compared to alternative automatic approaches based on convolutional networks or decisions trees. Results obtained on 61 publicly available PSG records with up to 20 EEG channels demonstrate that our network architecture yields the state-of-the-art performance. Our study reveals a number of insights on the spatiotemporal distribution of the signal of interest: a good tradeoff for optimal classification performance measured with balanced accuracy is to use 6 EEG with 2 EOG (left and right) and 3 EMG chin channels. Also exploiting 1 min of data before and after each data segment offers the strongest improvement when a limited number of channels are available. As sleep experts, our system exploits the multivariate and multimodal nature of PSG signals in order to deliver the state-of-the-art classification performance with a small computational cost.
Comparing Pixel- and Object-Based Approaches in Effectively Classifying Wetland-Dominated Landscapes
Wetland ecosystems straddle both terrestrial and aquatic habitats, performing many ecological functions directly and indirectly benefitting humans. However, global wetland losses are substantial. Satellite remote sensing and classification informs wise wetland management and moni...
Classification of large-sized hyperspectral imagery using fast machine learning algorithms
NASA Astrophysics Data System (ADS)
Xia, Junshi; Yokoya, Naoto; Iwasaki, Akira
2017-07-01
We present a framework of fast machine learning algorithms in the context of large-sized hyperspectral images classification from the theoretical to a practical viewpoint. In particular, we assess the performance of random forest (RF), rotation forest (RoF), and extreme learning machine (ELM) and the ensembles of RF and ELM. These classifiers are applied to two large-sized hyperspectral images and compared to the support vector machines. To give the quantitative analysis, we pay attention to comparing these methods when working with high input dimensions and a limited/sufficient training set. Moreover, other important issues such as the computational cost and robustness against the noise are also discussed.
ERIC Educational Resources Information Center
Clanchy, Kelly M.; Tweedy, Sean M.; Boyd, Roslyn
2011-01-01
Aim: This systematic review compares the validity, reliability, and clinical use of habitual physical activity (HPA) performance measures in adolescents with cerebral palsy (CP). Method: Measures of HPA across Gross Motor Function Classification System (GMFCS) levels I-V for adolescents (10-18y) with CP were included if at least 60% of items…
Thanh Noi, Phan; Kappas, Martin
2017-01-01
In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a few studies have compared the performances of these classifiers with different training sample sizes for the same remote sensing images, particularly the Sentinel-2 Multispectral Imager (MSI). In this study, we examined and compared the performances of the RF, kNN, and SVM classifiers for land use/cover classification using Sentinel-2 image data. An area of 30 × 30 km2 within the Red River Delta of Vietnam with six land use/cover types was classified using 14 different training sample sizes, including balanced and imbalanced, from 50 to over 1250 pixels/class. All classification results showed a high overall accuracy (OA) ranging from 90% to 95%. Among the three classifiers and 14 sub-datasets, SVM produced the highest OA with the least sensitivity to the training sample sizes, followed consecutively by RF and kNN. In relation to the sample size, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough, i.e., greater than 750 pixels/class or representing an area of approximately 0.25% of the total study area. The high accuracy was achieved with both imbalanced and balanced datasets. PMID:29271909
Thanh Noi, Phan; Kappas, Martin
2017-12-22
In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a few studies have compared the performances of these classifiers with different training sample sizes for the same remote sensing images, particularly the Sentinel-2 Multispectral Imager (MSI). In this study, we examined and compared the performances of the RF, kNN, and SVM classifiers for land use/cover classification using Sentinel-2 image data. An area of 30 × 30 km² within the Red River Delta of Vietnam with six land use/cover types was classified using 14 different training sample sizes, including balanced and imbalanced, from 50 to over 1250 pixels/class. All classification results showed a high overall accuracy (OA) ranging from 90% to 95%. Among the three classifiers and 14 sub-datasets, SVM produced the highest OA with the least sensitivity to the training sample sizes, followed consecutively by RF and kNN. In relation to the sample size, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough, i.e., greater than 750 pixels/class or representing an area of approximately 0.25% of the total study area. The high accuracy was achieved with both imbalanced and balanced datasets.
Classification of EEG Signals Based on Pattern Recognition Approach.
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a "pattern recognition" approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90-7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11-89.63% and 91.60-81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy.
Classification of EEG Signals Based on Pattern Recognition Approach
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a “pattern recognition” approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90–7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11–89.63% and 91.60–81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy. PMID:29209190
Classifications of Acute Scaphoid Fractures: A Systematic Literature Review.
Ten Berg, Paul W; Drijkoningen, Tessa; Strackee, Simon D; Buijze, Geert A
2016-05-01
Background In the lack of consensus, surgeon-based preference determines how acute scaphoid fractures are classified. There is a great variety of classification systems with considerable controversies. Purposes The purpose of this study was to provide an overview of the different classification systems, clarifying their subgroups and analyzing their popularity by comparing citation indexes. The intention was to improve data comparison between studies using heterogeneous fracture descriptions. Methods We performed a systematic review of the literature based on a search of medical literature from 1950 to 2015, and a manual search using the reference lists in relevant book chapters. Only original descriptions of classifications of acute scaphoid fractures in adults were included. Popularity was based on citation index as reported in the databases of Web of Science (WoS) and Google Scholar. Articles that were cited <10 times in WoS were excluded. Results Our literature search resulted in 308 potentially eligible descriptive reports of which 12 reports met the inclusion criteria. We distinguished 13 different (sub) classification systems based on (1) fracture location, (2) fracture plane orientation, and (3) fracture stability/displacement. Based on citations numbers, the Herbert classification was most popular, followed by the Russe and Mayo classifications. All classification systems were based on plain radiography. Conclusions Most classification systems were based on fracture location, displacement, or stability. Based on the controversy and limited reliability of current classification systems, suggested research areas for an updated classification include three-dimensional fracture pattern etiology and fracture fragment mobility assessed by dynamic imaging.
Shubham, Divya; Kawthalkar, Anjali S
2018-05-01
To assess the feasibility of the PALM-COEIN system for the classification of abnormal uterine bleeding (AUB) in low-resource settings and to suggest modifications. A prospective study was conducted among women with AUB who were admitted to the gynecology ward of a tertiary care hospital and research center in central India between November 2014 and October 2016. All patients were managed as per department protocols. The causes of AUB were classified before treatment using the PALM-COEIN system (classification I) and on the basis of the histopathology reports of the hysterectomy specimens (classification II); the results were compared using classification II as the gold standard. The study included 200 women with AUB; hysterectomy was performed in 174 women. Preoperative classification of AUB per the PALM-COEIN system was correct in 130 (65.0%) women. Adenomyosis (evaluated by transvaginal ultrasonography) and endometrial hyperplasia (evaluated by endometrial curettage) were underdiagnosed. The PALM-COEIN classification system helps in deciding the best treatment modality for women with AUB on a case-by-case basis. The incorporation of suggested modifications will further strengthen its utility as a pretreatment classification system in low-resource settings. © 2017 International Federation of Gynecology and Obstetrics.
Deep learning for tumor classification in imaging mass spectrometry.
Behrmann, Jens; Etmann, Christian; Boskamp, Tobias; Casadonte, Rita; Kriegsmann, Jörg; Maaß, Peter
2018-04-01
Tumor classification using imaging mass spectrometry (IMS) data has a high potential for future applications in pathology. Due to the complexity and size of the data, automated feature extraction and classification steps are required to fully process the data. Since mass spectra exhibit certain structural similarities to image data, deep learning may offer a promising strategy for classification of IMS data as it has been successfully applied to image classification. Methodologically, we propose an adapted architecture based on deep convolutional networks to handle the characteristics of mass spectrometry data, as well as a strategy to interpret the learned model in the spectral domain based on a sensitivity analysis. The proposed methods are evaluated on two algorithmically challenging tumor classification tasks and compared to a baseline approach. Competitiveness of the proposed methods is shown on both tasks by studying the performance via cross-validation. Moreover, the learned models are analyzed by the proposed sensitivity analysis revealing biologically plausible effects as well as confounding factors of the considered tasks. Thus, this study may serve as a starting point for further development of deep learning approaches in IMS classification tasks. https://gitlab.informatik.uni-bremen.de/digipath/Deep_Learning_for_Tumor_Classification_in_IMS. jbehrmann@uni-bremen.de or christianetmann@uni-bremen.de. Supplementary data are available at Bioinformatics online.
Weight-elimination neural networks applied to coronary surgery mortality prediction.
Ennett, Colleen M; Frize, Monique
2003-06-01
The objective was to assess the effectiveness of the weight-elimination cost function in improving classification performance of artificial neural networks (ANNs) and to observe how changing the a priori distribution of the training set affects network performance. Backpropagation feedforward ANNs with and without weight-elimination estimated mortality for coronary artery surgery patients. The ANNs were trained and tested on cases with 32 input variables describing the patient's medical history; the output variable was in-hospital mortality (mortality rates: training 3.7%, test 3.8%). Artificial training sets with mortality rates of 20%, 50%, and 80% were created to observe the impact of training with a higher-than-normal prevalence. When the results were averaged, weight-elimination networks achieved higher sensitivity rates than those without weight-elimination. Networks trained on higher-than-normal prevalence achieved higher sensitivity rates at the cost of lower specificity and correct classification. The weight-elimination cost function can improve the classification performance when the network is trained with a higher-than-normal prevalence. A network trained with a moderately high artificial mortality rate (artificial mortality rate of 20%) can improve the sensitivity of the model without significantly affecting other aspects of the model's performance. The ANN mortality model achieved comparable performance as additive and statistical models for coronary surgery mortality estimation in the literature.
3D multi-view convolutional neural networks for lung nodule classification
Kang, Guixia; Hou, Beibei; Zhang, Ningbo
2017-01-01
The 3D convolutional neural network (CNN) is able to make full use of the spatial 3D context information of lung nodules, and the multi-view strategy has been shown to be useful for improving the performance of 2D CNN in classifying lung nodules. In this paper, we explore the classification of lung nodules using the 3D multi-view convolutional neural networks (MV-CNN) with both chain architecture and directed acyclic graph architecture, including 3D Inception and 3D Inception-ResNet. All networks employ the multi-view-one-network strategy. We conduct a binary classification (benign and malignant) and a ternary classification (benign, primary malignant and metastatic malignant) on Computed Tomography (CT) images from Lung Image Database Consortium and Image Database Resource Initiative database (LIDC-IDRI). All results are obtained via 10-fold cross validation. As regards the MV-CNN with chain architecture, results show that the performance of 3D MV-CNN surpasses that of 2D MV-CNN by a significant margin. Finally, a 3D Inception network achieved an error rate of 4.59% for the binary classification and 7.70% for the ternary classification, both of which represent superior results for the corresponding task. We compare the multi-view-one-network strategy with the one-view-one-network strategy. The results reveal that the multi-view-one-network strategy can achieve a lower error rate than the one-view-one-network strategy. PMID:29145492
Improving Generalization Based on l1-Norm Regularization for EEG-Based Motor Imagery Classification
Zhao, Yuwei; Han, Jiuqi; Chen, Yushu; Sun, Hongji; Chen, Jiayun; Ke, Ang; Han, Yao; Zhang, Peng; Zhang, Yi; Zhou, Jin; Wang, Changyong
2018-01-01
Multichannel electroencephalography (EEG) is widely used in typical brain-computer interface (BCI) systems. In general, a number of parameters are essential for a EEG classification algorithm due to redundant features involved in EEG signals. However, the generalization of the EEG method is often adversely affected by the model complexity, considerably coherent with its number of undetermined parameters, further leading to heavy overfitting. To decrease the complexity and improve the generalization of EEG method, we present a novel l1-norm-based approach to combine the decision value obtained from each EEG channel directly. By extracting the information from different channels on independent frequency bands (FB) with l1-norm regularization, the method proposed fits the training data with much less parameters compared to common spatial pattern (CSP) methods in order to reduce overfitting. Moreover, an effective and efficient solution to minimize the optimization object is proposed. The experimental results on dataset IVa of BCI competition III and dataset I of BCI competition IV show that, the proposed method contributes to high classification accuracy and increases generalization performance for the classification of MI EEG. As the training set ratio decreases from 80 to 20%, the average classification accuracy on the two datasets changes from 85.86 and 86.13% to 84.81 and 76.59%, respectively. The classification performance and generalization of the proposed method contribute to the practical application of MI based BCI systems. PMID:29867307
Comparative analysis of image classification methods for automatic diagnosis of ophthalmic images
NASA Astrophysics Data System (ADS)
Wang, Liming; Zhang, Kai; Liu, Xiyang; Long, Erping; Jiang, Jiewei; An, Yingying; Zhang, Jia; Liu, Zhenzhen; Lin, Zhuoling; Li, Xiaoyan; Chen, Jingjing; Cao, Qianzhong; Li, Jing; Wu, Xiaohang; Wang, Dongni; Li, Wangting; Lin, Haotian
2017-01-01
There are many image classification methods, but it remains unclear which methods are most helpful for analyzing and intelligently identifying ophthalmic images. We select representative slit-lamp images which show the complexity of ocular images as research material to compare image classification algorithms for diagnosing ophthalmic diseases. To facilitate this study, some feature extraction algorithms and classifiers are combined to automatic diagnose pediatric cataract with same dataset and then their performance are compared using multiple criteria. This comparative study reveals the general characteristics of the existing methods for automatic identification of ophthalmic images and provides new insights into the strengths and shortcomings of these methods. The relevant methods (local binary pattern +SVMs, wavelet transformation +SVMs) which achieve an average accuracy of 87% and can be adopted in specific situations to aid doctors in preliminarily disease screening. Furthermore, some methods requiring fewer computational resources and less time could be applied in remote places or mobile devices to assist individuals in understanding the condition of their body. In addition, it would be helpful to accelerate the development of innovative approaches and to apply these methods to assist doctors in diagnosing ophthalmic disease.
Kirchner, Elsa A; Kim, Su Kyoung
2018-01-01
Event-related potentials (ERPs) are often used in brain-computer interfaces (BCIs) for communication or system control for enhancing or regaining control for motor-disabled persons. Especially results from single-trial EEG classification approaches for BCIs support correlations between single-trial ERP detection performance and ERP expression. Hence, BCIs can be considered as a paradigm shift contributing to new methods with strong influence on both neuroscience and clinical applications. Here, we investigate the relevance of the choice of training data and classifier transfer for the interpretability of results from single-trial ERP detection. In our experiments, subjects performed a visual-motor oddball task with motor-task relevant infrequent ( targets ), motor-task irrelevant infrequent ( deviants ), and motor-task irrelevant frequent ( standards ) stimuli. Under dual-task condition, a secondary senso-motor task was performed, compared to the simple-task condition. For evaluation, average ERP analysis and single-trial detection analysis with different numbers of electrodes were performed. Further, classifier transfer was investigated between simple and dual task. Parietal positive ERPs evoked by target stimuli (but not by deviants) were expressed stronger under dual-task condition, which is discussed as an increase of task emphasis and brain processes involved in task coordination and change of task set. Highest classification performance was found for targets irrespective whether all 62, 6 or 2 parietal electrodes were used. Further, higher detection performance of targets compared to standards was achieved under dual-task compared to simple-task condition in case of training on data from 2 parietal electrodes corresponding to results of ERP average analysis. Classifier transfer between tasks improves classification performance in case that training took place on more varying examples (from dual task). In summary, we showed that P300 and overlaying parietal positive ERPs can successfully be detected while subjects are performing additional ongoing motor activity. This supports single-trial detection of ERPs evoked by target events to, e.g., infer a patient's attentional state during therapeutic intervention.
Kirchner, Elsa A.; Kim, Su Kyoung
2018-01-01
Event-related potentials (ERPs) are often used in brain-computer interfaces (BCIs) for communication or system control for enhancing or regaining control for motor-disabled persons. Especially results from single-trial EEG classification approaches for BCIs support correlations between single-trial ERP detection performance and ERP expression. Hence, BCIs can be considered as a paradigm shift contributing to new methods with strong influence on both neuroscience and clinical applications. Here, we investigate the relevance of the choice of training data and classifier transfer for the interpretability of results from single-trial ERP detection. In our experiments, subjects performed a visual-motor oddball task with motor-task relevant infrequent (targets), motor-task irrelevant infrequent (deviants), and motor-task irrelevant frequent (standards) stimuli. Under dual-task condition, a secondary senso-motor task was performed, compared to the simple-task condition. For evaluation, average ERP analysis and single-trial detection analysis with different numbers of electrodes were performed. Further, classifier transfer was investigated between simple and dual task. Parietal positive ERPs evoked by target stimuli (but not by deviants) were expressed stronger under dual-task condition, which is discussed as an increase of task emphasis and brain processes involved in task coordination and change of task set. Highest classification performance was found for targets irrespective whether all 62, 6 or 2 parietal electrodes were used. Further, higher detection performance of targets compared to standards was achieved under dual-task compared to simple-task condition in case of training on data from 2 parietal electrodes corresponding to results of ERP average analysis. Classifier transfer between tasks improves classification performance in case that training took place on more varying examples (from dual task). In summary, we showed that P300 and overlaying parietal positive ERPs can successfully be detected while subjects are performing additional ongoing motor activity. This supports single-trial detection of ERPs evoked by target events to, e.g., infer a patient's attentional state during therapeutic intervention. PMID:29636660
Surkis, Alisa; Hogle, Janice A; DiazGranados, Deborah; Hunt, Joe D; Mazmanian, Paul E; Connors, Emily; Westaby, Kate; Whipple, Elizabeth C; Adamus, Trisha; Mueller, Meridith; Aphinyanaphongs, Yindalon
2016-08-05
Translational research is a key area of focus of the National Institutes of Health (NIH), as demonstrated by the substantial investment in the Clinical and Translational Science Award (CTSA) program. The goal of the CTSA program is to accelerate the translation of discoveries from the bench to the bedside and into communities. Different classification systems have been used to capture the spectrum of basic to clinical to population health research, with substantial differences in the number of categories and their definitions. Evaluation of the effectiveness of the CTSA program and of translational research in general is hampered by the lack of rigor in these definitions and their application. This study adds rigor to the classification process by creating a checklist to evaluate publications across the translational spectrum and operationalizes these classifications by building machine learning-based text classifiers to categorize these publications. Based on collaboratively developed definitions, we created a detailed checklist for categories along the translational spectrum from T0 to T4. We applied the checklist to CTSA-linked publications to construct a set of coded publications for use in training machine learning-based text classifiers to classify publications within these categories. The training sets combined T1/T2 and T3/T4 categories due to low frequency of these publication types compared to the frequency of T0 publications. We then compared classifier performance across different algorithms and feature sets and applied the classifiers to all publications in PubMed indexed to CTSA grants. To validate the algorithm, we manually classified the articles with the top 100 scores from each classifier. The definitions and checklist facilitated classification and resulted in good inter-rater reliability for coding publications for the training set. Very good performance was achieved for the classifiers as represented by the area under the receiver operating curves (AUC), with an AUC of 0.94 for the T0 classifier, 0.84 for T1/T2, and 0.92 for T3/T4. The combination of definitions agreed upon by five CTSA hubs, a checklist that facilitates more uniform definition interpretation, and algorithms that perform well in classifying publications along the translational spectrum provide a basis for establishing and applying uniform definitions of translational research categories. The classification algorithms allow publication analyses that would not be feasible with manual classification, such as assessing the distribution and trends of publications across the CTSA network and comparing the categories of publications and their citations to assess knowledge transfer across the translational research spectrum.
An Improvement To The k-Nearest Neighbor Classifier For ECG Database
NASA Astrophysics Data System (ADS)
Jaafar, Haryati; Hidayah Ramli, Nur; Nasir, Aimi Salihah Abdul
2018-03-01
The k nearest neighbor (kNN) is a non-parametric classifier and has been widely used for pattern classification. However, in practice, the performance of kNN often tends to fail due to the lack of information on how the samples are distributed among them. Moreover, kNN is no longer optimal when the training samples are limited. Another problem observed in kNN is regarding the weighting issues in assigning the class label before classification. Thus, to solve these limitations, a new classifier called Mahalanobis fuzzy k-nearest centroid neighbor (MFkNCN) is proposed in this study. Here, a Mahalanobis distance is applied to avoid the imbalance of samples distribition. Then, a surrounding rule is employed to obtain the nearest centroid neighbor based on the distributions of training samples and its distance to the query point. Consequently, the fuzzy membership function is employed to assign the query point to the class label which is frequently represented by the nearest centroid neighbor Experimental studies from electrocardiogram (ECG) signal is applied in this study. The classification performances are evaluated in two experimental steps i.e. different values of k and different sizes of feature dimensions. Subsequently, a comparative study of kNN, kNCN, FkNN and MFkCNN classifier is conducted to evaluate the performances of the proposed classifier. The results show that the performance of MFkNCN consistently exceeds the kNN, kNCN and FkNN with the best classification rates of 96.5%.
Bullich, Santiago; Seibyl, John; Catafau, Ana M; Jovalekic, Aleksandar; Koglin, Norman; Barthel, Henryk; Sabri, Osama; De Santi, Susan
2017-01-01
Standardized uptake value ratios (SUVRs) calculated from cerebral cortical areas can be used to categorize 18 F-Florbetaben (FBB) PET scans by applying appropriate cutoffs. The objective of this work was first to generate FBB SUVR cutoffs using visual assessment (VA) as standard of truth (SoT) for a number of reference regions (RR) (cerebellar gray matter (GCER), whole cerebellum (WCER), pons (PONS), and subcortical white matter (SWM)). Secondly, to validate the FBB PET scan categorization performed by SUVR cutoffs against the categorization made by post-mortem histopathological confirmation of the Aβ presence. Finally, to evaluate the added value of SUVR cutoff categorization to VA. SUVR cutoffs were generated for each RR using FBB scans from 143 subjects who were visually assessed by 3 readers. SUVR cutoffs were validated in 78 end-of life subjects using VA from 8 independent blinded readers (3 expert readers and 5 non-expert readers) and histopathological confirmation of the presence of neuritic beta-amyloid plaques as SoT. Finally, the number of correctly or incorrectly classified scans according to pathology results using VA and SUVR cutoffs was compared. Composite SUVR cutoffs generated were 1.43 (GCER), 0.96 (WCER), 0.78 (PONS) and 0.71 (SWM). Accuracy values were high and consistent across RR (range 83-94% for histopathology, and 85-94% for VA). SUVR cutoff performed similarly as VA but did not improve VA classification of FBB scans read either by expert readers or the majority read but provided higher accuracy than some non-expert readers. The accurate scan classification obtained in this study supports the use of VA as SoT to generate site-specific SUVR cutoffs. For an elderly end of life population, VA and SUVR cutoff categorization perform similarly in classifying FBB scans as Aβ-positive or Aβ-negative. These results emphasize the additional contribution that SUVR cutoff classification may have compared with VA performed by non-expert readers.
Sharma, Ashok K; Srivastava, Gopal N; Roy, Ankita; Sharma, Vineet K
2017-01-01
The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84-0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better ( R 2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better ( R 2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules.
Sharma, Ashok K.; Srivastava, Gopal N.; Roy, Ankita; Sharma, Vineet K.
2017-01-01
The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84–0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better (R2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better (R2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules. PMID:29249969
Pairwise Classifier Ensemble with Adaptive Sub-Classifiers for fMRI Pattern Analysis.
Kim, Eunwoo; Park, HyunWook
2017-02-01
The multi-voxel pattern analysis technique is applied to fMRI data for classification of high-level brain functions using pattern information distributed over multiple voxels. In this paper, we propose a classifier ensemble for multiclass classification in fMRI analysis, exploiting the fact that specific neighboring voxels can contain spatial pattern information. The proposed method converts the multiclass classification to a pairwise classifier ensemble, and each pairwise classifier consists of multiple sub-classifiers using an adaptive feature set for each class-pair. Simulated and real fMRI data were used to verify the proposed method. Intra- and inter-subject analyses were performed to compare the proposed method with several well-known classifiers, including single and ensemble classifiers. The comparison results showed that the proposed method can be generally applied to multiclass classification in both simulations and real fMRI analyses.
Impervious surface mapping with Quickbird imagery
Lu, Dengsheng; Hetrick, Scott; Moran, Emilio
2010-01-01
This research selects two study areas with different urban developments, sizes, and spatial patterns to explore the suitable methods for mapping impervious surface distribution using Quickbird imagery. The selected methods include per-pixel based supervised classification, segmentation-based classification, and a hybrid method. A comparative analysis of the results indicates that per-pixel based supervised classification produces a large number of “salt-and-pepper” pixels, and segmentation based methods can significantly reduce this problem. However, neither method can effectively solve the spectral confusion of impervious surfaces with water/wetland and bare soils and the impacts of shadows. In order to accurately map impervious surface distribution from Quickbird images, manual editing is necessary and may be the only way to extract impervious surfaces from the confused land covers and the shadow problem. This research indicates that the hybrid method consisting of thresholding techniques, unsupervised classification and limited manual editing provides the best performance. PMID:21643434
Modified Mahalanobis Taguchi System for Imbalance Data Classification
2017-01-01
The Mahalanobis Taguchi System (MTS) is considered one of the most promising binary classification algorithms to handle imbalance data. Unfortunately, MTS lacks a method for determining an efficient threshold for the binary classification. In this paper, a nonlinear optimization model is formulated based on minimizing the distance between MTS Receiver Operating Characteristics (ROC) curve and the theoretical optimal point named Modified Mahalanobis Taguchi System (MMTS). To validate the MMTS classification efficacy, it has been benchmarked with Support Vector Machines (SVMs), Naive Bayes (NB), Probabilistic Mahalanobis Taguchi Systems (PTM), Synthetic Minority Oversampling Technique (SMOTE), Adaptive Conformal Transformation (ACT), Kernel Boundary Alignment (KBA), Hidden Naive Bayes (HNB), and other improved Naive Bayes algorithms. MMTS outperforms the benchmarked algorithms especially when the imbalance ratio is greater than 400. A real life case study on manufacturing sector is used to demonstrate the applicability of the proposed model and to compare its performance with Mahalanobis Genetic Algorithm (MGA). PMID:28811820
NASA Technical Reports Server (NTRS)
Hoffer, R. M. (Principal Investigator); Knowlton, D. J.; Dean, M. E.
1981-01-01
A set of training statistics for the 30 meter resolution simulated thematic mapper MSS data was generated based on land use/land cover classes. In addition to this supervised data set, a nonsupervised multicluster block of training statistics is being defined in order to compare the classification results and evaluate the effect of the different training selection methods on classification performance. Two test data sets, defined using a stratified sampling procedure incorporating a grid system with dimensions of 50 lines by 50 columns, and another set based on an analyst supervised set of test fields were used to evaluate the classifications of the TMS data. The supervised training data set generated training statistics, and a per point Gaussian maximum likelihood classification of the 1979 TMS data was obtained. The August 1980 MSS data was radiometrically adjusted. The SAR data was redigitized and the SAR imagery was qualitatively analyzed.
Optical tomographic detection of rheumatoid arthritis with computer-aided classification schemes
NASA Astrophysics Data System (ADS)
Klose, Christian D.; Klose, Alexander D.; Netz, Uwe; Beuthan, Jürgen; Hielscher, Andreas H.
2009-02-01
A recent research study has shown that combining multiple parameters, drawn from optical tomographic images, leads to better classification results to identifying human finger joints that are affected or not affected by rheumatic arthritis RA. Building up on the research findings of the previous study, this article presents an advanced computer-aided classification approach for interpreting optical image data to detect RA in finger joints. Additional data are used including, for example, maximum and minimum values of the absorption coefficient as well as their ratios and image variances. Classification performances obtained by the proposed method were evaluated in terms of sensitivity, specificity, Youden index and area under the curve AUC. Results were compared to different benchmarks ("gold standard"): magnet resonance, ultrasound and clinical evaluation. Maximum accuracies (AUC=0.88) were reached when combining minimum/maximum-ratios and image variances and using ultrasound as gold standard.
NASA Astrophysics Data System (ADS)
Florindo, João. Batista
2018-04-01
This work proposes the use of Singular Spectrum Analysis (SSA) for the classification of texture images, more specifically, to enhance the performance of the Bouligand-Minkowski fractal descriptors in this task. Fractal descriptors are known to be a powerful approach to model and particularly identify complex patterns in natural images. Nevertheless, the multiscale analysis involved in those descriptors makes them highly correlated. Although other attempts to address this point was proposed in the literature, none of them investigated the relation between the fractal correlation and the well-established analysis employed in time series. And SSA is one of the most powerful techniques for this purpose. The proposed method was employed for the classification of benchmark texture images and the results were compared with other state-of-the-art classifiers, confirming the potential of this analysis in image classification.
Brain tumor segmentation based on local independent projection-based classification.
Huang, Meiyan; Yang, Wei; Wu, Yao; Jiang, Jun; Chen, Wufan; Feng, Qianjin
2014-10-01
Brain tumor segmentation is an important procedure for early tumor diagnosis and radiotherapy planning. Although numerous brain tumor segmentation methods have been presented, enhancing tumor segmentation methods is still challenging because brain tumor MRI images exhibit complex characteristics, such as high diversity in tumor appearance and ambiguous tumor boundaries. To address this problem, we propose a novel automatic tumor segmentation method for MRI images. This method treats tumor segmentation as a classification problem. Additionally, the local independent projection-based classification (LIPC) method is used to classify each voxel into different classes. A novel classification framework is derived by introducing the local independent projection into the classical classification model. Locality is important in the calculation of local independent projections for LIPC. Locality is also considered in determining whether local anchor embedding is more applicable in solving linear projection weights compared with other coding methods. Moreover, LIPC considers the data distribution of different classes by learning a softmax regression model, which can further improve classification performance. In this study, 80 brain tumor MRI images with ground truth data are used as training data and 40 images without ground truth data are used as testing data. The segmentation results of testing data are evaluated by an online evaluation tool. The average dice similarities of the proposed method for segmenting complete tumor, tumor core, and contrast-enhancing tumor on real patient data are 0.84, 0.685, and 0.585, respectively. These results are comparable to other state-of-the-art methods.
Yang, Fan; Xu, Ying-Ying; Shen, Hong-Bin
2014-01-01
Human protein subcellular location prediction can provide critical knowledge for understanding a protein's function. Since significant progress has been made on digital microscopy, automated image-based protein subcellular location classification is urgently needed. In this paper, we aim to investigate more representative image features that can be effectively used for dealing with the multilabel subcellular image samples. We prepared a large multilabel immunohistochemistry (IHC) image benchmark from the Human Protein Atlas database and tested the performance of different local texture features, including completed local binary pattern, local tetra pattern, and the standard local binary pattern feature. According to our experimental results from binary relevance multilabel machine learning models, the completed local binary pattern, and local tetra pattern are more discriminative for describing IHC images when compared to the traditional local binary pattern descriptor. The combination of these two novel local pattern features and the conventional global texture features is also studied. The enhanced performance of final binary relevance classification model trained on the combined feature space demonstrates that different features are complementary to each other and thus capable of improving the accuracy of classification.
Kelly, Jane M.; Osamba, Benta; Garg, Renu M.; Hamel, Mary J.; Lewis, Jennifer J.; Rowe, Samantha Y.; Rowe, Alexander K.; Deming, Michael S.
2001-01-01
Objectives. To characterize community health worker (CHW) performance using an algorithm for managing common childhood illnesses in Siaya District, Kenya, we conducted CHW evaluations in 1998, 1999, and 2001. Methods. Randomly selected CHWs were observed managing sick outpatient and inpatient children at a hospital, and their management was compared with that of an expert clinician who used the algorithm. Results. One hundred, 108, and 114 CHWs participated in the evaluations in 1998, 1999, and 2001, respectively. The proportions of children treated “adequately” (with an antibiotic, antimalarial, oral rehydration solution, or referral, depending on the child's disease classifications) were 57.8%, 35.5%, and 38.9%, respectively, for children with a severe classification and 27.7%, 77.3%, and 74.3%, respectively, for children with a moderate (but not severe) classification. CHWs adequately treated 90.5% of malaria cases (the most commonly encountered classification). CHWs often made mistakes assessing symptoms, classifying illnesses, and prescribing correct doses of medications. Conclusions. Deficiencies were found in the management of sick children by CHWs, although care was not consistently poor. Key reasons for the deficiencies appear to be guideline complexity and inadequate clinical supervision; other possible causes are discussed. PMID:11574324
Tiny videos: a large data set for nonparametric video retrieval and frame classification.
Karpenko, Alexandre; Aarabi, Parham
2011-03-01
In this paper, we present a large database of over 50,000 user-labeled videos collected from YouTube. We develop a compact representation called "tiny videos" that achieves high video compression rates while retaining the overall visual appearance of the video as it varies over time. We show that frame sampling using affinity propagation-an exemplar-based clustering algorithm-achieves the best trade-off between compression and video recall. We use this large collection of user-labeled videos in conjunction with simple data mining techniques to perform related video retrieval, as well as classification of images and video frames. The classification results achieved by tiny videos are compared with the tiny images framework [24] for a variety of recognition tasks. The tiny images data set consists of 80 million images collected from the Internet. These are the largest labeled research data sets of videos and images available to date. We show that tiny videos are better suited for classifying scenery and sports activities, while tiny images perform better at recognizing objects. Furthermore, we demonstrate that combining the tiny images and tiny videos data sets improves classification precision in a wider range of categories.
Support vector machine and principal component analysis for microarray data classification
NASA Astrophysics Data System (ADS)
Astuti, Widi; Adiwijaya
2018-03-01
Cancer is a leading cause of death worldwide although a significant proportion of it can be cured if it is detected early. In recent decades, technology called microarray takes an important role in the diagnosis of cancer. By using data mining technique, microarray data classification can be performed to improve the accuracy of cancer diagnosis compared to traditional techniques. The characteristic of microarray data is small sample but it has huge dimension. Since that, there is a challenge for researcher to provide solutions for microarray data classification with high performance in both accuracy and running time. This research proposed the usage of Principal Component Analysis (PCA) as a dimension reduction method along with Support Vector Method (SVM) optimized by kernel functions as a classifier for microarray data classification. The proposed scheme was applied on seven data sets using 5-fold cross validation and then evaluation and analysis conducted on term of both accuracy and running time. The result showed that the scheme can obtained 100% accuracy for Ovarian and Lung Cancer data when Linear and Cubic kernel functions are used. In term of running time, PCA greatly reduced the running time for every data sets.
An online sleep apnea detection method based on recurrence quantification analysis.
Nguyen, Hoa Dinh; Wilkins, Brek A; Cheng, Qi; Benjamin, Bruce Allen
2014-07-01
This paper introduces an online sleep apnea detection method based on heart rate complexity as measured by recurrence quantification analysis (RQA) statistics of heart rate variability (HRV) data. RQA statistics can capture nonlinear dynamics of a complex cardiorespiratory system during obstructive sleep apnea. In order to obtain a more robust measurement of the nonstationarity of the cardiorespiratory system, we use different fixed amount of neighbor thresholdings for recurrence plot calculation. We integrate a feature selection algorithm based on conditional mutual information to select the most informative RQA features for classification, and hence, to speed up the real-time classification process without degrading the performance of the system. Two types of binary classifiers, i.e., support vector machine and neural network, are used to differentiate apnea from normal sleep. A soft decision fusion rule is developed to combine the results of these classifiers in order to improve the classification performance of the whole system. Experimental results show that our proposed method achieves better classification results compared with the previous recurrence analysis-based approach. We also show that our method is flexible and a strong candidate for a real efficient sleep apnea detection system.
Wen, Zaidao; Hou, Zaidao; Jiao, Licheng
2017-11-01
Discriminative dictionary learning (DDL) framework has been widely used in image classification which aims to learn some class-specific feature vectors as well as a representative dictionary according to a set of labeled training samples. However, interclass similarities and intraclass variances among input samples and learned features will generally weaken the representability of dictionary and the discrimination of feature vectors so as to degrade the classification performance. Therefore, how to explicitly represent them becomes an important issue. In this paper, we present a novel DDL framework with two-level low rank and group sparse decomposition model. In the first level, we learn a class-shared and several class-specific dictionaries, where a low rank and a group sparse regularization are, respectively, imposed on the corresponding feature matrices. In the second level, the class-specific feature matrix will be further decomposed into a low rank and a sparse matrix so that intraclass variances can be separated to concentrate the corresponding feature vectors. Extensive experimental results demonstrate the effectiveness of our model. Compared with the other state-of-the-arts on several popular image databases, our model can achieve a competitive or better performance in terms of the classification accuracy.
Track classification within wireless sensor network
NASA Astrophysics Data System (ADS)
Doumerc, Robin; Pannetier, Benjamin; Moras, Julien; Dezert, Jean; Canevet, Loic
2017-05-01
In this paper, we present our study on track classification by taking into account environmental information and target estimated states. The tracker uses several motion model adapted to different target dynamics (pedestrian, ground vehicle and SUAV, i.e. small unmanned aerial vehicle) and works in centralized architecture. The main idea is to explore both: classification given by heterogeneous sensors and classification obtained with our fusion module. The fusion module, presented in his paper, provides a class on each track according to track location, velocity and associated uncertainty. To model the likelihood on each class, a fuzzy approach is used considering constraints on target capability to move in the environment. Then the evidential reasoning approach based on Dempster-Shafer Theory (DST) is used to perform a time integration of this classifier output. The fusion rules are tested and compared on real data obtained with our wireless sensor network.In order to handle realistic ground target tracking scenarios, we use an autonomous smart computer deposited in the surveillance area. After the calibration step of the heterogeneous sensor network, our system is able to handle real data from a wireless ground sensor network. The performance of this system is evaluated in a real exercise for intelligence operation ("hunter hunt" scenario).
Comparing Running Specific and Traditional Prostheses During Running: Assessing Performance and Risk
2016-09-01
extremity amputation (ILEA) running is limited with respect to biomechanical performance and injury risks. ILEA are able to run with both running...TERMS Kinetics, biomechanics , amputation, prosthesis, transtibial 16. SECURITY CLASSIFICATION OF: U 17. LIMITATION OF ABSTRACT 18. NUMBER OF PAGES...with lower extremity amputation (ILEA) running is limited with respect to biomechanical performance and injury risks. ILEA are able to run with both
Classifications for Cesarean Section: A Systematic Review
Torloni, Maria Regina; Betran, Ana Pilar; Souza, Joao Paulo; Widmer, Mariana; Allen, Tomas; Gulmezoglu, Metin; Merialdi, Mario
2011-01-01
Background Rising cesarean section (CS) rates are a major public health concern and cause worldwide debates. To propose and implement effective measures to reduce or increase CS rates where necessary requires an appropriate classification. Despite several existing CS classifications, there has not yet been a systematic review of these. This study aimed to 1) identify the main CS classifications used worldwide, 2) analyze advantages and deficiencies of each system. Methods and Findings Three electronic databases were searched for classifications published 1968–2008. Two reviewers independently assessed classifications using a form created based on items rated as important by international experts. Seven domains (ease, clarity, mutually exclusive categories, totally inclusive classification, prospective identification of categories, reproducibility, implementability) were assessed and graded. Classifications were tested in 12 hypothetical clinical case-scenarios. From a total of 2948 citations, 60 were selected for full-text evaluation and 27 classifications identified. Indications classifications present important limitations and their overall score ranged from 2–9 (maximum grade = 14). Degree of urgency classifications also had several drawbacks (overall scores 6–9). Woman-based classifications performed best (scores 5–14). Other types of classifications require data not routinely collected and may not be relevant in all settings (scores 3–8). Conclusions This review and critical appraisal of CS classifications is a methodologically sound contribution to establish the basis for the appropriate monitoring and rational use of CS. Results suggest that women-based classifications in general, and Robson's classification, in particular, would be in the best position to fulfill current international and local needs and that efforts to develop an internationally applicable CS classification would be most appropriately placed in building upon this classification. The use of a single CS classification will facilitate auditing, analyzing and comparing CS rates across different settings and help to create and implement effective strategies specifically targeted to optimize CS rates where necessary. PMID:21283801
Haenssle, H A; Fink, C; Schneiderbauer, R; Toberer, F; Buhl, T; Blum, A; Kalloo, A; Hassen, A Ben Hadj; Thomas, L; Enk, A; Uhlmann, L
2018-05-28
Deep learning convolutional neural networks (CNN) may facilitate melanoma detection, but data comparing a CNN's diagnostic performance to larger groups of dermatologists are lacking. Google's Inception v4 CNN architecture was trained and validated using dermoscopic images and corresponding diagnoses. In a comparative cross-sectional reader study a 100-image test-set was used (level-I: dermoscopy only; level-II: dermoscopy plus clinical information and images). Main outcome measures were sensitivity, specificity and area under the curve (AUC) of receiver operating characteristics (ROC) for diagnostic classification (dichotomous) of lesions by the CNN versus an international group of 58 dermatologists during level-I or -II of the reader study. Secondary end points included the dermatologists' diagnostic performance in their management decisions and differences in the diagnostic performance of dermatologists during level-I and -II of the reader study. Additionally, the CNN's performance was compared with the top-five algorithms of the 2016 International Symposium on Biomedical Imaging (ISBI) challenge. In level-I dermatologists achieved a mean (±standard deviation) sensitivity and specificity for lesion classification of 86.6% (±9.3%) and 71.3% (±11.2%), respectively. More clinical information (level-II) improved the sensitivity to 88.9% (±9.6%, P = 0.19) and specificity to 75.7% (±11.7%, P < 0.05). The CNN ROC curve revealed a higher specificity of 82.5% when compared with dermatologists in level-I (71.3%, P < 0.01) and level-II (75.7%, P < 0.01) at their sensitivities of 86.6% and 88.9%, respectively. The CNN ROC AUC was greater than the mean ROC area of dermatologists (0.86 versus 0.79, P < 0.01). The CNN scored results close to the top three algorithms of the ISBI 2016 challenge. For the first time we compared a CNN's diagnostic performance with a large international group of 58 dermatologists, including 30 experts. Most dermatologists were outperformed by the CNN. Irrespective of any physicians' experience, they may benefit from assistance by a CNN's image classification. This study was registered at the German Clinical Trial Register (DRKS-Study-ID: DRKS00013570; https://www.drks.de/drks_web/).
Prakash, Bhaskaran David; Esuvaranathan, Kesavan; Ho, Paul C; Pasikanti, Kishore Kumar; Chan, Eric Chun Yong; Yap, Chun Wei
2013-05-21
A fully automated and computationally efficient Pearson's correlation change classification (APC3) approach is proposed and shown to have overall comparable performance with both an average accuracy and an average AUC of 0.89 ± 0.08 but is 3.9 to 7 times faster, easier to use and have low outlier susceptibility in contrast to other dimensional reduction and classification combinations using only the total ion chromatogram (TIC) intensities of GC/MS data. The use of only the TIC permits the possible application of APC3 to other metabonomic data such as LC/MS TICs or NMR spectra. A RapidMiner implementation is available for download at http://padel.nus.edu.sg/software/padelapc3.
Arc-Welding Spectroscopic Monitoring based on Feature Selection and Neural Networks.
Garcia-Allende, P Beatriz; Mirapeix, Jesus; Conde, Olga M; Cobo, Adolfo; Lopez-Higuera, Jose M
2008-10-21
A new spectral processing technique designed for application in the on-line detection and classification of arc-welding defects is presented in this paper. A noninvasive fiber sensor embedded within a TIG torch collects the plasma radiation originated during the welding process. The spectral information is then processed in two consecutive stages. A compression algorithm is first applied to the data, allowing real-time analysis. The selected spectral bands are then used to feed a classification algorithm, which will be demonstrated to provide an efficient weld defect detection and classification. The results obtained with the proposed technique are compared to a similar processing scheme presented in previous works, giving rise to an improvement in the performance of the monitoring system.
Jung, Jun-Young; Heo, Wonho; Yang, Hyundae; Park, Hyunsub
2015-01-01
An exact classification of different gait phases is essential to enable the control of exoskeleton robots and detect the intentions of users. We propose a gait phase classification method based on neural networks using sensor signals from lower limb exoskeleton robots. In such robots, foot sensors with force sensing registers are commonly used to classify gait phases. We describe classifiers that use the orientation of each lower limb segment and the angular velocities of the joints to output the current gait phase. Experiments to obtain the input signals and desired outputs for the learning and validation process are conducted, and two neural network methods (a multilayer perceptron and nonlinear autoregressive with external inputs (NARX)) are used to develop an optimal classifier. Offline and online evaluations using four criteria are used to compare the performance of the classifiers. The proposed NARX-based method exhibits sufficiently good performance to replace foot sensors as a means of classifying gait phases. PMID:26528986
Jung, Jun-Young; Heo, Wonho; Yang, Hyundae; Park, Hyunsub
2015-10-30
An exact classification of different gait phases is essential to enable the control of exoskeleton robots and detect the intentions of users. We propose a gait phase classification method based on neural networks using sensor signals from lower limb exoskeleton robots. In such robots, foot sensors with force sensing registers are commonly used to classify gait phases. We describe classifiers that use the orientation of each lower limb segment and the angular velocities of the joints to output the current gait phase. Experiments to obtain the input signals and desired outputs for the learning and validation process are conducted, and two neural network methods (a multilayer perceptron and nonlinear autoregressive with external inputs (NARX)) are used to develop an optimal classifier. Offline and online evaluations using four criteria are used to compare the performance of the classifiers. The proposed NARX-based method exhibits sufficiently good performance to replace foot sensors as a means of classifying gait phases.
Classification of adaptive memetic algorithms: a comparative study.
Ong, Yew-Soon; Lim, Meng-Hiot; Zhu, Ning; Wong, Kok-Wai
2006-02-01
Adaptation of parameters and operators represents one of the recent most important and promising areas of research in evolutionary computations; it is a form of designing self-configuring algorithms that acclimatize to suit the problem in hand. Here, our interests are on a recent breed of hybrid evolutionary algorithms typically known as adaptive memetic algorithms (MAs). One unique feature of adaptive MAs is the choice of local search methods or memes and recent studies have shown that this choice significantly affects the performances of problem searches. In this paper, we present a classification of memes adaptation in adaptive MAs on the basis of the mechanism used and the level of historical knowledge on the memes employed. Then the asymptotic convergence properties of the adaptive MAs considered are analyzed according to the classification. Subsequently, empirical studies on representatives of adaptive MAs for different type-level meme adaptations using continuous benchmark problems indicate that global-level adaptive MAs exhibit better search performances. Finally we conclude with some promising research directions in the area.
Extended census transform histogram for land-use scene classification
NASA Astrophysics Data System (ADS)
Yuan, Baohua; Li, Shijin
2017-04-01
With the popular use of high-resolution satellite images, more and more research efforts have been focused on land-use scene classification. In scene classification, effective visual features can significantly boost the final performance. As a typical texture descriptor, the census transform histogram (CENTRIST) has emerged as a very powerful tool due to its effective representation ability. However, the most prominent limitation of CENTRIST is its small spatial support area, which may not necessarily be adept at capturing the key texture characteristics. We propose an extended CENTRIST (eCENTRIST), which is made up of three subschemes in a greater neighborhood scale. The proposed eCENTRIST not only inherits the advantages of CENTRIST but also encodes the more useful information of local structures. Meanwhile, multichannel eCENTRIST, which can capture the interactions from multichannel images, is developed to obtain higher categorization accuracy rates. Experimental results demonstrate that the proposed method can achieve competitive performance when compared to state-of-the-art methods.
Large deformation image classification using generalized locality-constrained linear coding.
Zhang, Pei; Wee, Chong-Yaw; Niethammer, Marc; Shen, Dinggang; Yap, Pew-Thian
2013-01-01
Magnetic resonance (MR) imaging has been demonstrated to be very useful for clinical diagnosis of Alzheimer's disease (AD). A common approach to using MR images for AD detection is to spatially normalize the images by non-rigid image registration, and then perform statistical analysis on the resulting deformation fields. Due to the high nonlinearity of the deformation field, recent studies suggest to use initial momentum instead as it lies in a linear space and fully encodes the deformation field. In this paper we explore the use of initial momentum for image classification by focusing on the problem of AD detection. Experiments on the public ADNI dataset show that the initial momentum, together with a simple sparse coding technique-locality-constrained linear coding (LLC)--can achieve a classification accuracy that is comparable to or even better than the state of the art. We also show that the performance of LLC can be greatly improved by introducing proper weights to the codebook.
Sepulveda, Esteban; Franco, José G; Trzepacz, Paula T; Gaviria, Ana M; Viñuelas, Eva; Palma, José; Ferré, Gisela; Grau, Imma; Vilella, Elisabet
2015-01-01
Delirium diagnosis in elderly is often complicated by underlying dementia. We evaluated performance of the Delirium Rating Scale-Revised-98 (DRS-R98) in patients with high dementia prevalence and also assessed concordance among past and current diagnostic criteria for delirium. Cross-sectional analysis of newly admitted patients to a skilled nursing facility over 6 months, who were rated within 24-48 hours after admission. Interview for Diagnostic and Statistical Manual of Mental Disorders, 3rd edition-R (DSM)-III-R, DSM-IV, DSM-5, and International Classification of Diseases 10th edition delirium ratings, administration of the DRS-R98, and assessment of dementia using the Informant Questionnaire on Cognitive Decline in the Elderly were independently performed by 3 researchers. Discriminant analyses (receiver operating characteristics curves) were used to study DRS-R98 accuracy against different diagnostic criteria. Hanley and McNeil test compared the area under the curve for DRS-R98's discriminant performance for all diagnostic criteria. Dementia was present in 85/125 (68.0%) subjects, and 36/125 (28.8%) met criteria for delirium by at least 1 classification system, whereas only 19/36 (52.8%) did by all. DSM-III-R diagnosed the most as delirious (27.2%), followed by DSM-5 (24.8%), DSM-IV-TR (22.4%), and International Classification of Diseases 10th edition (16%). DRS-R98 had the highest AUC when discriminating DSM-III-R delirium (92.9%), followed by DSM-IV (92.4%), DSM-5 (91%), and International Classification of Diseases 10th edition (90.5%), without statistical differences among them. The best DRS-R98 cutoff score was ≥14.5 for all diagnostic systems except International Classification of Diseases 10th edition (≥15.5). There is a low concordance across diagnostic systems for identification of delirium. The DRS-R98 performs well despite differences across classification systems perhaps because it broadly assesses phenomenology, even in this population with a high prevalence of dementia. Copyright © 2015 The Academy of Psychosomatic Medicine. Published by Elsevier Inc. All rights reserved.
Comparing K-mer based methods for improved classification of 16S sequences.
Vinje, Hilde; Liland, Kristian Hovde; Almøy, Trygve; Snipen, Lars
2015-07-01
The need for precise and stable taxonomic classification is highly relevant in modern microbiology. Parallel to the explosion in the amount of sequence data accessible, there has also been a shift in focus for classification methods. Previously, alignment-based methods were the most applicable tools. Now, methods based on counting K-mers by sliding windows are the most interesting classification approach with respect to both speed and accuracy. Here, we present a systematic comparison on five different K-mer based classification methods for the 16S rRNA gene. The methods differ from each other both in data usage and modelling strategies. We have based our study on the commonly known and well-used naïve Bayes classifier from the RDP project, and four other methods were implemented and tested on two different data sets, on full-length sequences as well as fragments of typical read-length. The difference in classification error obtained by the methods seemed to be small, but they were stable and for both data sets tested. The Preprocessed nearest-neighbour (PLSNN) method performed best for full-length 16S rRNA sequences, significantly better than the naïve Bayes RDP method. On fragmented sequences the naïve Bayes Multinomial method performed best, significantly better than all other methods. For both data sets explored, and on both full-length and fragmented sequences, all the five methods reached an error-plateau. We conclude that no K-mer based method is universally best for classifying both full-length sequences and fragments (reads). All methods approach an error plateau indicating improved training data is needed to improve classification from here. Classification errors occur most frequent for genera with few sequences present. For improving the taxonomy and testing new classification methods, the need for a better and more universal and robust training data set is crucial.
Jordan, Suzana; Maurer, Britta; Toniolo, Martin; Michel, Beat; Distler, Oliver
2015-08-01
The preliminary classification criteria for SSc lack sensitivity for mild/early SSc patients, therefore, the new ACR/EULAR classification criteria for SSc were developed. The objective of this study was to evaluate the performance of the new classification criteria for SSc in clinical practice in a cohort of mild/early patients. Consecutive patients with a clinical diagnosis of SSc, based on expert opinion, were prospectively recruited and assessed according to the EULAR Scleroderma Trials and Research group (EUSTAR) and very early diagnosis of SSc (VEDOSS) recommendations. In some patients, missing values were retrieved retrospectively from the patient's records. Patients were grouped into established SSc (fulfilling the old ACR criteria) and mild/early SSc (not fulfilling the old ACR criteria). The new ACR/EULAR criteria were applied to all patients. Of the 304 patients available for the final analysis, 162/304 (53.3%) had established SSc and 142/304 (46.7%) had mild/early SSc. All 162 established SSc patients fulfilled the new ACR/EULAR classification criteria. The remaining 142 patients had mild/early SSc. Eighty of these 142 patients (56.3%) fulfilled the new ACR/EULAR classification criteria. Patients with mild/early SSc not fulfilling the new classification criteria were most often suffering from RP, had SSc-characteristic autoantibodies and had an SSc pattern on nailfold capillaroscopy. Taken together, the sensitivity of the new ACR/EULAR classification criteria for the overall cohort was 242/304 (79.6%) compared with 162/304 (53.3%) for the ACR criteria. In this cohort with a focus on mild/early SSc, the new ACR/EULAR classification criteria showed higher sensitivity and classified more patients as definite SSc patients than the ACR criteria. © The Author 2015. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
A Classification Scheme for Smart Manufacturing Systems’ Performance Metrics
Lee, Y. Tina; Kumaraguru, Senthilkumaran; Jain, Sanjay; Robinson, Stefanie; Helu, Moneer; Hatim, Qais Y.; Rachuri, Sudarsan; Dornfeld, David; Saldana, Christopher J.; Kumara, Soundar
2017-01-01
This paper proposes a classification scheme for performance metrics for smart manufacturing systems. The discussion focuses on three such metrics: agility, asset utilization, and sustainability. For each of these metrics, we discuss classification themes, which we then use to develop a generalized classification scheme. In addition to the themes, we discuss a conceptual model that may form the basis for the information necessary for performance evaluations. Finally, we present future challenges in developing robust, performance-measurement systems for real-time, data-intensive enterprises. PMID:28785744
Kalpathy-Cramer, Jayashree; de Herrera, Alba García Seco; Demner-Fushman, Dina; Antani, Sameer; Bedrick, Steven; Müller, Henning
2014-01-01
Medical image retrieval and classification have been extremely active research topics over the past 15 years. With the ImageCLEF benchmark in medical image retrieval and classification a standard test bed was created that allows researchers to compare their approaches and ideas on increasingly large and varied data sets including generated ground truth. This article describes the lessons learned in ten evaluations campaigns. A detailed analysis of the data also highlights the value of the resources created. PMID:24746250
NASA Technical Reports Server (NTRS)
Coggeshall, M. E.; Hoffer, R. M.
1973-01-01
Remote sensing equipment and automatic data processing techniques were employed as aids in the institution of improved forest resource management methods. On the basis of automatically calculated statistics derived from manually selected training samples, the feature selection processor of LARSYS selected, upon consideration of various groups of the four available spectral regions, a series of channel combinations whose automatic classification performances (for six cover types, including both deciduous and coniferous forest) were tested, analyzed, and further compared with automatic classification results obtained from digitized color infrared photography.
The brain MRI classification problem from wavelets perspective
NASA Astrophysics Data System (ADS)
Bendib, Mohamed M.; Merouani, Hayet F.; Diaba, Fatma
2015-02-01
Haar and Daubechies 4 (DB4) are the most used wavelets for brain MRI (Magnetic Resonance Imaging) classification. The former is simple and fast to compute while the latter is more complex and offers a better resolution. This paper explores the potential of both of them in performing Normal versus Pathological discrimination on the one hand, and Multiclassification on the other hand. The Whole Brain Atlas is used as a validation database, and the Random Forest (RF) algorithm is employed as a learning approach. The achieved results are discussed and statistically compared.
Muthu Rama Krishnan, M; Shah, Pratik; Chakraborty, Chandan; Ray, Ajoy K
2012-04-01
The objective of this paper is to provide an improved technique, which can assist oncopathologists in correct screening of oral precancerous conditions specially oral submucous fibrosis (OSF) with significant accuracy on the basis of collagen fibres in the sub-epithelial connective tissue. The proposed scheme is composed of collagen fibres segmentation, its textural feature extraction and selection, screening perfomance enhancement under Gaussian transformation and finally classification. In this study, collagen fibres are segmented on R,G,B color channels using back-probagation neural network from 60 normal and 59 OSF histological images followed by histogram specification for reducing the stain intensity variation. Henceforth, textural features of collgen area are extracted using fractal approaches viz., differential box counting and brownian motion curve . Feature selection is done using Kullback-Leibler (KL) divergence criterion and the screening performance is evaluated based on various statistical tests to conform Gaussian nature. Here, the screening performance is enhanced under Gaussian transformation of the non-Gaussian features using hybrid distribution. Moreover, the routine screening is designed based on two statistical classifiers viz., Bayesian classification and support vector machines (SVM) to classify normal and OSF. It is observed that SVM with linear kernel function provides better classification accuracy (91.64%) as compared to Bayesian classifier. The addition of fractal features of collagen under Gaussian transformation improves Bayesian classifier's performance from 80.69% to 90.75%. Results are here studied and discussed.
Classification of wet aged related macular degeneration using optical coherence tomographic images
NASA Astrophysics Data System (ADS)
Haq, Anam; Mir, Fouwad Jamil; Yasin, Ubaid Ullah; Khan, Shoab A.
2013-12-01
Wet Age related macular degeneration (AMD) is a type of age related macular degeneration. In order to detect Wet AMD we look for Pigment Epithelium detachment (PED) and fluid filled region caused by choroidal neovascularization (CNV). This form of AMD can cause vision loss if not treated in time. In this article we have proposed an automated system for detection of Wet AMD in Optical coherence tomographic (OCT) images. The proposed system extracts PED and CNV from OCT images using segmentation and morphological operations and then detailed feature set are extracted. These features are then passed on to the classifier for classification. Finally performance measures like accuracy, sensitivity and specificity are calculated and the classifier delivering the maximum performance is selected as a comparison measure. Our system gives higher performance using SVM as compared to other methods.
Geng, Yanjuan; Wei, Yue
2017-01-01
Previous studies have showed that arm position variations would significantly degrade the classification performance of myoelectric pattern-recognition-based prosthetic control, and the cascade classifier (CC) and multiposition classifier (MPC) have been proposed to minimize such degradation in offline scenarios. However, it remains unknown whether these proposed approaches could also perform well in the clinical use of a multifunctional prosthesis control. In this study, the online effect of arm position variation on motion identification was evaluated by using a motion-test environment (MTE) developed to mimic the real-time control of myoelectric prostheses. The performance of different classifier configurations in reducing the impact of arm position variation was investigated using four real-time metrics based on dataset obtained from transradial amputees. The results of this study showed that, compared to the commonly used motion classification method, the CC and MPC configurations improved the real-time performance across seven classes of movements in five different arm positions (8.7% and 12.7% increments of motion completion rate, resp.). The results also indicated that high offline classification accuracy might not ensure good real-time performance under variable arm positions, which necessitated the investigation of the real-time control performance to gain proper insight on the clinical implementation of EMG-pattern-recognition-based controllers for limb amputees. PMID:28523276
Javidnia, Katayoun; Parish, Maryam; Karimi, Sadegh; Hemmateenejad, Bahram
2013-03-01
By using FT-IR spectroscopy, many researchers from different disciplines enrich the experimental complexity of their research for obtaining more precise information. Moreover chemometrics techniques have boosted the use of IR instruments. In the present study we aimed to emphasize on the power of FT-IR spectroscopy for discrimination between different oil samples (especially fat from vegetable oils). Also our data were used to compare the performance of different classification methods. FT-IR transmittance spectra of oil samples (Corn, Colona, Sunflower, Soya, Olive, and Butter) were measured in the wave-number interval of 450-4000 cm(-1). Classification analysis was performed utilizing PLS-DA, interval PLS-DA, extended canonical variate analysis (ECVA) and interval ECVA methods. The effect of data preprocessing by extended multiplicative signal correction was investigated. Whilst all employed method could distinguish butter from vegetable oils, iECVA resulted in the best performances for calibration and external test set with 100% sensitivity and specificity. Copyright © 2012 Elsevier B.V. All rights reserved.
Deep-learning-based classification of FDG-PET data for Alzheimer's disease categories
NASA Astrophysics Data System (ADS)
Singh, Shibani; Srivastava, Anant; Mi, Liang; Caselli, Richard J.; Chen, Kewei; Goradia, Dhruman; Reiman, Eric M.; Wang, Yalin
2017-11-01
Fluorodeoxyglucose (FDG) positron emission tomography (PET) measures the decline in the regional cerebral metabolic rate for glucose, offering a reliable metabolic biomarker even on presymptomatic Alzheimer's disease (AD) patients. PET scans provide functional information that is unique and unavailable using other types of imaging. However, the computational efficacy of FDG-PET data alone, for the classification of various Alzheimers Diagnostic categories, has not been well studied. This motivates us to correctly discriminate various AD Diagnostic categories using FDG-PET data. Deep learning has improved state-of-the-art classification accuracies in the areas of speech, signal, image, video, text mining and recognition. We propose novel methods that involve probabilistic principal component analysis on max-pooled data and mean-pooled data for dimensionality reduction, and multilayer feed forward neural network which performs binary classification. Our experimental dataset consists of baseline data of subjects including 186 cognitively unimpaired (CU) subjects, 336 mild cognitive impairment (MCI) subjects with 158 Late MCI and 178 Early MCI, and 146 AD patients from Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. We measured F1-measure, precision, recall, negative and positive predictive values with a 10-fold cross validation scheme. Our results indicate that our designed classifiers achieve competitive results while max pooling achieves better classification performance compared to mean-pooled features. Our deep model based research may advance FDG-PET analysis by demonstrating their potential as an effective imaging biomarker of AD.
Classification of a large microarray data set: Algorithm comparison and analysis of drug signatures
Natsoulis, Georges; El Ghaoui, Laurent; Lanckriet, Gert R.G.; Tolley, Alexander M.; Leroy, Fabrice; Dunlea, Shane; Eynon, Barrett P.; Pearson, Cecelia I.; Tugendreich, Stuart; Jarnagin, Kurt
2005-01-01
A large gene expression database has been produced that characterizes the gene expression and physiological effects of hundreds of approved and withdrawn drugs, toxicants, and biochemical standards in various organs of live rats. In order to derive useful biological knowledge from this large database, a variety of supervised classification algorithms were compared using a 597-microarray subset of the data. Our studies show that several types of linear classifiers based on Support Vector Machines (SVMs) and Logistic Regression can be used to derive readily interpretable drug signatures with high classification performance. Both methods can be tuned to produce classifiers of drug treatments in the form of short, weighted gene lists which upon analysis reveal that some of the signature genes have a positive contribution (act as “rewards” for the class-of-interest) while others have a negative contribution (act as “penalties”) to the classification decision. The combination of reward and penalty genes enhances performance by keeping the number of false positive treatments low. The results of these algorithms are combined with feature selection techniques that further reduce the length of the drug signatures, an important step towards the development of useful diagnostic biomarkers and low-cost assays. Multiple signatures with no genes in common can be generated for the same classification end-point. Comparison of these gene lists identifies biological processes characteristic of a given class. PMID:15867433
NASA Astrophysics Data System (ADS)
Fu, Haiyan; Yin, Qiaobo; Xu, Lu; Wang, Weizheng; Chen, Feng; Yang, Tianming
2017-07-01
The origins and authenticity against frauds are two essential aspects of food quality. In this work, a comprehensive quality evaluation method by FT-NIR spectroscopy and chemometrics were suggested to address the geographical origins and authentication of Chinese Ganoderma lucidum (GL). Classification for 25 groups of GL samples (7 common species from 15 producing areas) was performed using near-infrared spectroscopy and interval-combination One-Versus-One least squares support vector machine (IC-OVO-LS-SVM). Untargeted analysis of 4 adulterants of cheaper mushrooms was performed by one-class partial least squares (OCPLS) modeling for each of the 7 GL species. After outlier diagnosis and comparing the influences of different preprocessing methods and spectral intervals on classification, IC-OVO-LS-SVM with standard normal variate (SNV) spectra obtained a total classification accuracy of 0.9317, an average sensitivity and specificity of 0.9306 and 0.9971, respectively. With SNV or second-order derivative (D2) spectra, OCPLS could detect at least 2% or more doping levels of adulterants for 5 of the 7 GL species and 5% or more doping levels for the other 2 GL species. This study demonstrates the feasibility of using new chemometrics and NIR spectroscopy for fine classification of GL geographical origins and species as well as for untargeted analysis of multiple adulterants.
NASA Astrophysics Data System (ADS)
Lee, Haeil; Lee, Hansang; Park, Minseok; Kim, Junmo
2017-03-01
Lung cancer is the most common cause of cancer-related death. To diagnose lung cancers in early stages, numerous studies and approaches have been developed for cancer screening with computed tomography (CT) imaging. In recent years, convolutional neural networks (CNN) have become one of the most common and reliable techniques in computer aided detection (CADe) and diagnosis (CADx) by achieving state-of-the-art-level performances for various tasks. In this study, we propose a CNN classification system for false positive reduction of initially detected lung nodule candidates. First, image patches of lung nodule candidates are extracted from CT scans to train a CNN classifier. To reflect the volumetric contextual information of lung nodules to 2D image patch, we propose a weighted average image patch (WAIP) generation by averaging multiple slice images of lung nodule candidates. Moreover, to emphasize central slices of lung nodules, slice images are locally weighted according to Gaussian distribution and averaged to generate the 2D WAIP. With these extracted patches, 2D CNN is trained to achieve the classification of WAIPs of lung nodule candidates into positive and negative labels. We used LUNA 2016 public challenge database to validate the performance of our approach for false positive reduction in lung CT nodule classification. Experiments show our approach improves the classification accuracy of lung nodules compared to the baseline 2D CNN with patches from single slice image.
NASA Technical Reports Server (NTRS)
Biehl, L. L.; Silva, L. F.
1975-01-01
Skylab multispectral scanner data, digitized Skylab color infrared (IR) photography, digitized Skylab black and white multiband photography, and Earth Resources Technology Satellite (ERTS) multispectral scanner data collected within a 24-hr time period over an area in south-central Indiana near Bloomington on June 9 and 10, 1973, were compared in a machine-aided land use analysis of the area. The overall classification performance results, obtained with nine land use classes, were 87% correct classification using the 'best' 4 channels of the Skylab multispectral scanner, 80% for the channels on the Skylab multispectral scanner which are spectrally comparable to the ERTS multispectral scanner, 88% for the ERTS multispectral scanner, 83% for the digitized color IR photography, and 76% for the digitized black and white multiband photography. The results indicate that the Skylab multispectral scanner may yield even higher classification accuracies when a noise-filtered multispectral scanner data set becomes available in the near future.
NASA Astrophysics Data System (ADS)
Shyu, Mei-Ling; Sainani, Varsha
The increasing number of network security related incidents have made it necessary for the organizations to actively protect their sensitive data with network intrusion detection systems (IDSs). IDSs are expected to analyze a large volume of data while not placing a significantly added load on the monitoring systems and networks. This requires good data mining strategies which take less time and give accurate results. In this study, a novel data mining assisted multiagent-based intrusion detection system (DMAS-IDS) is proposed, particularly with the support of multiclass supervised classification. These agents can detect and take predefined actions against malicious activities, and data mining techniques can help detect them. Our proposed DMAS-IDS shows superior performance compared to central sniffing IDS techniques, and saves network resources compared to other distributed IDS with mobile agents that activate too many sniffers causing bottlenecks in the network. This is one of the major motivations to use a distributed model based on multiagent platform along with a supervised classification technique.
Evaluating the Visualization of What a Deep Neural Network Has Learned.
Samek, Wojciech; Binder, Alexander; Montavon, Gregoire; Lapuschkin, Sebastian; Muller, Klaus-Robert
Deep neural networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multilayer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision, given a new unseen data sample. Recently, several approaches have been proposed enabling one to understand and interpret the reasoning embodied in a DNN for a single test image. These methods quantify the "importance" of individual pixels with respect to the classification decision and allow a visualization in terms of a heatmap in pixel/input space. While the usefulness of heatmaps can be judged subjectively by a human, an objective quality measure is missing. In this paper, we present a general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps. We compare heatmaps computed by three different methods on the SUN397, ILSVRC2012, and MIT Places data sets. Our main result is that the recently proposed layer-wise relevance propagation algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method. We provide theoretical arguments to explain this result and discuss its practical implications. Finally, we investigate the use of heatmaps for unsupervised assessment of the neural network performance.Deep neural networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multilayer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision, given a new unseen data sample. Recently, several approaches have been proposed enabling one to understand and interpret the reasoning embodied in a DNN for a single test image. These methods quantify the "importance" of individual pixels with respect to the classification decision and allow a visualization in terms of a heatmap in pixel/input space. While the usefulness of heatmaps can be judged subjectively by a human, an objective quality measure is missing. In this paper, we present a general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps. We compare heatmaps computed by three different methods on the SUN397, ILSVRC2012, and MIT Places data sets. Our main result is that the recently proposed layer-wise relevance propagation algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method. We provide theoretical arguments to explain this result and discuss its practical implications. Finally, we investigate the use of heatmaps for unsupervised assessment of the neural network performance.
Stephens, David; Diesing, Markus
2014-01-01
Detailed seabed substrate maps are increasingly in demand for effective planning and management of marine ecosystems and resources. It has become common to use remotely sensed multibeam echosounder data in the form of bathymetry and acoustic backscatter in conjunction with ground-truth sampling data to inform the mapping of seabed substrates. Whilst, until recently, such data sets have typically been classified by expert interpretation, it is now obvious that more objective, faster and repeatable methods of seabed classification are required. This study compares the performances of a range of supervised classification techniques for predicting substrate type from multibeam echosounder data. The study area is located in the North Sea, off the north-east coast of England. A total of 258 ground-truth samples were classified into four substrate classes. Multibeam bathymetry and backscatter data, and a range of secondary features derived from these datasets were used in this study. Six supervised classification techniques were tested: Classification Trees, Support Vector Machines, k-Nearest Neighbour, Neural Networks, Random Forest and Naive Bayes. Each classifier was trained multiple times using different input features, including i) the two primary features of bathymetry and backscatter, ii) a subset of the features chosen by a feature selection process and iii) all of the input features. The predictive performances of the models were validated using a separate test set of ground-truth samples. The statistical significance of model performances relative to a simple baseline model (Nearest Neighbour predictions on bathymetry and backscatter) were tested to assess the benefits of using more sophisticated approaches. The best performing models were tree based methods and Naive Bayes which achieved accuracies of around 0.8 and kappa coefficients of up to 0.5 on the test set. The models that used all input features didn't generally perform well, highlighting the need for some means of feature selection.
Temporally consistent probabilistic detection of new multiple sclerosis lesions in brain MRI.
Elliott, Colm; Arnold, Douglas L; Collins, D Louis; Arbel, Tal
2013-08-01
Detection of new Multiple Sclerosis (MS) lesions on magnetic resonance imaging (MRI) is important as a marker of disease activity and as a potential surrogate for relapses. We propose an approach where sequential scans are jointly segmented, to provide a temporally consistent tissue segmentation while remaining sensitive to newly appearing lesions. The method uses a two-stage classification process: 1) a Bayesian classifier provides a probabilistic brain tissue classification at each voxel of reference and follow-up scans, and 2) a random-forest based lesion-level classification provides a final identification of new lesions. Generative models are learned based on 364 scans from 95 subjects from a multi-center clinical trial. The method is evaluated on sequential brain MRI of 160 subjects from a separate multi-center clinical trial, and is compared to 1) semi-automatically generated ground truth segmentations and 2) fully manual identification of new lesions generated independently by nine expert raters on a subset of 60 subjects. For new lesions greater than 0.15 cc in size, the classifier has near perfect performance (99% sensitivity, 2% false detection rate), as compared to ground truth. The proposed method was also shown to exceed the performance of any one of the nine expert manual identifications.
LBP and SIFT based facial expression recognition
NASA Astrophysics Data System (ADS)
Sumer, Omer; Gunes, Ece O.
2015-02-01
This study compares the performance of local binary patterns (LBP) and scale invariant feature transform (SIFT) with support vector machines (SVM) in automatic classification of discrete facial expressions. Facial expression recognition is a multiclass classification problem and seven classes; happiness, anger, sadness, disgust, surprise, fear and comtempt are classified. Using SIFT feature vectors and linear SVM, 93.1% mean accuracy is acquired on CK+ database. On the other hand, the performance of LBP-based classifier with linear SVM is reported on SFEW using strictly person independent (SPI) protocol. Seven-class mean accuracy on SFEW is 59.76%. Experiments on both databases showed that LBP features can be used in a fairly descriptive way if a good localization of facial points and partitioning strategy are followed.
Annotation and Classification of CRISPR-Cas Systems
Makarova, Kira S.; Koonin, Eugene V.
2018-01-01
The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas (CRISPR-associated proteins) is a prokaryotic adaptive immune system that is represented in most archaea and many bacteria. Among the currently known prokaryotic defense systems, the CRISPR-Cas genomic loci show unprecedented complexity and diversity. Classification of CRISPR-Cas variants that would capture their evolutionary relationships to the maximum possible extent is essential for comparative genomic and functional characterization of this theoretically and practically important system of adaptive immunity. To this end, a multipronged approach has been developed that combines phylogenetic analysis of the conserved Cas proteins with comparison of gene repertoires and arrangements in CRISPR-Cas loci. This approach led to the current classification of CRISPR-Cas systems into three distinct types and ten subtypes for each of which signature genes have been identified. Comparative genomic analysis of the CRISPR-Cas systems in new archaeal and bacterial genomes performed over the 3 years elapsed since the development of this classification makes it clear that new types and subtypes of CRISPR-Cas need to be introduced. Moreover, this classification system captures only part of the complexity of CRISPR-Cas organization and evolution, due to the intrinsic modularity and evolutionary mobility of these immunity systems, resulting in numerous recombinant variants. Moreover, most of the cas genes evolve rapidly, complicating the family assignment for many Cas proteins and the use of family profiles for the recognition of CRISPR-Cas subtype signatures. Further progress in the comparative analysis of CRISPR-Cas systems requires integration of the most sensitive sequence comparison tools, protein structure comparison, and refined approaches for comparison of gene neighborhoods. PMID:25981466
Annotation and Classification of CRISPR-Cas Systems.
Makarova, Kira S; Koonin, Eugene V
2015-01-01
The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas (CRISPR-associated proteins) is a prokaryotic adaptive immune system that is represented in most archaea and many bacteria. Among the currently known prokaryotic defense systems, the CRISPR-Cas genomic loci show unprecedented complexity and diversity. Classification of CRISPR-Cas variants that would capture their evolutionary relationships to the maximum possible extent is essential for comparative genomic and functional characterization of this theoretically and practically important system of adaptive immunity. To this end, a multipronged approach has been developed that combines phylogenetic analysis of the conserved Cas proteins with comparison of gene repertoires and arrangements in CRISPR-Cas loci. This approach led to the current classification of CRISPR-Cas systems into three distinct types and ten subtypes for each of which signature genes have been identified. Comparative genomic analysis of the CRISPR-Cas systems in new archaeal and bacterial genomes performed over the 3 years elapsed since the development of this classification makes it clear that new types and subtypes of CRISPR-Cas need to be introduced. Moreover, this classification system captures only part of the complexity of CRISPR-Cas organization and evolution, due to the intrinsic modularity and evolutionary mobility of these immunity systems, resulting in numerous recombinant variants. Moreover, most of the cas genes evolve rapidly, complicating the family assignment for many Cas proteins and the use of family profiles for the recognition of CRISPR-Cas subtype signatures. Further progress in the comparative analysis of CRISPR-Cas systems requires integration of the most sensitive sequence comparison tools, protein structure comparison, and refined approaches for comparison of gene neighborhoods.
Bottai, Matteo; Tjärnlund, Anna; Santoni, Giola; Werth, Victoria P; Pilkington, Clarissa; de Visser, Marianne; Alfredsson, Lars; Amato, Anthony A; Barohn, Richard J; Liang, Matthew H; Aggarwal, Rohit; Arnardottir, Snjolaug; Chinoy, Hector; Cooper, Robert G; Danko, Katalin; Dimachkie, Mazen M; Feldman, Brian M; García-De La Torre, Ignacio; Gordon, Patrick; Hayashi, Taichi; Katz, James D; Kohsaka, Hitoshi; Lachenbruch, Peter A; Lang, Bianca A; Li, Yuhui; Oddis, Chester V; Olesinka, Marzena; Reed, Ann M; Rutkowska-Sak, Lidia; Sanner, Helga; Selva-O’Callaghan, Albert; Wook Song, Yeong; Ytterberg, Steven R; Miller, Frederick W; Rider, Lisa G; Lundberg, Ingrid E; Amoruso, Maria
2017-01-01
Objective To describe the methodology used to develop new classification criteria for adult and juvenile idiopathic inflammatory myopathies (IIMs) and their major subgroups. Methods An international, multidisciplinary group of myositis experts produced a set of 93 potentially relevant variables to be tested for inclusion in the criteria. Rheumatology, dermatology, neurology and paediatric clinics worldwide collected data on 976 IIM cases (74% adults, 26% children) and 624 non-IIM comparator cases with mimicking conditions (82% adults, 18% children). The participating clinicians classified each case as IIM or non-IIM. Generally, the classification of any given patient was based on few variables, leaving remaining variables unmeasured. We investigated the strength of the association between all variables and between these and the disease status as determined by the physician. We considered three approaches: (1) a probability-score approach, (2) a sum-of-items approach criteria and (3) a classification-tree approach. Results The approaches yielded several candidate models that were scrutinised with respect to statistical performance and clinical relevance. The probability-score approach showed superior statistical performance and clinical practicability and was therefore preferred over the others. We developed a classification tree for subclassification of patients with IIM. A calculator for electronic devices, such as computers and smartphones, facilitates the use of the European League Against Rheumatism/American College of Rheumatology (EULAR/ACR) classification criteria. Conclusions The new EULAR/ACR classification criteria provide a patient’s probability of having IIM for use in clinical and research settings. The probability is based on a score obtained by summing the weights associated with a set of criteria items. PMID:29177080
Lin, Yuan-Pin; Yang, Yi-Hsuan; Jung, Tzyy-Ping
2014-01-01
Electroencephalography (EEG)-based emotion classification during music listening has gained increasing attention nowadays due to its promise of potential applications such as musical affective brain-computer interface (ABCI), neuromarketing, music therapy, and implicit multimedia tagging and triggering. However, music is an ecologically valid and complex stimulus that conveys certain emotions to listeners through compositions of musical elements. Using solely EEG signals to distinguish emotions remained challenging. This study aimed to assess the applicability of a multimodal approach by leveraging the EEG dynamics and acoustic characteristics of musical contents for the classification of emotional valence and arousal. To this end, this study adopted machine-learning methods to systematically elucidate the roles of the EEG and music modalities in the emotion modeling. The empirical results suggested that when whole-head EEG signals were available, the inclusion of musical contents did not improve the classification performance. The obtained performance of 74~76% using solely EEG modality was statistically comparable to that using the multimodality approach. However, if EEG dynamics were only available from a small set of electrodes (likely the case in real-life applications), the music modality would play a complementary role and augment the EEG results from around 61-67% in valence classification and from around 58-67% in arousal classification. The musical timber appeared to replace less-discriminative EEG features and led to improvements in both valence and arousal classification, whereas musical loudness was contributed specifically to the arousal classification. The present study not only provided principles for constructing an EEG-based multimodal approach, but also revealed the fundamental insights into the interplay of the brain activity and musical contents in emotion modeling.
Lin, Yuan-Pin; Yang, Yi-Hsuan; Jung, Tzyy-Ping
2014-01-01
Electroencephalography (EEG)-based emotion classification during music listening has gained increasing attention nowadays due to its promise of potential applications such as musical affective brain-computer interface (ABCI), neuromarketing, music therapy, and implicit multimedia tagging and triggering. However, music is an ecologically valid and complex stimulus that conveys certain emotions to listeners through compositions of musical elements. Using solely EEG signals to distinguish emotions remained challenging. This study aimed to assess the applicability of a multimodal approach by leveraging the EEG dynamics and acoustic characteristics of musical contents for the classification of emotional valence and arousal. To this end, this study adopted machine-learning methods to systematically elucidate the roles of the EEG and music modalities in the emotion modeling. The empirical results suggested that when whole-head EEG signals were available, the inclusion of musical contents did not improve the classification performance. The obtained performance of 74~76% using solely EEG modality was statistically comparable to that using the multimodality approach. However, if EEG dynamics were only available from a small set of electrodes (likely the case in real-life applications), the music modality would play a complementary role and augment the EEG results from around 61–67% in valence classification and from around 58–67% in arousal classification. The musical timber appeared to replace less-discriminative EEG features and led to improvements in both valence and arousal classification, whereas musical loudness was contributed specifically to the arousal classification. The present study not only provided principles for constructing an EEG-based multimodal approach, but also revealed the fundamental insights into the interplay of the brain activity and musical contents in emotion modeling. PMID:24822035
Zu, Chen; Jie, Biao; Liu, Mingxia; Chen, Songcan
2015-01-01
Multimodal classification methods using different modalities of imaging and non-imaging data have recently shown great advantages over traditional single-modality-based ones for diagnosis and prognosis of Alzheimer’s disease (AD), as well as its prodromal stage, i.e., mild cognitive impairment (MCI). However, to the best of our knowledge, most existing methods focus on mining the relationship across multiple modalities of the same subjects, while ignoring the potentially useful relationship across different subjects. Accordingly, in this paper, we propose a novel learning method for multimodal classification of AD/MCI, by fully exploring the relationships across both modalities and subjects. Specifically, our proposed method includes two subsequent components, i.e., label-aligned multi-task feature selection and multimodal classification. In the first step, the feature selection learning from multiple modalities are treated as different learning tasks and a group sparsity regularizer is imposed to jointly select a subset of relevant features. Furthermore, to utilize the discriminative information among labeled subjects, a new label-aligned regularization term is added into the objective function of standard multi-task feature selection, where label-alignment means that all multi-modality subjects with the same class labels should be closer in the new feature-reduced space. In the second step, a multi-kernel support vector machine (SVM) is adopted to fuse the selected features from multi-modality data for final classification. To validate our method, we perform experiments on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database using baseline MRI and FDG-PET imaging data. The experimental results demonstrate that our proposed method achieves better classification performance compared with several state-of-the-art methods for multimodal classification of AD/MCI. PMID:26572145
An Integrating Framework for Interdisciplinary Military Analyses
2017-04-01
Effectiveness, System Performance, Task Prosecution, War Gaming 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT 18. NUMBER OF PAGES...and space for every play of the game . Called plays can be compared to collective tasks with each player responsible for executing one or more
Comparative validity of MMPI-2 and MCMI-II personality disorder classifications.
Wise, E A
1996-06-01
Minnesota Multiphasic Personality Inventory-2 (MMPI-2) overlapping and nonoverlapping scales were demonstrated to perform comparably to their original MMPI forms. They were then evaluated for convergent and discriminant validity with the Million Clinical Multiaxial Inventory-II (MCMI-II) personality disorder scales. The MMPI-2 and MCMI-II personality disorder scales demonstrated convergent and discriminant coefficients similar to their original forms. However, the MMPI-2 personality scales classified significantly more of the sample as Dramatic, whereas the MCMI-II diagnosed more of the sample as Anxious. Furthermore, single-scale and 2-point code type classification rates were quite low, indicating that at the level of the individual, the personality disorder scales are not measuring comparable constructs. Hence, each instrument is providing similar and unique information, justifying their continued use together for the purpose of diagnosing personality disorders.
Sobol-Shikler, Tal; Robinson, Peter
2010-07-01
We present a classification algorithm for inferring affective states (emotions, mental states, attitudes, and the like) from their nonverbal expressions in speech. It is based on the observations that affective states can occur simultaneously and different sets of vocal features, such as intonation and speech rate, distinguish between nonverbal expressions of different affective states. The input to the inference system was a large set of vocal features and metrics that were extracted from each utterance. The classification algorithm conducted independent pairwise comparisons between nine affective-state groups. The classifier used various subsets of metrics of the vocal features and various classification algorithms for different pairs of affective-state groups. Average classification accuracy of the 36 pairwise machines was 75 percent, using 10-fold cross validation. The comparison results were consolidated into a single ranked list of the nine affective-state groups. This list was the output of the system and represented the inferred combination of co-occurring affective states for the analyzed utterance. The inference accuracy of the combined machine was 83 percent. The system automatically characterized over 500 affective state concepts from the Mind Reading database. The inference of co-occurring affective states was validated by comparing the inferred combinations to the lexical definitions of the labels of the analyzed sentences. The distinguishing capabilities of the system were comparable to human performance.
NASA Astrophysics Data System (ADS)
Dementev, A. O.; Dmitriev, E. V.; Kozoderov, V. V.; Egorov, V. D.
2017-10-01
Hyperspectral imaging is up-to-date promising technology widely applied for the accurate thematic mapping. The presence of a large number of narrow survey channels allows us to use subtle differences in spectral characteristics of objects and to make a more detailed classification than in the case of using standard multispectral data. The difficulties encountered in the processing of hyperspectral images are usually associated with the redundancy of spectral information which leads to the problem of the curse of dimensionality. Methods currently used for recognizing objects on multispectral and hyperspectral images are usually based on standard base supervised classification algorithms of various complexity. Accuracy of these algorithms can be significantly different depending on considered classification tasks. In this paper we study the performance of ensemble classification methods for the problem of classification of the forest vegetation. Error correcting output codes and boosting are tested on artificial data and real hyperspectral images. It is demonstrates, that boosting gives more significant improvement when used with simple base classifiers. The accuracy in this case in comparable the error correcting output code (ECOC) classifier with Gaussian kernel SVM base algorithm. However the necessity of boosting ECOC with Gaussian kernel SVM is questionable. It is demonstrated, that selected ensemble classifiers allow us to recognize forest species with high enough accuracy which can be compared with ground-based forest inventory data.